Honestly, Grok's technology is not impressive at all, and I wonder why anyone wo...

dilap · on May 19, 2025

I like it! For me it has replaced Sonnet (3.5 at the time, but 3.7 doesn't seem better to me, from my brief tests) for general web usage -- fast, the ability to query x nee twitter is very nice, & I find the code it produces tends to be a bit better than Sonnet. (Though perhaps that depends a lot on the domain...I'm doing mostly C# in Unity.)

For tough queries o3 is unmatched in my experience.

t1amat · on May 19, 2025

Llama is arguably the reason open weight LLM’s are a thing, with the leak of Llama 1 and subsequent release of Llama 2. Llama 3 was a huge push for quality, size, context length, and multi-modality. Llama 4 Maverick is clearly better than it looks if a fine tune can put it at the top of LMArena human preferences leaderboard.

Grok 3 mini is quite a decent agentic model and competitive with frontier models at a fraction of the cost; see livebench.ai.

Zambyte · on May 19, 2025

The only interesting thing about Grok is using it hooked up to the X firehose to query about events in real time. Unfortunately it sucks at that.

vitorgrs · on May 20, 2025

Although Deepseek is old, I find the V3 (without reason) still to be the best non reasoning model out there.

Now, ChatGPT main advantage for me right now it's search + o4-mini. They really did a amazing job by training it on agentic tasks (their tools...) and the search with reasoning works amazing.

Way better than grok search or anything.

mhh__ · on May 20, 2025

Grok search is really good

Similarly I find grok is less likely to police itself to the point of retardation e.g. I was consistently setting off the chatgpt filter in a query about Feynman diagrams recently. Why?

jbellis · on May 20, 2025

Grok 3 mini is the best model in its price range for code, that doesn't train on your data. So it's part of Brokk's free plan. https://brokk.ai

bigyabai · on May 20, 2025

> that doesn't train on your data.

Don't say that for sure unless you're inferencing it on your own machine.

laybak · 2025-05-21T16:28:55 1747844935

agreed. we should normalize this level of skepticism/scrutiny for all claims from big AI labs

CobrastanJorji · on May 20, 2025

You don't trust Elon Musk at his word?

ls612 · on May 19, 2025

Before the release of Gemini 2.5 Grok 3 was the best coding AI IME, especially when you used reasoning. It also complained the least about things you asked it to do. Gemini for instance still won’t tell you how to use yt-dlp.

drozycki · on May 19, 2025

Gemini gave me a yt-dlp command two weeks ago without complaining. Can you share your log to compare?

https://g.co/gemini/share/638562c1a8f4

jeffhuys · on May 20, 2025

Grok is almost completely uncensored. That's incredibly useful.

includenotfound · 2025-05-21T09:57:37 1747821457

Indeed. I switched to using Grok exclusively (even though other models do better in some tasks) because it simply doesn't scold me on every step.

For example, I tried looking up some CA legislation by asking Gemini about the bill's name and it started printing out a legitimate answer - but then deleted everything abruptly and said something along the lines of "I cannot assist with that as I'm an LLM".

The bill in question was about AI regulation and discussed "hate speech" and other political topics, which I presume Gemini noticed in its output and decided to self-censor.

Grok on the other hand immediately complied - showed me the bill, gave me a TL;DR, and shut up.

Another example is: I found a bunch of old HDDs from old laptops. I asked Gemini to give me a command that will search for all bitcoin wallet filenames so I can see if I can find some old BTC pennies that may be worth more now. Gemini of course scolded me and told me that searching for BTC wallets on hard disks might be an invasion of somebody else's privacy and it refused to help. Grok on the other hand cooperated and shut up.

And yes, I might have worded my prompt carelessly (e.g. "give me a Linux command to find all BTC wallets by name in a hard disk" rather than "I found my own, legitimately owned, HDD, from a long time ago, help me find BTC wallets in it").

But I shouldn't have to walk on eggshells talking to smart sand, and I won't.

frollogaston · 2025-05-21T20:35:57 1747859757

This caught my interest. "Generate an image of George W Bush on the beach"... Grok does it. Gemini and ChatGPT both refuse.

adrr · on May 20, 2025

At least two times they had unauthorized changes to their prompts to inject far right content that showed up on random content. imagine you're using it for a chat bot and it starts spouting off white nationalist content like "great replacement" theory.

https://www.theguardian.com/technology/2025/may/14/elon-musk...

rurp · on May 20, 2025

True, although "unauthorized" might deserve scare quotes given the source and how pertinent those changes were to the bosses immediate interest.

duskwuff · on May 20, 2025

What was the other time? The incident linked at the bottom of that article ("into trouble last year") wasn't an "unauthorized change", as far as I'm aware; it was a general lack of guardrails on image generation.

inferiorhuman · on May 20, 2025

White genocide and holocaust denial.

stuaxo · on May 20, 2025

"Unauthorised" and yet seem to line up with what Elon himself likes on X comments.

duskwuff · on May 20, 2025

That was the one I was aware of. Was there another incident separate from that?

inferiorhuman · on May 20, 2025

While I'm sure the same rogue "employee" was responsible for both, they are separate incidents. Musk's AI service was pushing "white genocide" lies as answers to unrelated prompts. It was only spouting holocaust denial lies when asked directly.

mhh__ · on May 20, 2025

[flagged]

Planktonne · on May 20, 2025

> from London which is now what 35% English

How are you defining "English" here?

jeffbee · on May 20, 2025

Do you recall anything about the history of England? For example did the Indians vote to become subjects of the Queen?

mhh__ · on May 20, 2025

A radical metaphor, you make it sound deliberate

redox99 · on May 20, 2025

Grok is much more concise, to the point, no bs. Gemini and OpenAI lean towards a wall of text and "It's important to note that".

I'm sure with a good system prompt you can mitigate that. I'm just comparing them out of the box.

arresin · on May 19, 2025

I’ve found 3.7 to be garbage. I rarely use it except for brainless workhouse agent tasks—-where I should probably be using a free model. It really mangles code if you let it do anything slightly complicated.

Workaccount2 · on May 19, 2025

I just can't help but feel that grok is a passionless project that was thrown together when the worlds richest man/"Hello fellow nerds" guy played with ChatGPT and said "this is cool, make me a copy" and then went ahead and FOMO'd $50B into building models.

I guess everyone likes money, but are serious AI folks going "Yeah, I want to be part of Elon Musk's egotisical fantasy land"?

hnsigmaomega · on May 19, 2025

Do you know who started OpenAI?

Workaccount2 · on May 19, 2025

OpenAI in 2018 was not sitting on the same tech as it was in 2023. It just makes the FOMO even more apparent.

JohnMakin · on May 19, 2025

do you?