Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Honestly, Grok's technology is not impressive at all, and I wonder why anyone would use it:

- Gemini is state-of-the-art for most tasks

- ChatGPT has the best image generation

- Claude is leading in coding solutions

- Deepseek is getting old but it is open-source

- Qwen has impressive lightweight models.

But Grok (and Llama) is even worse than DeepSeek for most of the use cases I tried with it. The only thing it has going for is money behind its infamous founders. Other than that, their existence would be barely acknowledged.



I like it! For me it has replaced Sonnet (3.5 at the time, but 3.7 doesn't seem better to me, from my brief tests) for general web usage -- fast, the ability to query x nee twitter is very nice, & I find the code it produces tends to be a bit better than Sonnet. (Though perhaps that depends a lot on the domain...I'm doing mostly C# in Unity.)

For tough queries o3 is unmatched in my experience.


Llama is arguably the reason open weight LLM’s are a thing, with the leak of Llama 1 and subsequent release of Llama 2. Llama 3 was a huge push for quality, size, context length, and multi-modality. Llama 4 Maverick is clearly better than it looks if a fine tune can put it at the top of LMArena human preferences leaderboard.

Grok 3 mini is quite a decent agentic model and competitive with frontier models at a fraction of the cost; see livebench.ai.


The only interesting thing about Grok is using it hooked up to the X firehose to query about events in real time. Unfortunately it sucks at that.


Although Deepseek is old, I find the V3 (without reason) still to be the best non reasoning model out there.

Now, ChatGPT main advantage for me right now it's search + o4-mini. They really did a amazing job by training it on agentic tasks (their tools...) and the search with reasoning works amazing.

Way better than grok search or anything.


Grok search is really good

Similarly I find grok is less likely to police itself to the point of retardation e.g. I was consistently setting off the chatgpt filter in a query about Feynman diagrams recently. Why?


Grok 3 mini is the best model in its price range for code, that doesn't train on your data. So it's part of Brokk's free plan. https://brokk.ai


> that doesn't train on your data.

Don't say that for sure unless you're inferencing it on your own machine.


agreed. we should normalize this level of skepticism/scrutiny for all claims from big AI labs


You don't trust Elon Musk at his word?


Before the release of Gemini 2.5 Grok 3 was the best coding AI IME, especially when you used reasoning. It also complained the least about things you asked it to do. Gemini for instance still won’t tell you how to use yt-dlp.


Gemini gave me a yt-dlp command two weeks ago without complaining. Can you share your log to compare?

https://g.co/gemini/share/638562c1a8f4


Grok is almost completely uncensored. That's incredibly useful.


Indeed. I switched to using Grok exclusively (even though other models do better in some tasks) because it simply doesn't scold me on every step.

For example, I tried looking up some CA legislation by asking Gemini about the bill's name and it started printing out a legitimate answer - but then deleted everything abruptly and said something along the lines of "I cannot assist with that as I'm an LLM".

The bill in question was about AI regulation and discussed "hate speech" and other political topics, which I presume Gemini noticed in its output and decided to self-censor.

Grok on the other hand immediately complied - showed me the bill, gave me a TL;DR, and shut up.

Another example is: I found a bunch of old HDDs from old laptops. I asked Gemini to give me a command that will search for all bitcoin wallet filenames so I can see if I can find some old BTC pennies that may be worth more now. Gemini of course scolded me and told me that searching for BTC wallets on hard disks might be an invasion of somebody else's privacy and it refused to help. Grok on the other hand cooperated and shut up.

And yes, I might have worded my prompt carelessly (e.g. "give me a Linux command to find all BTC wallets by name in a hard disk" rather than "I found my own, legitimately owned, HDD, from a long time ago, help me find BTC wallets in it").

But I shouldn't have to walk on eggshells talking to smart sand, and I won't.


This caught my interest. "Generate an image of George W Bush on the beach"... Grok does it. Gemini and ChatGPT both refuse.


At least two times they had unauthorized changes to their prompts to inject far right content that showed up on random content. imagine you're using it for a chat bot and it starts spouting off white nationalist content like "great replacement" theory.

https://www.theguardian.com/technology/2025/may/14/elon-musk...


True, although "unauthorized" might deserve scare quotes given the source and how pertinent those changes were to the bosses immediate interest.


What was the other time? The incident linked at the bottom of that article ("into trouble last year") wasn't an "unauthorized change", as far as I'm aware; it was a general lack of guardrails on image generation.


White genocide and holocaust denial.


"Unauthorised" and yet seem to line up with what Elon himself likes on X comments.


That was the one I was aware of. Was there another incident separate from that?


While I'm sure the same rogue "employee" was responsible for both, they are separate incidents. Musk's AI service was pushing "white genocide" lies as answers to unrelated prompts. It was only spouting holocaust denial lies when asked directly.


[flagged]


> from London which is now what 35% English

How are you defining "English" here?


Do you recall anything about the history of England? For example did the Indians vote to become subjects of the Queen?


A radical metaphor, you make it sound deliberate


Grok is much more concise, to the point, no bs. Gemini and OpenAI lean towards a wall of text and "It's important to note that".

I'm sure with a good system prompt you can mitigate that. I'm just comparing them out of the box.


I’ve found 3.7 to be garbage. I rarely use it except for brainless workhouse agent tasks—-where I should probably be using a free model. It really mangles code if you let it do anything slightly complicated.


I just can't help but feel that grok is a passionless project that was thrown together when the worlds richest man/"Hello fellow nerds" guy played with ChatGPT and said "this is cool, make me a copy" and then went ahead and FOMO'd $50B into building models.

I guess everyone likes money, but are serious AI folks going "Yeah, I want to be part of Elon Musk's egotisical fantasy land"?


Do you know who started OpenAI?


OpenAI in 2018 was not sitting on the same tech as it was in 2023. It just makes the FOMO even more apparent.


do you?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: