epolanski 10 hours ago

Among the worst things that I have a hard time tolerating about Claude:

- sycophancy, I'm honestly tired of "You're absolutely right". I need a pair programmer, something that's gonna correct me, provide different ideas, etc.

- inability to follow the script. Even though it will tell you you're right, it will still do its thing. Doesn't matter if I spend 2 hours writing a detailed spec file, a todo list, etc, Claude's gonna do its thing regardless. You can even correct it with "no, don't do this", and it will still do it regardless. I understand that this is how AI works (it's like children, if you tell them to not do something it's more likely they will), but it's just unbearable.

For both of these things it's impossible to make it go right. No matter the system instructions, the prompts, the context management, it's just terrible.

That's not to say it's all bad: there are things I like about Claude and AI assistants. I firmly believe that a coder with AI is much more productive than one without. But what AI should be delegated to, is not writing and editing code, but planning it, writing specs, doing research, verifying you're maintaining docs, suggest ideas, alternatives, test cases, reviewing PRs according to guidelines, etc.

I don't even think it's a matter of "it will get better", it produces way too much code than a human can review and reviewing code is more difficult than writing it in the first place.

Even more, it can provide its value in tasks humans are just bad at such as writing good issues/tasks, stuff like user stories that use consistent ubiquitous language, etc, etc. Stuff that it's hard to get stakeholders to get right, but can be achieve with a set of good rules and having the stakeholders interact with the chatbot first that can guide them writing much more clearly.

  • verdverm 10 hours ago

    Nothing specific to Claude in your two issues, I see the same thing with other models. They really aren't that different

    Instructions go a long way. There probably needs to be a better LMM+prompt+loop at the top, the one you interface with, or one below that maybe.

    My next step is taking over this instead of outsourcing to M$, Google, or Anthropic. It's just too important to let others decide how they should work at this point. It needs to be more open and something we can tinker with like vim

  • kroaton 6 hours ago

    You forgot: "The code is now 100% production ready. All features implemented." It confidently says this without testing anything or running it. Placeholder code everywhere, features missing, overall the code ends up not compiling due to the dumbest errors imaginable.

  • bryanlarsen 10 hours ago

    I find those 2 things infuriating also. Sometimes "You're absolutely right" is wrong. I tell it "try this" and it'll tell me I'm absolutely right. Half the time I'm wrong, I'm asking it to figure something out. Assuming I'm right from the beginning is counter productive.

    But still not as infuriating as the second. And it can be really hard to stop it from doing something you don't want.

    I find that one can use Claude to produce lower quality code faster, but one can also use it to produce higher quality code slower, by using it as a pair programmer, rubber duck, to try experiments, et cetera.

    • epolanski 10 hours ago

      I have to manually type "correct me if I'm wrong" in half of my comments to avoid this behavior.

      I have no clue how to avoid it going off rails, it's one of the most common criticisms I see on Reddit too.

      > I find that one can use Claude to produce lower quality code faster, but one can also use it to produce higher quality code slower, by using it as a pair programmer, rubber duck, to try experiments, et cetera.

      That's a very good phrase I'm gonna steal.

      • bryanlarsen 10 hours ago

        Thanks, I'll try that next time. I thought I was being tentative enough, but that phrase might make it more clear.

Etheryte 11 hours ago

> [Metrics] from the Vibe Kanban - a tool which orchestrates AI agents - has shown Claude Code usage drop from 83% to 70%

I'm not really convinced that this warrants the title the post currently has. For one, I hadn't even heard of Vibe Kanban prior to this, and for two, the error bars on this must be insanely wide as is.

yaKashif 2 hours ago

a dev in philipine india and pak will turn out to be cheaper and better than ai

mdotk 9 hours ago

Codex just seems to have a much bigger context and doesn't chew up tokens as readily as Claude. It seems to be able to do a much wider and broader range of things, accepting much wider and broader instructions and implementing them perfectly.Whereas Claude would struggle.

42lux 11 hours ago

For most of the vibe coding crowd the novelty simply wore off. You can only tinker with small projects for so long before craving something more substantial. When the tool inevitably struggles with your evolved, more complex goals, you perceive it as having gotten worse.

jaggs 8 hours ago

Poor performance, over tight rate limits, inability to follow prompts suddenly, erratic outputs, which together have combined to increase project costs massively?

dzhiurgis 6 hours ago

Junie is cheaper and built into Webstorm. Also Claude tried to charge my card for months after I've tried to cancel it. Despicable.