Hacker Newsnew | past | comments | ask | show | jobs | submit | thtmnisamnstr's commentslogin

The general rule to follow is that you need as much VRAM as the model size. 30b models are usually around 19GB. So, most likely a GPU with 24GB of VRAM.


But this also means tiny context windows. You can't fit gpt-oss:20b + more than a tiny file + instructions into 24GB


Gpt-oss is natively 4-bit, so you kinda can


You can fit the weights + a tiny context window into 24GB, absolutely. But you can't fit anything of any reasonable size. Or Ollama's implementation is broken, but it needs to be restricted beyond usability for it not to freeze up the entire machine when I last tried to use it.


Gemini CLI is a solid alternative to Claude Code. The limits are restrictive, though. If you're paying for Max, I can't imagine Gemini CLI will take you very far.


Gemini CLI isn't even close to the quality of Claude Code as a coding harness. Codex and even OpenCode are much better alternatives.


Gemini CLI regularly gets stuck failing to do anything after declaring its plan to me. There seems to be no way to un-lock it from this state except closing and reopening the interface, losing all its progress.


you should be able to copy the entire conversation and paste it in (including thinking/reasoning tokens).

When you have a conversation with an AI, in simple terms, when you type a new line and hit enter, the client sends the entire conversation to the LLM. It has always worked this way, and it's how "reasoning tokens" were first realized. you allow a client to "edit" the context, and the client deletes the hallucination, then says "Wait..." at the end of the context, and hits enter.

the LLM is tricked into thinking it's confused/wrong/unsure, and "reasons" more about that particular thing.


Depending on task complexity, I like to write a small markdown file with the list of features or tasks. If I lose a session (with any model), I'll start with "we were disconnected, please review the desired features in 'features.md', verify current state, and complete anything remaining.

That has reliably worked for me with Gemini, Codex, and Opus. If you can get them to check-off features as they complete them, works even better (i.e, success criteria and an empty checkbox for them to mark off).


Well, I use Gemini a lot (because it's one of three allowed families), but tbh it's pretty bad. I mean, it can get the job done but it's exhausting. No pleasure in using it.


I tried Gemini like a year or so ago, and I gave up after it directly refused to write me a script and instead tried to tell me how to learn to code. I do not make this up.


That's at least two major updates ago. Probably worth another try.


Gemini is my preferred LLM for coding, but it still does goofy shit once in a while even with the latest version.

I'm 99.9999% sure Gemini has a dynamic scaling system that will route you to smaller models when its overloaded, and that seems to be when it will still occasionally do things like tell you it edited some files without actually presenting the changes to you or go off on other strange tangents.


I tried it on Tuesday and, having used CC a lot lately, was shocked at how bad it was - I'd forgotten.


Kilocode is a good alt as well. You can plug into OpenRouter or Kilocode to access their models.


> 1. Unlike most developed countries, in India (and many other develping countries), people in authority are expected to be respected unconditinally(almost). Questioning a manager, teacher, or senior is often seen as disrespect or incompetence. So, instead of asking for clarification, many people just "do something" and hope it is acceptable. You can think of this as a lighter version of Japanese office culture, but not limited to office... it's kind of everywhere in society.

I was a manager at Deloitte in their tech consulting practice. I led fairly large teams of devs based in India. This is very true, and it takes a lot of time and trust-building to overcome. Making Indian devs, especially early-career ones, comfortable enough to oppose something or offer feedback is non-trivial, and often Indian engineering managers make it more difficult. Overcoming cultural hierarchy is hard.


Uncertainty is frequently a contributor to depression. Uncertainty is one of the most reliable stress triggers, which, over prolonged periods of time, especially when paired with low perceived control, is a direct path to increased depression. So if something is uncertain, it is often depressing as well.


What they are describing sounds like the Glove80. I've been using the Glove80 for a year, and I'm a huge fan of it. Took a while to get used to, but now typing is way more comfortable than it was on even my old Freestyle Kinesis.


I'm a marketer. I write a lot. GPT-4.5 is really good at natural sounding writing. It's nearing the point where it would be worth $200/mth for me to have access to it all the time.


I used the GPT-4.5 API to write a novel, with a reasonably simple loop-based workflow. The novel was good enough that my son read the whole thing. And he has no issue quitting a book part way through if it becomes boring.


I guess I don't really understand why. I'm a writer. The joy in storytelling is telling a story. Why outsource that to a bot?


Books create joy for people other than the authors. The joy isn't confined to the writing process.


No, but knowing that a book was written by a bot would hinder my enjoyment of it to the point that I'd drop it.


I’m curious: what was the novel about?


It's a comedic adventure novel set in the Minecraft universe.

Actually I forgot there's a second one he read all the way through, for which he defined the initial concept and early plot, but then the rest of the plot and the writing were all done by GPT-4.5.

The code is kind of basic, and each chapter is written without the full text of prior chapters, but the output isn't bad.

https://gist.github.com/rahimnathwani/41e5bc475163cd5ea43822...


Very fascinating, I tried doing the same years ago with a simple Markov chain model. The biggest problem back then was inconsistency. I'd love to read a chapter of the Minecraft or hard magic / sci-fi books to check out the writing.


Email in profile.


Not having access to earlier chapters is a terrible thing, but maybe possible if you aren’t too bothered by inconsistency (or your chapter summaries are explicit enough about what is supposed to happen I suppose).

I find the quality rapidly degrades as soon as I run out of context to fit the whole text of the novel. Even summarizing the chapters doesn’t work well.


Yeah this is true. I could have sent the entire book up until that point as context. But doing that 100 times (once per chapter) would have meant sending roughly 50x the length of the book as input tokens (going from 0% to 100% as the book progressed).

This would be fine for a cheap model, but GPT 4.5 was not cheap!

I would have liked to have fewer, longer chapters, but my (few) experiments at getting it to output more tokens didn't have much impact.


Yeah, that’s what I eventually ended up doing. Quality and cost both went through the roof. To be fair, Claude is good about caching, and with a bunch of smart breakpoints, you pay only 10% for most generations.


If everyone is as good as you , how much will your work cost?


A better question might be: "If everyone is as good as you, how much will you be worth in the marketplace?"


Well, an even better question might be: if everyone is the same, what does it take to be exceptional?

I'm firmly convinced that being able to troubleshoot code, even code generated by LLMs, and to write guidelines and tests to make sure it's functioning, is a skill of a shrinking pool

For smaller stuff, great. Everyone's the same. The second your application starts gaining responsibility and complexity, you're going to need to be able to demonstrate reproducibility and reliability of your application to stakeholders.

Like, your job increasingly will be creating interface checkpoints in the code, and then having the model generate each step of the pipeline. That's great, but you have understand and validate what it wrote, AND have a rich set of very comprehensive tests to be able to iterate quickly.

And as mentioned, on top of that, large swaths of the field of new people have their brains completely rotted by these tools. (certainly not all new/young people, but i've seen some real rough shit)

If anything, I see a weird gap opening up

- people who dont adopt these tools start falling out of the industry - they're too slow

- people who adopt these tools too early stop getting hired - they're too risky

- people who have experience in industry/troubleshooting/etc, who adopt these tools, become modern day cobol programmers - they're charging $700 an hour

the real question to me is this: does the amount of people taken out of the pool by being slow or risky due to these tools, outpace the reduction in jobs caused by these tools?


> I'm firmly convinced that being able to troubleshoot code, even code generated by LLMs, and to write guidelines and tests to make sure it's functioning, is a skill of a shrinking pool

Well, today only scientists can make stone tools.


I’m not sure the point you’re trying to make but I’ve had so many junior level interviewees and interactions where they are unable to do anything without an LLM coaching them the whole way. This is dangerous!

It’s like if I was hiring a mathematician. I’d expect them to use a calculator or CAS package but I’d also expect them to be able to do everything by hand. I wouldn’t ever waste their time by making them do that, of course.


> I’m not sure the point you’re trying to make

I was trying to say that dropping old technologies isn't always bad.

> It’s like if I was hiring a mathematician.

Do you expect candidate to memorize all theorems up to date. Usually people forgetting things they don't actively use. But they are able to refresh their knowledge if needed. I've learned quite a lot, but no, I don't remember even key theorems from partial differential equations (used them in my diploma). I can refresh and relearn quickly, I'm sure.

Using LLM without understanding disqualifies the candidate, even monkey can do it. But if he deeply understands the subject and uses LLM for like handbook for minor details.. that's different.


> Do you expect candidate to memorize all theorems up to date.

Completely missing the point. I expect them to have enough knowledge to briefly study the theorems and understand how to apply them. I’m not trying to quiz people, I’m trying to get things done - and done well.

And for the stuff I’m doing, it’s required that any engineer understand what they’re building and why.

> Using LLM without understanding disqualifies the candidate, even monkey can do it. But if he deeply understands the subject and uses LLM for like handbook for minor details.. that's different

The problem is that they don’t understand the subject and overly rely on LLMs. Completely falling apart during in person interviews. Surface knowledge of everything and no depth.

Using LLMs isn’t inherently bad but I’ve seen severe side effects from students and junior engineers who over rely on it.


Approximately $200/month apparently.


It probably would be just like with developers.

A great developer + an AI = productive.

A shitty developer + an AI = still shit.

AI does not make all developers the same. And it doesn't make all marketers the same.


I wish all LLM-written marketing copy had disclaimers so I knew never to waste my time reading it.


I think writing claude sonnet 4 is more human - like.


Can't you still use the European site to download the EOS Webcam Utility for free? https://www.canon-europe.com/cameras/eos-webcam-utility/


You could have solved this issue. Your coding fingers turned into complaining fingers. If you really had a problem, you should have opened a PR with a license instead of demanding that OP do it on your timeline.


I'd dispute that. No gi is not easier on the knees, ankles, or shoulders IMO. The slipperiness compared to gi comes with the downside of sudden slipping movements that put your knees and shoulders at higher risk of injury and dislocation. The increased focus on leg attacks also puts your knees and ankles at higher risk. Add to that the seeming slant towards more explosive movements in no gi, and the overall risk of injury should be higher than gi. You likely see more injuries in gi, because way more people train gi than no gi still.

Note: I train gi and no gi and have been for almost 10 years. My biggest injury happened in the gi (broken hand), but I've had significantly more ankle and knee sprains and shoulder dislocations in no gi. Also, the morning after no gi feels like I got hit by a truck compared to the morning after gi.


I work at Earthly. We build a pretty popular open source build tool. I've worked for several companies that build OSS before Earthly as well.

At Earthly, a few years ago, the founder and CEO had these same concerns about big cloud providers and switched to a source available license. There was backlash, and after around a year, we switched back to open source. We've discussed things like this a lot, and believe an open source license is best for our product, our users, and our business.

The way that we differ from Hashicorp, Redis, and others that have switched to source available licenses is that the service we offer and generate revenue from isn't just a hosted version of our OSS. It's several services that natively integrate with our OSS but are not open source. This seems like one of the only ways a company that maintains popular OSS can survive without switching licenses: build great OSS that users love, build non-OSS services that integrate with and augment your OSS (and/or open up new use cases), and charge for those services.

If the service a company sells is just a hosted version of their OSS, even if it has a bunch of non-OSS bells and whistles added on, that company is at risk of a cloud provider eating their lunch unless they switch to a non-OSS license.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: