Hacker Newsnew | past | comments | ask | show | jobs | submit | tvshtr's favoriteslogin

This is similar to an old HCI design technique called Wizard of Oz by the way, where a human operator pretends to be the app that doesn’t exist yet. It’s great for discovering new features.

https://en.m.wikipedia.org/wiki/Wizard_of_Oz_experiment


This (ragebaity/ai?) post kind of mixes things up. Kubernetes is fine I think but almost everything around it and the whole consulting business is the problem. You can build a single binary, use a single layer oci container and run it with a single config map with memory quota on 30 machines just fine.

Take a look at the early borg papers what problem does it solves. Helm is just insane but you can use jsonnet that is modelled after Google's internal system.

Only use the minimal subset and have an application that is actually build to work fine in that subset.


Just set this on my MiniPC running Debian which runs Jellyfin.

    sudo nano /etc/default/grub
Look for GRUB_CMDLINE_LINUX_DEFAULT and add: i915.mitigations=off

    GRUB_CMDLINE_LINUX_DEFAULT="quiet i915.mitigations=off"
Then:

    sudo update-grub
    sudo reboot
To verify:

    cat /proc/cmdline

> It's i915.mitigations

Since you're doing the research, you tell us. Is NEO_DISABLE_MITIGATIONS (the flag mentioned in TFA) related to i915.mitigations, and if so, how?

TFA mentions that Intel ships prebuilt driver packages with this NEO_... flag set, and that Canonical and Intel programmers talked at some length about the flag.


I had to ask Gemini CLI to remind myself ;) but you can add this into settings.json:

{ "excludeTools": ["run_shell_command", "write_file"] }

but if you ask Gemini CLI to do this it'll guide you!


Have tried everything under the sun at the moment— broadly just two winners (and both have become my daily drivers for different use cases):

1. Claude Code with Opus 4

2. Cursor with Opus 4 or Gemini 2.5 Pro (Windsurf used to be an option but Anthropic has now cut them out)

3. (Coming up; still playing around) Claude Code’s GitHub Action


Yes, that's the LGTM(Loki, Grafana, Tempo, and Mimir) stack.

First, the main issue with this stack is maintenance: managing multiple storage clusters increases complexity and resource consumption. Consolidating resources can improve utilization.

Second, differences in APIs (such as query languages) and data models across these systems increase adoption costs for monitoring applications. While Grafana manages these differences, custom applications do not.


In no particular order:

* experiment with multiple models, preferably free high quality models like Gemini 2.5. Make sure you're using the right model, usually NOT one of the "mini" varieties even if its marketed for coding.

* experiment with different ways of delivering necessary context. I use repomix to compile a codebase to a text file and upload that file. I've found more integrated tooling like cursor, aider, or copilot, are less effective then dumping a text file into the prompt

* use multi-step workflows like the one described [1] to allow the llm to ask you questions to better understand the task

* similarly use a back-and-forth one-question-at-a-time conversation to have the llm draft the prompt for you

* for this prompt I would focus less on specifying 10 results and more about uploading all necessary modules (like with repomix) and then verifying all 10 were completed. Sometimes the act of over specifying results can corrupt the answer.

[1]: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/

I'm a pretty vocal AI-hater, partly because I use it day to day and am more familiar with its shortfalls - and I hate the naive zealotry so many pro-AI people bring to AI discussions. BUTTT we can also be a bit more scientific in our assessments before discarding LLMs - or else we become just like those naive pro-AI-everything zealots.


It's a real UI - the code for that is here: https://www.val.town/x/geoffreylitt/stevensDemo/code/dashboa...

A single cloudflare durable object (sqlite db + serverless compute + cron triggers) would be enough to run this project. DOs have been added to CFs free tier recently - you could probably run a couple hundred (maybe thousands) instances of Stevens without paying a cent, aside from Claude costs ofc

> Won’t he eventually ran out of context window?

The "memories" table has a date column which is used to record the data when the information is relevant. The prompt can then be fed just information for today and the next few days - which will always be tiny.

It's possible to save "memories" that are always included in the prompt, but even those will add up to not a lot of tokens over time.

> Won’t this be expensive when using hosted solutions?

You may be under-estimating how absurdly cheap hosted LLMs are these days. Most prompts against most models cost a fraction of a single cent, even for tens of thousands of tokens. Play around with my LLM pricing calculator for an illustration of that: https://tools.simonwillison.net/llm-prices

> If one were to build this locally, can Vector DB similarity search or a hybrid combined with fulltext search be used to achieve this?

Geoffrey's design is so simple it doesn't even need search - all it does is dump in context that's been stamped with a date, and there are so few tokens there's no need for FTS or vector search. If you wanted to build something more sophisticated you could absolutely use those. SQLite has surprisingly capable FTS built in and there are extensions like https://github.com/asg017/sqlite-vec for doing things with vectors.


I believe you can still use Gemini 2.5 Pro for free via https://aistudio.google.com and their gemini-2.5-pro-exp-03-25 model ID through their API.

The free tier is "used to improve our products", the paid tier is not.


To be clear, the difference is something along this line:

    $ bash -ec 'echo hello && ls -la /tmp/ | grep systemd && false && echo testing'
    hello
    drwx------.   3 root   root      60 Mar 29 18:33 systemd-private-bluetooth.service-yuSMVM
    drwx------.   3 root   root      60 Mar 29 18:33 systemd-private-upower.service-YhHHP2
versus

    $ bash -euxc 'echo hello; ls -la /tmp/ | grep systemd; false; echo testing'
    + echo hello
    hello
    + ls -la /tmp/
    + grep systemd
    drwx------.   3 root   root      60 Mar 29 18:33 systemd-private-bluetooth.service-yuSMVM
    drwx------.   3 root   root      60 Mar 29 18:33 systemd-private-upower.service-YhHHP2
    + false
Docker also supports the `SHELL` syntax now, which is even better, because you can set it once at the top of the Dockerfile without having to do the whole `set -eux` on every line.

There's a fork of this that has some great improvements over to the top of the original and it is also actively maintained: https://github.com/lexiforest/curl-impersonate

There's also Python bindings for the fork for anyone who uses Python: https://github.com/lexiforest/curl_cffi


Hello, very cool app. I have been making apps like this using my tool here: https://domsy.io

It's pretty cool how quickly and easily I can generate little static apps like this for ad hoc use cases. I have made a weight tracker, expense tracker, prototypes for work, cards for my wife, slides for work, etc.

For example, this slide show app: https://domsy.io/share/644305ab-d36b-40a9-80e7-f0b52abaa18b

I import it in domsy.io and give AI a text dump of everything I need, it uses the js in that html to convert to slides that I can download to pdf.


chat.qwenlm.ai has quickly risen to the preferred choice for all my LLM needs. As accurate as Deepseek v3, but without the server issues.

This makes it even better!


For anyone looking for a sleep supplement, before you go down the rabbit hole of Theanine, Mg, etc. Try an OTC Azelastine or Fluticasone nasal spray for a month.

Turns out my chronic poor quality, restless sleep was a dust mite allergy that I should have figured out and treated a decade ago. Would wake up with a stuffy nose and very dry mouth but didn't have too many issues during the day. I was allergic to my bed.

Been using antihistamines, and a dehumidifier for several months now and sleeping better than I have in years. Given how extremely common mite allergies are there's got to be a lot of folks with undiagnosed issues here.


The last time I've used a leet code style interview was in 2012, and it resulted in a bad hire (who just happened to have trained on the questions we used). I've hired something like 150 developers so far, and what I ended up with after a few years of trial and error:

1. Use recruiters and network: Wading through the sheer volume of applications was even nasty before COVID, I don't even want to imagine what it's like now. A good recruiter or a recommendation can save a lot of time.

2. Do either no take home test, or one that takes at most two hours. I do discuss the solution candidates came up with, so as long as they can demonstrate they know what they did there, I don't care too much how they did it. If I do this part, it's just to establish some base line competency.

3. Put the candidate at ease - nervous people don't interview well, another problem with non-trivial tasks in technical interviews. I rarely do any live coding, if I do, it's pairing and for management roles, to e.g. probe how they manage disagreement and such. But for developers, they mostly shine when not under pressure, I try to see that side of them.

4. Talk through past and current challenges, technical and otherwise. This is by far the most powerful part of the interview IMHO. Had a bad manager? Cool, what did you do about it? I'm not looking for them having resolved whatever issue we talk about, I'm trying to understand who they are and how they'd fit into the team.

I've been using this process for almost a decade now, and currently don't think I need to change anything about it with respect to LLMs.

I kinda wish it was more merit based, but I haven't found a way to do that well yet. Maybe it's me, or maybe it's just not feasible. The work I tend to be involved in seems way too multi faceted to have a single standard test that will seriously predict how well a candidate will do on the job. My workaround is to rely on intuition for the most part.


There is an option in feed settings:

[x] Fetch original content

But most power comes from URL rewrite rules. Here is the one I use for problematic sites:

rewrite("^(.+)$"|"https://markdown.download/$1")


As a general PSA, youtube channels have an RSS feed to alert you when a favourite creator releases a new video.

The form is

https://www.youtube.com/feeds/videos.xml?channel_id=UC2wdo5v...

where channel_id is the channel hash code which is buried in the source for the "nicely named" channel:

https://www.youtube.com/@CuttingEdgeEngineering

and can be found without source diving via (say) FeedBro (RSS browser extension) "Find Feeds in Current Tab" function.

https://nodetics.com/feedbro/


Pentax cameras are much better at the ui and do not have any of this shit. They are also bulletproof and nearly indestructible, favoured by war photographers, and tend to have excellent spec sheets (if a bit of a a slow autofocus).

The company went bankrupt and bought by Ricoh, which I sincerely hope will keep the brand alive. Capitalism does really seem to prefer the nickel and dime approach...


Or just use https://opennebula.io/ based on KVM, still works with vmware if needed. ~20 year old project. It has legs

There are tons of supported solutions in this space, such as Openstack, Kubevirt, and oVirt (and many more).

gpt-4o-mini might not be the best point of reference for what good LLMs can do with code: https://aider.chat/docs/leaderboards/#aider-polyglot-benchma...

A teeny tiny model such as a 1.5B model is really dumb, and not good at interactively generating code in a conversational way, but models in the 3B or less size can do a good job of suggesting tab completions.

There are larger "open" models (in the 32B - 70B range) that you can run locally that should be much, much better than gpt-4o-mini at just about everything, including writing code. For a few examples, llama3.3-70b-instruct and qwen2.5-coder-32b-instruct are pretty good. If you're really pressed for RAM, qwen2.5-coder-7b-instruct or codegemma-7b-it might be okay for some simple things.

> medium specced macbook pro

medium specced doesn't mean much. How much RAM do you have? Each "B" (billion) of parameters is going to require about 1GB of RAM, as a rule of thumb. (500MB for really heavily quantized models, 2GB for un-quantized models... but, 8-bit quants use 1GB, and that's usually fine.)


Otel seems complicated because different observability vendors make implementing observability super easy with their proprietary SDK’s, agents and API’s. This is what Otel wants to solve and I think the people behind it are doing a great job. Also kudos to grafana for adopting OpenTelemetry as a first class citizen of their ecosystem.

I’ve been pushing the use of Datadog for years but their pricing is out of control for anyone between mid size company and large enterprises. So as years passed and OpenTelemetry API’s and SDK’s stabilized it became our standard for application observability.

To be honest the documentation could be better overall and the onboarding docs differ per programming language, which is not ideal.

My current team is on a NodeJS/Typescript stack and we’ve created a set of packages and an example Grafana stack to get started with OpenTelemetry real quick. Maybe it’s useful to anyone here: https://github.com/zonneplan/open-telemetry-js


I just donated 133,7€ and will gladly do it again if further legal costs arise. Please consider also making a generous donation and post about it in this thread.

What Newag is doing here is absolutely vile. They want to charge 20.000€ per train to “reactivate” them after they have been serviced at third party workshops. We must not let them win and set a precedent.

I highly encourage everyone to watch the previous presentation: https://media.ccc.de/v/37c3-12142-breaking_drm_in_polish_tra...


The author has a dev log I’d recommend if you’re curious about what makes it different + general goodies on Zig/terminal emulators.

https://mitchellh.com/ghostty


I have found the following community site for generating Ghostty config quite helpful https://ghostty.zerebos.com/

I've seen these called "explorables" or "explorable explanations" before and I really like them. I've been collecting notes on them here: https://simonwillison.net/tags/explorables/

Here's the website that coined the term: https://explorabl.es/


Yep.

A corollary of this is the following paper

https://web.mit.edu/nelsonr/www/Repenning=Sterman_CMR_su01_....

Basically stating the fact that people fail to see the value in reinvestment of time and resources for improvement. Being Idle is not a failure but a way to think and be ready if a period of higher intensity comes. And it is healthy to have sometimes more time for a menial task.

People get so crazy about the idea of optimization, but fail to account for severe issues that arise when time is always occupied by something, which seems to happen more and more these days...


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: