This is similar to an old HCI design technique called Wizard of Oz by the way, where a human operator pretends to be the app that doesn’t exist yet. It’s great for discovering new features.
This (ragebaity/ai?) post kind of mixes things up. Kubernetes is fine I think but almost everything around it and the whole consulting business is the problem. You can build a single binary, use a single layer oci container and run it with a single config map with memory quota on 30 machines just fine.
Take a look at the early borg papers what problem does it solves. Helm is just insane but you can use jsonnet that is modelled after Google's internal system.
Only use the minimal subset and have an application that is actually build to work fine in that subset.
Since you're doing the research, you tell us. Is NEO_DISABLE_MITIGATIONS (the flag mentioned in TFA) related to i915.mitigations, and if so, how?
TFA mentions that Intel ships prebuilt driver packages with this NEO_... flag set, and that Canonical and Intel programmers talked at some length about the flag.
Yes, that's the LGTM(Loki, Grafana, Tempo, and Mimir) stack.
First, the main issue with this stack is maintenance: managing multiple storage clusters increases complexity and resource consumption. Consolidating resources can improve utilization.
Second, differences in APIs (such as query languages) and data models across these systems increase adoption costs for monitoring applications. While Grafana manages these differences, custom applications do not.
* experiment with multiple models, preferably free high quality models like Gemini 2.5. Make sure you're using the right model, usually NOT one of the "mini" varieties even if its marketed for coding.
* experiment with different ways of delivering necessary context. I use repomix to compile a codebase to a text file and upload that file. I've found more integrated tooling like cursor, aider, or copilot, are less effective then dumping a text file into the prompt
* use multi-step workflows like the one described [1] to allow the llm to ask you questions to better understand the task
* similarly use a back-and-forth one-question-at-a-time conversation to have the llm draft the prompt for you
* for this prompt I would focus less on specifying 10 results and more about uploading all necessary modules (like with repomix) and then verifying all 10 were completed. Sometimes the act of over specifying results can corrupt the answer.
I'm a pretty vocal AI-hater, partly because I use it day to day and am more familiar with its shortfalls - and I hate the naive zealotry so many pro-AI people bring to AI discussions. BUTTT we can also be a bit more scientific in our assessments before discarding LLMs - or else we become just like those naive pro-AI-everything zealots.
A single cloudflare durable object (sqlite db + serverless compute + cron triggers) would be enough to run this project. DOs have been added to CFs free tier recently - you could probably run a couple hundred (maybe thousands) instances of Stevens without paying a cent, aside from Claude costs ofc
The "memories" table has a date column which is used to record the data when the information is relevant. The prompt can then be fed just information for today and the next few days - which will always be tiny.
It's possible to save "memories" that are always included in the prompt, but even those will add up to not a lot of tokens over time.
> Won’t this be expensive when using hosted solutions?
You may be under-estimating how absurdly cheap hosted LLMs are these days. Most prompts against most models cost a fraction of a single cent, even for tens of thousands of tokens. Play around with my LLM pricing calculator for an illustration of that: https://tools.simonwillison.net/llm-prices
> If one were to build this locally, can Vector DB similarity search or a hybrid combined with fulltext search be used to achieve this?
Geoffrey's design is so simple it doesn't even need search - all it does is dump in context that's been stamped with a date, and there are so few tokens there's no need for FTS or vector search. If you wanted to build something more sophisticated you could absolutely use those. SQLite has surprisingly capable FTS built in and there are extensions like https://github.com/asg017/sqlite-vec for doing things with vectors.
Docker also supports the `SHELL` syntax now, which is even better, because you can set it once at the top of the Dockerfile without having to do the whole `set -eux` on every line.
Hello, very cool app. I have been making apps like this using my tool here:
https://domsy.io
It's pretty cool how quickly and easily I can generate little static apps like this for ad hoc use cases. I have made a weight tracker, expense tracker, prototypes for work, cards for my wife, slides for work, etc.
For anyone looking for a sleep supplement, before you go down the rabbit hole of Theanine, Mg, etc. Try an OTC Azelastine or Fluticasone nasal spray for a month.
Turns out my chronic poor quality, restless sleep was a dust mite allergy that I should have figured out and treated a decade ago. Would wake up with a stuffy nose and very dry mouth but didn't have too many issues during the day. I was allergic to my bed.
Been using antihistamines, and a dehumidifier for several months now and sleeping better than I have in years. Given how extremely common mite allergies are there's got to be a lot of folks with undiagnosed issues here.
The last time I've used a leet code style interview was in 2012, and it resulted in a bad hire (who just happened to have trained on the questions we used). I've hired something like 150 developers so far, and what I ended up with after a few years of trial and error:
1. Use recruiters and network: Wading through the sheer volume of applications was even nasty before COVID, I don't even want to imagine what it's like now. A good recruiter or a recommendation can save a lot of time.
2. Do either no take home test, or one that takes at most two hours. I do discuss the solution candidates came up with, so as long as they can demonstrate they know what they did there, I don't care too much how they did it. If I do this part, it's just to establish some base line competency.
3. Put the candidate at ease - nervous people don't interview well, another problem with non-trivial tasks in technical interviews. I rarely do any live coding, if I do, it's pairing and for management roles, to e.g. probe how they manage disagreement and such. But for developers, they mostly shine when not under pressure, I try to see that side of them.
4. Talk through past and current challenges, technical and otherwise. This is by far the most powerful part of the interview IMHO. Had a bad manager? Cool, what did you do about it? I'm not looking for them having resolved whatever issue we talk about, I'm trying to understand who they are and how they'd fit into the team.
I've been using this process for almost a decade now, and currently don't think I need to change anything about it with respect to LLMs.
I kinda wish it was more merit based, but I haven't found a way to do that well yet. Maybe it's me, or maybe it's just not feasible. The work I tend to be involved in seems way too multi faceted to have a single standard test that will seriously predict how well a candidate will do on the job. My workaround is to rely on intuition for the most part.
Pentax cameras are much better at the ui and do not have any of this shit. They are also bulletproof and nearly indestructible, favoured by war photographers, and tend to have excellent spec sheets (if a bit of a a slow autofocus).
The company went bankrupt and bought by Ricoh, which I sincerely hope will keep the brand alive. Capitalism does really seem to prefer the nickel and dime approach...
A teeny tiny model such as a 1.5B model is really dumb, and not good at interactively generating code in a conversational way, but models in the 3B or less size can do a good job of suggesting tab completions.
There are larger "open" models (in the 32B - 70B range) that you can run locally that should be much, much better than gpt-4o-mini at just about everything, including writing code. For a few examples, llama3.3-70b-instruct and qwen2.5-coder-32b-instruct are pretty good. If you're really pressed for RAM, qwen2.5-coder-7b-instruct or codegemma-7b-it might be okay for some simple things.
> medium specced macbook pro
medium specced doesn't mean much. How much RAM do you have? Each "B" (billion) of parameters is going to require about 1GB of RAM, as a rule of thumb. (500MB for really heavily quantized models, 2GB for un-quantized models... but, 8-bit quants use 1GB, and that's usually fine.)
Otel seems complicated because different observability vendors make implementing observability super easy with their proprietary SDK’s, agents and API’s. This is what Otel wants to solve and I think the people behind it are doing a great job.
Also kudos to grafana for adopting OpenTelemetry as a first class citizen of their ecosystem.
I’ve been pushing the use of Datadog for years but their pricing is out of control for anyone between mid size company and large enterprises. So as years passed and OpenTelemetry API’s and SDK’s stabilized it became our standard for application observability.
To be honest the documentation could be better overall and the onboarding docs differ per programming language, which is not ideal.
My current team is on a NodeJS/Typescript stack and we’ve created a set of packages and an example Grafana stack to get started with OpenTelemetry real quick.
Maybe it’s useful to anyone here: https://github.com/zonneplan/open-telemetry-js
I just donated 133,7€ and will gladly do it again if further legal costs arise. Please consider also making a generous donation and post about it in this thread.
What Newag is doing here is absolutely vile. They want to charge 20.000€ per train to “reactivate” them after they have been serviced at third party workshops. We must not let them win and set a precedent.
I've seen these called "explorables" or "explorable explanations" before and I really like them. I've been collecting notes on them here: https://simonwillison.net/tags/explorables/
Basically stating the fact that people fail to see the value in reinvestment of time and resources for improvement. Being Idle is not a failure but a way to think and be ready if a period of higher intensity comes. And it is healthy to have sometimes more time for a menial task.
People get so crazy about the idea of optimization, but fail to account for severe issues that arise when time is always occupied by something, which seems to happen more and more these days...
https://en.m.wikipedia.org/wiki/Wizard_of_Oz_experiment