Hacker Newsnew | past | comments | ask | show | jobs | submit | cvg's commentslogin

Nice. Google's soundstream already has some great quality. Some examples at 6kbps here: https://google-research.github.io/seanet/soundstream/example...


This is cool. If you're wondering about top performers:

Longest Distance: https://www.foldnfly.com/32.html#The-Bird Longest Time Aloft: https://www.foldnfly.com/43.html#Stealth-Glider


Had to look up the technique, Eguchi method, and it uses color rather than complicated musical notation to associate with each key. Interesting how those who have synesthesia naturally have this same color, key association.


But not every synesthetic person sees the same colors for every key, right?


I have a step in my fashion work where I pause and consider what the colors I’m selecting will look and feel like to a normal person. Converting back and forth between [optimistic-seaglass-springtime] and “teal” isn’t very accurate, but I’m certainly accustomed to it. I’m going to try this technique soon now that I know about it; as I’m already color sensitive to pitch, I suspect the value will be in training that sensitivity rather than memorizing their hues.


I like this framing by Michael Hashimoto, on Prompt Engineering:

https://mitchellh.com/writing/prompt-engineering-vs-blind-pr...

Most of what we see on Twitter or YouTube is Blind Prompting. However, it is possible to apply an engineering mindset to prompting and that is what we should call prompt engineering. Check out the article for a much more detailed framing.

Dair AI also has some nice info and resources ( with academic papers) about prompt engineering.


Prompt testing, especially when for q/a pairs where there are multiple right answers, has been bugging me a lot

The article is reasonable, but also shows a big gap in tooling, as the techniques there feel closer to linting & typing then testing once you do more interesting prompts. They don't check the interesting parts..


> The article seems reasonable but ... closer to linting then testing... they don't check the interesting parts

can you elaborate a bit more on what those interesting parts are?

It could just be a limitation of computation.


We are helping our users with qa tasks involving code generation, where the answers may be either JSON, executable code, or markdown discussions involving the same. We are tuning for a bunch of tools following that pattern so our users don't have to.

It's easy to make a labeled training set for grading our homework (catching regressions, ...) in the case of classifiers, and that's basically what the blog post showed.

What about for the above qa tasks? We can ask GPT4 whether a generated A was a good answer for a Q, but that's asking it to grade itself. Likewise, in the code case, we can write unit tests for the answers. (Trick: we use the former to more quickly do the latter.) But I feel like there has to be better ways

Another: OpenAI always updates models based on use, so we have to be sure our tests are real holdout sets that never get back to them...


I don't think LLMs are going to be able to solve that. There are a number of things that are assumed are true, but may not necessarily be true. This can potentially lead to multiple possible answers (outputs) given the same inputs.

For example determinism in code, its required for computation and its a system's property, but generalizing a test for it is really hard. Its a property, and by knowing its true or false you can make inferences on whether a system maintains those properties, but most of this is abstracted away at lower levels and since the context can't ever be fully shared with an LLM for evaluation, nor can it automatically switch contexts when evaluation fails, this most likely will never be solveable by computers when there exists one single input that produces two separate (different) outputs, at least from what I know about automata theory and computability.

Its generally considered a class of problems that can't be solved by turing machines.

https://en.wikipedia.org/wiki/Theory_of_computation

https://medium.com/@tarcisioma/limits-of-computation-231bf28... (overview)

https://en.wikipedia.org/wiki/Undecidable_problem (crux of the problem)


This flexibility is a big draw. I can experiment with in memory, launch with cloud, and move to my own infra if I’m lucky enough to need that kind of scale.


Interesting read. I’m curious if anyone is familiar with the software they use to connect the genetic sequencing. I’m amazed that they can find migration patterns from this data.


A JavaScript app that automated the resolution of half of Twitter’s support tickets. Logic got refactored after a few years, but still used at Twitter. Probably saved Twitter about $10 million a year over the last ten years.


This probiotic at Costco also has reuteri, “trunature Advanced Digestive Probiotic”. 15$ per 100 caps https://www.costco.com/trunature-advanced-digestive-probioti...


also costco has "garden of life" probiotics with it too


Likely because it's considered "Googley" within the company.


This is interesting. Looks like the app might be just parsing this schema from the webpage and adding a nice ui.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: