More

k8si · 2025-12-16T19:27:51 1765913271

I'm not sure people outside of Greater Boston would care, but those of us who do live there probably find it exceedingly strange that this occurred in Brookline of all places.

methyl · 2025-12-18T18:48:26 1766083706

This was headline news in Poland

k8si · 2025-12-03T15:16:01 1764774961

Well, currently we have a ton of Congresspeople who are primarily motivated by their "good financial sense" (for obvious reasons e.g. this study). So, I think we could do with a few more Congresspeople with less financial sense and more genuine motivation to improve the lives of their constituents.

k8si · 2025-08-15T18:19:05 1755281945

"Only rich kids should get to choose what they study in school, poor kids are too dumb to make their own choices"

zdragnar · 2025-08-15T18:30:59 1755282659

The argument is rather that humanities degrees are a luxury item. Neither kids just starting out their adult lives nor society should be burdened propping up departments whose value doesn't match their price tag.

k8si · 2025-08-04T19:31:43 1754335903

Maybe this is a nitpick but CoNLL NER is not a "challenging task". Even pre-LLM systems were getting >90 F1 on that as far back as 2016.

Also, just in case people want to lit review further on this topic: they call their method "programmatic data curation" but I believe this approach is also called model distillation and/or student-teacher training.

GabrielBianconi · 2025-08-04T19:37:36 1754336256

Thanks for the feedback!

We chose a set of tasks with different levels of complexity to see how this approach would scale. For LLMs, the "challenge" with NER is not the task itself but the arbitrariness of the labels in the dataset. I agree it's still much simpler than the other tasks we present (agentic RAG, agentic tool use, maze navigation).

There are definitely strong parallels to model distillation and student-teacher training, with the primary difference being that we don't simply take all the data from the larger model but rather filter the dataset based on metrics from the environment. In the "Does curation even matter?" section, we show that this generally improves the result by a good margin.

We link to Vicuna, which might be the closest reference as prior art: https://lmsys.org/blog/2023-03-30-vicuna/

Thanks!

k8si · on July 19, 2024

I believe many high-quality embedding models are still based on BERT, even recent ones, so I don't think it's entirely fair to characterize it as "deprecated".

k8si · on April 24, 2024

Please put concrete examples right at the top of the page you're publicizing!

zagap · on April 24, 2024

The specification for the Podlite markup language is written using Podlite markup itself.

https://github.com/podlite/podlite-specs/blob/main/Specifica...

Also online playground is available here: https://pod6.in/

Thanks for your interest in Podlite! with best, Alex

jon_richards · on April 24, 2024

That playground is cool. I wonder if there are any 2-way playgrounds where the right side is also editable using a Word / Google Docs style interface (and the changes are reflected in the code-style interface on the left). I've always wanted something like that for teaching non-technical people the basics of Markdown. Bonus points if it's collaborative.

zagap · on April 25, 2024

This is a wonderful and highly sought-after idea. However, it requires significant resources, so we will definitely come back to it and implement it. Thank you

k8si · on March 28, 2024

Is it actually more feasible now? Do LLMs actually make this problem easier to solve?

Because I have a hard time believing they can actually extract time increments and higher-level tasks from log data without a ton of pre/post-processing. But then the problem is just as much work as it was 5 years ago when you might have been using plain old BERT.

adrianparlow · on March 28, 2024

In our experience they do! The smarter the LLM, the less pre/post-processing you can get away with and still get a good output. So no, you can't just throw raw log data at it, but it doesn't require nearly as much work as it did 2-3 years ago

k8si · on March 28, 2024

Why? Do they not know where the data in their own system, that they built, is being sent?

k8si · on March 11, 2024

I suggest going through the exercise of seeing whether this is true quantitatively. Get a business-relevant NER dataset together (not CoNLL, preferably something that your boss or customers would care about), run it against Mistral/etc, look at the P/R/F1 scores, and ask "does this solve the problem that I want to solve with NER". If the answer is 'yes', and you could do all those things without reading the book or other NLP educational sources, then yeah you're right, job's finished.

k8si · on July 18, 2023

Why do people pretend that alignment of AI is the important problem to solve, rather than alignment of the companies that run AI products with the wellbeing of humanity?

ChatGTP · on July 18, 2023

Corporate propaganda?

mistermann · on July 18, 2023

Cultural propaganda/cognitive training (don't venture outside the Overton Window, and that goes for all dimensions).