More

srameshc · 2024-11-14T12:24:32 1731587072

As someone who has very limited understanding but tried to use BERT for classification, is BERT still relavant when compared to LLMs ? Asking because I hardly see any mention of BERTs anymore.

osanseviero · 2024-11-14T16:21:26 1731601286

Yes, they are still used

- Encoder based models have much faster inference (are auto-regressive) and are smaller. They are great for applications where speed and efficiency are key. - Most embedding models are BERT-based (see MTEB leaderboard). So widely used for retrieval. - They are also used to filter data for pre-training decoder models. The Llama 3 authors used a quality classifier (DistilRoberta) to generate quality scores for documents. Something similar is done for FineWeb Edu

itchyjunk · 2024-11-14T16:41:49 1731602509

Wait, I thought GPT's were autoregressive and encoder only like BERT used masked tokens? You're saying BERT is auto-regressive or am I misunderstanding?

woadwarrior01 · 2024-11-14T18:14:21 1731608061

You're right. Encoder only models like BERT aren't auto-regressive and are trained with the MLM objective. Decoder only (GPT) and encoder-decoder (T5) models are auto-regressive and are trained with the CLM and sometimes the PrefixLM objectives.

ipsum2 · 2024-11-14T21:11:59 1731618719

You can mask out the tokens at the end, so its technically autoregressive.

spmurrayzzz · 2024-11-14T16:52:16 1731603136

They're still very useful on their own. But even more broadly, you can often use them in tandem with LLMs. A good example could be a classifier that's used as a "router" of sorts; could be for selecting a prompt template, directing to a specific model, or loading a LoRA or soft prompt vector to be used at inference-time.

mynegation · 2024-11-14T13:17:50 1731590270

For many specialized tasks you can run BERTs (and simpler models in general) at scale, with lower latency, at lesser cost, with similar or even better results.

beoberha · 2024-11-14T14:18:19 1731593899

Depends what you’re trying to do. I’m writing a personal assistant app (speech to text) and want to classify the user input according to the current actions I support (or don’t). The flagship LLMs are pretty great at it if you include the classes in the prompt and they will spit out structured output every time. But, man, they are expensive and there’s the privacy aspect I’d prefer to adhere to. I’ve only got 24 GB of RAM, so I can’t run too many fancy local models and things like llama3.1:8b don’t classify very well.

So I’m trying BERT models out :)

Tostino · 2024-11-14T16:03:18 1731600198

Try some of the Quen models. They have some that are slightly larger than 8b that will fit on your 24gb quite nicely. They have been amazing so far.

galeos · 2024-11-14T12:35:09 1731587709

My understanding is that BERT can still outperform LLMs for sentiment classification?

jsemrau · 2024-11-14T13:16:54 1731590214

To my understanding yes. But I never found a good use-case for sentiment classification.

ta8645 · 2024-11-14T13:27:56 1731590876

It seems to be used by Youtube for comment censoring / shadow-banning.

jsemrau · 2024-11-14T18:02:33 1731607353

That might make sense.

antononcube · 2024-11-14T20:34:49 1731616489

I used sentiment analysis a few times in recommender systems (for digital media consumption.)

Also for analyzing Trump's tweets (from 2016): https://mathematicaforprediction.wordpress.com/2016/11/21/te...

deepsquirrelnet · 2024-11-14T18:33:53 1731609233

They’ve drowned in the LLM noise, but they’re definitely still relevant.

- Generative model outputs are not always desirable, and often even undesirable

- BERT models are smaller and can run with lower latency and serve larger batches with lower vram requirements

- BERT models have bidirectional attention, which can improve performance in many applications

LLMs are “cheap” in the sense that they work well generically, without requiring fine tuning. Where they overlap with BERT models is mostly that they may work better in low training data environments due to better generalization capabilities.

But mostly companies like them because they don’t “require” ML engineers or data scientists on staff. For the lack of care given to evaluation that I see around LLM apps, I suspect that’s going to prove to be a faulty premise.

antononcube · 2024-11-14T20:32:25 1731616345

> - BERT models are smaller and can run with lower latency and serve larger batches with lower vram requirements

The most recent version of Wolfram Language (aka Mathematica) uses by default BERT models for embedding.

(Say, for this function: https://reference.wolfram.com/language/ref/CreateSemanticSea... .)

srameshc · 2024-11-07T19:34:58 1731008098

Sincere question : how is it different that something like K6 ? > https://github.com/grafana/k6

conradludgate · 2024-11-07T19:40:20 1731008420

As far as I understand, Codspeed is a tool to perform continuous benchmarking. k6 is a specific implementation of http benchmarking

srameshc · 2024-10-17T17:09:20 1729184960

Growing up we called it Carrom board, which is square board with 4 pockets in the corners. I never knew there was an American version of it as Crokinole board.

sleepybrett · 2024-10-17T17:15:34 1729185334

Canadian. I haven't played Carrom, but it's my understanding it's Indian in origin and plays a bit more like a billiards variant, even going so far as to use tiny pool cues.

dagw · 2024-10-18T11:24:46 1729250686

I think there are few different variants of this game. I played the version with the tiny pool cues as a kid, but we called it Couronne. Looking at images online it seems the main difference between Carrom and Couronne is that in Couronne you hit the pieces with a cue and that the pockets are much bigger than in Carrom.

pixelatedindex · 2024-10-17T18:04:26 1729188266

I haven’t played in a while but as long as I remember, there aren’t any pool cues but there are varying (house) rules on how/where the disc can be flicked

almostdeadguy · 2024-10-17T17:22:51 1729185771

Different game, but Carrom is supposed to be great as well. I haven't played it but many of the folks I follow on BGG prefer it to Crokinole.

rendx · 2024-10-17T17:30:20 1729186220

I played both and own a Crokinole board, but strongly prefer Carrom. It's similar but still quite different.

pahool · 2024-10-17T19:29:25 1729193365

Generally, crokinole is a much less punishing game than carrom, if we're talking about Indian carrom boards. American carrom boards, that were really popular in after-school programs when I was growing up, have relatively HUGE pockets than the Indian boards, in addition to being smaller boards. American carrom is like playing 8-ball, Indian carrom is like playing snooker.

I like carrom a lot, but I'm terrible at it. I'm at least a reasonable player at crokinole, and it's a lot easier to introduce others to the game without them getting too frustrated by it.

pahool · 2024-10-17T19:55:19 1729194919

Also, a lot of American carrom boards were produced with a checker board one side and a crokinole board on the other side.

darreninthenet · 2024-10-17T17:15:47 1729185347

Crokinole is Canadian

srameshc · 2024-09-29T16:21:48 1727626908

Good to see author's mention about routing. I am mentally stuck with mux for a long time and didn't pay attention to the new release features. Happy that I always find things like these on HN.

JodieBenitez · 2024-09-29T16:58:56 1727629136

Nice new feature, would actually make me want to use Go without Gin.

leetrout · 2024-09-29T17:08:57 1727629737

I am over Gin and have been for years yet everyone keeps using it because it has inertia. The docs are garbage.

Big fan of Echo and it has much better docs.

https://echo.labstack.com/

JodieBenitez · 2024-09-29T18:52:27 1727635947

Thanks for the suggestion, will give it a try. I'm more familiar with Python than Go. I know my way around the Python ecosystem and can make informed decisions about which tool to use. Not so much with Go, so I appreciate your advice.

xerox13ster · 2024-09-29T18:29:51 1727634591

I had to move from Gin to echo for my personal site, the routing in Gin was refusing to serve static resources at the root path without some headache.

rwdf · 2024-09-30T05:57:01 1727675821

I've grown to prefer go-chi over Gin (or Echo), since it's just the standard library with some QoL features on top.

thiht · 2024-09-30T08:12:11 1727683931

Chi is amazing. I love the philosophy of extending the stdlib instead of writing an alternative. I try to keep that in mind when writing my own libs or helpers now, and I'm very satisfied with the results.

For example I made a lib to write commands (like cobra or urfave/cli), but based entirely on the `flag` package: https://github.com/Thiht/go-command

rwdf · 2024-09-30T19:19:01 1727723941

> For example I made a lib to write commands (like cobra or urfave/cli), but based entirely on the `flag` package: https://github.com/Thiht/go-command

Looks nice! I'd like an easier way of setting both long and short flags for a command, i.e. --verbose and -v should do the same. Using `flag` I have to declare everything twice to achieve this.

JodieBenitez · 2024-09-30T11:46:16 1727696776

Nice CLI lib ! I'm still looking for a Argh or Typer equivalent though.

linhns · 2024-09-30T11:30:56 1727695856

I like it, but with the new http.ServeMux rolled out in Go 1.22, is there any use for Chi anymore?

rwdf · 2024-09-30T19:09:04 1727723344

Good question. The middleware stack it provides is nice.

srameshc · 2024-09-15T16:03:04 1726416184

This is really great resource for someone getting started to understand what's going on. I see many have issues and try out different git commands without understanding the outcome.

leetrout · 2024-09-15T16:17:40 1726417060

It's a really great resource to come back to for those of us who have been using git over over a decade, too!

shagie · 2024-09-15T17:01:25 1726419685

https://onlywei.github.io/explain-git-with-d3/ is another one that I've used to demonstrate the repository state after various operations.

fluorinerocket · 2024-09-15T17:26:48 1726421208

I always send this to new hires who don't know git. Pretty common in mechanical engineering

srameshc · 2024-09-10T18:18:06 1725992286

We are working on something content driven (for an ad or subscription model) with lot of effort and time and I am concerned how this technology will affect all that effort and eventually monetization ideas. But I can see how helpful this tool can be for learning new stuff.

srameshc · 2024-08-13T21:17:08 1723583828

We always talk about how scarce freshwater is but this image reprenstation has made it difficult to imagine how much supply do we have for an ever growing human population, the growing demand for water and how long will it last.

asadm · 2024-08-13T21:18:14 1723583894

Does water leave earth when used?

sfink · 2024-08-14T01:14:53 1723598093

It's more extreme than that, it stops existing.

The comment you're replying to is about fresh water. Which becomes non-fresh when it mixes into seawater or waste or pollution. No need to leave the Earth.

Admittedly, it's probably better to talk about the cycle, since non-freshwater will be automatically converted back to freshwater via solar energy. But the rate can be slowed—eg, dump a bunch of toxic stuff in one place, it'll drain to a river, now everything from that point and downstream is no longer freshwater. Or pump up enough groundwater. Or inject toxic crap down where the groundwater lives.

We're quite good at reducing the total amount of freshwater available.

morepork · 2024-08-14T00:03:56 1723593836

Only tiny amounts

srameshc · 2024-08-13T01:20:07 1723512007

Back in 2012 or sometime around it, I was trying Akka a Java library and trying concurrency and stuff. Around the same time I gave Go a try and it was much less verbose and simple. Never looked at Java after that, but I never felt Go is verbose.

srameshc · 2024-08-12T15:59:26 1723478366

If I understand correctly, this is like SQLite, but Postgres. I love SQLite, but sometimes I need a little more. So, no more saving Date as text and we have arrays, jsonb etc and all the good stuff from Postgres. Am I right ?

samwillis · 2024-08-12T16:04:59 1723478699

Exactly, all your favourite PG types, plus any that come with extensions such as vectors with pgvector.

We are working on PostGIS to, which will bring the geo types to PGlite.

aitchnyu · 2024-08-13T09:09:42 1723540182

IIRC Sqlite can exist as flat file and can be backed up. Will this work the same? And will it allow multiple writers?

srameshc · 2024-08-13T13:04:24 1723554264

Every use case and expectations are different IMO. And yes if it's a file system , you can always grab a way to keep a snapshot. Too early for this project to deliver everything at one go.

srameshc · 2024-07-31T19:54:31 1722455671

Valid question and I am sure it doesn't.