More

drclau · 2025-12-12T11:44:13 1765539853

How do you know the confidence scores are not hallucinated as well?

kiliankoe · 2025-12-12T12:02:12 1765540932

They are, the model has no inherent knowledge about its confidence levels, it just adds plausible-sounding numbers. Obviously they _can_ be plausible, but trusting these is just another level up from trusting the original output.

I read a comment here a few weeks back that LLMs always hallucinate, but we sometimes get lucky when the hallucinations match up with reality. I've been thinking about that a lot lately.

TeMPOraL · 2025-12-12T12:19:23 1765541963

> the model has no inherent knowledge about its confidence levels

Kind of. See e.g. https://openreview.net/forum?id=mbu8EEnp3a, but I think it was established already a year ago that LLMs tend to have identifiable internal confidence signal; the challenge around the time of DeepSeek-R1 release was to, through training, connect that signal to tool use activation, so it does a search if it "feels unsure".

losvedir · 2025-12-12T14:48:51 1765550931

Wow, that's a really interesting paper. That's the kind of thing that makes me feel there's a lot more research to be done "around" LLMs and how they work, and that there's still a fair bit of improvement to be found.

fragmede · 2025-12-12T13:50:12 1765547412

In science, before LLMs, there's this saying: all models are wrong, some are useful. We model, say, gravity as 9.8m/s² on Earth, knowing full well that it doesn't hold true across the universe, and we're able to build things on top of that foundation. Whether that foundation is made of bricks, or is made of sand, for LLMs, is for us to decide.

xhkkffbf · 2025-12-12T16:09:42 1765555782

It doesn't hold true across the universe? I thought this was one of the more universal things like the speed of light.

procflora · 2025-12-12T17:27:15 1765560435

G, the gravitational constant is (as far as we know) universal. I don't think this is what they meant, but the use of "across the universe" in the parent comment is confusing.

g, the net acceleration from gravity and the Earth's rotation is what is 9.8m/s² at the surface, on average. It varies slightly with location and altitude (less than 1% for anywhere on the surface IIRC), so "it's 9.8 everywhere" is the model that's wrong but good enough a lot of the time.

fragmede · 2025-12-13T01:43:28 1765590208

It doesn't even hold true on Earth! Nevermind other planets being of different sizes making that number change, that equation doesn't account for the atmosphere and air resistance from that. If we drop a feather that isn't crumpled up, it'll float down gently at anything but 9.8m/s². In sports, air resistance of different balls is enough that how fast something drops is also not exactly 9.8m/s², which is why peak athlete skills often don't transfer between sports. So, as a model, when we ignore air resistance it's good enough, a lot of the time, but sometimes it's not a good model because we do need to care about air resistance.

hackeman300 · 2025-12-12T17:26:36 1765560396

Gravity isn't 9.8m/s/s across the universe. If you're at higher or lower elevations (or outside the Earth's gravitational pull entirely), the acceleration will be different.

Their point was the 9.8 model is good enough for most things on Earth, the model doesn't need to be perfect across the universe to be useful.

JAlexoid · 2025-12-12T21:32:48 1765575168

g(lower case) is literally gravitational force of Earth at surface level. It's universally true, as there's only one Earth in this universe.

G is the gravitational constant which is also universally true(erm... to the best of our knowledge), g is calculated using gravitational constant.

dfsegoat · 2025-12-12T12:01:24 1765540884

they 100% are unless you provide a RUBRIC / basically make it ordinal.

"Return a score of 0.0 if ...., Return a score of 0.5 if .... , Return a score of 1.0 if ..."

drclau · 2025-03-28T16:34:01 1743179641

According to Google Maps "measure distance" tool it's ~630 miles, or ~1000 km. I am very surprised it was felt so strongly at such a distance.

v3ss0n · 2025-03-28T18:38:39 1743187119

Not just felt, death tolls too.

groby_b · 2025-03-28T16:54:51 1743180891

Not surprising. A 7.7 is absolutely massive. (In terms of energy, 10^23.35 erg. Or 5 megatons of TNT, if my math works)

drclau · on Oct 20, 2024

Mare Tranquillitatis pit solves these problems. And there are likely many more caves that haven't been discovered yet.

https://en.wikipedia.org/wiki/Mare_Tranquillitatis_pit

drclau · on Oct 11, 2024

> Microsoft has a large team dedicated towards improving these languages constantly

… and the people working on these projects need to deliver, else their performance review won’t be good, and their financial rewards (merit increase, bonus, refresher) will be low. And here we are.

Edit: I realize I’m repeating what you said too, but I wanted to make it more clear what’s going on.

neonsunset · on Oct 11, 2024

From what I've been told, all the nice bonuses and career opportunities are in Azure and other, more business-centric areas. You go to DevDiv to work on Roslyn (C#) or .NET itself because you can do so and care about either or both first and foremost.

drclau · on Feb 14, 2024

Well, they do tell you in the UI that chats are stored for 30 days even when you disable history. And then there's a link to this:

https://help.openai.com/en/articles/7730893-data-controls-fa...

drclau · on May 31, 2023

I see a lot of complaints regarding ChatGPT 4's performance in coding tasks. My hypothesis is that Microsoft wants to launch Copilot X based on GPT-4 [0], and they can't have OpenAI's ChatGPT 4 as a strong competitor.

[0]: https://github.com/features/preview/copilot-x

drclau · on May 31, 2023

How did the encounter go? Don’t leave us hanging here!

jacquesm · on May 31, 2023

Outside of mating season and when not appearing threatening typically 'just fine'. With young and whilst in mating season: avoid if you can. Seeing two bull moose crash into each other will give you all kinds of things to think about, such as what would happen to your car if one decided to plow into it. And they don't move slow either, they are very agile, probably much more so than you'd give them credit for if you haven't seen them in action. It's more like swordfighters than sumo wrestlers.

edit: this is a good sample:

https://www.youtube.com/watch?v=g-7imHBlguk

vesinisa · on May 31, 2023

Most moose encounters are entirely uneventful. Moose normally avoid / don't care about humans. That's why they are actually most dangerous to drivers - a high velocity moose crash is very dangerous for any vehicle smaller than a semi or van due to the anatomical mechanics of the moose body.

goodcanadian · on May 31, 2023

I.e. You will take out the moose's legs, but the rest of injured, still living, moose will come through the windshield to join you in the passenger compartment.

goodcanadian · on May 31, 2023

I walked away quickly and so did it. I think it was just as surprised as me.

It is certainly the closest I have been to a moose, but living in that area, I encountered them fairly often. They are usually pretty docile.

drclau · on May 28, 2023

Out of curiosity, where are you from?

FWIW, I agree with you, although I experienced the medical system only as a patient / outsider. I live in a former communist country in Eastern Europe.

drclau · on May 18, 2023

How can you create a US account? Don't they ask for a credit/debit card and/or phone number?

jimstr · on May 18, 2023

I bought a $20 US gift card on ebay and redeemed that into a new US account

drclau · on May 7, 2023

> I'm sure it's because the number of people on 5G is drastically lower than the number of people on 4G/LTE.

5G has increased capacity over 4G, so even if all 4G users would switch to 5G, you will still have better service.