Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> a machine learning model

Not to mention that generally ML models are not useful for assessing risk. ML nearly always focuses almost exclusively on some point estimate rather than a distribution of what you believe about a value. The former case is all about expectation and the latter about variance. Correctly modeling variance is far more essential to risk modeling than expectation alone.

I recall talking to a startup that was attempting to model credit risk by building a binary classier for defaulting, and trying to figure out a way to use this to score people for credit (obviously they chose to ignore the fact that there is a huge industry with decades of experience in assessing consumer credit risk).

They focused exclusively on finding more advanced models to get better AUC without even realizing that that's not important. I mentioned that the most simplistic credit score model should at least model P(default|info) and then set the interest rate to - P(default|X)/(P(default|X)-1) to break even and they couldn't comprehend this basic reasoning. It was doubly hilarious since their population's base default rate was such that the solution to this equation was higher than the legal limit they could charge for interest.

In the early part of the current startup/tech boom there was a focus on "disruption", the idea that new ideas could easily dominate old ways of doing things. But for many industries, such as credit/lending and real estate, you should at least understand the basic principles of how these "old ways" work before trying to disrupt them.



> Not to mention that generally ML models are not useful for assessing risk. ML nearly always focuses almost exclusively on some point estimate rather than a distribution of what you believe about a value.

It is actually quite a common practice to design neural networks that output probability distributions.


That distribution is still a point estimate for a multinomial, not truly the distribution of your certainty in that estimate itself. This is essentially a generalization of logistic regression, which will of course give the probability of a binary outcome, but in order to understand the variance of your prediction itself you need to take into account the uncertainty around your parameters themselves.

This can be done for neural networks, through either bootsrap resampling of the training data or more formal bayesian neural networks, both of these are fairly computationally intensive and not typically done in practice.


I was going to say, that seems like an "easy" second step once you get your ML to output hard numbers -- tack on ranges and confidence intervals.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: