Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My point was to the actual trained model.

I can't see access to data ever being accepted, for the reasons you mentioned.

However, publishing the model in a way that third parties can fuzz it seems to allow for discovery of the worst bads.



Releasing the trained model has the same consequences as releasing the data. Even worse, people don't have to spend the capital required to train.

For example let's say big company trains uber translation network. Releasing the trained model means anyone can now use that model.

Maybe we could have laws about this like patent laws but there's still difficulty because is fine tuning the model making it substantially different? How do you deal with countries like China that don't respect patents (especially when we're talking about technology worth hundreds of millions or billions of dollars)?

I'm not saying we should give up. I'm saying that there isn't a simple answer here. We shouldn't expect one either! But to get a good answer we need to discuss and figure out the nuance of the situation. Because on one side we can't trust these companies too much. It's too easy to make mistakes. On the other side we can't just bankrupt them because we'll lose a huge standing in the world economy. So where's the middle ground? That's what I'm after.


Releasing a model is in no way like releasing the data. ML models are fundamentally compression algorithms for data sets, when you look at them from an information theory perspective. They are lossy encoding.

Furthermore, it's a huge field, of which NLP is a significant, but tiny, subset.

The vast majority of models being used in the real world don't generalize thusly, because the data sets and processes they're linked to are bespoke.

The middle ground is the black-box model (perhaps with technical safeguards against decompiling-equivalent, or hosted as a service). It provides the ability to statistically prove bias, while protecting the majority of privileged or private information.


What I'm saying is that there's going to be a lot of push back for releasing a full trained model because anything with that model can make similarly accurate predictions. Additionally one can fine tune that model and make something better. So someone spends a few million in getting the original model. Someone else spends a few thousand to fine tune a better version.

What I'm saying is there's a huge downside to releasing the full model compared to holding it tight. Many companies don't patent things for similar reasons.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: