I see a ton of people trying to justify their use of Copilot/ChatGPT (or rushing a startup to cash in on LLMs).
Maybe that conflict of interest is why there's very little talk of it being based on plagiarizing and license violation of open source code on which the model was trained.
We just suffered through a couple decades of almost every company in our field selling out users' privacy, one way or another. And years of shamelessly obvious crypto/blockchain scams. So I guess it'd be surprising if our field didn't greedily snap up the next unethical opportunity.
Right, any humans that reads open source code ever should also be forced to submit any code they write to ensure they've not mentally performed copyright violations.
I don't give two shits if whatever current expensive GPT is dumping out code 'very similar' to open source code today. And you'd be chopping off your own nose if you did too. Thinking these models will remain as expensive to run in the future means that at the time you or I could run 'LibreGPT' on our own hardware, we'd be scared as hell to even write the code because any use of it could get you sued into oblivion.
Cleanroom development practice is a great point to make.
We have precedent of believing that exposure to some code (or other internals) might taint an engineer, such that they can't, say, write a sufficiently independent implementation.
Maybe that conflict of interest is why there's very little talk of it being based on plagiarizing and license violation of open source code on which the model was trained.
We just suffered through a couple decades of almost every company in our field selling out users' privacy, one way or another. And years of shamelessly obvious crypto/blockchain scams. So I guess it'd be surprising if our field didn't greedily snap up the next unethical opportunity.