Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If you use GitHub, you feed OpenAI with your code as training data already, with GitLab you do the same for Google.

Do you have some evidence that github trained copilot on private repositories? They've been pretty clear about claiming they only used public repos.

Also, gitlab is not owned by Google AFAICT but is instead a publicly traded company.



Some people complained that CoPilot outputted their rare code almost verbatim so I have no reason to trust whatever GitHub/Microsoft state.


If you your going to make these kinds of accusations (the kind that if proved true would lead to multi-million dollar lawsuits), you should at least try to provide sources.


One example:

https://devclass.com/2022/10/17/github-copilot-under-fire-as...

Also there are multiple lawsuits already as you've probably noticed.


That is an examples of GitHub allegedly violating the license of public repositories.

The claim that was made was that GitHub trained using private repositories and I have yet to see any evidence.


The caveat you missed in that case was that the person unchecked the checkbox that would allow GitHub to use files in his repository for training.


That caveat is irrelevant as it has nothing to do with a private repository. When code is public, especially with a GPL license, it can end up in multiple repositories which may not all de-check the share check.


Why don't you just look it up? It was on HN front page some time ago.


I tried, all I could find was one other HN comment asking about it. Admittedly I could have tried harder but it really isn't my assertion to defend.

I think there might be some confusion here between private repositories and public repositories with restrictive licenses. There is evidence of the latter but not the former.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: