Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think OpenAI is training on your data. At least they say they don't, and I believe that. I wouldn't be surprised if the NSA or something has access to data if they request it or something though.

But DeepSeek clearly states in their terms of service that they can train on your API data or use it for other purposes. Which one might assume their government can access as well.

We need direct eval comparisons between o3-mini and DeepSeek.. Or, well they are numbers so we can look them up on leaderboards.



OpenAI clearly states that they train on your data https://help.openai.com/en/articles/5722486-how-your-data-is...


By default, we do not train on any inputs or outputs from our products for business users, including ChatGPT Team, ChatGPT Enterprise, and the API. We offer API customers a way to opt-in to share data with us, such as by providing feedback in the Playground, which we then use to improve our models. Unless they explicitly opt-in, organizations are opted out of data-sharing by default.

The business bit is confusing, I guess they see the API as a business product, but they do not train on API data.


So for posterity, in this subthread we found that OpenAI indeed trains on user data and it isn't something that only DeepSeek does.


So for posterity, in this subthread we found that I can use OpenAI without them training on my data, whereas I cannot with DeepSeek.


What do you mean? They both say the same thing for usage through API. You can also use DeepSeek on your own compute.


Where does DeepSeek say that about API usage? Their privacy policy says they store all data on servers in China, and their terms of use says that they can use any user data to improve their services. I can’t see anything where they say that they don’t train on API data.


> Services for businesses, such as ChatGPT Team, ChatGPT Enterprise, and our API Platform > By default, we do not train on any inputs or outputs from our products for business users, including ChatGPT Team, ChatGPT Enterprise, and the API.

So on API they don't train by default, for other paid subscription they mention you can opt-out


> I don't think OpenAI is training on your data. At least they say they don't, and I believe that.

Like they said they were committed to being “open”?


Yes but DeepSeek models can be accessed through the APIs of Cloudflare or GitHub, in which case no training on your data takes place.


True.


I don't trust a company that goes against its founding principles.

OpenAI is not publishing open source models. They should rename as ClosedAI.


You can pay for the compute and be certain that no one in recording your data with deepseek.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: