Hacker Newsnew | past | comments | ask | show | jobs | submit | TheGeminon's commentslogin

In the paper DeepSeek just says they have ~800k responses that they used for the cold start data on R1, and are very vague about how they got it:

> To collect such data, we have explored several approaches: using few-shot prompting with a long CoT as an example, directly prompting models to generate detailed answers with reflection and verification, gathering DeepSeek-R1-Zero outputs in a readable format, and refining the results through post-processing by human annotators.


My surface-level reading of these two sections is that the 800k samples come from R1-Zero (i.e. "the above RL training") and V3:

>We curate reasoning prompts and generate reasoning trajectories by performing rejection sampling from the checkpoint from the above RL training. In the previous stage, we only included data that could be evaluated using rule-based rewards. However, in this stage, we expand the dataset by incorporating additional data, some of which use a generative reward model by feeding the ground-truth and model predictions into DeepSeek-V3 for judgment.

>For non-reasoning data, such as writing, factual QA, self-cognition, and translation, we adopt the DeepSeek-V3 pipeline and reuse portions of the SFT dataset of DeepSeek-V3. For certain non-reasoning tasks, we call DeepSeek-V3 to generate a potential chain-of-thought before answering the question by prompting.

The non-reasoning portion of the DeepSeek-V3 dataset is described as:

>For non-reasoning data, such as creative writing, role-play, and simple question answering, we utilize DeepSeek-V2.5 to generate responses and enlist human annotators to verify the accuracy and correctness of the data.

I think if we were to take them at their word on all this, it would imply there is no specific OpenAI data in their pipeline (other than perhaps their pretraining corpus containing some incidental ChatGPT outputs that are posted on the web). I guess it's unclear where they got the "reasoning prompts" and corresponding answers, so you could sneak in some OpenAI data there?


That's what I am gathering as well. Where is OpenAI going to have substantial proof to claim that their outputs were used ?

The reasoning prompts and answers for SFT from V3 you mean ? No idea. For that matter you have no idea where OpenAI got this data from either. If they open this can of worms, their can of worms will be opened as well.


>Where is OpenAI going to have substantial proof to claim that their outputs were used ?

I assume in their API logs.


Shibboleths in output data


Maybe they’ll try to build up traffic to your site from those domains and then push to sell them to you/extort by removing the redirects?


Just feels like such an odd play lol. If they could organically generate leads/traffic that I'd be willing to get extorted over, then surely they would also have the means to start a marketing agency that I'd be willing to pay far more for?


It is also unavailable from Canada.


Same in the US.


I vaguely recall this being part of a tit-for-tat thing between China and the anti-Chinese. There have been movements to restrict Chinese access to FOSS, because forking FOSS lowers Chinese dependence on the West, along with (ironic) accusations that the "authoritarian" Chinese are limiting access to Western tech products. I thought there was some sort of legislative or judicial outcome that came out of it, but no luck with a quick google.

-----

U.S. restriction on Chinese use of open-source microchip tech would be hard to enforce - October 13, 2023

> U.S. lawmakers are pressuring the administration of President Joseph Biden to place restrictions on RISC-V to prevent China from benefiting from the technology as it attempts to develop its semiconductor industry.

https://thechinaproject.com/2023/10/13/u-s-bar-on-chinese-us...

-----

China’s Use of Foreign Open-Source Software, and How to Counter It - April 2, 2024

> Democratic governments also need to reassess which products should not be made open-source because they’re at risk of being weaponized by malign actors.

https://chinaobservers.eu/chinas-use-of-foreign-open-source-...

-----

Whatever the US did, Europe would do. Anybody in the US or Europe working on a FOSS project with Chinese contributors that they're friendly with? Has anything happened recently?


TianYancha is a corporate data aggregation website, it has nothing to do with FOSS. Your post is such a clumsy attempt to steer the conversation into Anti-Americanism/Westernism. Like really blatant lol.


And Australia.


If you read a bit further he excludes instances like that and listed films with only a single (likely intentional) title drop.


> Okay, so those are the problems. What’s the solution?

> If you need to perform a case mapping on a string, you can use LCMap­String­Ex with LCMAP_LOWERCASE or LCMAP_UPPERCASE, possibly with other flags like LCMAP_LINGUISTIC_CASING. If you use the International Components for Unicode (ICU) library, you can use u_strToUpper and u_strToLower.


Devaluing the new currency by adding lesser metals will also devalue existing currency that is "pure" as you aren't able to trust the value of the currency anymore, so the value of the existing pool of money will drop.

Its at a smaller scale, but it can be seen with counterfeit currency today. Cash-heavy businesses have to absorb whatever amount of counterfeits they accept, so they are really valuing your dollar at $0.99 if they might have to throw it out.


There are ways to gauge the confidence of the LLM (token probabilities over the response, generating multiple outputs and checking consistency), but yeah that’s outside the LLM itself. You could feed the info back to the LLM as a status/message I suppose


The idea of hooking LLMs back up to themselves, i.e. giving them token prob information somehow or even giving them control over the settings they use to prompt themselves is AWESOME and I cannot believe that no one has seriously done this yet.

I've done it in some jupyter notebooks and the results are really neat, especially since LLMs can be made with a tiny bit of extra code to generate a context "timer" that they wait before they prompt themselves to respond, creating a proper conversational agent system (i.e. not the walkie talkie systems of today)

I wrote a paper that mentioned doing things like this for having LLMs act as AI art directors: https://arxiv.org/abs/2311.03716


The problem is also that the model may have a very high confidence in token probability and is still wrong, but I'm sure it could help in some cases.


Democracies are more than capable of passing laws that violate the rights of minorities.


So this would be a worthwhile case if a youthful physicist was bringing the complaint?

As per the article, they are arguing that seniors are particularly affected by climate change due to heat waves, and these are causing deaths.

The 2022 heatwaves had hundreds of deaths attributed to climate change in Switzerland: https://lenews.ch/2023/07/08/climate-change-behind-60-percen...


The issue with people dying from heat waves has nothing to do with climate change.

The issue is that Switzerland outlawed air conditioning for private homes (still allowed in shopping centres though!)


Show me that law, because as far as I can tell, that's complete nonsense.

I know of a bunch of people who legally purchased and installed aircon in their homes.


It might vary by canton, I’m in Zürich.

https://archive.is/7g3A9

Yeah you can buy the mobile AC unit, but none of the new-builds in the past 10+ years come with a built-in (fixed) unit.


With 37,000 Palestinians marked as suspected militants, it would mean they expected up to 555,000 - 740,000 civilian casualties.


How did you arrive at these numbers?


Not GP but:

> Lavender listed as many as 37,000 Palestinian men

> they were permitted to kill 15 or 20 civilians during airstrikes

37000 * 15 = 555000 37000 * 20 = 740000


They claim the system has 90% accuracy, so they would have to actually kill about 10% more people than these numbers, to offset the 10% error rate. So between 610500 and 814000. The whole Gaza strip had about 2 million people before the current siege.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: