I know we would have to change a lot from how it is set up today, but here are some ideas:
- How many times has the results been reproduced by others?
- Add another layer of blindness: the person doing the lab work is not the person crunching the results. It could even be where two different groups crunch the numbers, and all groups are unknown to the others.
- Avoid p hacking with pre determined (and registered!) p values
- Register the experiment with hypothesis before the start of the experiment
- Register authors as they are added and removed, and have that history be publicly available
- All results have to be uploaded to a website before publication
- The method of calculation has to be public on a website before publication.
So a high quality paper is one where
- the experiment was logged in advance
- the history of authors is known
- there is a distinct separation of experimenter and cruncher
- the public can get the results themselves and run the same analysis to confirm (or not!) the results
- the experiment is repeated and confirmed by others. Even if the first experiment is bad or a fraud, and the second one doesn’t confirm it, a third experiment could be the tie breaker. It would be more traceable to understand if it was the lab or the cruncher that made a mistake or was committing fraud.
Many of these ideas have been tried already. Unfortunately they don't work.
1. Pre-registration. Great idea, it's often being done. Doesn't work because universities don't enforce it. They'd have to proactively check that papers match their pre-registrations and then fire people when they don't.
2. Reproducibility. Nobody gets funded to do this but even if they did, there are lots of bogus studies that can easily be reproduced. The problems are in the methods rather than the data. If you specify a logically invalid or unscientific method, then people following your instructions will calculate the same results and still be wrong.
3. Blindness/splitting work up. This is already how many papers are made and academics turn it around as a defense. Notice how in every single case where this stuff happens, the people whose names are on the paper claim the fraud was done by someone else and they had no idea. Universities invariably accept this claim without question.
4. All results have to be uploaded before publication. Did you mean raw data? Results are the primary content of the paper, so it's unclear what this would mean. Researchers in some fields heavily resist publishing raw data for various (bad) reasons, like academic culture rewarding papers but not data collection efforts, so they're incentivized to squeeze as many papers out of a dataset as possible before revealing it to competitors. In a few fields (like climatology) they often refuse to publish raw data because they're afraid of empowering people who might check their papers for errors, who they see as a sort of shadowy enemy.
5. Authorship history. Which frauds are you thinking of that this would fix?
I spent a lot of lockdown time looking at this question. You've listed five ideas, I churned through maybe 15-20. None of them work. On investigation they've usually been tried before and academics immediately work around them. Science is littered with integrity theatre in which systems are put in place in response to some fraud, and they appear to be operating on the surface, but nothing is actually being checked or enforced.
> Overall, we have to incentivize good science.
I'm by now 95% convinced the only way to do this is for scientific research to be done commercially. No government or charitable funding of science at all. As long as science is being firehosed with money by people who don't actually use or consume the resulting papers, the incentive will always be to provide the appearance of doing science without actually doing it. It's just so much more effective for fundraising. Commercial science that's purchased by someone has a customer who will actually try to check they got what they paid for at some point, and can vote with their wallet if they didn't. Also legal protections and systems to stop fraud in markets are well developed. Notice that Elizabeth Holmes went to prison for defrauding investors. If she'd done the same thing at a university she'd have gotten away with it scot free.
> If she'd done the same thing at a university she'd have gotten away with it scot free.
But she didn't though. She was rejected from academia, which is why she turned to private capital. Her fraud worked on private investors, not government funding agencies. She couldn't even convince her professor to get behind her idea, but private funds threw millions of dollars at her.
I haven't read her bio, so would be interested to know more about her being rejected from academia. The official story is that she dropped out during her undergrad and immediately formed a company. Or are you referring to her professors telling her they didn't think her idea would work? If so then they were right, but her profs are not the people handing out grants. There's no sign she wouldn't have got a grant for such a thing given that government funding is justified by the fact that it can fund long shot ideas that some say won't work, and people doing things that clearly can't work regularly get grants.
- How many times has the results been reproduced by others? - Add another layer of blindness: the person doing the lab work is not the person crunching the results. It could even be where two different groups crunch the numbers, and all groups are unknown to the others. - Avoid p hacking with pre determined (and registered!) p values - Register the experiment with hypothesis before the start of the experiment - Register authors as they are added and removed, and have that history be publicly available - All results have to be uploaded to a website before publication - The method of calculation has to be public on a website before publication.
So a high quality paper is one where - the experiment was logged in advance - the history of authors is known - there is a distinct separation of experimenter and cruncher - the public can get the results themselves and run the same analysis to confirm (or not!) the results - the experiment is repeated and confirmed by others. Even if the first experiment is bad or a fraud, and the second one doesn’t confirm it, a third experiment could be the tie breaker. It would be more traceable to understand if it was the lab or the cruncher that made a mistake or was committing fraud.
Overall, we have to incentivize good science.