Hacker News new | past | comments | ask | show | jobs | submit login
What happens when you try to publish a failure to replicate in 2015/2016 (xeniaschmalz.blogspot.com)
332 points by brianchu on June 29, 2016 | hide | past | favorite | 137 comments



The paper was a meta-analysis of nine unpublished studies. While the issue of non-publication of negative results is extremely important, this should not be the test case for it.

I agree with other comments that the lack of nul-results in journals is a massive showstopper for epistemology and confidence in scientific work. In my experience during my PhD, academics were unwilling to submit nuls, they wanted to find the actual answer and publish that instead - which leads to delays of years, leaving well known and obvious errors in existing papers unchallenged in public, and potentially leading to that scientist never even submitting the nul result.


> The paper was a meta-analysis of nine unpublished studies. While the issue of non-publication of negative results is extremely important, this should not be the test case for it.

I think they probably made a mistake in calling it a meta-analysis, because indeed a meta-analysis relies on the idea that the studies themselves have already been peer reviewed and as such don't need to be questioned individually. Here, however, we're dealing with eight attempts at replication by the author that have not previously been published, and as such those studies should be discussed in detail so we can assess the merits of the individual replication attempts. Any meta-analysis of those experiments is then really just the cherry on top.


> the studies themselves have already been peer reviewed and as such don't need to be questioned individually

Just to add a bit to your comment: Peer review is a community's way of saying a paper stands up to their best understanding of methods at the time. The author could have accidentally left something out, misrepresented their approach or fabricated things. Peer review would not catch that, but replication or failure to replicate would. They are two different things.

For example, I remember one experiment could never be replicated by anyone except the author and his students. So, it got a bit heated. Ultimately, they found it was an experimental setup, not reported in the paper because they didn't think it was important, that allowed the experiment to work. So, it passed peer review, failed replication, but ultimately, because of failed replications and the academic process, further information came to light and knowledge was created.

The academic process can be slow and frustrating. But, when applied properly, it is our best approach to expanding knowledge. That isn't to say it can't be improved, especially regarding failure to replicate and null-hypothesis papers. Academics know this and are working on methods around this, methods called out in these discussions.

Some of the most innovative people i've found, seeking ways to improve this process have been NSF and NIH officials. So, it's promising, but slow.


I think you're putting a little too much weight on peer review here. The way it is presented to non-academics often makes it sound more like a trial or inquest, where a large group of people carefully weigh the evidence.

In practice, peer review means that 1-3 people each spent 1-3 hours thinking about the paper and didn't find anything horribly wrong with it. It is more like a code review--it's good if it catches something, but not terribly surprising if some bugs slip through.

You're definitely right about the the importance of continuing to discuss and replicate work after it has been published.


The level of rigor in peer review is quite variable across disciplines and even specialties. This is a cultural thing.

The code review parallel is pretty good (and also captures that, code review is very variable also)


Sure. Peer reviews in pure math are often exceedingly careful, or so I am told (not in the field).

Still, I think it's a little much to say that it's "a community's way of saying a paper stands up." At best, it's the opinion of handful of people from that community.


I have also moved to this view of peer review (my area is quantitative biology). Given the complexity of the papers, the ambiguity of the environment in which experiments are done, and the inability of most scientists to write up a clear description of their work, and the tendency to pad the significance of a paper's findings, peer reviewers at best act to find "invalidating errors" which should prevent the paper from being published because it's "obviously wrong".


That's a pretty good description of peer review. It varies, but generally the people chosen know the topic and are good representatives of the community, even if only 1-3 people.

For good venues (journals and even conferences), I think the biggest limitation is not the people, but the information contained in the paper. You can't answer all questions and show every last detail.

People are frustrated with academics, but academics are frustrated with how journalists and readers interpret their results.


I agree. Another way to describe this would be: We performed 8 experiments investigating effect X. Details of the experiments below. A hierarchical model was constructed to analyse the results jointly. This failed to find effect X.


I agree that the reliance on unpublished and non peer reviewed studies seems to be the biggest flaw, and isn't going to be easily addressed by the author without significant work. But one route to prestigious publication might involve publishing each of the independent replications studies as individual papers in a journal that doesn't assess for impact (like PLOS One or PeerJ), then once each has passed peer review on its own and been published write the meta-analysis on those studies and try to publish that analysis in a higher profile journal.

It would take a hell of a lot longer, but that's the way to ensure that each individual study passes the bar for scientific rigor, and so therefore your meta-analysis actually has some teeth. And yes, that sounds like a grueling process -- and a damn expensive one if each publication in PLOS One carries a $1,500 fee, so $12,000 just for the author's 8 replication studies. Although this is certainly where PeerJ's unlimited publishing for a one-time fee of $399 would be the cheapest option. [Disclosure: shareholder of SAGE, which is an investor in PeerJ]


You wouldn't need to publish 8 papers. The paper is pretty light on details, but there is this table in the appendix: https://osf.io/e6h5d/

It looks like there are two paradigms used with different stimulus sets and subject populations. I think a longish paragraph describing each paradigm, plus shorter descriptions of the different subject pools and word lists (and any other manipulation) would be enough.


I don't see the need for publishing the constituent studies before publishing the "meta-analysis". The individual studies aren't contributing unique theoretical arguments; they are additive toward a single argument that the original results can't be replicated. It's fairly common for a single paper to incorporate multiple experiments (the paper is closer to this style than a traditional meta-analysis) which are all addressed by the same reviewers.


Good points and your comment and the other sibling comment definitely make the same valid argument. I got hung up on the phrase meta-analysis, which does seem to be misused, or at least not used in the typical sense (which as was noted in the post was brought up by a reviewer as a problem). It's not a meta-analysis of existing research, it's purely new research to present multiple replications attempts, all of which failed.


Would it be possible to take the studies and actually develop them into the analysis?

> the authors focus on a bunch of unpublished studies from a dissertation and a colleague who is not even an author of the present paper. There is no way of knowing whether these unpublished experiments meet the standards to be published in high-quality journals.”

The way to know if the experiments meet the standards is to show it in the same paper, I guess?


That's what I would do.

I think that isn't actually an unreasonable complaint: the current paper has very few details about the paradigm used in those experiments. There is a really brief table, along with some comments embedded in R output, but what the paper really needs is something that looks like a methods section from an experimental psycholinguistics paper. Something like this:

Experiment 1: Nineteen native English speakers (demographics here) performed a lexical decision task. Three string of letters was presented on a computer monitor (viewing conditions here). Subjects were asked to report, via a button press, whether the middle string was a valid English word (left button) or psuedoword (right button). Although they had up to 5 sec to respond, subjects were instructed to respond as quickly as possible and the median reaction time was N.NN seconds. The original experiment manipulated (some irrelevant condition), which is not analyzed here. Instead, we calculated...

Experiment 2: As above, except the subject pool consisted of native German speakers and the stimuli were either German words or German psuedowords.

Experiment 3: Same as experiment 1, except that (irrelevant condition #2) was also manipulated instead of (irrelevant condition #1).

This might not fit in the main paper--even online-only journals can be weird about page limits--but it would be fine in an appendix.


You are probably right about the flaws, but my concern is that the bar to publish a study that reports "failure to replicate" is higher. This is at the heart of a very serious problem with science these days. Proving that research can be replicated and validated is probably more socially useful in some fields than "discovering" some new correlation. But all the academic incentives point to publishing "discovery".


At this point, I think Google Scholar should step in and just put a replications section beside every scientific publication. People should be able to quickly and easily know how many times a study has been attempted to replicate and, of those attempted, how many times it has actually successfully replicated.

It's unfortunate that replications aren't taken more seriously these days, but it also doesn't help that, when there are actual replications, you have to scour the internet for them rather than having them readily available to you.


I think that's a pretty great idea actually. If anyone here has the reach to get this on their table please do reach out. It would be even greater if they'd use some replication metric in their ranking algorithm for the papers.

The danger however is that this would just lead to only me-too studies being accepted and failed replications still being rejected.

I'd also love if they'd add (possibly Google hosted) repositories where the data/scripts etc. that belong to the paper can be uploaded and archived for all eternity.


I saw something like this a while ago... Codalab (http://codalab.org/). It's run by Percy Liang (https://cs.stanford.edu/~pliang/). It lets you create executable worksheets to go with your papers; and people can validate both the code and data used in your experiments.


http://gitxiv.com is far superior than Codalab. Source: I go to Stanford.


Looks pretty cool indeed, bookmarked. However I believe it is focused on Computer Science. Nothing wrong with that but I think they key is convincing scientists that are not exposed to ideas like version control, open source (and I'd argue data sharing due to the rise of data mining) by default (or are less computer savvy in general).


Interesting... It seems broken right now; but the github & arxiv integration sounds pretty cool, so I'll keep an eye on it.



The narrative would be more persuasive if it incorporated a story of how the paper evolved meaningfully in response to peer criticism. The question lingering in my mind after reading this is whether and how the paper was substantially revised (in light of reviewer feedback) between rejections. I'm sure it was (it has to have been, right?), but we don't get that feeling from the blog post. The author(s?) should have received a large amount of very good feedback between rejections from well-meaning peers in their scientific community. I don't recall reading about incorporating any of that feedback into subsequent revisions of the paper. The term "meta-analysis" probably should have been dropped after the first (pointed) rejection, for example, and the paper should probably have been broken down into two or three smaller papers rather than submitted as a 'meta-analysis' of unpublished work.

This is not to say that peer feedback wasn't taken seriously. I don't know that at all. But if the goal is to persuade a skeptical audience that academic publishing is broken, the author should articulate how they followed best practices in response to rejection letters from peer-reviewed journals. The alternative is to sound arrogant and self-defeating, which I'm sure was not the intent!


Forgive me for not being an academic, so maybe this question is moot.

Why isn't there a place that links to a given paper so that discussion about the paper can be centralized? It could also contain links to papers that link to that paper, among them would/could be the failure to replicate information, adding to the discussion. And I don't really mean a topical "this is what's new" site, I mean a historical "This is the paper, and this is what people have said about it." sort of site.

This seems like a fairly elementary idea. The only seeming difficult bits I see are:

a) Getting (legal?) access to these papers.

b) Dealing with a large number of papers (millions?).

c) Authenticating users to keep the discussion level high.

d) Moderating the discussion in a way that doesn't piss off academia (impossible?).

e) Keeping the number of these sites (competition, if you will) low so that the discussion is not fractured between them.

It would seem like one of the "Information wants to be free" sites that host the papers that everyone shares with each other would be a great place to start something like this.


In general the way academics have a “discussion about the paper” is by writing another paper in response and citing the original.

So what you’re looking for is the citation graph, and you can find it e.g. on Google scholar. Search for the paper you are interested in, and then click the link which lists all the papers which cite that one.

You’ll probably need to be on a university campus or ask your anonymous internet friends for help to get around paywalls.


That only gives you academic critique and follow-ups. Sometimes, some practitioner is pissed to the degree he/she posts an opinion online. That is extremely valuable, but never gets into the citation graph. For example... http://bramcohen.livejournal.com/20140.html?thread=214956

> I'd comment on academic papers more, but generally they're so bad that evaluating them does little more than go over epistemological problems with their methodology, and is honestly a waste of time.


Yes, we write comment papers. But the process is very very slow, so many don't bother.

I submitted a comment paper to a journal in March last year. The paper I was commenting on was from November 2014. My comment got accepted, after two rounds of peer review, in May this year. It will appear in the journal in July some time.


And that's fast by academic publishing standards. A flamewar that would take a day in an internet forum can smoulder for decades in academia, while the principals gather supporters and snipe at each other in conference sessions.


> You’ll probably need to be on a university campus or ask your anonymous internet friends for help to get around paywalls.

Sci-Hub is now everybody's internet friend.


This is exactly what ResearchGate aims to be - a social networking platform for scientists and researchers to share papers, ask and answer questions, and find collaborators.

In this interview [1] Ijad Madisch the founder & CEO of ResearchGate envisions ResearchGate to be a home for negative results.

[1] http://www.zmescience.com/other/interviews/qa-with-dr-ijad-m...


The idea is good, but the execution is LinkedIn-level horrible. Dark patterns everywhere, useless gamification which promotes low quality content, etc.


Yeah research gate has some of the most awful dark patterns/invitation spam of any social networking site. Co-authors constantly inviting you to join, convincing you people that are not on the site are on the site, etc...


a) Maybe just have Sci-Hub and/or ArXiv links? If you assume your audience is mostly professors, then they can get past paywalls and you can just link to a paywalled source.

b) Nested subtopics, good search, and mergeable threads so you can enforce one thread per paper? Do papers have ISBN numbers, or some equivalent?

c) Most professors have verifiable .edu email addresses, but this is going to take human effort and you'd have to decide how much of the .edu world you want to allow in.

d) Start with a very specific code of conduct, accept the fact some will be pissed off, have at least one full-time paid moderator? This will be tough, maybe the hardest part.

e) Network effects will do this for you within networks (so each subfield of academia will eventually migrate to one site, but you might be able to attract e.g. most of math but biology could be on a competitor). I would worry about this least.

I think this would work best apart from a paper-hosting site. The human costs and hosting costs will become expensive and it's easier to collect ad money if you stay legal. You might even able able to host this on a university network as ArXiv is, but this would make moderation more difficult because you'd be accused of bias towards that university.

I think PHP-BB has all of the technical features you would actually need. The moderation would be the hardest part IMO.


For (b) – yes, papers generally get DOIs.

I would love to have something like this. However I think that you may be overestimating the willingness of people to contribute high quality discussion about a paper. The people best able to offer expert critiques are the same people who have the least time to do so. The fact that the peer review system regularly manages to do so is something of a miracle, but that's because we've managed to get prestige and career advancement tied to membership on program committees and editorial boards.

I think for something like this to work you'd need to think carefully about how to compensate reviewers. And I don't think that compensating them with money will work that well, since most academics care much more about career advancement and prestige than money (otherwise we'd be in industry...)


PubPeer does something along these lines already. The bigger problem is that there's not really any incentive to participate in this, beyond blowing off some steam and a vague sense of "service to the field."


Most important reason: a lot of potential reviewers wouldn't like that model for a variety of reasons.

Reviewing, generally, isn't prestigious or well-rewarded. So potential reviewers need to be kept as happy as possible. Internet forums have a lot of frustrating problems that IRL conversations don't have, especially for people who wouldn't otherwise use an internet forum.

You have to remember that the people who enjoy participating in written conversations online -- even young people who grew up on the internet -- is still a small fragment of the population.


f) Cost of maintaining discussion site. First the computer cost. Second the time labor of the moderators. Even with for-profit journals, much of the labour is free. Its based on the the pay-forward model: "I'll constructively review your paper for free hoping you'll do the same mine and others papers". But still every publisher has a paid managing editor to supervise the overall process.


There's a psychology journal[1] dedicated to only publishing null-hypothesis results.

[1] http://www.jasnh.com/about.html


A journal dedicated to negative results is IMO not the solution here. Academics must publish in reputable, prestigious journals in order to advance their career (a publication in a non-prestigious journal basically counts for nothing). A journal dedicated to negative results is never going to be a prestigious, impactful, or widely-disseminated venue.


But that's a self-fulfilling prophecy. If more people were publishing their null results in such a journal, one of those journals would emerge as the most prestigious null-result journals. It seems like an acceptable solution, especially if the alternative is going to be not publishing at all and increasing the positive result bias.


Being the most prestigious null-result journal doesn't rise to being a prestigious journal, though.

Regardless, this is the reality today. I don't think trying to boost a null-result journal to the level of Nature or Science is a better way forwards than pressuring Nature or Science (and similar caliber journals) to publish more null result works.


As far as career advancement goes, I don't work in academia, but I would be more confident walking into an interview with a list of difficult and impactful experiments finding no effects than having even one highly cited paper in a prestigious journal that was later debunked by a couple of grad students in a null-result journal. The latter prospect should also frighten journal publishers into taking null results more seriously (especially if they published the earlier work), but only if the threat is real.


Most assessment processes in academia are purely administrative processes where they count the number of cookies that you have earned. They don't take the actual quality of the work into account.

In my country, it goes as far as having "objective" point scales like: publication in ISI JCR-indexed journals, 1 point for Q4, 2 points for Q3, 3 points for Q2, 5 points for Q1. Publication in non-indexed journal, 0.1 points.

No one ever looks into whether the paper is good or crap, or whether the author has managed to submit almost the same paper to five journals. In fact, I know cases of honest people that tried to look into that kind of thing, but they didn't let them because it was against the published "objective" scale.

I don't think the system is equally rotten in every country/institution, my country is probably among the most extreme places in this respect, but anyway this kind of assessment is the most common AFAICT, as the impact factor cult is often denounced by international scientific societies.


This does not reflect hiring practices at reputable US research universities.


Neither of the candidates you describe will even get a faculty interview.

Reply to below: I think it's valid. The first researcher has not demonstrated they will be leader/pioneer/inventor/theorizer/discoverer of new research findings, if that's all they've done - this is what reputable research universities are looking for. The second researcher has, well, no findings at all.


Another problem that needs solving.

Edit: To clarify, I was also assuming some typical amount of other research credentials, as well. From Al-Khwarizmi's reply it sounds like the second candidate is likely to be the only one interviewed, which rules out the possibility of hiring the first candidate, the superior experimentalist.


If they have a good research portfolio otherwise, then the first candidate's chances are unaffected, and the second candidate's chances at an interview are probably reduced.

The first candidate's replication studies won't really count for anything (maybe a slight boost). The second candidate's bogus studies will count negatively.

The process he describes is not reflective of US research university faculty hiring practices - the faculty will read your papers, they will solicit expert opinions from other leading researchers in the same research areas as you.


Chances of those grad students being correct are slim.

Since you were published in this hugely impactful prestigious journal that thousands of people have read, and are debunked by those grads in a journal that nobody reads, no-one is even aware that it happened.


Why is there no prestige associated with getting rid of rubbish?

As a programmer the best feeling in the world is deleting code, reducing the cognitive load of the overall system without sacrificing it effectiveness.

Does the academic world have the concept of technical debt?


One flaw in your analogy is that if all you do is actually just delete code (you don't replace the code), that's not usually valuable.

It is easier to publish null results than it is to publish legitimate positive results (of course easiest of all is to publish bogus positive results). It doesn't require creativity or brilliance to publish null results, just competence (competence enough to replicate).

Of course, it is still valuable, so a thoroughly analyzed and argued negative result should be publishable in prestigious venues - but this is more a reflection of the value to the community than the actual novelty/brilliance of the work.

Put another way: a university is not going to hire you if all you've published are null results. They want leaders who are going to pioneer/discover/invent/theorize new fields.


The Catholic church for centuries now has employed an advocatus Diaboli, someone whose entire job it is to examine claims of miracles and make the argument that in fact they are natural phenomena. They are priests who have been trained in science (there's a perhaps surprising number of those).

I think this is a good thing. There should be somebody whose job it is to poke holes in the castles we build.


> there's a perhaps surprising number of those

The Vatican still runs an astronomical observatory that researches trans-neptunian objects, exoplanets and so on: http://www.vaticanobservatory.va/


Similarly, an astronomer friend of mine trains Buddhist monks in physical astronomy.

These are almost certainly good things.


Yes. Those are peer reviewers. And people still publish negative result papers.


there is also the journal of universal rejection: http://www.universalrejection.org/


This has made my day; thank you.


So broken. I'm not involved in academia, so the most I can contribute are upvotes here and there, and giving respect to those who push against the current.


There needs to be a journal called "Failures" that focuses exclusively on failed experiments and unreproducible results. I'm far more interested in learning why something may have gone wrong.

Maybe it's just me but I feel it would surprise many people with its popularity.


See bandrami's comment

https://news.ycombinator.com/item?id=11999413

as well as this:

> The biotech company Amgen Inc. and prominent biochemist Bruce Alberts have created a new online journal that aims to lift the curtain on often hidden results in biomedicine: failed efforts to confirm other groups’ published papers. Amgen is seeding the publication with reports on its own futile attempts to replicate three studies in diabetes and neurodegenerative disease and hopes other companies will follow suit.

> The contradictory results—along with successful confirmations—will be published by F1000Research, an open-access, online-only publisher. Its new “Preclinical Reproducibility and Robustness channel,” launched today, will allow both companies and academic scientists to share their replications so that others will be less likely to waste time following up on flawed findings, says Sasha Kamb, senior vice president for research at Amgen in Thousand Oaks, California.

http://www.sciencemag.org/news/2016/02/if-you-fail-reproduce...


I'm not sure this would work as a journal.

Who would read this? I skim Nature, Science, and some journals in my particular subfield every week/month because I am sure I will find something relevant to my research or generally interesting. With the exception of some hilarious or spectacular ones, I don't think Failures would be so engaging, so the only way I would find an article in Failures is if I specifically set out to search for it.

I might do that (though searching by technique is hard), but even if I found something, read it, and decided it was helpful, what would I do with it? For better or worse, citations are the currency of academia and academic publishing. Where would I cite a paper that specifically warned me away from a possible project? "Dear Nature, I was going to study XZY via ABC, but [1] indicates that's a dead end."

So, with no regular readers and no citations, Failures becomes more like a database than a journal. There currently aren't any good incentives to submit things to databases, so Failures becomes a....failure.


"There has been some promising research with alternative technique X, but that has shown to be unreliable. [1] Because of this and [other factors], we have decided on technique Y."


Academia is broken. If we'd had the internet before we got around to creating academic science, our system of disseminating knowledge would look nothing like the journal/peer review system and probably a lot more like Github.


Seas of severely unreviwed work, many containing dangerous flaws-- distributed with apparently equal footing with carefully reviewed constructs?


Add a reputation system and you are good to go (github does that).


1. There are many places this could have been published without an importance review, eg PLoS ONE.

2. I think anyone interested in the replication problem needs to read this piece [1] by Peter Walter. As he put it: "It is much harder to replicate than to declare failure.".

[1] http://www.ascb.org/on-reproducibility-and-clocks/


Alternately, it would have worked really well as a "Registered Report."

Cortex now offers a format where the paper is submitted in two stages. One first submits the introduction/background, methods, and proposed analysis, before collecting data. If the reviews are favorable, the paper is accepted, regardless of how the results turn out. There is a second round of reviews for the final manuscript to assess the quality of the data and ensure that the pre-determined analysis plan was followed. More here: http://cdn.elsevier.com/promis_misc/PROMIS%20pub_idt_CORTEX%...

I've had a hard time selling collaborators on it ("TWO ROUNDS OF REVIEW?! Isn't one slow and painful enough?!"), but I really like this idea. It rewards interesting hypotheses and good experimental design, but avoids the ask-a-stupid-question-get-a-stupid-answer issue with null results.


Unfortunately, the importance of the journal you publish in is decisive for whether or not you get a job in academia. It's fucked up but people like the author of the blog post are not in the position to change that.


Re: 1. As stated in the concluding paragraph, she put it on researchgate.


She's talking about point 1) when she talks about open access journals (plos one is an open access journal).


Seems to me this issue is getting to the point where it could become an existential threat to the credibility of science in general. Note how climate-change deniers have recently used these sorts of arguments to challenge the consensus - is it really so far-fetched to argue that perhaps climate scientists are as biased as researchers in areas such as medicine and linguistics?

The pay-wall, blind peer review process seems broken beyond repair. There needs to be a better, robust method to publish every relevant study that is not utter crank, and get some sort of crowd-sourced consensus from researchers with credible reputations.


> is it really so far-fetched to argue that perhaps climate scientists are as biased as researchers in areas such as medicine and linguistics?

No, it isn't, unless you believe that climate scientists are a definitionally more moral group than others. The incentives remain the same, in all branches of science, so manipulation is likely the same in all branches.


Physicists have a better understanding of statistics and better data to work with than pyschologists. There is a very clear causal explanation for why greenhouse gases do what they do. With that said, there is not enough discussion about the cost and effectiveness of proposed solutions to global warming - a lot of the "solutions" are more feel-good measures, which are not worth the cost for what they do. Global warming is a global problem - if the US stopped all of its contributions to climate change the earth would keep warming up.

As a physics grad student I have seen plenty of "negative" results published, and the standard of positive results for instance 3 or even 5 sigma is a much tighter standard than p > 0.05. Science is a big field, the problems in one domain do not necessarily translate into all domains. However there are other fundamental problems with physics as a field.


The question in climate change is not direction, but magnitude and variance. While most of physics does use 3-5 sigma tests due to a large number of experiments, climate science can not do the same.

Furthermore, we have significant evidence that climate scientists are motivated to protect their work against critics - just read the climategate emails. That's not how science should be done, and goes well beyond the replicability issues discussed here.


At the risk of further derailing this thread, there was nothing wrong with the content of the climategate emails, other than poor choice of words - a poor choice of words that were all too easy to take out of context.


Someone said something along the lines of "I'm not giving you the data because you'll try to poke holes in it" as a FOIA request answer.

How is this OK?


That doesn't relate to the actual content of the emails or data. Although I agree that they dealed with the FOI requests poorly.


Really? There is 'literally nothing wrong' with stonewalling FOI requests, stacking peer review to prevent opposing scientists from publishing, or splicing temperature data onto the end of a proxy chart to hide the fact the proxy is diverging?

Ok then.


>There is a very clear causal explanation for why greenhouse gases do what they do

I'm sorry, but it doesn't work that way. We have very clear chemistry/physics on the composition of food, how it is digested, etc. But nutrition science is still an embarrassing shitshow.

We have clear physics on how neurons fire, etc. but neuroscience is still in its infancy and psychology... well, it is psychology.

Earth's climate is a large scale multi-variable control system with thousands of feedback loops. A good understanding of the physics that drive a single forcing doesn't really tell us shit I'm afraid. It is just as open to manipulation as these other fields.


This sentiment is not correct - causal explanations do matter. The statistical evidence for global warming is not particularly strong, as temperature data is very noisy. If we didn't have a causal explanation there would not be a scientific consensus behind climate change. That adding greenhouse gases to the atmosphere changes the climate is more like gravity - we have clear physics on why things fall when you drop them, and if we somehow were adding mass to the center of the earth we know what effect this would have.

More generally you are correct, we don't know exactly how much the earth will warm or if there are complicated feedback mechanisms in place that could cause this warming to speed up or reverse course. We can't even reliably say that next year will be hotter than this year (actually it probably will be cooler because this year has been unusually hot).


Indeed, but it is really only the system response we are interested in. As you admit, a clear causal explanation for how a single input forcing is increasing doesn't automatically get us to a position where we can predict the overall system response. Or predict whether a given system response will have some positive or negative second order effect.

We have a clear causal physical explanation that eating fat should make you fatter right? Are you willing to say that? Or would you qualify yourself - 'eating more fat will make you fatter in the absence of negative feedbacks (perhaps the fat reduces your appetite more than carbs or protein?) and assuming that all other inputs (exercise level etc) remain constant'.


Really, all we have is how well a model predicts the future. there's a fairly sucky model from the 80's, [1]. Maybe it's totally wrong. Maybe it's just overfitting. But it's got a good track record, if underpredictive. I'm all for the idea that co2 doesn't effect temperature. I just haven't seen any models that show that. Furthermore, i haven't seen anything with any sort of a track record. I'm pretty sure Hansen's paper is pretty much the root of all modern climate modeling. I'm totally willing to stipulate that the whole field is wildly off track, but there's no evidence of that. The only thing we have is 35 year old models that seem to work.

I mean, more specifically, aether was flat out wrong. it still had predictive power. Someday we'll have something more refined, or perhaps completely refuted.

[1] http://www.realclimate.org/index.php/archives/2012/04/evalua...


Derailing the thread further, I don't understand why people say "aether was flat out wrong". Is it because they don't know it was just a name change, and we're now calling it "vacuum"? [1]

[1] https://en.wikipedia.org/wiki/Aether_theories

See for example,

The modern concept of the vacuum of space, confirmed every day by experiment, is a relativistic ether. But we do not call it this because it is taboo.


Interesting.

Compare the chart in the linked article with Hansen's own assessment that he put out in 2005 (third page): http://www.columbia.edu/~jeh1/2005/Crichton_20050927.pdf

How do you reconcile those two? A lot seems to come down to centering decisions?

Scenario A I believe is closest to the emissions path we are on.


My guess is, the pink band is current (er, 2011 versions) of what the global average temperature was. Which makes the 60's even cooler (heh). As with all things, it depends on how you measure.

global mean global giss is here - http://data.giss.nasa.gov/gistemp/graphs_v3/

So anyway, i'd agree the centering of zero sure could change things, but it sure seems like predictive power in the rates of change. Like, temperature isn't stable, it's not going down, it's going up and it looks like it's going up at about that rate.

edit

but that sort of goes back to the original point, all we really have is models. you held up physics as an example, but if we look at something like the gravitational constant, things are really screwy. As far as i can tell, 3 really first rate teams came up with 3 different answers - not even overlapping in the error bands. big G is obviously helpful, but it must be more complicated than we understand right now.

I dunno. I kind of like the european model for chemical handling. Super toxic chemicals in tiny quantities aren't that big of a deal, maybe 50 people die, so it's not heavily regulated. Mildly toxic chemicals in large quantities, same deal. Maybe 50 people die. Large quantities of toxins are regulated in proportion to the risk. The point is, balancing the risk against the best current understanding really seems like the best that can be done.


Sure, but a pretty high percentage of the US believes anthropogenic climate change is not real. They think it's either a misinterpretation, or an explicit hoax. Most of the Republican party denies climate change, or at least denies that human activity is playing a significant role in it.

We can't have reasonable debates on cost-effectiveness when so many people deny the fact that there is even a problem at all.


This is orthogonal to my point, which is that these incentives are present in all branches of science, and are likely impacting what is published similarly.


I'd be interested in reading any thoughts you have to share regarding other fundamental issues in the field of physics.


I can only speak to particle physics, but the main issue is that we can't study the truly interesting problems. Theories such as string theory are basically beyond the reach of experiments. Our current models work very well to describe the universe, but we haven't made that much serious fundamental progress since the 1970s. It takes decades to find new particles which we know must exist (top quark in 1995, Higgs in 2012). This will probably become even more true after the next few years after the LHC collects data at 13 TeV, although of course I could be wrong and something could be found. To study new physics you have to go up an order of magnitude in energy or luminosity, and this scales worse than linearly with cost, so it isn't feasible. Of course it is possible that new technologies emerge, but this isn't a sure thing.

The other problem with physics is that it is really hard to become a professor, and the field forces 90% of bright, dedicated and talented people to go into industry because there is a lack of jobs in physics. We really need permanent positions at labs outside of academia.


>Physicists have a better understanding of statistics and better data to work with than pyschologists.

They certainly have better data to work with in most cases, but what makes you think that physicists understand statistics better than psychologists? Is this just the physics superiority complex?


No. Physicists just take a lot more math classes (including statistics and data analysis) during their undergrad/graduate studies than biologists or psychologists do.


Physics PhDs will have taken more math courses than Psychology PhDs on average, but I am pretty skeptical that they have taken more statistics courses. I would like to see the evidence for that if you have it.


Sorry, my point was simply that the incentives discussed in the post aren't unique to the field. They play a role in all academic literature, the best way to deal with it is to be consciously aware and take advantage of what the internet has allowed in terms of non-traditional distribution.


Another point that is briefly touched upon is the availability of data. What good is a study I can't verify? I wish there was a tag that could be applied to published papers along the lines of "the original data is not available or the author is unwilling to share it, treat this very septically". If you ever feel like having a sad couple of weeks, pick 50 empirical papers and send the authors emails along the lines of "I just read your paper X, very interesting. Would you kindly provide me with access to the data so that I can Y".

Call me naive but reproducibility is one of the hallmarks of a good scientific publication. I'm also fairly skeptical of studies where the used questionnaires are not openly available...the list goes on (I'd also argue that closed source software should be avoided because "we plugged our data into the not checkable algorithm X and got result Y which we interpreted according to what the authors of the software say we should interpret it as" isn't great either)


s/septically/sceptically/ (probably :-))

And preparing data for publication is lots of work [1] that unfortunately doesn't 'pay' in today's scientific culture.mit even runs the risk of others 'mining' publications from your data before you have the chance to do so.

The best reply to questions you can expect is "that's a nice follow-up idea. Let's work on it together". Risk there is that the scientists may be working in some other direction that they find more promising.

[1] for example, you have 100 sets of data, but only used 78 in _this_ publication for various reasons (chemical used was applied at incorrect time, for too long, was out of date, subject was too old, had an existing condition, etc). You explain that in the publication in a small paragraph. In your dataset, you probably did it by copying the dataset and removing the ones not suitable for the task at hand. Did you keep that selection after the paper was published? Probably not. Even if you did, sending the selected data set out isn't sufficient. Others might want to see all data, so that they can verify the correctness of your selection.

Source control systems could help here, but as I said, providing data unfortunately career-wise doesn't pay at the moment for scientists.


> and get some sort of crowd-sourced consensus from researchers with credible reputations.

Would be nice, but where would the money come from? Even if you live in a country with high taxes that fund ridiculous projects, they are still often accountable to the public. Would the public want to fund new research or try to replicate or disprove other studies? The only way to make it work would be to build-in a certain percentage of funding for such a purpose in reaction to some gross oversight, so you could get the public behind it, or perhaps get some wealthy donors behind the cause, but I think you'll have a difficult time.

The better option might be having research students be required to replicate at least one study that hasn't been replicated in addition to doing their own research. That way you get some free labor!


Depends. I would like to think that "the public" would notice all those silly cancer patients showing up to clinic (don't they know that cancer is cured twice a week at every major hype factory, err, university?) but, no.

Pie in the sky it is, then! If only a field existed that attempted to quantify the expected variation from a given size and design of experiment.

That field would be of great use to major nation states. So much so, you might call it statistics. The practitioners would probably be big party poopers, always referring to weird concepts like "regression to the mean", "power", "effect size" and other unexciting yet critical details. I imagine there would be funny videos on YouTube about these people, and "real" researchers would happily ignore them while they jerk each other off in P01 study sections.

Anyways, let's keep science the way it is. There's no way the process or its results could possibly be improved!


> Anyways, let's keep science the way it is. There's no way the process or its results could possibly be improved!

Let it be know that this is sarcasm, for those that have trouble determining that.

Note: I was being completely serious about having graduate students being required to replicate studies that have not been replicated, or to disprove them. That was not sarcasm.


Graduate students are effectively required to replicate studies already. They just don't get any credit for the work when it turns out that the flashy paper didn't replicate, and the journal editors hate to look foolish, so it's exceedingly rare for failures to replicate to see a journal. So the next graduate student to work on the technique also gets to waste months or years on it.

Really a splendid system, isn't it? Sometimes (rarely) I sympathize with industry types who want grant claw-back processes. Then I remember that those are the same crooks that pushed Vioxx and 510k "equivalent" medical devices. And the sympathy evaporates, because they're even worse.


> There needs to be a better, robust method to publish every relevant study that is not utter crank, and get some sort of crowd-sourced consensus from researchers with credible reputations.

Isn't that the Internet?

I guess the problem on the Internet is that you also get every irrelevant study that is utter crank, and a crowd-sourced consensus from laypeople with no credibility whatsoever.


Peer review is boosting with three weak learners. If you think that has much credibility (after STAP, arsenic life, LaCour, etc) you clearly haven't been paying attention.

Nb. I review for various journals, but the process is far from foolproof. I do the best I can, but editors are free to override us in the interest of "impact". At least with preprints and PubMedCommons anyone with a cogent rebuttal can present it. Cell Press must hate that...


Curious question -- when you or someone else does the peer review, is it standard practice to actually double-check the math behind the paper, or audit any of the data?


No. At least not in neuroscience/psychology.

I've never seen a review request that was accompanied by raw data--you typically get an unformatted version of the manuscript, along with the tables and figures. That's it.

The reviewers can comment on anything, but they tend to be pretty conceptual. For example, one might say that the manuscript claims X, but the authors need to rule out competing hypothesis Y. Good reviewers will either suggest ways to do that (e.g., by doing a specific control experiment or citing a paper that rules Y out). They might ask you to comment on why your data claims A, but some previously published work indicates that !A.

To the extent that statistics get reviewed (not enough), reviewers typically comment whether they think the methods are suitable for the specific application or whether their output is being interpreted correctly. However, it's exceedingly unlikely that someone will actually check whether a t-test on the data in Table #2 actually gives the p-value printed at the bottom (or whatever).


Agreed.

I see it as both a cultural thing as well as practical thing (perhaps what initially lead to it becoming cultural). The reason for the second part is because most analyses are not easily reproducible--they exist as a smattering of data files and analysis scripts that coalesce into a magical, undocumented pipeline that spits out statistics and figures. Reproducibility has received more attention lately but it's still an uphill battle against academics' frantic publishing schedule, lack of familiarity with their software, and general lack of incentive for others to replicate the results (for some, a reverse incentive exists).


Thanks -- that is interesting to hear.


For me, yes. For others, no idea.

I run code, I check derivations, and I usually sign my name.


Science is ultimately self-correcting, but that can take decades. Sometimes you have to wait for distinguished asshole scientist to retire or die before happens. For example the Harvard Geology Department was the laughingstock of the world for taking so long to accept the thoery of plate tectonics. But eventually it did after rotating in a new generation of modern faculty.


Every few days I read that science is broken or losing credibility. And yet somehow the torrent of new technology, medicine and understanding of the world just keeps coming, and often makes my life better. So maybe not completely broken.

CRISPR, Higgs' boson, gravity waves, deep networks, self-driving cars. Not broken.


It's not about technology. Bullshit science affects policies. Also you don't see what did not get discovered/invented because of people having wasted time trying to build on bad theories.

Also, nobody's saying it's completely broken.


Those are major advances that happen very very rarely. The bulk of science is small incremental steps in science. It is extremely frustrating not being able to trust the results published even in high impact journals. This is a particularly glaring problem for theoreticists/modelers. Each theory/model must rely on potentially hundreds of studies, and it's really hard to assign a confidence level to each of them.


CRISPR is a genuinely novel scientific advance. The other items in your list are not.

The Higgs mechanism was concocted half a century ago; gravity waves are even older (general relativity is more than a century old). Their experimental confirmations were engineering feats.

Neural networks are from the 50s. Add fast computers and you get deep ones. Put them in cars and they learn to drive themselves.


It may take a toll of the social sciences and biology. It's not a big problem in physics or chemistry.


Oh, "Hollywood" chemistry is pretty fucking bad, but for different reasons. I miss being a real scientist sometimes.


This seems to be the original work in question: https://www.researchgate.net/publication/8098564_Reading_Acq...


There should be a failure to replicate journal. The standards committee would should be all about rigor so that just getting published there would be a demonstration of technique and ability if not headlines.


Any journal that refuses to publish a failure to replicate research they originally published without proper reasoning should be closed down. That journal should have such a reputational black mark next to it that nobody would want to publish there and anyone who already had should be at the door with pitchforks and torchers for tarnishing the researchers' reputations.

If it's was important enough to publish research saying "here's something" then it's important enough to publish properly done research showing "actually, probably it's nothing." By definition. Or it's not science, it's fking marketing and the journal should be treated with the same scientific reverence we reserve for pepsi-cola advertisements from the 1990s.


I somewhat disagree. They still should have the right/plight to vet the failure to replicate on quality. Without that, everyone and his dog would have publications in Nature and Science tomorrow.


I believe objections on the grounds of low quality would count as "proper reasoning."


Yes, but there is just not a way to regulate journals other then by market forces - which seem to fail through information asymmetry about the journal quality and maybe some monopoly inducing positive feedback effects.

Maybe law could break the information asymmetry between sellers and buyers of the journals by adding new clauses to the copyrighted works of state-sponsored scientists...


How about if say, Harvard, Stanford, MIT & Caltech made a pronouncement that from now none of their academics should submit research for publication in journal X on the grounds that journal X lacks commitment to being a scientific journal and it would diminish their reputations as academic institutions to have any association with that journal. Then canceled their libraries subscriptions to that journal in public, demanded a full refund and generally made a fuss in the newspapers. The NYT would print the fuss.

I think that would change the dynamic of Scientific Journals pretty quickly. Pick one of the worst, get the evidence together, ring your opposite number at a couple of places that have similar numbers of nobel prize winners and go right at the rubbish journals with the wrecking ball having the reputation of being a place where the world's best research is being conducted.


Some of those reviews are good materials for http://shitmyreviewerssay.tumblr.com/


PLoS one specifically says they will publish "Studies reporting negative results".


Or save yourself the APC and put it on arXiv. Even Eisen (who was one of the founders) suggests this.


Its not peer reviewed so wont be found by researchers searching in pubmed or whatever. Thats the main point of publishing...so reviews will find the negative data.


PLoS one is an open access journal, which she talks about at the end.


I know. I was adding information.


Put it on arXiv or f1000 for fucks sake. Who actually believes psychology papers anyways? The vast majority are fishing expeditions as best i can tell.

When the field starts enforcing minimal standards (as expected for, say, clinical trials, or even genetics studies nowadays) maybe someone will give a shit. Until then people like this guy who actually seek the truth will be ostracized.


As stated in the concluding paragraph, she put it on researchgate.


ugh, that site is the worst. At least f1000 has open review and ultra cheap APCs.


The government believes psychology research and actively legislates against it.


The field is pressured to adopt and enforce standards when practitioners speak out and get noticed by the establishment. Much of getting noticed, for better or worse, involves getting published in a traditional journal.


As opposed to actually doing science. This is the root of the problem. By the time I finished grad school I had been cited over 5000 times; that didn't make me a better scientist. Actually implementing clinical trials and supervising junior researchers, that did. (Oddly, so did reviewing others' work)

Traditional journals are an artifact of the past. They interfere with the proper assignment of priority for discovery and their opaque review processes give some truly terrible work the stamp of "peer review". I submit some work to them, as requested by my colleagues, but if it were all my decision I'd never again send a paper to a journal that doesn't allow preprints (or considers it "prior publication") because that process is truly ass backwards.

Fortunately NIH and NSF have started to notice the same. It's only a matter of time now.


arXiv is not a peer reviewed journal.


That is correct, and the author did make their research available online (on researchgate and OSF). The arXiv simply isn't the right preprint server for this, as its range of subjects doesn't cover the author's: "Open access to 1,160,864 e-prints in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics".

For all the hate that peer review gets on HN, it plays a significant role in science and it's important to have a quality screening for papers. Preprint servers and peer-reviewed conferences and journals work best in tandem. The way it works here is that when we submit something to a conference, we also submit a technical report to arXiv, which we update to incorporate the reviewers' feedback. The "complete" and preferred version is usually the one on arXiv (no length restriction, so you can actually explain and prove stuff). Conferences are much more than just publication venues, though: lots of collaborations start there, and interesting discussions can be had.


The entire replication chain is a statistics exercise. But whatever -- your point is sound, preprints should complement peer review. It's only Elsevuer and a few others that try to prevent this. I guess we can hope that fat old professors die off & younger academics decline to edit for journals that impede progress. It works great in physics.


There should be a Nulled Science Magazine!





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: