Hacker News new | past | comments | ask | show | jobs | submit login

The authors of the study are all from the chemistry department at the University of Kansas. Is this really the sort of paper they should be authoring? https://www.cell.com/cell-reports-physical-science/fulltext/...

The methodology is terrible. The prompting was as simple as : "Can you produce a 300- to 400-word summary on this topic: INSERT TOPIC HERE" where some example topics are:

A surprising fossil vertebrate

Stem cells remember insults

I can't see how that prompt is going to come up with anything comparable to the human text which is based on perspectives articles in Science.

And they don't report these numbers but I can from the tables.

Document false positive rate (human assigned as AI) 0%

Document false negative rate (AI assigned as human) 0%

Paragraph false positive rate (human assigned as AI) 14%

Paragraph false negative rate (AI assigned as human) 3%

In summary though this is a garbage tier study for entertainment only




Also, their classifier just uses manual features rather than doing any sort of meaningful analysis. As in, the input to their model is a tuple of ~20 features consisting of things like, whether the article contains the word "but", or the character "?".

All of this is easily fixable by using better prompts. It's 99% effective for a small set of articles from one journal (sampling not specified), assuming that the synthetic samples are created with the minimum level of effort using GPT-3.5.

It's not that their methods are wrong, it's just that they're absolutely useless for anything beyond an extremely narrow range of data. In the ML field this is called overfitting.


It may be out of their expertise as chemists, but it may be well within their domain as educators. If their students are turning in GPT-written papers, then it's their problem.

I don't know what kinds of papers their students write. I don't recall writing a lot of papers in chemistry class, but they may be teaching a different kind of class. I could imagine them asking for papers on "careers in chemistry" or "history of some famous chemist" that students could cheat with GPT.

Even if that's the case, their paper should probably be more specific to the chemistry domain. Though perhaps they're called on to teach General Science classes as well, which could be even more likely to generate essay assignments prone to cheating.


> Even if that's the case, their paper should probably be more specific to the chemistry domain.

Why?


Because it would be more directly inspired by real examples, and make their testing set more believable. Otherwise, they're more likely to be making elementary mistakes about what constitutes a good test of machine-generated text in general.


So in other words perfect 100% sensitivity and specificity on the document level...

That's just not realistic at all. There are very few, if any tests like this in any field.

Except of course in single papers with suspicious fundamentals ;)


> The authors of the study are all from the chemistry department at the University of Kansas. Is this really the sort of paper they should be authoring?

What are you saying here?


Are chemists typically good at writing AI tools? Is the fact that they are professors interested in catching students cheating instead of adapting their methods relevant?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: