I am a math professor with a scanned exam grading workflow that I hacked togethe...

Svetlitski · on Sept 18, 2021

You should consider looking into Gradescope (Gradescope.com). As a former TA, I can attest to it making grading much more pleasant and streamlined than it would be otherwise.

Evidlo · on Sept 18, 2021

> There are many evolutionary layers and no formal specification or verification;

There is a specification, but it's very complicated.

hyperpallium2 · on Sept 18, 2021

PDF uses postscript, which is Turing equivalent. It's a document format with the halting problem.

mkl · on Sept 18, 2021

PDF does not use Postscript, and is not Turing complete. Its drawing model is based on Postscript's (with additions), but its instruction set is focused on drawing, and can't do programming. Here is the instruction set, with equivalent Postscript commands where applicable: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PD.... Most parts of the Postscript language have no PDF equivalent.

Other things like JavaScript and Flash can be embedded, but they are extras on top of the document.

dunham · on Sept 18, 2021

It's based on postscript, like json is based on javascript. I don't believe there are any control flow (or even general arithmetic?) instructions in the content streams, so I don't see how it could be Turing complete. It's just a sequence of drawing and transformation commands, like SVG.

I presume the extensions to embed javascript are Turing complete, and IMO do not belong in a PDF file.

I've also heard that some of the embedded font formats have features that are Turing complete, but I don't know the details on that.

If there is a way to implement a Turing machine in PDF, outside of the fonts and javascript, I'd love to see the details. (I know somebody managed it with just the macro expansion bit of TeX.)

maxerickson · on Sept 18, 2021

Postscript files are often rendered into PDF, but PDF doesn't use postscript.

Syzygies · on Sept 18, 2021

See dunham's answer. Anyone who has seen a PDF file in text form would swear they're looking at Postscript. There's a common intersection that's identical.

Very roughly speaking (this is a semantic debate where everyone is wrong from someone else's perspective), a PDF file is a restricted subset of Postscript, with added indexes so one can render pages in the middle without having to process the code from the beginning.

The hardship in generating PDFs from scratch is getting those indexes right. It's far easier to convert a Postscript file using standard tools.

maxerickson · on Sept 19, 2021

That's fair. I was waffling about using stronger language, "pdf isn't postscript", but didn't get there. It would have been more correct.

kaba0 · on Sept 19, 2021

The same way as JSON is a subset of JS, it has radically different properties that way, so comparisons to postscript is not really meaningful.

einpoklum · on Sept 18, 2021

That is not a realistic concern for GP. The scanned exams won't even involve any Postscript.

Syzygies · on Sept 18, 2021

I do insert the score radio buttons using the "pdfmark" mechanism via Postscript.

selfhoster11 · on Sept 18, 2021

HTML uses JavaScript which is also Turing complete. Lack of the halting problem is nice, but having it is not a show stopper.

Tijdreiziger · on Sept 18, 2021

Perhaps you might be interested in the Zesje project: https://gitlab.kwant-project.org/zesje/zesje

mkl · on Sept 18, 2021

More information: https://sandbox.grading.quantumtinkerer.tudelft.nl/, https://zesje.tudelft.nl/about/

It sounds like it's still mostly a prototype?

Tijdreiziger · on Sept 19, 2021

Sorry, I should have elaborated in my initial comment.

I was briefly involved as a developer several years ago (as part of my bachelor's thesis). At that time, it was mostly beta-quality, but it was already in use by multiple professors for grading. I haven't been involved with the project since, so I'm not sure about the current status.

I think the homepage [1], which you linked to and where it mentions that it's still a prototype, is at least somewhat outdated; it has a screenshot of a very old version of the software. At least the 'support' section still looks accurate, though.

If you're interested in using it, I would advise getting in touch via the Mattermost channel or mailing list (both linked to from the homepage [1]) and asking about the current state of the project. Tell them Jamy sent you :)

[1] https://sandbox.grading.quantumtinkerer.tudelft.nl/

jhgb · on Sept 18, 2021

> Having grad students help grade paper is a consistency nightmare: It's look once, never look back.

Captcha-ize them, with several of them grading the same result, and with checking their responses against each other?

gettalong · on Sept 20, 2021

If you want to do something like this in Ruby, have a look at HexaPDF - https://hexapdf.gettalong.org/ - which provides a full-blown implementation for reading and writing PDFs and is quite mature already (n.b. I'm the author).

It is licensed AGPL+Commercial but if you just use it for yourself, this does not matter as you can use the AGPL.

strzibny · on Sept 18, 2021

Would be cool to know the differences to Ruby's HexaPDF. One is certainly the license.

senorsmile · on Sept 19, 2021

Would you be willing to list all of the command line tools you're using?