Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generating a custom PDF is just such a pain. You don't know it before you tried to do it.

I've seen Anvil (a YC co) is building a product dedicated to this challenge, check it out!



If I needed to generate PDFs, I would use ReportLab[1]. The core is open source and there's a great company behind it. The product appears easy and mature. What am I missing?

1. https://docs.reportlab.com/


I have PTSD from 20 years ago trying to convert OpenOffice documents to PDF with ReportLab. It was janky because all elements had to be positioned in absolute coordinates to the virtual page.

Nowadays I'll go straight to https://weasyprint.org to produce PDFs in Python. That's what I used for https://scaleway.com invoices.


If you're OK with paying ~£0.05 per PDF you can just use Adobe's PDF API where you supply a word document with {{ name }} like variables in it and a JSON file and you get back a PDF with it all filled in. This way you just change the styling and layout with a word editor.


Yep, that’s a solution. But editing millions of invoices would cost way to much I guess. Depending on your company size and what you want to achieve though


Yep I should have added it as well in my prev answer, and doing it for a reasonable cost at scale is not trivial!


> Generating a custom PDF is just such a pain. You don't know it before you tried to do it.

I never had any problem generating PDFs by using Ghostscript.


I wrote some C# code a couple of years ago to generate PDF invoices. Has been running fine since then generating tens of thousands of PDFs. It was pretty straight forward back then, only annoying thing were page breaks when there were a lot of line items, but apart from that I would not consider it a "pain".

What issue have you had? What do you mean by "custom"?


The pain in there is that every time you want to have a PDF generation system, you have to do the same stuff, it's not "complex" at all, but it definitely takes time, whatever the technology you use. On my experience, having ten thousand PDF generating at the same time was a very high load for our infrastructure and the cost was enormous, that is one of the main thing I'll focus on today : the scalability of it. It also depending on the way you create your template, hard coded with pixels or HTML!


You're right, per se generating a small number of PDF is indeed easy.

What's complex is: - Dynamic PDFs: you're trying to digitize a government process for instance, by launching online forms with a variety of user paths for the answers and generating a PDF with the answers that still fits specific government requirements - Doing it at scale


I just use headless chromium to "print" a generated HTML page as a PDF, works great.


My shot-scraper CLI tool has an option that can do exactly that: https://shot-scraper.datasette.io/en/stable/pdf.html

    shot-scraper pdf myfile.html -o output.pdf


AFAIK uses wkhtmltopdf underneath. Saw a post suggesting Chrome browser developers would prefer not to be offering thin cli support -- presumably for such use cases to use wkhtmltopdf directly.


It doesn't, and AFAIK they reversed that decision.


Lago actually use Gotenberg, it's very nice to manage the headless chromium state without any headache!


Did you face any implementation difficulties with your solution?


Not parent, but if you need headers and footers on each printed page you need to make use of tables.


I recall using cups to print to a pdf in the past.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: