More

cloud8421 · on Aug 31, 2024

I built a few hobby projects, and have another one in the pipeline for when I have time. All of them are based on Raspberry PI Zero kits from Pimoroni (shop.pimoroni.com) and sadly some of them are not in stock, and won’t be stocked again according to their support.

I built:

- a clock with an internal task scheduler (e.g. send me an email digest every morning), based on https://shop.pimoroni.com/products/scroll-bot-pi-zero-w-proj...

- a lamp that follows the solar cycle for the given location (e.g. it turns purple at twilight), based on https://shop.pimoroni.com/products/mood-light-pi-zero-w-proj...

I’ve also got some impression (https://shop.pimoroni.com/search?q=impression&stock=true) kits that I want to program for a family photo tree kinda thing.

A summary of my experience:

- these are not complex projects, and required (in the worst case scenario) some porting of the original python examples provided by pimoroni to interface with the hardware.

- deploying via ssh is fast and reliable

- can quickly experiment by pasting a new version of a module in the device shell. Got an editor keybinding to do that by targeting a tmux pane.

- OTP constructs help structuring code beyond the standard infinite loop that you have in other languages. And you can have pub-sub and state machines just with the standard library.

- it’s easy to abstract the hardware bits away using dependency injection, so that you can work on the host machine if needed.

- working with time and time zones is possible thanks to the ecosystem packages.

- any dependency with native extension can be a problem if not built with cross compilation in mind, and if it crashes on device that’s where debugging becomes a bit harder (might need a serial cable).

- if needed, I usually add a simple web yo for config/customization.

- there are great tools in the ecosystem to troubleshoot memory issues etc.

- sometimes I have WiFi issues and I suspect it’s related to power management, but I haven’t checked thoroughly yet.

cloud8421 · on March 18, 2022

You can try it at https://pspdfkit.com/pdf-sdk/web/ocr/, the OCR functionality is shared with our SDKs.

cloud8421 · on March 18, 2022

This is the list of supported languages: https://pspdfkit.com/api/pdf-ocr-api/#supported_languages

At the moment we don’t include Japanese and Korean, but I’ll take a note around your questions.

Handwriting is definitely a different beast, that’s not supported.

stevenminhhh · on March 20, 2022

Thanks. I have been dealing with ton of headache from my projects since modifying PDFs can be very problematic. Rather than stitching up multiple libraries, I would rather suggest one platform to handle everything.

Will definitely keep this in mind until it meet my requirements. Is there a mailing list I can sign up for?

cloud8421 · on March 17, 2022

We’ve done some tests in that area and while Chromium is technically able to generate tagged PDFs, which would be accessible for the most part, it’s far from perfect.

We have some work planned in that direction, but nothing close to release at this stage.

cloud8421 · on March 17, 2022

You’re touching on a few different points so I’ll try to cover everything.

- We do build on top OSS (just not those programs you listed - see https://pspdfkit.com/legal/acknowledgements/processor-acknow... for a complete list). The layer we build is quite large though, and it would take many person-years to replicate in its entirety. It’s possible though that you don’t need that at all and a focused program that wraps other ones might do the trick for your use case.

- If you build a product based on our tech, you’re taking a conscious decision about risk: while I do think we’re gonna be in business in 10 years (we have solid revenue and last year we got backed by a large investor, Insight), that we would version APIs and support you (not just during upgrades), the reality is that it is indeed possible that we’re not gonna be around anymore, like every other company on the planet. As a consumer, this is the reality for most of the things we buy nowadays. We do take deprecation seriously, as sell SDKs, and I’m sure in case of the company shutting down you would have enough time to migrate.

- Depending on what you need to build, using our product may shortcut your development time by a large factor. It may not, if you just need to rotate pages of a PDF document and there’s a reliable OSS package that does that in your language of choice. It really depends on what you need to do.

- Even if you package everything with OSS, waiting 10 years is a sufficiently large amount of time that it may not work and you have to fork and rebuild yourself. It’s a different type of risk, but still a risk. 10 years ago Docker had just been launched. Whether you build something on OSS or commercial, you would wanna test things once a year to see if they still work or keep up with security and bug fixes.

Ultimately, there are situations where the approach you described is sound: for example, I do my taxes in plain text accounting, using ledger and emacs. I generate the reporting via a couple of Ruby scripts. I do that exactly because I care about longevity: I do my taxes once a year, I don’t wanna spend time fixing the toolchain every time I have to do them. Yet every year I hit a couple of snags I have to fix, but I consider that acceptable.

cloud8421 · on March 17, 2022

What languages do you need support for?

nonameiguess · on March 17, 2022

How about none? Any service that can be passed text from an http request can be passed text via argv and called from the command line. The fact that a helper program runs on the same host rather than across a network doesn't mean you can only use it via direct function call. Imagine if the developers of pandoc didn't actually distribute pandoc and only allowed you to invoke a pandoc instance running on their servers remotely.

cloud8421 · on March 17, 2022

If you have a sample HTML you wanna try, you can use https://pspdfkit.com/pdf-sdk/web/pdf-generation/ and paste HTML there - the generation engine is virtually the same.

cloud8421 · on March 17, 2022

Very valid concern around privacy. We don't store the documents (see https://pspdfkit.com/api/privacy/), but for people that have sensitive documents to process, we offer an on-prem product, see https://pspdfkit.com/api/documentation/deployment-options/. You can run it in your own infra and it doesn't report any telemetry to us, so information remains completely private.

cloud8421 · on March 17, 2022

We're based on PDFium, but there's a lot more going on than just that - see https://pspdfkit.com/blog/2019/contributing-to-pdfium/ for an overview.

martin_a · on March 17, 2022

So, no compliance with printing industry standards. That's a pity.

muhehe · on March 17, 2022

I don't know about printing industry and their compatibility requirements. Would you mind elaborate a bit on this (I occasionally do some pdf output, so I'd like to avoid basic mistakes)?

martin_a · on March 18, 2022

What you would typically be looking at, is compliance witt the PDF/X standards [1] in various levels, which are basically ISO norms for PDFs.

Files for printing production need to have their fonts embedded, color profiles attached/at least tagged to images, transparency dealt with, lots of stuff that ensures that the PDF itself contains all the necessary information for a successful reproducting/printing on a printing machine of any kind.

As printing production systems have evolved, the rules became "less strict" as all (most) the systems can now handle transparency natively, for example. That for example was a big change with PDF/X4, before you had to convert (keyword is "transparency reduction") all transparencies and factor them into the underlying elements.

Most PDF generators out there are not able to follow the rules of ISO/the PDF/X specifications, so print shops might have a hard time handling that data, due to various missing pieces of information.

That's normally no deal for your office printer, but when you are looking at large(r) printing operations, it surely is.

[1] https://en.wikipedia.org/wiki/PDF/X

muhehe · on March 18, 2022

Thanks!

> Most PDF generators out there are not able ...

Do you know of any compatible?

martin_a · on March 21, 2022

I'm mainly working with two systems.

The one is PDFlib [1] which can easily be accessed via Java and PHP. As a web guy, I'm using PHP obviously. There's some learning curve to it, and you have to take care of lots of stuff by yourself, but the results are pretty good afterwards.

The second are the products from callas, mainly pdfToolbox [2] and pdfChip [3], which are kind of the de facto standard for the printing industry, at least in my Western Europe bubble.

pdfChip is based around the WebKit rendering engine, so you can work with HTML + CSS and convert your document to a PDF file. The pdfChip internals will take care of PDF/X compliance, if you want to.

pdfToolbox and pdfChip both have a steep learning curve, too, but you'll probably find that with any software that is highly specialized.

[1] https://www.pdflib.com/ [2] https://www.callassoftware.com/en/products/pdftoolbox [3] https://www.callassoftware.com/en/products/pdfchip

cloud8421 · on March 17, 2022

No, just number of created documents.