How zulip is worse? 700K line of high quality code is something to br proud of. It's backward compatible and have very powerful threading system, slack can't compare anything to that
Very narrow spec of how company can fit into code.Try Integrating ERP system in non-tech business. You will see how resistant the people are to use a software that streamline their business and need customization that fit their operation procedures or outright resistance because it would make things a lot easier and make them irrelevant.
On successful integration all they would use ERP system would be for signing in , chatting, producing invoices, the rest would still be done manually, if lucky in excel files.
They are Han Chinese, Most of those triad family come from Yunan borders a few decades ago. In Myanmar , thanks to military junta , they can easily buy citizenship by bribing authorities One of them is even a MP Senate in Military aligned party.
Sounds too much like an advertisement.
Also we need to watch out when diving into Polars . Polars is VC backed Opensource project with cloud offering , which may become an opencore project - we know how those goes.
They get forked and stay open source? At least this is what happens to all the popular ones. You can't really un-open-source a project if users want to keep it open-source.
To be fair, as someone who's fought pandas for many years I agree with basically everything they said. The API design for Polars is much, much more intuitive. It's a base R to dplyr level change.
I fought with Tesseract for quite a while. Its good if high accuracy doesn't matter. Transcribing a book from clean, consistent non-skewed data its fine and an LLM might even be able to clean it up. But for legal or accounting data from hand scanned documents, the error rate made it untenable. Even clean, scanned documents of the same category have all sorts of density and skew anomalies that get misinterpreted. You'll pull your hair out trying to account for edge cases and never get the results you need even with numerous adjustments and model retraining on errors.
Flash 2.5 or 3 with thinking gave the best results.
Thanks. I was surprised that Tesseract had recognized poorly scanned magazines and with some Python library I was able to transcribe two-columns layout with almost no errors.
Tesseract is a cheap solution as it doesn’t touch any LLM.
For invoices, Gemini flash is really good, for sure, and you receive “sorted” data as well. So definitely thumbs up. I use it for transcription of difficult magazine layout.
I think that for such legally problematic usage as companies don’t like to share financial data with Google, it is be better to use a local model.
reply