Hacker Newsnew | past | comments | ask | show | jobs | submit | more dexcs's commentslogin

In German we call that a Luftkurort. Google it.


Interesting topic but a paywalled article.


Thanks, I didn't know them.


BTW: The first online store for notebooks i see that isn't filled with tons of javascript and overlays and all the stuff. That alone is a reason to buy one from them.


I just ordered a complete Raspberry Pi Set to grab CAN BUS informations of my old landrover defender to send engine metrics to a remote sink on my synology nas at home. Maybe incl. gps tracking. Thats my plan :)


Mailbox.org.


Analyzing the text is the problem. Not extracting. Are there any good open source libs out there?


Sure. But the tool posted here doesn't do that. It merely extracts text, and the "analysis" is a couple of regexes that are tailor-made for that particular pdf. Awk can do that much and a lot more.

If you want to extract tables from a pdf, there's Tabula[1], but it isn't automated to run over the whole pdf - you've to do a manual rectangular selection around the table you want to extract.

1. https://github.com/tabulapdf/tabula


Indeed. Many years ago, I "ran SQL" on a couple decades of Usenet newsgroup data. Extraction and manipulation involved a bunch of grep, sed, tr and awk (and millions of tmp files). But, as with PDFs of utility bills, it was very specific regex.


Hey, Kshitij from Rockset here.

With Rockset you can avoid ETL when it comes to extracting and manipulating the data. Also, the main value here is that you can join this data with other data sets that are in JSON, CSV, XLS or Parquet formats using SQL to help in analysis.


Maybe you could add modules for extracting and manipulating data from popular sources. Such as the most popular social media. Also Amazon, Craigslist, Ebay, etc. And the main search engines.

There are many people who want usable data from such sources. And your service wouldn't be doing any scraping, so you'd probably be OK legally. But IANAL, so do check.


I’ve been impressed with Camelot for PDF tables


Calico or Cilium? ;)

Compared to Cilium I would spontaneously say it's bpf and fqdn based network policies.


Another killer feature is that they can do networkpolicies on fqdns and not only on ips. That's not a new feature in 1.4 but outstanding though.


Whoa I didn't know that, that's huge as well -- There are some overlays that don't even come with NetworkPolicy support at all -- didn't know they supported FQDNs as well


Tell me you don't work for them...


Thank you guys, I think that's the way I'm going for that. A app per OS.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: