Hacker News new | past | comments | ask | show | jobs | submit | more na_ka_na's comments login

Apixio | San Mateo ONSITE | Frontend, Full stack, Backend Engineers!

At Apixio we are changing the way healthcare uses data. About 80% of healthcare data is under used because it is too messy or unstructured to efficiently analyze. The healthcare industry needs technology solutions that can process this data and extract insights. We are a profitable mid sized (less than 90) healthcare company. Our stack is React, Scala, Java, Python, Cassandra, Elastic, Redis. All on AWS.

Email me at sanchay@apixio.com or find me on LinkedIn


Apixio | San Mateo ONSITE | Frontend Tech Lead, Backend Tech Lead, Director of Engineering At Apixio we are changing the way healthcare uses data. About 80% of healthcare data is under used because it is too messy or unstructured to efficiently analyze. The healthcare industry needs technology solutions that can process this data and extract insights.

We are a profitable mid sized (less than 90) healthcare company. Our stack is React, Scala, Java, Python, Cassandra, Elastic, Redis. All on AWS.

Email me at sanchay@apixio.com or find me on LinkedIn


Apixio | San Mateo ONSITE | Frontend Tech Lead, Backend Tech Lead, Director of Engineering

At Apixio we are changing the way healthcare uses data. About 80% of healthcare data is under used because it is too messy or unstructured to efficiently analyze. The healthcare industry needs technology solutions that can process this data and extract insights.

We are a profitable mid sized (less than 90) healthcare company. Our stack is React, Scala, Java, Python, Cassandra, Elastic, Redis. All on AWS.

Email me at sanchay@apixio.com or find me on LinkedIn


Here are some thoughts: 1. Instead of grading, maybe you can use it for training, tutoring. If a student is learning to write essays, I'm assuming it's hard for them to get any feedback. 2. But then there's probably not enough money to be earned there.

One trick might be to write an independent AI to summarize the essay back and see how closely it matches the essay title. This might weed out gibberish essays with sound English sentences.


Two projects on github currently:

Run SQL queries over JSON / Protobuf objects https://github.com/na-ka-na/object-query

Compare Excel sheets via command line https://github.com/na-ka-na/ExcelCompare


Why is not intended that in-memory state be in Cap'n Proto / Protobuf objects? What are the down-sides?


The classes generated by Cap'n Proto and Protobuf are 100% public and are limited to the exact structures supported by the respective languages. That means that if you decide one day that your state needs to include, say, a queue, or if you want to encapsulate some of your state to give a cleaner API to callers, you can't, unless you go all the way and wrap everything. Inevitably if you've been building up your internal APIs in terms of protobuf/capnp types all along then you're going to be resistant to rewriting it and will instead probably come up with some ugly hack instead, and over time these hacks will pile up.

With that said, using protobufs for internal state is not an uncommon practice and if you don't care about cleanliness and just want to pound out some code quickly, sometimes it can work well.

Cap'n Proto has an additional disadvantage here in that its zero-copy nature requires arena allocation, in order to make sure all the objects are allocated contiguously so that they can be written out all at once. This actually make allocation memory for Cap'n Proto object much faster than for native objects -- but you can't delete anything except by deleting the entire message. So if you have a data structure that is gradually gaining and losing sub-objects over time, in Cap'n Proto you'll see a memory leak, as the old objects aren't freed up. You can work around this by occasionally copying the entire data structure into a new message and deleting the old one -- essentially "garbage collecting". But it's rather inconvenient.

This is actually one reason I want to extend the Cap'n Proto C++ API to generate POCS (Plain Old C Structs) for each type, in addition to the current zero-copy readers/builders. You could use the POCS for in-memory state that you mutate over time, then you could dump it into a message when needed (requiring one copy, but it should still be faster than protobuf encoding).

https://capnproto.org/roadmap.html#c-capn-proto-api-features


After integrating protobufs in my application for messaging I decided to use a separate schema for storing the current state of the program. Ie. When state changes, the protobuf is updated and written to disk. When the program restarts, the state file is loaded into memory. I have not run into any problems doing this.

Edit: Your question is addressed here: https://news.ycombinator.com/item?id=14249367


Thanks, but I don't follow how that comment addresses my question. Is it that cost of constructing Cap'n Proto / Protobuf is quite a bit higher than constructing objects defined natively?


> Is it that cost of constructing Cap'n Proto / Protobuf is quite a bit higher than constructing objects defined natively?

I discussed in more detail in reply to your first post, but just to be really clear on this:

No. In fact, for deeply-nested object trees, constructing a Cap'n Proto object can often be cheaper than a typical native object since it does less memory allocation. However, there are some limitations -- see my other reply.

(Constructing Protobuf objects, meanwhile, will usually be pretty much identical to POCS, since that's essentially what Protobuf objects are.)

There is a common myth that Cap'n Proto "just moves the serialization work to object-build time", but ultimately does the same amount of work. This is not true: Although you could describe Cap'n Proto as "doing the serialization at object build time", the work involved is not significantly different from building a regular in-memory object.


what language is that? Couldn't find it in the post.


C#


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: