I've seen CQRS pop up over and over, but never seen anyone do a real application aside from some pseudo-bank account code. Are there any bigger OSS projects that use event sourcing?
Used it. Loved it. It does have its issues though.
I was working on the backend systems for electric car charging. When we heard about real-world events happen (start-charging, stop-charging, etc.), we wrote them directly to Kafka. It was up to other services to interpret those events, e.g. "I saw a start then a stop, so I'm writing an event to say user-has-debt". Yet another service says "I see a debt, I'm going to try to fix this by charging a credit card". I guess you'd call the above the 'C' part of CQRS.
But Kafka by itself is not great for relational queries. So we had additional services for, e.g history. The history service also listened to starts, stops, debts, credits, etc. and built up a more traditional SQL table optimised for relational queries, so a user could quickly see where they had charged before.
The issues we had were:
1) Where's the REST/Kafka boundary? I.e. when should something write to the Kafka log as opposed to POSTing directly to another service? E.g. If a user sets their payment method, do we update some DB immediatley, or do we write the fact that they set their payment method onto Kafka, and have another service read it?
2) Services which had to replay from the beginning of time took a while to start up, so we had to find ways to get them not to.
3) You need to be serious about versioning. Since many services read messages from Kafka, you can't just change the implementation of those messages. We explicitly versioned every event in its class name.
> 1) Where's the REST/Kafka boundary? I.e. when should something write to the Kafka log as opposed to POSTing directly to another service? E.g. If a user sets their payment method, do we update some DB immediatley, or do we write the fact that they set their payment method onto Kafka, and have another service read it?
I believe the term that's emerging for this issue is "collapsing CQRS" and how you handle this is application-dependent (are you using plain synchronous HTTP requests? websockets?) In my case, the HTTP server has a producer that writes to a Kafka topic and a consumer that consumes the answers. The HTTP request waits until the answer appears on the answer topic.
> 2) Services which had to replay from the beginning of time took a while to start up, so we had to find ways to get them not to.
Kafka Streams local state makes this fast, unless you need to reprocess.
> 3) You need to be serious about versioning. Since many services read messages from Kafka, you can't just change the implementation of those messages. We explicitly versioned every event in its class name.
Yes, this is tricky. In my case, I either add fields in a backwards compatible manner, or rebase/rewrite event topics and roll them out while unwinding the previous version of the topic that may still be in use. The former is obviously the simpler option.
My current project is totally ES/CRQS from the core to the UI. It's an email service with users, messages, client sessions and other entities. Please see my top-level comment about the joy of implementing it with Kafka Streams. These state of the art is still emerging, so it was a lot of work to figure out what tools to use and how to model things, but once I did, I can't look back update state in a table again.
I mostly stopped using it because it's so expensive and we didn't get a lot of benefit out of it. Just people asking questions that can be answered by reading the first line of the page they're on.
Actually, running integration tests against SQL RDBMS works really well. It might be a tad slow but the CI can easily handle that. We have Docker and all of that for this, and it works x-platform.
Running the RBDMS with `eatmydata` (https://www.flamingspork.com/projects/libeatmydata/) can significantly speed up tests in my experience. All of you're data is gone as soon as the process exits, but that doesn't matter for tests.
You can even do SQL DDL ALTER TABLEs in a Transaction on production, do the change, run smoke tests in a nested transaction that are all rolled back when complete, and if anything fails you roll back the DDL changes and are back where you started clean.
Support for DDL in transactions seems to vary quit a bit between DB servers.
In mysql (and I think also mariadb) ALTER TABLE cannot be rolled back, and also implicitly does a COMMIT on any in-progress transaction before altering the table.
In PostgreSQL you generally can do ALTER TABLE in transactions, but there are dome restrictions or exceptions. If the alteration is to add a value to an enum, it cannot be done in a transaction.
MS SQL Server seems to be OK with ALTER TABLE in the midst of a transaction, according to this article [1]. Oracle seems to be similar to mysql.
Even the ones that support this well, like MS SQL Server, have restrictions on other DDL, such as creating indexes.
I think I'd rather just assume no DDL in transactions and design my schema update procedure accordingly, rather than asking whoever is designing the new schema to try to limit themselves to changes from the old that avoid whatever statements whatever DB server we are using doesn't allow.
Docker isn't really required for this. Any in memory db will provide good contract verifications. And a simple alpha/beta/gamma setup with the end database can get you the rest of the way there.
Docker can be a bit of a better option... Unless you're using the same database software for the in-memory operations as the actual deployment. Or aren't using many specific features the rdbms you have chosen (stored procedures, triggers, virtual fields, etc). Not that I'm a fan of triggers and prefer direct table access over SP, but virtual fields are GREAT during transitional changes.
It also generally requires an abstraction between the DB variants.
Agreed. In the old world, this is what an alpha/beta/gamma environment could help test against. The in memory could be used to make sure your units and overall components were correctly specified. The actual installs of the DB software to isolated schemas were where you made sure you integrated with the DB correctly.
That all said, I did not mean this to be that Docker doesn't provide some value. Just pointing out it is not a prerequisite. If that makes sense.
Is there an overview of what sets Triton apart from other PaaS (which this seems to be)? It's really hard to keep up just with what happens at Kubernetes, but it feels like there are "Cloud Application Server" (quotes intentionally) popping up everywhere and it feels like everyone is as complex as the other.
Aside from performance: I find the role based access controls far easier to use than, eg. GCP (which likes to claim I don't have access to things even when I'm using an admin account). The firewall is also really easy to use and configure (via the web UI or by defining a simple rule language and setting it up via, eg. Terraform).
My favorite thing hands down is how they handle and use machine metadata though; tags you define in the web UI (or via your infrastructure provisioning mechanism) get shoved onto the machine by zoneinit and then can be used on the machine for configuration, or can show up in the web UI after provisioning (eg. the postgres image uses this to render a "show credentials" button). The service names and what not also can automatically be shoved in DNS (for a simple form of service discovery, although you'll need to implement some form of authentication on top of that since DNS isn't secured), or certain images will automatically trust any public keys you define on your Triton account (I think the SmartOS base-64 image does this, as well as the Debian 9 image).
Unfortunately the documentation for all this is terrible.
EDIT: I left out the other major benefit over some other providers: I can run anything I want, not just supported images. Even if it's something that doesn't fit in an lx zone, they ported KVM over so you can always run, eg. netbsd or whatever in a more traditional virtualized environment. This is great for when you have a few legacy machines left over that aren't using SmartOS or one of their lx- Linux images.
Joyent's cloud (and the open-source software they use to run it!) is notable b/c they don't run containers inside VMs. Instead, they use illumos zones to secure multiple tenants running on a single host os and bare metal server. It sounds like that offers quite an advantage in performance.
How often did Chinese researchers claim to have done something incredible, without proper proof and lack of data, and it turned out to be a total scam? This really feels like another one of these, we'll see what happens, but I wouldn't hold my breath.
But the difference here is that, to the best of my knowledge, there's nothing particularly difficult about CRISPR in humans. It's routine in other mammals and had previously been done on human embryos. The thing holding everyone back was ethical concerns and the possibilities of off-target effects, side-effects of the mutation, etc. So I'd err on the side of assuming they did do it.
It's worth noting that none of the moderators or questioners at the talk expressed any skepticism that He had done it. They all took for granted He had.
It's really not that implausible a thing to do, you know. People have been editing human embryos with CRISPR since at least 2014. And He has past experience, it seems, and had networked a fair amount and seemed to know what he was doing. George Church also has said that he's seen some of the data and it looks right to him. Combine that with all the data on the slides, and while you can debate the ethics or how useful it would be or how well it actually worked, it seems increasingly plausible that he did something like what he claimed, and he's not simply faking it all.
Also the cached version: https://webcache.googleusercontent.com/search?q=cache:jZXfTY...