Catch Breaking Changes by Diffing API Traffic

spullara · on Sept 23, 2020

A tool like this was at the core of the Twitter frontend rewrite done years ago ~2011 time frame. Diffed both HTML and JSON responses, at scale, to find any issues. Amusingly found lots of issues in the original where the new system was correct.

edit:

We did it via a front-end proxy that would send the request to the original system, get the response and send it, then send the request to the test system along with the response from the first system and diffed them. Had a bunch of heuristics for various element types where you could expect differences between them (timestamps mostly).

zackangelo · on Sept 23, 2020

open sourced as diffy i believe: https://github.com/opendiffy/diffy

sa46 · on Sept 23, 2020

I looked into diffy but the license was changed from Apache to AGPL about a year ago in what appeared to be a monetization scheme based on the links to the support company in the REAMDME. https://github.com/opendiffy/diffy/commit/0848f9888f59aded4d...

spullara · on Sept 23, 2020

At the time it was just hacked into our front-end traffic server. This is probably just inspired by it.

donavanm · on Sept 23, 2020

My recollection is that OpenDNS(?) did the same thing on an ongoing basis. They mirrored prod network traffic in to a gamma environment to test/compare responses. I seem to recall it come up at a DNS OARC workshop a few years back if anyone wants to look for more details. Another system I worked on essentially did passive packet level functional tests on production network traffic. Pretty good technique to pick up new variations in the protocol.

homero · on Sept 23, 2020

Do you send all traffic? Wouldn't that require 3x the servers? Double API plus some proxy servers.

spullara · on Sept 23, 2020

Nah. At 500k rps there was no need, sampling wise and as you point out hugely expensive. Though the new system was 10x fewer servers at 10x lower latency :). At smaller scale, you could probably do all of them.

caseysoftware · on Sept 23, 2020

I was at Twilio back in the day when we launched Shadow - https://github.com/twilio/shadow - and it's gotten zero attention since.

I'm glad to see others taking this concept forward and doing great things with it.

palijer · on Sept 23, 2020

Pact is a pretty big player in contract testing already, what are your talking points to someone looking at the differences between Akita and Pact?

jeanyang · on Sept 23, 2020

Hi there, I'm Jean, the founder and CEO of Akita. Thanks for the great question! We designed our tool very much in the spirit of tools like Pact, but to support much more automation. Pact is great if you 1) know your API and 2) know the contracts you want to test your API against. Akita is able to: * Automatically infer your API and automatically generate a spec * Automatically infer properties of your API that it observes (implicit API contracts), so you can diff/test against those later * (Coming soon) Automatically generate tests for your API against those contracts

theptip · on Sept 23, 2020

I’ve been meaning to give Pact a spin, any success or disaster stories from folks around here?

janee · on Sept 23, 2020

This seems like a cool tool, but if you're starting from scratch you can avoid breaking changes like this by using codegen tools that generate client code guaranteed to conform to what an API provides...and if it's a typed language you can instantly spot these kind of backwards compatibility issues when you regenerate the client code.

We use gql + apollo codegen + typescript + CI tests and haven't had a breaking API change like this in three years

simonhamp · on Sept 23, 2020

I don’t really get this. If there’s a contract that crosses service boundaries, surely the system that is being changed needs to have encoded that somewhere and not simply leave it to being a side-effect of other code.

In our API for example we have an explicit set of (what we call) presentable attributes, an explicit mapping between our internal naming of properties to the external attributes that are presented publicly.

This allows us to maintain the external contract whilst the internal structure may change wildly. Furthermore, with a generally simple set of unit tests, we can catch scenarios that might be cause for failures before any such changes go to production.

Adding this extra layer of diffing feels like unnecessary complexity and if you end up using a proxy to catch responses, potentially even an extra source of data leakage and exploitation.

jeanyang · on Sept 23, 2020

Hi @simonhamp, founder/CEO of Akita here. :) It sounds like you have good internal discipline about what could potentially break your API and that's great. The point you bring up, that surely the code can't be the only documentation of an implicit contract, is exactly what we've been observing, in that the code is often the only documentation. The example we gave (service/service dependency) was an extreme case of this, though we have seen this. More subtle examples would be changing an error code, or changing from one specific type of string to another (for instance, two datetime formats). Propagating these changes to dependencies, especially in larger systems, can cause some real headaches. Would love to talk more to understand how you've gotten around some of these with good process!

Shopify has this great blog post about their solution for change management in their external APIs that also makes the point that change impact analysis is not easy: https://engineering.shopify.com/blogs/engineering/shopify-ma...

Also, quick point: we don't proxy for precisely the reason you brought up. Akita works without sending any user data back to our servers, just the metadata.

social_quotient · on Sept 23, 2020

I read the title too quickly and missed “API” and thought immediately, is someone really parsing traffic feeds in real-time and doing some sort of diff on the frames/cars to better understand traffic flow and accidents. What a weird lucid moment of nonsense.

karmelapple · on Sept 23, 2020

Same here, which is why I clicked - I thought, "whoa, car traffic changes? This seems pretty wild."

futhey · on Sept 23, 2020

This looks great, can't wait to try it out!

CGamesPlay · on Sept 23, 2020

So, I understand the idea of what you're saying (just intuitively) but I don't actually see how Akita provides this from the blog post. Aki said that `get_folder_metadata` wasn't used, and the screenshot of Akita seems to confirm that she is indeed removing it... Where does the bug get caught?

jeanyang · on Sept 23, 2020

Great question! (Jean, founder and CEO of Akita here.) We elided some details to keep the blog post simple. Here is the longer answer. Akita catches dependencies across your system, so there are two ways you could set up Akita to catch this bug: 1. If the clients depending on the removed property are also explicitly tested by Akita, the regression test would flag the removed property as a change to take notice of. 2. If you install Akita to run in staging/production where it can pick up the dependencies on the service, the regression test will be able to detect that the removed property was used in a previous run and is now gone.

hashamali · on Sept 23, 2020

This looks awesome, I like that it doesn't require having to proxy to generate the spec. Does this rely on code introspection or somehow listens for traffic without proxying?

jeanyang · on Sept 23, 2020

Thank you!! (Jean, founder and CEO of Akita here. :)) No code introspection necessary! Akita asks for access to listen to the traffic without proxying.

phoenix24 · on Sept 23, 2020

that's a nice approach, i've been using similar approach for somewhat similar end-goal; although personal project for now; would love to chat sometime :)

what apis are do you support at the moment, http only or binary aswell?

jeanyang · on Sept 23, 2020

We are focused on HTTP APIs at the moment. Happy to chat. :)

hirundo · on Sept 22, 2020

I'd like to get an idea of what the pricing will be before investing in trying the beta. Is graphql on the roadmap?

jeanyang · on Sept 22, 2020

Hi, I'm Jean, founder and CEO at Akita. Thanks for the questions. :) Yes GraphQL support is on the roadmap! We’re in a private beta, the product will be free during that period and we will provide a discount to those users once we move to a public beta with a pricing. There will always be a freemium model where some amount spec generation and analysis will be available without cost. We see tools like Honeycomb and Lightstep as similar in pricing.