If you are testing a microservices "ball of mud", you can (and probably should) setup a testing environment and do your integration tests right there, against real dependencies. The tool seems nice for simple dependencies and local testing but I fail to see it as a game changer.
You mention this as an afterthought but that's the critical feature. Giving developers the ability to run integration tests locally is a massive win in a "ball of mud" environment. There are other ways to accomplish this locally, but the test-infrastructure-as-test-code approach is a powerful and conceptually elegant abstraction, especially when used as a tool to design testcontainers for your own services that can be imported as packages into dependent services.
For example we have pure unit tests. But also some tests that boot up Postgres. Test the db migration and gives you a db to play with for your specific “unit” test test case.
No need for a complete environment with Kafka etc. It provides a cost effective stepping stone to what you describe.
What would be nice if test containers could create a complete environment, on the test machine and delete it again.
Still a deploy with some smoke tests on a real env are nice.
It's not really a middle ground if you're not testing your service in the same conditions as in production environment.
If you're not testing integration with Kafka, and the producer, your service is still lacking integration tests.
Testing classes in isolation with testcontainer is fine. But I observed that with microservice architecture the line between E2E tests and integration tests are blurred.
Microservices can and should be tested from the client perspective.
In my last project we used https://java.testcontainers.org/modules/kafka/ to start a small Kafka container. It's not the exactly like a production installation, but it goes a long way.
I agree with this. At work we use both approaches but at different levels of the test pyramid.
To test integration with 1 dependency at class level we can use test containers.
But to test the integration of the whole microservice with other microservices + dependencies we use a test environment and some test code.
It's a bit like an E2E test for an API.
I would argue that the test environment is more useful if I had to choose between the two as it can test the service contract fully, unlike lower type testing which requires a lot of mocking.
Yeah, I prefer setting up docker-compose.yml myself, so I can startup services once, and also do manual testing.
The only thing I would maybe use testcontainers for is to deploy my own service into docker as part of integration test so I can test a more realistic deployment scenario instead of running it locally outside docker.
I very strongly disagree. Having "integration" tests are super powerful, you should be able to test against your interfaces/contracts and not have to spin up an entire environment with all of your hundreds of microservces, building "integrated" tests that are then dependant on the current correctness of the other microservices.
I advocate for not having any integrated environment for automated testing at all. The aim should be to be able to run all tests locally and get a quicker feedback loop.
I'm always a bit confused about the CPU limit (for the pod), some guides (and tools) advice to always set one, but this one [0] doesn't.
Ops people I worked with almost always want to lower that limit and I have to insist for raising it (no way they disable it).
Is there an ultimate best practice for that?
CPU limits are harmful if they strand resources that could have been applied. I usually skip them for batch deployments, use them for latency-sensitive services. Doesn’t seem like a security topic though.
They are actually even worse for latency sensitive workload because cfs with 100ms default period will cause crap tail latency (especially for multithreaded processes such as most go programs)
Interesting. It's my impression too. I understand that CPU limit will artificially throttle CPU, when not necessarily needed, wasting CPU cycles I could use.
(Java programs in my case but I imagine it's comparable to Go ones)
Do you recommend to disable CPU limit? In the general case.
We don’t set them anywhere in prod and generally didn’t have any issues. We always set cpu requests and alert if those are exceeded for prolonged periods and always set memory req=limit
I think this is backwards. How are you planning on “sticking to it” when you’re serving unpredictable user traffic? If requests are set appropriately everywhere then it won’t really starve batch as kernel would just scale everything to their respective cpu.shares when cpu is fully saturated. This would allow you to weather spiky load with minimum latency impact and minimize spend
It's weird that apparently you are a borg user from google, according to other discussions we have exchanged, but you question the value of hard-capping for latency-sensitive processes.
Borg sre even ;) (former) and yes i do question them. For one borg aint using 100ms cfs period and it wasn’t even standard cfs if i recall so yes i do question that outside of limited borg usecase