At a certain point a bias in the prng just has to become significant? This will be a function of the experiment. So I don’t think it’s possible to talk about a general lack of “practical impact” without specifying a particular experiment. Thinking abstractly - where an “experiment” is a deterministic function that takes the output of a prng and returns a result - an experiment that can be represented by a constant function will be immune to bias, while one which returns the nth bit of the random number will be susceptible to bias.
> At a certain point a bias in the prng just has to become significant?
Sure, at a point. I'm not disputing that. I'm asking for a concrete bound. When the state space is >= 2^64 (you're extremely unlikely to inadvertently stumble into a modern PRNG with a seed smaller than that) how large does the sample set need to be and how many experiment replications are required to reach that point?
Essentially what I'm asking is, how many independent sets of N numbers must I draw from a biased deck, where the bias takes the form of a uniformly random subset of the whole, before the bias is detectable to some threshold? I think that when N is "human" sized and the deck is 2^64 or larger that the number of required replications will be unrealistically large.
Structural subtyping yes, nominal subtyping is a bit pricklier.
As a developer I personally prefer structural subtyping, but structural subtyping is harder for a compiler to optimize runtime performance for.
Nominal sub-type hierarchies allows for members to be laid out linearly and member accesses becomes just an offset whereas a structural system always has the "diamond problem" to solve (it's hidden from users so not a "problem" but will still haunt compiler/runtime developers).
Now the kicker though, in practice nominal subtype polymorphism has other issues for performance on _modern_ computers since they create variable sized objects and cannot be packed linearly like monomorphic structures.
In the 90s when languages settled on nominal typing memory speeds weren't really an huge issue, but today we know that we should rather compose data to achieve data-polymorphic effects and singular types should be directed to packing.
Thus, most performance benefits of a nominal type system over a structural don't help much in real-life code and maintenance wise we would probably have been better off using structural types (iirc Go went there and interfaces in Java/C# achieves mostly the same effect in practice).
I've been implementing row polymorphism in my fork of Elm in order to support proper sum and substraction operations on unions, and it's far from trivial.
Example usecase: an Effect may fail with 2 types, but actually you have handled one/catched one so you want to remove it.
Elm-like HM systems handle fine, as you say it, row polymorphism mostly over records.
I'm not an expert in all of this, started studying this recently, so take my words with a grain of salt.
Not sure about your opening thesis. The vast majority of employees work for privately held businesses and from personal experience of working in such companies, “management” by OKRs and the like is common in companies who are not listed on stock markets also.
There'll inevitably be cargo culting driven by MBA curriculums and "they're making a lot of money, let's do what they did" without examining the specifics of the situation to distinguish luck from judgement.
That’s the seductive power of mocking - you get a test up and running quickly. The benefit to the initial test writer is significant.
The cost is the pain - sometimes nightmarish - for other contributors to the code base since tests depending on mocking are far more brittle.
Someone changes code to check if the ResultSet is empty before further processing and a large number of your mock based tests break as the original test author will only have mocked enough of the class to support the current implementation.
Working on a 10+ year old code base, making a small simple safe change and then seeing a bunch of unit tests fail, my reaction is always “please let the failing tests not rely on mocks”.
> Someone changes code to check if the ResultSet is empty before further processing and a large number of your mock based tests break as the original test author will only have mocked enough of the class to support the current implementation.
So this change doesn't allow an empty result set, something that is no longer allowed by the new implementation but was allowed previously. Isn't that the sort of breaking change you want your regression tests to catch?
I used ResultSet because the comment above mentioned it. A clearer example of what I’m talking about might be say you replace “x.size() > 0” with “!x.isEmpty()” when x is a mocked instance of class X.
If tests (authored by someone else) break, I now have to figure out whether the breakage is due to the fact that not enough behavior was mocked or whether I have inadvertently broken something. Maybe it’s actually important that code avoid using “isEmpty”? Or do I just mock the isEmpty call and hope for the best? What if the existing mocked behavior for size() is non-trivial?
Typically you’re not dealing with something as obvious.
What is the alternative? If you write a complete implementation of an interface for test purposes, can you actually be certain that your version of x.isEmpty() behaves as the actual method? If it has not been used before, can you trust that a green test is valid without manually checking it?
When I use mocking, I try to always use real objects as return values. So if I mock a repository method, like userRepository.search(...) I would return an actual list and not a mocked object. This has worked well for me. If I actually need to test the db query itself, I use a real db
For example, one alternative is to let my IDE implement the interface (I don’t have to “write” a complete implementation), where the default implementations throw “not yet implemented” type exceptions - which clearly indicate that the omitted behavior is not a deliberate part of the test.
Any “mocked” behavior involves writing normal debuggable idiomatic Java code - no need to learn or use a weird DSL to express the behavior of a method body. And it’s far easier to diagnose what’s going on or expected while running the test - instead of the backwards mock approach where failures are typically reported in a non-local manner (test completes and you get unexpected invocation or missing invocation error - where or what should have made the invocation?).
My test implementation can evolve naturally - it’s all normal debuggable idiomatic Java.
It doesn't have to be a breaking change -- an empty result set could still be allowed. It could simply be a perf improvement that avoids calling an expensive function with an empty result set, when it is known that the function is a no-op in this case.
If it's not a breaking change, why would a unit test fail as a result, whether or not using mocks/fakes for the code not under test? Unit tests should test the contract of a unit of code. Testing implementation details is better handled with assertions, right?
If the code being mocked changes its invariants the code under test that depends on that needs to be carefully re-examined. A failing unit test will alert one to that situation.
(I'm not being snarky, I don't understand your point and I want to.)
The problem occurs when the mock is incomplete. Suppose:
1. Initially codeUnderTest() calls a dependency's dep.getFoos() method, which returns a list of Foos. This method is expensive, even if there are no Foos to return.
2. Calling the real dep.getFoos() is awkward, so we mock it for tests.
3. Someone changes codeUnderTest() to first call dep.getNumberOfFoos(), which is always quick, and subsequently call dep.getFoos() only if the first method's return value is nonzero. This speeds up the common case in which there are no Foos to process.
4. The test breaks because dep.getNumberOfFoos() has not been mocked.
You could argue that the original test creator should have defensively also mocked dep.getNumberOfFoos() -- but this quickly becomes an argument that the complete functionality of dep should be mocked.
Jumping ahead to the comments below: obviously, I mentioned `java.sql.ResultSet` only as an example of an extremely massive interface. But if someone starts building theories based on what is left unsaid in the example for those from outside the Java world, one could, for instance, assume that such brittle tests are simply poorly written, or that they fail to mitigate Mockito's default behavior.
In my view, one of the biggest mistakes when working with Mockito is relying on answers that return default values even when a method call has not been explicitly described, treating this as some kind of "default implementation". Instead, I prefer to explicitly forbid such behavior by throwing an `AssertionError` from the default answer. Then, if we really take "one method" literally, I explicitly state that `next()` must return `false`, clearly declaring my intent that I have implemented tests based on exactly this described behavior, which in practice most often boils down to a fluent-style list of explicitly expected interactions. Recording interactions is also critically important.
How many methods does `ResultSet` have today? 150? 200? As a Mockito user, I don't care.
Drives me crazy as I travel a lot. Even if I‘m logged into Google and it’s been tracking me for years and it‘s still targeting me with obviously personalized ads but it will assume I’ve suddenly acquired temporary proficiency in French, German, Italian, Spanish, etc stubbornly ignoring the language I’ve used for my query or the only language I’ve ever - in all my interactions with Google over years - ever used. What the fuck would be so hard to just NOT use braindead geolocation (which doesn’t even work in countries like Switzerland) when you know for certain who I am. You track all this information about me and use it to generate ad revenue but refuse to use my language preference?
Patrick Boyle eventually comes to a similar conclusion in his video about global population decline - https://youtu.be/ispyUPqqL1c?si=7jUgVBkOvLHluPAR - but includes lots of graphs and other interesting factlets.
* warning for Americans: not suitable for those offended by sarcasm
I don’t see the relevance to the topic. I could preface your list with something like “The monkey wrench is not the best tool for the following situations:”. It’s kinda vacuously true in a meaningless way but without expansion adds nothing to a discussion about the relative merits of monkey wrenches versus other similar tools like pliers or vice grips.
> Most local hackerspaces I visited are basically green and leftist queer safe spaces where adults run around with stuffed animals.
So what? You’re not being asked or expected to feel empathy - just show tolerance. Which is the easiest virtue to develop - just ignore behavior which doesn’t threaten you.
If someone is doing their own thing - wearing a MAGA hat, a rainbow t-shirt or carrying a fluffy toy - it doesn’t bother me in the slightest. Why does it bother you unless they’re getting in your face?
What is "so what" supposed to mean that isn't covered already by the exact sentence afterwards? Why even come at me with "tolerance" when that's exactly what I haven't been receiving, as I laid out in the paragraph you selectivity quoted. What's your point exactly?
> Important note: the strange magic requires the official git completion script.
I dislike the official git completion script because it’s so slow when working with large repos. On macOS - because of vagaries of the way its file system works - with large repos it’s effectively unusable. Hint to completion script writers: if your “smart” completion ever takes seconds, it’s worse than useless and far worse than “dumb” filename completion.
reply