This isn't a "statistics" problem exactly, but rather a data science problem, but... Personality tests are usually framed as psychological instruments, but the core problem is statistical. They assume responses are noisy observations of a stable latent variable, when in practice they’re samples from a context-dependent decision process. Respondents condition on incentives, infer what’s being rewarded, and optimize accordingly, so the data-generating process shifts with each setting. From a statistical perspective this breaks identifiability: multiple latent personalities can produce the same observed responses, and the same personality can produce different responses under different incentive structures. Because those incentives are unobserved or only weakly observed, the model is fundamentally misspecified. More samples don’t converge on truth; they converge on a biased estimate of a strategic equilibrium rather than the latent trait itself.
On paper, I'm a real collaborator, because that is what I think the test should reveal about me. In reality, I just don't wanna deal with collaboration and I'd rather work alone. Can this be measured? Probably not, but it does present an odd paradox for data scientists to solve.
I must be overly idle during this storm and reading HN too much, because this is two days in a row when I get on HN and see you breaking the guidelines to say "Buy my book!", which as I stated yesterday is not really appreciated here.
The actual rule-breaking today is your posting of AI-generated content. 'Tis not allowed.
So let me just pre-empt tomorrow's chastisement by giving general advice: Don't use HN as a marketing channel to sell your stuff. Instead, engage with the community, discuss interesting things, and when a topic comes up where your work is truly relevant, feel free to post a link. Once in a while. You'll be welcomed if you do so, not so much if you spam your book at us.
Agreed. I don't think AI adoption should be strictly driven by people (executives) that are too far removed from the development process to not have clear understanding of how effective the tools are or an understanding of their failure modes. Some of the consternation might be coming from the people working on the code directly, which is probably a healthy feedback loop that should not be ignored.
I am a professional LLM code tester. I cannot say much more due to my NDA, but if you are very concerned I would run it in a docker container at the very least. That being said, I highly suggest reviewing the code carefully before running it. Even a look at say python imports might tell you what the code can potentially do if you see it importing the os module. I've run advanced code agents in vs code using their unrestricted access settings while in a docker container. At first it was scary, then I started using that time to refill my coffee. My computer is still running fine.
Most web apps would be just as good, if not better, if run on my machine like a native app. I use several different browsers for several different purposes and they are almost always near the top of the list in terms of memory use. I constantly clear my cache but doing that a lot is a chore and sometimes I lose useful webapp state.
With AI agents, I do think these apps that manipulate the DOM so much that they obscure access to useful data will hinder AI agent discovery of their useful data. If an agent is looking at your website, it cannot see what you do not show it. This makes your site less visible to them and therefore reduces its likelihood of referencing it in its responses. Pure HTML sites are very kind to agents and allow them to generate better answers with attribution. Think of this as AI optimization for your website. Some regression to pure HTML will be the norm in the near future. Bottom line, build apps that run on machines, and use APIs to feed them, but build pages for the web with the intent of having AI agents discover them and have a good shot at using the data.
On paper, I'm a real collaborator, because that is what I think the test should reveal about me. In reality, I just don't wanna deal with collaboration and I'd rather work alone. Can this be measured? Probably not, but it does present an odd paradox for data scientists to solve.