I found that AI models really have a hard time with Android native, kept reverting to the latest libraries which don't have broad support. I had more success with Flutter but still needs a decent amount of course correction.
You still will need clear benchmarks as the reward for RL. With Chess, the rules are simple but you may not have a clear loss function for a complicated architectural challenge.
That isn't true for the case of Israel. Almost half of the current government didn't serve in the required army duty. Some were rejected by the army for their extremist views which in the past would have also banned them from elected office.
My experience with coding with LLMs is that the only thing it's really good at is generating boilerplate that it has more-or-less seen before (essentially a library, even if is somewhat adapted), however it is incapable of the creative thinking that developers regularly need to engage in when architecting a solution for their use case.
My experience is the opposite. When I started using Copilot I thought it would only be good at standard boilerplate but I'm constantly surprised how well it understands my completely convoluted legacy architecture that barely I understand myself even though I'm the only contributor.
Understanding existing code is in its wheelhouse (provided the infrastructure feeding the existing code to the prompt is working well), but I believe if you examine the totality of work a human programmer is involved in, an LLM is woefully behind in many areas (gathering proper requirements, potentially iterating/pushing back on requirements, architecting a solution on a macro level, other gaps an LLm cannot fill).
Parents problem I experienced -> it gets "stuck" and its limitation of learning loop (humans are always asking why it gets stuck and how to get unstuck), LLMs just power through without understanding what "stuck" is.
For explaining existing corpus, algorithm it does a fantastic job.
So likely we will see significant wage garnishing in "agency/b2b enterprise" shops.
Yes, this is all about risk tolerance. If a component on a website doesn't function as expected, it rarely kills people. Flight engineering should have the lowest risk tolerance possible. This is expensive, but necessary.