"The assumption that any intelligent agent will want to recursively self-improve, let alone conquer the galaxy, to better achieve its goals makes unwarranted assumptions about the nature of motivation."
Why wouldn't it if it is able to? It doesn't have to "want" to self-improve, it only has to want anything that it could do better if it was smarter. All it needs is the ability, the lack of an overwhelming reason not to, and a basic architecture of optimizing towards a goal.
If you knew an asteroid would hit the earth 1 year from now, and you had the ability to push a button and become 100,000x smarter, I would hope your values would lead you to push the button because it gives you the best chance of saving the world.
Would I get a 100,000x bigger head, and die of a snapped neck? Would I get a 100,000x increase in daily calorie requirements and die within a day? Would I get a 100,000x increase in heat output in my head and have my brain cook itself? Would I get 100,000x more neurons but untrained, so I need to live 100,000x more lifetimes for them to learn something? Would I have 100,000x more intelligence but bottlenecked by one pair of eyes, ears, vocal chords, arms and legs so it's no more applicable?
A 100,000x more intelligent me only represents ~1/50,000th of the world's combined brainpower (assuming 5 billion reasonably capable adults) - why would I think myself to have the best chance of saving the world compared to "everyone working together", if a 100,000x boost to one person represents a fraction of a fraction of a percent of a change to world brainpower?
Unless you're just handwaving hoping for a techno-magic wormhole style fix, we already know what kind of things we'd need to stop an asteroid destroying the world - nukes on rockets, mass evacuation of ground zero areas, maybe high powered lasers, underground survival bunkers - generally things which take large amounts of teamwork and resources way beyond a single person's abilities to build.
It's not clear cut that the button would be automatically a good thing to press, especially when there's no talk of trade-off's or compromises. If you had to get to work in 1 minute instead of 1 hour, would you press a button that made your car 100,000x faster? No, because that would be completely uncontrollable and you'd die as soon as your car hit the first corner and you didn't steer fast enough and hit a building at 1Mmph.
Because there are tradeoffs. Whatever it's goal is, some of those "drives" (instrumental values) will be more effective for reaching that goal over the timespan that it cares about.
Sometimes "accumulating more resources" is the most effective way. Sometimes "better understanding what problem I'm trying to solve" is the most effective way". Sometimes "resisting attempts to subvert my mind" is the most effective way. And yes, sometimes "becoming better at general problem solving" (self improvement of one's intelligence) is the most effective way.
But there's no guarantee that any one of those will be a relevant bottleneck in any particular domain, so there's no guarantee an agent will pursue all of those drives.
Agreed. But if the goal is something like "build the largest number of paperclips", recursive self improvement is going to be a phenomenally good way to achieve that, unless it is already intelligent enough to be able to tile the universe with paperclips. Either way we don't care if it self improved or not, that's just the seemingly most likely path, we just care if it is overwhelmingly more powerful than us.
The only thing that stops me from recursively self improving is that I'm not able to. If I could it would be a fantastic way to do good things that as an altruistic human I want to do. Like averting crises (climate change, nuclear war), minimizing poverty and misery, etc...
Wouldn't a constant desire to self improve mean a constant desire for more energy? That would bring it into conflict with other beings that want that energy.
This isn't just an unreflective assumption. The argument is laid out in much more detail in "The Basic AI Drives" (Omohundro 2008, https://selfawaresystems.files.wordpress.com/2008/01/ai_driv...), which is expanded on in a 2012 paper (http://www.nickbostrom.com/superintelligentwill.pdf).