have you actually done any web scrapping at scale? The problem is never the web automation. It's bypassing IP blacklist, rate limits, capcha etc, and a hosted service can provide solutions for those:
> Proxies included..., Auto Captcha Solving, Advanced Stealth Mode
Other than that, like everything else, a hosted service is always an option and not contradict with you being able to host that service directly, they're just for different sets of constrains.
I have, and solved a lot of those problems. Yes, it requires additional plugins and services, but I prefer to own the solution (a must-have for my use case, but for someone where it's lower stakes perhaps a hosted solution is ideal to the engineering/research)