In that they're trying to be crawling frameworks, and for sure Scrapy allows headed crawling via Splash, it's just not something I've needed or advocate for
Scrapy also has a long lineage of extensions, which maybe Crawlee will gain as it increases in popularity but I didn't see any obvious way of decoupling (for example) if one wanted to plug a new storage engine into Crawlee: https://crawlee.dev/docs/guides/result-storage versus that delineation is very strong in Scrapy for all its moving parts
Also, Parsel (the selector library powering Scrapy) is A++, in that it allows expressing one's intent via xpath, css selector, and regex matches in a fluent API; I'm sure this nodejs framework allows doing something similar because it seems to be all-in on the DOM, but it for sure will not be `response.xpath("//whatever").css("#some-id").re("firstName: (.+)").extract()`
Further, as I mentioned -- and as someone pointed out elsewhere in this submission -- Scrapy is prepared to store requests to disk and makes testing spider methods super easily since they're very well defined callback methods. If you have the HTML from a prior run and need to reproduce a bad outcome, testing just the "def parse_details_page" is painless. It certainly may be possible to test Crawlee code, too, but I didn't see anything mentioned about it
I don't know anything about this other than the announcement here and on reddit, so you'll likely want to post your question as a top comment so Jan can see it, or open a GH issue so they can help you evaluate
Scrapy also has a long lineage of extensions, which maybe Crawlee will gain as it increases in popularity but I didn't see any obvious way of decoupling (for example) if one wanted to plug a new storage engine into Crawlee: https://crawlee.dev/docs/guides/result-storage versus that delineation is very strong in Scrapy for all its moving parts
Also, Parsel (the selector library powering Scrapy) is A++, in that it allows expressing one's intent via xpath, css selector, and regex matches in a fluent API; I'm sure this nodejs framework allows doing something similar because it seems to be all-in on the DOM, but it for sure will not be `response.xpath("//whatever").css("#some-id").re("firstName: (.+)").extract()`
Further, as I mentioned -- and as someone pointed out elsewhere in this submission -- Scrapy is prepared to store requests to disk and makes testing spider methods super easily since they're very well defined callback methods. If you have the HTML from a prior run and need to reproduce a bad outcome, testing just the "def parse_details_page" is painless. It certainly may be possible to test Crawlee code, too, but I didn't see anything mentioned about it