Thanks for that. Now that I've had a chance to read through it, a question:
The examples seem to be implemented in pure Rust. No one is going to port their Spark jobs to Rust in the shot term. Have you evaluated perf with Python etc?
If you're still seeing significant speedups, you might want to bottle this up and seek VC because a managed service along the lines of 'databricks but 10x faster' would certainly get traction.
It is in a very initial POC stage and distributed mode is pretty basic, but it is moving faster than I expected. Python integration is definitely one of the primary objectives as I suspect that no one is going to learn Rust for this, although I feel that it is not that hard. In fact, it can have a better integration story with python than Spark as Rust has good C interop. Regarding performance, yeah it is pretty good from what I have seen for CPU intensive tasks and once blockmanager is implemented with compression and other optimizations like Spark, shuffle tasks also will improve. There are a lot of unnecessary allocations here than I would prefer just to keep it in safe Rust as much as possible and there is still plenty of optimizations possible here. I am doing this in my free time only. I feel that it is too early to compare witn Spark given how many features Spark has. Maybe in a couple of months after it matures a bit and if there is enough traction for this, then we can look for sponsors.
This is an economy where content competes for clicks, not clicks competing for content. The author of that content wants me to see it, Medium doesn't want me to see it. I don't care enough to try to circumvent their arrangement.
Given the number of votes on my root comment, it seems neither do most people.
You can also use Reader Mode on Safari, which not only avoids the modals and popups but gets rid of the top and bottom bars as well. Long-click on the Reader Mode button and you can set it to always use it on medium.com.
I haven't actually monetized it. I think without medium distribution, it will be limited to my followers. That is the only reason I switched on distribution. I have decided to just use Github for my future blog.
Stop hosting your content on a platform that holds it hostage so that it can make money off it without giving anything back to you.