It's pretty much letting people with GPUs become a sort of 'mini-cloud provider'. There's no job queue or fancy distributing computing setup. We just let you SSH into a container on someone's computer, and pay for the time used.
I was doing a lot of fun deep learning projects on the side & would often Venmo my friend who was mining Ethereum to use his GPU to train my models. He made more and I paid less than AWS spot instances or Paperspace.
This is just a fun side project hoping to let people who want to train their deep learning models do it cheaply (on other people's computers!)
I love the way this idea evolved across threads, to stimulate a great discussion! As this idea attracts attention, the limits of scaling this become clear, as does the need for well balanced network incentives. This is one of those problems that actually does benefit from a blockchain token, and a few implementations are just emerging. I imagine success with this concept will include Homomorphic Encryption or Zero Knowledge Proofs, in order to prove unique processing took place. Value added services seem to be a natural fit as well. Check out Openmined ( https://github.com/OpenMined/Docs) Cardstack (particularly this recent post “The Tally Protocol: Scaling Ethereum With Untapped GPU Power” @christse https://medium.com/cardstack/the-tally-protocol-scaling-ethe... and Ocean Protocol, and as others mentioned, SONM, iExec, Golem, and all the BOINC tokens (pascal coin). I so look forward to this whole niche maturing.
> It's pretty much letting people with GPUs become a sort of 'mini-cloud provider'. There's no job queue or fancy distributing computing setup. We just let you SSH into a container on someone's computer, and pay for the time used.
I had the same idea a few days ago - but in my head, the process would be wrapped up as a "cryptocurrency" where the AI researchers pay real money and the "proof of work" is useful/"real" work. I ran into 2 issues regarding trust: the first is that how do you verify that the hardware owner is running the real job an not NOOP'ing and sending false results? The second issue is how do you protect the hardware from malicious jobs? GPUs have DMA access - how do you stop task submitters from rooting your box and recruiting it into an AI botnet (for free)?I ended up dismissing the idea, but if you could work out these 2 issues, there's money to be made...
> the first is that how do you verify that the hardware owner is running the real job an not NOOP'ing and sending false results?
Consensus. Have _n_ nodes perform the same work (if it’s deterministic), and only accept (and pay) if all the nodes match - or at least the nodes that were part of the majority
I don’t think this would be considerably different from SETI or folding@home, which have been going on for around twenty years.
For my senior project in college one of our ideas was a distributed render farm that operated like what we’re talking about. There were some additional issues there (transfering assets, preventing node owners from extracting the output [say a studio was “letting” fans donate compute time to render a feature film], etc).
If you can pay for a single fully trusted node to do the calculation once, the cost of n nodes redundantly calculating the same result in order to establish trust must require those untrusted nodes to be cumulatively cheaper than the one trusted node, in order for there to be an economic incentive to do so, no?
My assumption is that you would have to be faithful in a low number of untrusted nodes in order for that to end up cheaper.
The cases of folding and SETI are particularly different because there are institutions which have in interest in funding these programs in part due to their goal being a public cause. The same clearly doesn’t apply to micro tasks if you will.
But I can imagine cases in which you can accept bad actors giving bunk results for some percentage of the calculations you run. As long as you’re rotating nodes often enough (provided that they’re from distinct actors) I’m imagine it could work out to be economically more feasible to spend the time to work around that bad data than it would be to directly hire fully trustable compute power.
> Consensus. Have _n_ nodes perform the same work (if it’s deterministic), and only accept (and pay) if all the nodes match - or at least the nodes that were part of the majority
It would work well for problems that are computationally hard to solve, but easy to verify solutions for. Unfortunately, such problems are ubiquitous in cryptocurrency, but rare in machine learning.
Well, not really. Training a model is one such task. It's hard to train a network but easy to verify that it has good performance (training vs inference).
The real problem here, I believe, and I've seen this idea pop up several times on hackernews, is that almost no machine learning tasks are progress free.
If the cryptocurrency is just paid out to the person who solves the task first in a non-progress free problem, then the person with the fastest GPU would mine all the coins and nobody else can participate. One of the key ideas behind proof of work is that if two people have the same compute, and person A has a headstart, if person A has not succeeded by the time person B starts, they'll have the same probability of mining a block.
People seem to be just jumping on the crypto bandwagon and trying to come up with "useful" proof of work, but it's a pretty difficult task.
Consensus for every computation would be 2x as expensive, but you may be able to achieve something like it with randomly assigning 10% of the calculations to be double-checked, and double=checking more (all?) of a node's computations if it has an inconsistent result.
BOINC has quite a sophisticated system, but it's a long time since I looked at the details. I believe new participants are subject to greater scrutiny.
> Consensus. Have _n_ nodes perform the same work (if it’s deterministic), and only accept (and pay) if all the nodes match - or at least the nodes that were part of the majority
I’m under the impression that proof of work that verifies the authenticity of transactions on a blockchain cannot must depend on those specific transactions as its input. If there are other uses for the work that are unrelated to securing specific transactions, then the fact that you performed the work says nothing about the authenticity of those transactions.
> how do you stop task submitters from rooting your box and recruiting it into an AI botnet (for free)?
Only real way to do this is run the job in a VM with a GPU and CPU+motherboard that support passthrough (read: not consumer NVidia GPU's, your CPU+board must support an IOMMU and your card cannot freak out when being reset after initialization).
Saw a panel with the golem people just last night, and sure enough this question came up. The short answer is that they don't have a solution yet and IMO their thinking was no more advanced than what I'm seeing on this thread.
How is their problem substantially different from this project? Apart of course from the overheard and complexity caused by trying to force what should probably just be a centralised service into a blockchain.
Don't get me wrong, I think it's a great idea. I just don't see why it needs a blockchain and all the associated trustless infrastructure. Even nicehash doesn't bother with all that.
I suppose the approach by the Golem team is substantially different because the ideology associated with it.
What you see as “trying to force what should probably just be a centralized service,” I see as “innovating a new approach to powering decentralized architecture.”
I’m not saying you’re wrong. It would be easier to solve the problem using existing tool sets and more mature protocols. Yet, I’m pretty sure that the Golem team is doing something right. So there’s that. Maybe this isn’t a zerosum thing.
> I’m pretty sure that the Golem team is doing something right
I'm not at all convinced that the golem team have any particular insight to solve this obvious and common problem that everyone else doesn't have. And frankly I think that the overhead of running unnecessary infrastructure will render them price-uncompetitive to any reasonable centralised provider. In short, I predict they will fail.
But eh, they raised USD$8m and I didn't, so what do I know.
The Golem team doesn't necessarily need to solve every problem. Being built on top of the Ethereum Network is advantageous. If they make an appealing, open platform with potential, maybe other developers will pick up the ball and run with it to power their own ends.
> I guess that's why we're sitting in different camps.
Indeed, doesn't mean I don't want to hear the other side's point of view though!
Open source is not going to save them. They have one main problem - how to tell if people did the work they claim they did? If centralised, they can "test" new users or perhaps periodically check up on long term users by secretly allocating duplicate work and verifying its content. How can you do that in public? The blockchain is actually working against them.
And who really needs a cryptographically secure attestation that on march 25th 2018, user XYZ completed ML shared 456.7? This is a level of audit logging appropriate for a bank and basically nothing else. All you need is availability accounting of some sort. It's not rocket science. I couldn't write the client for this app but I sure as hell could write the back end and I wouldn't even think of using a blockchain. Make no mistake, their choice of technologies is for buzzword compliance, not technical necessity - a very bad sign.
There is also no need for the GNT. It solves no problem and users could just as easily be compensated in ETH or anything else. Sure, it's a funding mechanism, fine. We still haven't figured out how ICOs should even work.
Despite all the rigmarole, they have a product they need to sell like any other startup: rent us your GPU/CPU for $x/hr. Because of their overhead, I predict they will easily be outcompeted by centralised providers. People are not going to use golem over another, better-paying alternative just from the goodness of their hearts. And I cannot see any way how golem can be structurally more efficient than a centralised solution.
All said, I'm not as optimistic as you. Not like I want them to fail though, good luck to them!
Well I sure do appreciate you going out of your way to explain your perspective to me.
Just a couple more responses:
1. The use of blockchain is not for buzzword compliance. Julian (CEO) is a longtime Ethereum supporter/developer. This project has really been in development since 2014 or so, long before blockchain was "buzzy." So the use of blockchain here is not for grabbing cash. They actually think it's a better (perhaps harder) way forward.
2. Not only does the token allow investors to directly invest in the project, it also allows developers to "print" tokens that can be locked behind smart contracts. That way developers can be rewarded for reaching project goals with bowls of their own dog food. Not bad to eat when it's pretty much "real" money.
3. The decentralized and distributed nature of the project will allow the Golem Network to achieve goals and execute code that no centralized competitor could achieve/run. I'll leave it as a thought exercise for you to speculate what those goals/codes might look like.
Thanks for the engagement. It's great to test my beliefs through debate. Time will be the true arbitrator here though. Best wishes.
Instead of having the party ssh into a VM installed on the user's machine, potentially exposing a high majority of the user's codebase, have you considered spinning up temporary containers on your back-end and having contributors install something like remote CUDA or remote OpenCL so that only the GPU kernels are transferred to the contributor, who's client software polls a network queue checking to see what kernel should be run and where the results should be sent?
Good idea from the perspective of not exposing the code base. However, technologies such as remote Cuda/OpenCL which rely on remote execution of compute kernels in general require high-bandwidth and low-latency connectivity - this is especially true for deep learning / AI workloads, not necessarily for other applications which may have a higher computer to data transfer / synchronization ratios. The latency on a typical internet connection will likely stall the GPUs on a remote system, yielding little compute benefit.
I think this is a great business idea with a lot of potential, if you can address the reliability, scalability and trust problems you might have a really large business opportunity here. I'm really impressed with your pragmatic and simple (for the user) implementation of this idea, I think you really did a great job identifying a minimal set of useful features and implementing that to validate your idea, congratulations! How long did you work on this so far if I might ask?
I somehow missed the paragraph where you said about building the platform and the link to the service. This looks awesome then.
As somebody who's spent a silly amount of money on EC2 spot instances to train models, I would certainly overlook the odd dodgy result for access to those GPUs at those prices.
I just hope you find a way so that the ingenious but disreputable people that seem to come when money's to be gained don't ruin it for everyone. However, I wish you every success.
I imagine you could do some kind of hardware fingerprinting, but there's nothing stopping a really bad actor from modifying the kernel to pretend to have a GPU and NaN on allocation. I suppose I'm descending into absurd levels of distrusting trust that may never happen.
I also foresee annoying customers who say they only get back NaNs but this is down to instabilities in their training and they flood any reporting of bad actors that you have.
I don't believe either are actually terminal with the right incentives.
Nice work! I wouldn't let this minor objection keep you down. You can always spot check a computation on a trusted system (e.g. your own) and update your trust accordingly.
Thank ya! Plus the way it's setup right now, you don't pay for anything until you done and satisfied with a session! I just want both parties to be happy with the GPU compute transaction :)
This looks awesome, I just submitted a hosting application. I only have a single GTX 1060 on a Ryzen board, but I only use it 3-4 hours per day and I'm good with its downtime being used for passive income. Hopefully someone will find it useful.
One question, I noticed you only pay in crypto right now, do you plan to offer USD or other fiat currencies in the future? Crypto isn't a problem for me (I don't mine crypto myself but I wouldn't be opposed to carrying a passively obtained ∗coin balance and watching it appreciate over time), just curious.
Anyway, I think you have the makings of a nifty project here. Good luck!
I was thinking about paying out in fiat, but crypto is so much easier because of no fees, instant transactions, and not having to deal with various currencies.
While something like Stripe Connect may be useful, the fees are unreasonable for smaller transactions. A quick hack to cash out to crypto is to use your Coinbase wallet address as the payout address, and just sell off the crypto the moment it hits your wallet.
I was doing a lot of fun deep learning projects on the side & would often Venmo my friend who was mining Ethereum to use his GPU to train my models. He made more and I paid less than AWS spot instances or Paperspace.
This is just a fun side project hoping to let people who want to train their deep learning models do it cheaply (on other people's computers!)