1. How does this system deal with the "data withholding" problem? In other words, when people provide "storage power" their data will be repeatedly sampled to make sure it is available... but when an entity claims that samples aren't being provided as required by the protocol, how does the system determine that that person wasn't lying, if the sampled data is still provided correctly in a followup request? If the answer is "through arbitration", what prevent the arbitration system from being DDOSed?
2. The "verified clients" are certified by "a decentralized network of verifiers". How does this system prevent a sibyl attack, i.e how does it prevent verifiers from repeatedly verifying themselves using multiple accounts?
3. I notice this system doesn't mention the use of erasure coding, which is usually a common feature of similar schemes by other projects. Why is it that erasure coding isn't necessary in this system? In other words, if data is randomly sampled, how does a client make sure 0.001% of the data isn't missing if only 99.999% or less of the data has been sampled so far?
4. The filecoin organization has a ton of funds due to their successful ICO. This makes it hard for users of the filecoin network to know if it is truly scalable (since the filecoin org could just run a bunch of anonymous server farms with their funds that provide free storage to paper over flaws in the cryptoeconomic incentives) How can a user of filecoin get some assurance that the files they are storing aren't just sitting on a server run by the filecoin organization & are truly running on a decentralized system functioning through the specified cryptoeconomic mechanism?
To answer 1.), the data is not sampled to prove its there. Instead, the miner must provide a continua; cryptographic proof (called proof of replication) to prove he has stored the data over a given period of time.
https://filecoin.io/blog/filecoin-proof-system/
Hey, great questions! I'm on the Filecoin team (but not the proofs/cryptography team, where the root answers to some of your questions lie). Let me try a first pass and tap out for someone deeper on the cryptography side if needed.
> How does this system deal with the "data withholding" problem? In other words, when people provide "storage power" their data will be repeatedly sampled to make sure it is available... but when an entity claims that samples aren't being provided as required by the protocol, how does the system determine that that person wasn't lying, if the sampled data is still provided correctly in a followup request?
Filecoin sort of splits this problem into two parts – "data withholding" from Filecoin's proof-of-spacetime consensus mechanism (a "storage fault" in Filecoin terminology, yes I know there's a lot of new terminology here!), and "data withholding" from a client that's requesting stored data.
Storage miners are required to prove to the network itself, not to any specific challenger entity, that they're storing files. Each storage miner is (basically) randomly challenged once per [short interval] to provide a compressed cryptographic proof in response to a challenge. The proof conclusively confirms that, during that period, the miner was storing the data being they'd previously promised to store. You can ctrl-f "if a miner goes offline" in the linked post for a surface-level description of how the network deals with storage faults. Ditching the data and recovering it later is economically irrational for pretty involved reasons – basically, recovery is more expensive than just storing the data over the short-ish intervals during which faults are recoverable.
When it comes to "withholding data" from clients – retrieval on Filecoin is just a market-based system for bandwidth. The solution to holding data "hostage," i.e. refusing to serve it at reasonable prices, is to store a few replicated copies (just like centralized storage services do for you today behind the scenes). There's really no upside to miners refusing to profitably serve you a file when they know or suspect you can get it from another source.
> The "verified clients" are certified by "a decentralized network of verifiers". How does this system prevent a sibyl attack, i.e how does it prevent verifiers from repeatedly verifying themselves using multiple accounts?
The short answer here – with apologies for the brevity; details forthcoming – is that verified data isn't meant to be scarce, and some degree of over-verification is expected. There will be a decentralized group of folks responsible for (quite permissively) verifying and renewing clients for fixed amounts of data, and declining to renew allocations for clients who seem to be abusively verifying data. We're optimistic that this will dramatically decrease the rate at which "fake" data is stored and (most importantly) ensure that there's always storage available for client data.
> Why is it that erasure coding isn't necessary in this system?
Basically: cool, novel cryptography! In particular, this is where proofs-of-replication and proofs-of-spacetime kick in. Check out this podcast with Juan to learn much more: https://filecoin.io/blog/filecoin-proof-system/
(Also – if you like erasure coding, it is totally compatible with Filecoin whether you're a miner or a client! I would be surprised if this feature isn't developed by the community in Filecoin's early days.)
> How can a user of filecoin get some assurance that the files they are storing aren't just sitting on a server run by the filecoin organization?
Really fair question. First and foremost, as a client, you get to choose your storage miner if you want to. You then have to solve another problem, of course, which is how to map a Filecoin peer ID to a real-world actor (or prove that it's not being run by Filecoin, or whatever). This is solvable in a bunch of different ways, which I won't get into here, but the high-level takeaway is that you're not just throwing your data at an undifferentiated storage interface with obscure inner workings.
More fundamentally – Filecoin is part of a huge ecosystem of open source projects. Transparency is a key value – highlighting the success of the community, including the many decentralized storage miners participating in Filecoin, is really important to us and the only way the network can succeed. You can hop on our Slack any time (https://join.slack.com/t/filecoinproject/shared_invite/zt-dj...) to chat with the many folks already building on Filecoin. If you have other ideas on how we can establish that there are lots of groups operating on the network, not just us, let us know and I'll see what we can do :)
Well even if people use ipfs.io or infura gateways, the content served via IPFS is not being "served" from those central servers, just cached, so it's not exactly the same situation.
This really looks like a solution in search of a problem. The cloud storage market is already pretty efficient and we are expected to believe that adding a bunch of inefficiency and complexity will result in a better product?
This system is going after the dream where everyone can earn extra money by letting other people use spare space on their hard drives. In theory, I think it could offer "cloud storage" at orders of magnitude lower cost than other offerings- This would certainly be desirable.
The problem is, people have been trying to build this sort of system for many years, so any proposed system is likely to fail when put into operation & therefore has an enormous burden to prove they've overcome the many fatal flaws that existed in previous systems.
It doesn't have to be making money, it can just be saving it. The idea behind many of these protocols is that people with bursty traffic should be able to exchange bandwidth over time for peak heavy traffic during exceptional events.
Be that running an obscure website that gets flash traffic, or pushing out backups you hope you never need, until you need them all at once.
So the question is does your cloud provider end up charging you for your average traffic per year or your peak traffic, and that depends on base rates and overage rates. Even if protocols like this just exert downward pressure on those rate tables, we all win.
The problem is likely an insurmountable one - If you can offer storage to someone on your spare capacity in such a way that you make any money at all from it, it's likely that the cloud vendors with their economy of scale and cheap electricity can undercut you and capture the market.
Unfortunately for the dream, centralisation offers efficiencies and reliability that are hard or impossible to match.
Yep, we fell for the "good products sell themselves" meme. :/ As engineers, we felt uncomfortable promising the moon like so many other ICOs did (and still do). Blockchain-wise, the system we built was actually very conservative: aside from the file contract payment channels, our consensus design doesn't stray far from Bitcoin. As it turns out, the blockchain is the easy part! The hard part is, once you've got your network of hosts, how do you turn that into a truly great product? Sia is crazy cheap, it streams video at 1Gbps, it's got plenty of 9s, but those metrics don't mean anything in a vacuum. We've finally hit upon a more user-friendly product in Skynet, and we're doing a lot more PR this time, but we've still got a long way to go.
I really wish you well. Makes me consider actually trying Sia for our backups, which would make this the first time I or anyone else I know has actually used a block chain product in production... or for much of anything for that matter.
The block chain world is absolutely overrun with over-funded over-specced over-engineered boil the ocean projects that will never ship anything usable, solutions in search of problems, stuff that just plain doesn't work for fundamental math/algorithmic reasons, and of course a ton of outright scams. All of these are loud and lean heavy on the hype because they have nothing else. Just keep working and you will emerge once the dust settles. Not saying forget about marketing, but focus more on product.
I don't really want to hijack this discussion by discussing merits (or lack thereof) of other cryptocurrencies. Maybe post a link on HN for the cryptoeconomics of this other project and then we can discuss it there?
You opened the door to this avenue of discussion by implying or suggesting by omission or dismissal that no successful implementation currently exists. The existence and purported demonstration of work of Sia is not yet proven. I’ve never heard of Sia before now, and I will research it now that I know that it’s a thing to study. But I digress.
It’s contrary to parliamentary procedure and rules of debate to make a statement in order to back up your own point and not admit counterexamples and discussion of said statement and it’s implications as well as it’s intent and factual basis in the current reality we all share, regardless of which parts of it are presently up for debate.
You would have us not be informed about relevant contextual information regarding a point you yourself brought up, just so we don’t derail a discussion you yourself are replying to? We’re all valid commenters with valid comments provided we have something relevant to say.
It's meant for private encrypted backups instead of public distributed files, so it is less complicated. The users just pay the people who store it through a contract.
I have many personal files and wanting make sure they are accessible for rest of my life. Please tell me a platform that can provide this solution and why I can trust it?
Depending on the size of your files, burn a burn of duplicate DVDs or BDs. They have the longest lifetime in cold storage tradeoff vs equipment/money investment. Period. Store those disks in a fireproof bag or box and they're pretty much there for a lifetime.
This is definitely true, most cheaper discs degrade after a few years.
The issue you're forgetting to mention is the fact that physical DVD drives won't be around for more than a few more years, you then need to make sure you have one in working order on a system that can use it
For people who don't pay too much attention to storage: AWS S3 is a technical wonder, bringing down the price of storage to practically unbeatable levels. It's really a jewel in the crown of the AWS service portfolio. In terms of the amount you get metered down to the penny, it may be the single most impressive thing on AWS, honestly.
If it costs me $1 to store my stuff per month on S3, and you reduce that... so what? It's so cheap it's not going to help my wallet much. This isn't like going from $100 to $2- or even $50 to $20. It's going from like $10/year to... I dunno, $4. I might as well stay on AWS.
That leaves the enterprise market, which, naturally, loves S3 100x more than the random individual, because S3 is a solid enterprise choice, and will always have the enterprise advantage, by a crushing margin.
AWS is itself hard to compete with, but of all the services you could compete with on AWS, S3 is probably the worst. So you're going up against the worst of the worst, here. I can't say it's impossible, but it's like an extra double hard market to compete in, in an already tough market.
just because something exists in some form today and is decentralized doesn't mean it will stand the test of time. the internet is a graveyard of abandoned and failed projects and protocols.
Arweave has developed a new type of blockchain based on Moore’s Law of the declining cost of data storage. Users pay upfront (one time payment) for a hundred years of storage at less than a cent per megabyte, and the interest that accrues will cover the dwindling storage cost forever. More than one million pieces of data are now stored on the permaweb...
Can someone shed light on the relation between filecoin and IPFS? My initial thinking was that filecoin is the incentive layer to pin IPFS content but I see not mention of IPFS on the provided link. Did Filecoin and IPFS separate or did filecoin generalize their "offering"?
There's no direct relationship between the two currently. They share a lot of underlying technology but Filecoin doesn't directly incentivize IPFS, and IPFS itself isn't directly used by Filecoin either.
The basic reason why is because the protocol to request a file from Filecoin is a lot more complicated than simply fetching a URL, so IPFS can't simply sit atop, at least not without other changes.
It’s not real in that it’s not able to be mined because the network isn’t functional on devnet yet until more work is done on specs from what I could figure out on the forum and the github. I was directed to the page below to learn more and to refer to their github specs repo to participate.
1. How does this system deal with the "data withholding" problem? In other words, when people provide "storage power" their data will be repeatedly sampled to make sure it is available... but when an entity claims that samples aren't being provided as required by the protocol, how does the system determine that that person wasn't lying, if the sampled data is still provided correctly in a followup request? If the answer is "through arbitration", what prevent the arbitration system from being DDOSed?
2. The "verified clients" are certified by "a decentralized network of verifiers". How does this system prevent a sibyl attack, i.e how does it prevent verifiers from repeatedly verifying themselves using multiple accounts?
3. I notice this system doesn't mention the use of erasure coding, which is usually a common feature of similar schemes by other projects. Why is it that erasure coding isn't necessary in this system? In other words, if data is randomly sampled, how does a client make sure 0.001% of the data isn't missing if only 99.999% or less of the data has been sampled so far?
4. The filecoin organization has a ton of funds due to their successful ICO. This makes it hard for users of the filecoin network to know if it is truly scalable (since the filecoin org could just run a bunch of anonymous server farms with their funds that provide free storage to paper over flaws in the cryptoeconomic incentives) How can a user of filecoin get some assurance that the files they are storing aren't just sitting on a server run by the filecoin organization & are truly running on a decentralized system functioning through the specified cryptoeconomic mechanism?