Sure, you CAN verify Turing machines, but verifying languages that AREN'T Turing complete is sufficiently simpler. So why not make your verification work easier by using a total language? The only real argument against that would be "we can't do what we want in a total language".
I don't buy the argument that what people want to do with smart contracts requires Turing-completeness. In fact, as time goes on I become more and more convinced that Turing completeness is highly overrated. Almost all code wants to do finite and/or structurally inductive operations. The only case that doesn't fall in that are things like request loops where you want to "infinitely" wait, process request, loop back to waiting. But even those things can be done in total languages via co-induction, so exactly why do we want to make our analysis/verification work more difficult by using a non-total language for things like Ethereum?
The real answer is probably that total languages are obscure and the Ethereum inventors didn't know about them so they chose a simple and ordinary stack machine.
But those total systems are, of course, subsets of Ethereum, which means that if you write your program in an obviously finite way, you can use the same inductive proofs you would use for a total language.
Turing completeness makes it hard to prove properties automatically about arbitrary programs, but you can still prove properties about specific programs.
I'm pretty sure that MOST research on program verification has been done in the context of Turing complete systems, from the work of Floyd and Hoare, to the symbolic execution of Deutsch and King, to temporal logic, TLA+, etc.
> The real answer is probably that total languages are obscure and the Ethereum inventors didn't know about them so they chose a simple and ordinary stack machine.
I'm not an expert on Ethereum, but even if they did pick a total language, how would you deal with bounding the CPU cost of complex contracts? Even if you could formally verify a loop would eventually terminate, wouldn't long running loops or those with expensive computations have bad impacts on the network? Their "gas" system is their way of dealing with that and perhaps you would still need something similar even with a total language.
I can't think of program verification research where both termination and CPU cost is verified. I can think of some papers that deal with identifying Big Oh growth rates though.
Either way, smart contracts sound like one of the top tier areas where you want strong static typing that has a route to be formally verified.
The gas limit is one reason I think it's weird that people insist on calling the EVM "Turing complete", since one of the most prominent features of the system is that every program is guaranteed to terminate in a finite and low number of steps (via the gas mechanism). Turing complete programs are supposed to be problematic because of the halting problem, but in the EVM, the halting problem is trivial: every program halts, period.
> The gas limit is one reason I think it's weird that people insist on calling the EVM "Turing complete", since one of the most prominent features of the system is that every program is guaranteed to terminate in a finite and low number of steps (via the gas mechanism). Turing complete programs are supposed to be problematic because of the halting problem, but in the EVM, the halting problem is trivial: every program halts, period.
Exactly, knowing that the contract code will eventually halt is nice to know but knowing that it will halt after a reasonable CPU cost seems much more important in this case (which is really interesting).
From what I've seen from example contracts, it doesn't seem like the chance of infinite loops is that high. Personally, I find that in mainstream languages, if you accidentally write an infinite loop you find out very quickly on the first few runs and code that could be hiding an infinite loop stands out. The vast majority of loops just iterate from the start of a collection to the end.
The contract coders I work with try as much as possible to avoid loops altogether, because even if they're correct, they'll still have a linear cost, and that's usually not what you want. Almost all contract operations should be O(1). Sometimes we've made fairly large system design changes to avoid a loop in a function.
Interesting. So what's your view about the most practical language for writing contracts in then?
For what I've read, having the sender setting gas limits seems really awkward especially when the limit is exceeded. Are there practical alternatives? Even with O(1) you'd need to limit the computation time.
I don't have a fully formed view yet, but I think Solidity is a pretty good prototyping language, disregarding all marketing hype that it's an "easy" language. While developing, you'd also be working on a specification of exactly how the contract shall behahve, and trying very hard to simplify ruthlessly.
The Parity multisig contract is very complex and I wouldn't and didn't trust any funds with it. I hope that the newly deployed fixed multisigs are bug-free but how would you know?
Right now I'm working on tools for static analysis of bytecode and using such tools to verify the correctness of contracts written by hand in assembly, which I think makes a great deal of sense for the simple contracts that we need as utilities (multisigs, tokens, etc).
> The Parity multisig contract is very complex and I wouldn't and didn't trust any funds with it. I hope that the newly deployed fixed multisigs are bug-free but how would you know?
Yeah...any ideas why they didn't go with a language that had an easier route for formal verification? It seems like an ideal application and the contract specifications seem like they would be fairly straightforward to prove (compared to what you'd see in most journal papers for instance). I'm guessing the designers just weren't aware of theorem provers but I'm puzzled why they didn't go with a language with strong static typing where private access and immutability are the defaults plus avoiding anything to do with the mechanism they have for picking default handling functions when a message isn't understood. Great experiment to watch though!
Dependent or even refinement type systems are where a great deal of research into proof carrying code is done (for refinement types, since systems are restricted to decidability, the proofs are automatic but more limited).
What proof systems are you thinking about? Strong static types seem the most natural way to introduce formal verification into mainstream programming to me.
(Not the parent here.) ACL2 is a prominent example of a proof assistant without a strong static type system; its object language is a pure subset of Common Lisp. The lack of static types is one of the reasons I find proving in ACL2 to be painful, but others have done amazing stuff with it.
I don't feel there's much difference between e.g. the Coq way where types are used to bake specifications + proofs into the program and the Isabelle way where you write a program with less complex types and prove properties about the program after the fact.
General mathematical proofs? If you mean paper proofs they wouldn't be considered formal as they don't go down to the axiom level.
Corecursion and codata are cool, and I would trust Solidity vastly more if it were a "total" functional language with no mutable state and aggressive type checking. Or if it were something more like https://www.idris-lang.org/ , where I had abundant tools to rigorously prove that nobody could steal my money.
Basically, if I'm going to spend even 5 minutes playing with smart contracts, I want to feel like the language designers were incredibly paranoid and aware of just how good modern languages can be at provability and security.
> If you're competent with proofs in Idris, you can write your EVM programs as an Idris DSL.
Yeah, all my formal proof experience is with Coq, which is just a bit too idiosyncratic and difficult for this sort of thing. But I seem to remember that—especially if you're willing to certify specific results, and not necessarily the program in general—you can make your trusted computing base very small indeed.
I mean, if the CompCert C compiler can be as thoroughly verified as it is (http://compcert.inria.fr/compcert-C.html), there's no reason why it shouldn't be possible to verify smart contracts.
Jack Petterson and Robert Edström did a master's thesis about retargeting the Idris compiler to emit EVM code, with a custom effect type to express smart contract effects. They ended up modestly skeptical of functional programming as a paradigm for EVM, instead looking toward process calculi. https://publications.lib.chalmers.se/records/fulltext/234939...
See especially his formalization of the EVM as Hoare triples in Lem, which is usable from Isabelle: https://github.com/pirapira/eth-isabelle -- very cool stuff, although I have to say that the ostensible complexity and difficulty of his proof verifying an utterly simple wallet contract makes me a bit skeptical of this approach to smart contract correctness: https://github.com/pirapira/eth-isabelle/blob/master/example...
> I mean, if the CompCert C compiler can be as thoroughly verified as it is (http://compcert.inria.fr/compcert-C.html), there's no reason why it shouldn't be possible to verify smart contracts.
Hmm, so from the smart contract examples I've seen, I would have thought writing them in something like Coq would be quite feasible. The code examples I've seen tend to only have simple loops if any, basic conditions and the code is short (e.g. compared to something like web development or mobile apps). Writing anything inside a theorem prover is very challenging though but it seems a compelling application.
If you were implementing a smart contract in something like Coq, I'd be interested to read about what kind of specifications you would typically want to verify. I guess properties like "the owner of this wallet cannot be changed" and "the owner of this wallet can only accept money but not give it away" would be good ones.
I don't buy the argument that what people want to do with smart contracts requires Turing-completeness. In fact, as time goes on I become more and more convinced that Turing completeness is highly overrated. Almost all code wants to do finite and/or structurally inductive operations. The only case that doesn't fall in that are things like request loops where you want to "infinitely" wait, process request, loop back to waiting. But even those things can be done in total languages via co-induction, so exactly why do we want to make our analysis/verification work more difficult by using a non-total language for things like Ethereum?