What's a good small project + stack for someone to start with formal methods and...

agentultra · on March 6, 2019

For an introduction and motivation to TLA+ I recommend Hillel Wayne's talk Everything About Distributed Systems is Terrible [0] and basically anything he has written on his blog about the TLA+.

I think following along with the TLA+ video course is really thorough and digestible once you've decided that you want to learn more [1].

You'll find the TLA+ Hyperbook is a useful reference and guide as you start specifying your own systems [2]. It has a number of examples and projects to work on.

Eventually if you want to go deep after you've decided it's worth it for yourself, get into Specifying Systems the text book and bible of TLA+ [3].

Program Synthesis has a lot of research in it and is starting to show promising results. I'd say that while the research has been going on for a long time it's still rather new to industrial use... for a motivating talk and explanation on what program synthesis is try this one: https://www.youtube.com/watch?v=HnOix9TFy1A

[0] https://www.hillelwayne.com/talks/distributed-systems-tlaplu...

[1] https://lamport.azurewebsites.net/video/videos.html

[2] https://lamport.azurewebsites.net/tla/hyperbook.html

[3] https://lamport.azurewebsites.net/tla/book.html

deegles · on March 6, 2019

Thanks!

jerf · on March 6, 2019

I'd actually suggest rather becoming fluent in modern Haskell.

Formal methods are great in their own way but for a lot of programming tasks they overshoot the mark. For those that do not, by all means, bring them in. But right now, a practical engineer can't ignore just how klunky they are and how badly they scale up. One may gaze longingly at them, and dream fondly of the future date when they are more practical, but they aren't there yet.

There is a reason the Haskell community is always banging on about monads and monoids and applicative functors and commutivity and associativity and identifying exactly how little power you need to accomplish a given task, and it's the same reason I was able to say in another comment thread that Haskell is probably the only current semi-popular language that could deploy the ideas in this paper as a library, and that library would actually be correct. (There are some languages that could deploy the library and it would be a cultural fit, like OCaml or Clojure, but you'd still be able to trivially use it incorrectly.)

Plus even if you still want to get into formal methods and program synthesis, "learn Haskell first" is still a fairly sensible stepping stone. It probably isn't strictly mathematically true, but formal methods are going to be more-or-less a superset of what modern Haskell teaches you anyhow. Both the Haskell language and the Haskell tooling will be easier, and pragmatic enough you can do a real project like "rewrite my website in it" or other pragmatic, attainable projects that the formal methods languages can't do yet.

mjb · on March 6, 2019

Many formal methods have a much simpler mathematical foundation than Haskell. For example, for TLA+ the background includes mostly (basic) set theory, basic logic, and some additional bits of temporal logic.

Some methods do require a lot more math, or different math, but I'd say your statement is false for the majority of popular formal methods.

senderista · on March 6, 2019

Indeed, I heard Lamport say at a talk (in response to a question) that he saw little or no applications of functional programming to the distributed systems problems he was interested in.

nickpsecurity · on March 6, 2019

Formal verification is really hard because you have to use mathematical proof to show that a piece of software does what it should on all inputs and paths. You have to formalize the software, the desired goals, and prove a connection. Typically, you have to do this for software written in languages that aren't designed for verification. That makes it even harder.

This led me to recommend people start with so-called lightweight, formal methods that require less upfront learning with more automation. You'll see the value of formal methods, have more fun, and maybe use it on realistic problems. The best one to start with is Design-by-Contract invented by Meyer in the 1980's based on principles from 1960's. Contracts are simply precise, declarative descriptions of a module expects from callers, does during execution, and returns to the caller. Here's an overview even a project manager can find useful:

https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-b...

A number of us independently arrived at a specific use for contracts that's high ROI. You use contracts to specify intent, then automatically generate tests from them (replaces unit testing), and finally they can become runtime checks for automated testing (esp fuzzing). Contract-based tests or fuzzing with contract-based checks cause each failure to take you right to the source of the problem. Note that the test generation part has many names to Google: specification- or spec-based, model-based, and recently property-based testing. Hillel Wayne has great posts on contracts and test generation:

https://hillelwayne.com/post/contracts/

https://hillelwayne.com/post/pbt-contracts/

I recommend reading his whole blog. Anyway, the next tool is Alloy which is super-simple but with better tooling than most academic works. Hillel has examples on his blog. Davidk01 on Lobste.rs has been using it for a lot of things. He said, given its relational, he just uses it like he uses SQL:

https://lobste.rs/s/ghhcus/test_generation_from_bounded_alge...

Finally, one that will help you with protocols and concurrency is TLA+. Hillel has learntla.com up to teach you for free on top of blog examples. After a lot more time with it, he got to publish a book through Apress. I recommend it:

https://www.betterworldbooks.com/product/detail/Practical-TL...

If you want a taste of proof, the SPARK Ada language was designed to make it easier. It's used in industry. Basically uses contract-like specs and supporting functions to feed "verification conditions" into automated solvers. You only have to prove what they can't but can always use runtime checks on those. There's a nice book teaching you how to use it plus Altran/Praxis' Correct-by-Construction method for low-defect software.

https://www.betterworldbooks.com/product/detail/Building-Hig...

There's also Cleanroom: a 1980's method that combines semi-formal specs, simple use of language, hierarchical composition, and usage-based testing to cost-effectively get high quality software. Stavely has best intro to it:

http://infohost.nmt.edu/~al/cseet-paper.html

https://www.betterworldbooks.com/product/detail/Toward-Zero-...

Note: Linking to Better World Books since they're a public-benefit type of company (see History). Amazon ain't. ;)

agentultra · on March 7, 2019

Nice list!

For proofs I've been finding the leap into dependent-type theory to be very productive as many provers in this realm also double as practical programming languages allowing the possibility of shipping proof-carrying code.

My two favorites in this area are Lean [0] and Agda [1], the latter of which recently received a nice introductory text from Phil Wadler and Wen Kokke [2].

[0] https://leanprover.github.io/ [1] https://wiki.portal.chalmers.se/agda/pmwiki.php [2] https://plfa.github.io/