Exactly. Even though I thought the author was bringing up an important point, I couldn't take them seriously due to the repeated finger-pointing and the conspiracy-theorist attitude.
Ironically this is the exact point criticised by the article.
Namely people's preference for pseudoscientifically presented factoids of arbitrary provenance over more fruitful, curated discussions in more approachable, honest language.
Sure, let's totally ignore that and instead turn to places where the same topic is discussed in a calm, scientific, factual manner, that does not involve conspiracies and finger pointing. BTW, where is that? Could you point me to such places that are discussing Sweden and Israel data right now?
I was living in Sweden at the beginning of the pandemic. I suspected that I had COVID multiple times and tried to get a test. The response I got from the Swedish healthcare system was: “it doesn't matter what you are infected with, just stay home”.
Even though the author does have a point, they overlook the fact that it's been much harder to get COVID tests in Sweden than the countries it is compared to in the blog post. Are these numbers normalised with respect to the number of tests performed? What about the number of tests on symptomatic vs. asymptomatic people?
Sweden's lack of action against the pandemic was not limited to just the lack of closure and mask mandates and so on. As far as I know, they never set up a proper test and trace system (please correct me if I'm wrong, I'm not sure about this) as they did in Israel or the UK. Of course their case numbers will be smaller.
Population density is not a measure of urbanization.
A country with a single city of 1M and outside it a desert of 1M sq km has a population density of around 1 person per square kilometer.
A country with a single city of 1M and outside it a desert of 10M sq/km has a density of around 0.1 person per square km.
A 1sq km city state with 1M people have 1M people per sq km!
All 3 these countries are 100% urbanized (no one lives in the countryside) and have the same population yet population density varies by four orders of magnitude.
Sweden has a high level of urbanization and a low population density. Or put another way: most people live in few places BUT the areas where people don’t live are large.
He's saying that Sweden is larger but that you can expect the same population density in places where people live, i.e. cities, so just raw national population density is not a good metric.
No, population density is how many people live in a certain area. They can live spread out over the entire area (rural) or concentrated in cities and similar (urban).
Yes. So it's a measure that makes sense for a small region (e.g. a city). A very dense city with lots of people on commiuter trains (Tokyo) vs. a very sprawled city with lots of people in Cars (Houston) will behave very differently in a pandemic.
But in both those cases, it doesn't matter how much unpopulated land there is outside the cities! That is - the population density of a whole country is a completely useless and irrelevant measure when it comes to the pandemic. Urbanization is also a blunt measurement, but at least it's better than density.
In the end, it's all about human contact and movement patterns. Here you can compare a city like Stockholm which has a very large service economy with people working from home during the pandemic (good) with a city of the same size and density in a poorer country with a larger part of the economy depending on manufacturing or services direct human interaction. It's going to fare worse than Stockholm.
The problem with Stockholm was put very simply that there is a large chunk of the population that lives in cramped working conditions, doesn't have private cars and that's immigrants in a few suburbs. They drive taxis, buses or work in other areas with human contacts. They also have disproportionately more comorbidites. Sadly, they are also overrepresented in elderly care work, so the grooup most severely affected also works closely with the most at-risk group. Finally, to add insult to injury the elderly care system has been progressively dismantled, leading to staff shortages and poor working conditions.
So the picture of the early stage of the pandemic in stockholm is basically this: An elderly care worker lives with a large extended family in cramped conditions. He or she has to take public transport to several different care homes, sometimes within a single day to work, despite the fact that there are active Covid outbreaks in some of the care homes and not in others. When there are outbreaks in the care homes, there is no possibility to isolate the sick from the other residents.
What matters is how many people you can transmit the disease to. If population density is high then you are crossing paths with a larger number of people every day then compared to a more population density.
Population density is meaningful only locally... so a country such as australia can have very low population density nationally, but a high one in one of the large cities.
Population densities of both Jerusalem and Stockholm are high enough for covid to spread fast, and forrests and deserts are empty enough to make it hard for covid to spread. Averageas are meaningless here, because even statisticians can drown in a river with average depth of 0.5m.
At the start of the pandemic there was little-to-no testing in Sweden, but since autumn 2020 it has expanded and scaled significantly. We have drive-thru tests as well as home-delivered tests, free of charge. I used them a number of times, and I could usually book a test 30 mins prior. For non PCR tests, i.e. antigen tests, they are available in all pharmacies, and most people have stocked up on those as well.
Regarding contact tracing - I'm not aware of anything, but Telia does check aggregate people movement across different regions, and report findings to the health authorities, see https://business.teliacompany.com/crowd-insights
And? I get the test argument when you look at number of cases, government policies vary so much on that topic that it's useless to compare countries, but the article brings up number of deaths a lot.
Why does it matter if tests are readily available or not when you compare death rates? At the end of the day, if the policy is to say "assume that everybody is sick and just be careful/sensible" I don't see how more tests would change anything.
I guess it helps to know where/when you can stop assuming everyone else is a seething virus cauldron.
Or from more testing...
Peace of mind for the testee, social liability coverage for self or employer, knowing if infections are increasing or not, vaccines blocking transmission or not, death rate lags by a few weeks for anyone hoping to see a quick change from a new/killed policy.
When did you try to get tested? I booked a test in June 2020 and got it done the same day. Same thing this spring, I started feeling symptoms around lunch and had no problem getting a test done in the afternoon.
No, not really. The wikipedia page of Maude has a good explanation of what it is like: https://en.wikipedia.org/wiki/Maude_system. You can also see the Maude book if you're interested [1].
I'm a bit out of my league here, but how does Maude compare to say Idris?
I remember dabbling with Pure, which used term rewriting, and lent itself to expressing mathematical equations in a familiar format, but it was not a prover.
I downloaded the PDF for Maude, and I will certainly take a look in the meantime. My interests are solely for learning, and I thought using something like Idris to work my way through Sussman's Structure and Interpretation of Classical Mechanics, might drive it home for me in ways that my physics classes didn't.
It makes me sad how people think imperative programming is a "natural" way to think. I have known lots of people who resist using functional languages for serious projects, marginalizing the "functional" way of thinking as though thinking in terms of "labeled boxes with data in them" (Von Neumann architecture) is some sort of more normal way to think.
Before computers people modeled the real world using a Haskell-like pseudocode called "mathematics".
I find both mental models useful. For example, "accumulate all of the monoids in this collection" (https://hackage.haskell.org/package/base-4.9.0.0/docs/Data-F...) makes total sense as a functional operation. You have a collection and some pure operations you want to apply to combine it all into one thing.
But then, sometimes you just want to iterate across a list of directories, read some files, and put some results in lists. Tail-recursion here feels dumb. Like, dammit, I'm the human and you're the computer, I know you can turn a sequence of instructions into a tail recursive loop for me, so don't make me think.
Haskell, due to its effectful / monadic vs. pure syntax distinction isn't terribly great here. Ideally I'd write all of my code in an "imperative style" all the time, but a row-typed effect inference system would determine which functions were pure, or in IO, or async, or in ST, or threw exceptions, etc. Koka [1] is an interesting language that explores some of these ideas.
Some other projects that are doing cool things in this space are Idris[1] and F-Star[2] (the latter is also a Microsoft Research project). I am particularly impressed by Idris's algebraic effect approach, since it gives you a general effect system that combines the ability to easily extend that monad transformer approaches provide with the ease of use and inference of F*'s effect system.
I think the theory behind algebraic effects is a key step in making it easy to develop code with more stringent and customized safety properties than is available in languages like Rust.
> Ideally I'd write all of my code in an "imperative style" all the time, but a row-typed effect inference system would determine which functions were pure, or in IO, or async, or in ST, or threw exceptions
Yes, this would be a really nice approach (and it's not one that Haskell lovers would be upset by, BTW).
I would be upset if it meant no visible distinction between effectful and non-effectful calls. When refactoring it's really important to know when the order of calls is significant vs when it's incidental.
Agreed, I just recently took some recursive F# code from a photobooth (display countdown, async sleep, take photo, download from camera, recur with new byte[] list)->reverse list, and rewrote it it an imperative loop appending to a normal List<byte[]> and the code is much more intuitive.
Some things are more intuitive as a list of things to do and changes to make.
Yes they are, but I think "for loops" and "mutable lists" a'la Java's ArrayList and Vector and C#'s List<T> don't exist even in IO (I'm not a Haskell expert). That kind of thing has to be done with recursion. They exist in F# but are not idiomatic. However sometimes they're more intuitive (to me at least).
forM is a for each loop over anything iterable (Traversable in Haskell) that will execute a given monadic action on each element in the collection. A more traditional for loop is this over a range of integers or whatever else you feel like.
forM (range 5) println
Mutable collections exist in various forms depending on what kind of mutability you want. For example, there's transactional mutability that gives you mutable data structures inside an STM but forbids IO. And there's IO mutability that removes all the bounds but restricts you to the IO Monad. TVars, MVars, and the various kinds of Refs are what you're looking for here.
I suspect he didn't keep the size of his stm atomically programs small. The docs and associated papers and book materials all make it very very clear that you want to keep stm transactions small and try to defer computation into thinks that are only evaluated when the transaction succeeds.
Compare the cost of optimally, expertly built STM-using code to, say, just assigning some byte values in a *char array. Instruction count? Cache locality? Allocations?!?
EDIT: Oh, and isn't STM typically just a wrapper over immutable stuff anyways? So he said "immutable copying is too slow for me", and you said "here, use this instead, it has all the copy overhead plus some additional overhead".. that's a kick in the teeth.
> Compare the cost of optimally, expertly built STM-using code to, say, just assigning some byte values in a *char array
This is comparing the wrong tasks. Apples and oranges.
STM is usable for this particular task, but will almost assuredly be far slower. If performing the exact same task in Haskell, you could use IO and mutable memory for comparison.
Where STM really shines, however, is in highly-concurrent environments. There's no great "C comparison" for hundreds of concurrent operations locklessly reading/writing to/from the same memory. STM makes this safe and performant without a change in semantics or additional mental overhead from the "non-concurrent" case.
I'd limit your case to "highly-concurrent environments where want lots of shared, mutable state". There's always cases for some shared or global thing that you're transacting with, but I think 99% of code in highly-concurrent environments should be written with message-passing and limited scope at any point in the program.
... Stm in ghc is definitely mutable. Before commit it has an oplog representation but it's most assuredly doing very imperative things like find grained locking and writes to the same reference :)
Though maybe I stand corrected. I revisited that. Here's what I had initially:
(include the following stubs to compile):
let upload _ = async.Return()
let takePhoto i = async.Return [|0uy|]
// i stands for "photo i of N", a required parameter for our workflow
---
let runSequence n =
async {
let rec runSequenceRec (framesSoFar: byte[] list) =
async {
if framesSoFar.Length = n then
return List.rev framesSoFar
else
let! bytes = takePhoto framesSoFar.Length
do! upload bytes
return! runSequenceRec (bytes::framesSoFar)
}
return! runSequenceRec []
}
Nasty dual-level async calls, list reversal, recursion, if/else, an initializing []. A lot of visual junk that takes away from what the code is actually doing. I rewrote it imperatively to this:
let runSequence' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
let! bytes = takePhoto i
do! upload bytes
frames.Add bytes
return frames
}
Much cleaner, you can see exactly what's going on. I was fine with that, and that's where I left it until now.
However reading some of the Haskell comments, I thought there might be something better. So I tried replicating Haskell's forM:
let forNAsync f n =
let rec runOnce (soFar: _ list) =
async {
if soFar.Length = n then
return List.rev soFar
else
let! result = f soFar.Length
return! runOnce (result::soFar)
}
async.ReturnFrom(runOnce [])
Okay, not pretty but it's a library function so we should never have to look at it again. With that in hand though, we can define:
let runOneShot i =
async {
let! bytes = takePhoto i
do! upload bytes
return bytes
}
let runSequence'' = forNAsync runOneShot
This is terrific. Shorter and cleaner than the imperative method, preserves immutability throughout, and composes easily.
If you were to try to compose `runOneShot` from the imperative method, you'd end up with this, because of your frames variable
let runSequence''' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
let! bytes = runOneShot i
frames.Add bytes
return frames
}
Or if you wanted to try to rework the `runOneShot` to include the frames variable, it wouldn't be much easier:
let runOneShot'''' i (frames: System.Collections.Generic.List<byte[]>) =
async {
let! bytes = takePhoto i
do! upload bytes
frames.Add bytes
}
let runSequence'''' n =
async {
let frames = System.Collections.Generic.List<byte[]>()
for i = 0 to n do
do! runOneShot'''' i frames
return frames
}
So, maybe things that seem imperative do work better functionally. You've just got to create some combinators that let you hook them up.
Additionally, in languages with good, convenient support for functional programming like Haskell and Scala, I find that I tend to write lots of short functions, often making use of higher-order functions to decompose code into really simple, and potentially reusable, pieces. And like you say, this really helps to show intent. The same sort of thing, in fact, that object-oriented programmers try to do with design patterns such as the Strategy pattern - but with fewer lines of code.
I'm not sure I agree. Functional Reactive Programming is a declarative way of doing user interaction, but it's still a computer science research topic (Elm used to claim to be FRP, but has now decided to move away from its old allegedly FRP-like style). I need to play around with the latest FRP research libraries some time!
In addition there is also machines/pipes/conduits, which all mix imperative I/O with declarative style code in an interesting and new way.
FRP is awesome and is an example of how something previously thought of as an inherently imperative problem can be much easily done declaratively; we completely agree on this!
What I was talking about is: almost all lower-level abstractions are designed with the view of computer as a state machine. If we are solving a problem at an abstraction low enough to make us deal with this analogy, it is easier not to have a declarative layer in between.
machines/pipes/conduits are all very interesting and I was going to mention them too, but I still don't think they solve this problem in the way OP was referring to.
Before computers, far more people did bookkeeping or followed recipes or read sheet music than did abstract mathematics. I think one can reasonably say that lists of instructions that change mutable variables/objects is one of the most natural ways people can operate. Anyone learning a task essentially does this in the real world. It may indeed be terrible when done at a large scale. But there are a lot of people who can naturally understand these.
Immutable data may be a great idea. But is it the first that occurs to anyone? I have an MA in mathematics and the idea of mutable variables never seemed strange in the process of learning to program.
But even more, for a lot of people, even algebra isn't a natural way of thinking. What's natural is a series of instruction about how perform arithmetic. Learning what an abstract variable means is a hurdle for a lot of students.
The thing is that "natural", in this context, just means "what occurs to you first, all things being equal".
You may have a good and correct argument that learning Haskell is the right way to do things, that the costs of learning are more than exceed by the benefits.
But I don't think you can argue people into having functional approaches be the first thing that occurs to them.
I think it's impossible to disentangle the reality of common programming languages and how they work from what is inherently more "natural". The bottom line is programming is not natural, not in any way shape or form. It's hard to learn imperative programming, it's hard to learn functional programming, and it's really hard to learn object-oriented programming (well). If all languages were functional I don't think that would really be harder than learning BASIC, people would just accept that is what programming is and get on with their lives.
I acknowledge that imperative programming is more analogous to real world tasks, but then as soon as you try to write your first program you run headlong into the fact that programming is nothing like other tasks you have done before. The tricks you have to learn to make imperative programming work are not really any easier than the tricks for other paradigms, and in fact you can mix and match from various paradigms to great effect (see: ruby).
"...it's really hard to learn object-oriented programming (well)."
You have no idea how desperately I wish more people understood that. I could swear that every introductory programming book produced in the last years was either written by someone who either completely forgot how hard this material was to originally master, or who was just some freak of nature who just intuitively grasped it.
Please, please, please, give your students a solid grounding in procedural (or even functional!) programming before teaching OOP. You can teach them about objects and how to use them, but please stop making people write entire classes before they have at minimum a month of coding exercises behind them! (Universities that teach CS101 in Java, I'm looking at you.)
Anyway, I strongly agree with you and reached for the upvote button, and ended up downvoting you instead. Stupid touchscreens.
but then as soon as you try to write your first program you run headlong into the fact that programming is nothing like other tasks you have done before. The tricks you have to learn to make imperative programming work are not really any easier than the tricks for other paradigms.
There you go again, equating good and natural.
It seems like even functional decomposition can be unnatural for people even given the commonness of "one big loop" mega-functions as the solution created by a lot of contractors.
Certainly, the problems of mud ball and simple imperative programming styles is that they get harder as you go along.
Indeed, programming methodologies are more or less unnatural ways to program that allow programmers to be more productive and programs to scale in compensation for that unnaturalness.
Consider that mathematicians among the general public, pre-computers, were actually far more rare than functional programmers among programmers. Mathematics is also an unnatural construction. I grew having a great deal of difficulty with handwriting before computers appeared and I learned to love math shortcuts for the writing they saved me. But I also learned most people are more mentally lazy than physically lazy and on average would rather do more work than learn a new thing.
If you view it in terms of psychological development, understanding concrete imperative steps happens before children are capable of understanding abstract concepts like mathematics. The works of Seymour Papert describe how these findings relate to education.
That said, people who connect this to software in a professional context are doing a disservice since hopefully software professionals have progressed in their intellectual development beyond their teens and are capable of abstract mathematical thought.
Concrete imperative steps are extremely mathematical -- even elegantly so -- and most of software verification is predicated on this idea. Just like you can describe physics pretty precisely with simple math, you can do it with computation. The math of PFP is one particular approximation, chosen for its supposed utility as the basis for a language. It is more linguistic math than descriptive math. I also hope people adopt abstract mathematical thought when reasoning about their programs, but that has little to do with the kind of math they choose to model their programs' source code with.
Sure, you can model imperative steps in a variety of ways using mathematics, but for 99% of mathematics there is a far more direct link to pure functions. In how many branches of mathematics do you see a function that is anything but a mapping between sets (or types, or objects in a category). In terms of using the same kind of processes and intuitions, pure functional programming is much closer to mathematics as done by humans than imperative programming. Even theoretical computer science looks this way outside of particular algorithms being studied.
Not that this is some sort of indictment of imperative programming; it's a much better fit for a lot of different situations. But if you're trying to model something in abstract mathematics, pure or pure-ish function programming is likely to be a better solution. For example, look at the various proof assistants like Coq. Many of them are written in functional languages, and if you look at the code you can see why: the manipulation of formulas and proofs fits very well with an immutable, functional approach.
> but for 99% of mathematics there is a far more direct link to pure functions.
But state transitions are pure! Deterministic state machines are nothing more than mathematical functions from one algorithm state to the next, and nondeterministic ones are binary relations! (see this for an overview: http://research.microsoft.com/en-us/um/people/lamport/pubs/s...) Those can then be mathematically manipulated, substituted and what-have-you elegantly and simply. This is how formal verification has been reasoning about programs since the seventies (early on the formalization was a bit more cumbersome, but the concept was as simple).
> For example, look at the various proof assistants like Coq.
For example, look at the specification language (and proof assistant) TLA+. It is far simpler than Coq, just as pure, and requires no more than highschool math.
Coq and other PFP formalizations are built on the core idea of denotational semantics, an approximation which at times only complicates the very simple math of computation, requiring exotic math like category theory.
> the manipulation of formulas and proofs fits very well with an immutable, functional approach.
The manipulation of formulas and proofs fits very well with the imperative approach, only no category theory is required, and core concepts such as simulation/trace-inclusion arise naturally without need for complex embeddings. "PFP math" is what arises after you've decided to use denotational semantics. The most important thing is not to confuse language -- for which PFP has an elegant albeit arcane mathematical formalization -- with the very simple mathematical concepts underlying computation.
PFP is an attempt to apply one specific mathematical formulation to programming languages (other similar, though perhaps less successful attempts are the various process calculi). But it is by no means the natural mathematics of computation. Conflating language with algorithms is a form of deconstructionism: an interesting philosophical concept, but by no means a simple one, and perhaps not the most appropriate one for a science. Short of that, we have the basic, simple core math, and on top of it languages, some espousing various mathematical formalizations, related to the core math in interesting ways, and some less mathematical. But the math of computation isn't created by the language!
> But state transitions are pure! Deterministic state machines are nothing more than (obviously, "pure") functions from one program state to the next, and nondeterministic ones are binary relations!
Okay, yes, but 99% of mathematics isn't deterministic state machines. So again, pure functions can model a lot of things, but deterministic state machines are foreign to the way people think about mathematics in most instances.
> For example, look at the pure specification language (and proof assistant) TLA+. It is far simpler than Coq, just as pure, and requires no more than highschool math.
Okay, that's completely fine as an example. I just mentioned Coq because it's one of the better known ones.
> Coq and other PFP formalizations are built on the core idea of denotational semantics, an approximation which at times only complicates the very simple math of computation, requiring exotic math like category theory.
I'm not sure where CiC (or any of the other recent type theories, including HOTT) requires category theory (which is hardly exotic in 2016). People who are doing meta-mathematics on these systems are interested in categorical models and such, but implementing and using these systems requires no category theory.
Categories are not fundamental objects in these theories; you have to build them just like you do in material set theories, although some parts of that process are easier (especially in HOTT). But that is the reverse of the link that you're claiming.
> The manipulation of formulas and proofs fits very well with the imperative approach, only no category theory is required, and core concepts such as simulation/trace-inclusion arise naturally without need for complex embeddings.
Again, no category theory is required (though I'm still not sure why this is a bad thing?) to develop a prover like Coq or TLA+. If you're bringing simulation and trace-inclusion into this, then you're just saying the stateful, imperative approach is well adapted to working with stateful, imperative systems. I agree, but how exactly does that equate to that approach having any benefit whatsoever for formalizing the rest of mathematics?
> Okay, yes, but 99% of mathematics isn't deterministic state machines. So again, pure functions can model a lot of things, but deterministic state machines are foreign to the way people think about mathematics in most instances.
Sorry, I wasn't clear. 99% of mathematics isn't Schrödinger's equation either, but Schrödinger's equation is still relatively simple math. State machines are simple math, but math isn't state machines. State machines are the concept that underlies computation, and simple math is used to reason about them.
> I just mentioned Coq because it's one of the better known ones.
... and one of the least used ones, at least where software is concerned. There's a reason for that: it's very hard (let alone for engineers) to reason about computer programs in Coq; it's much easier (and done by engineers) to reason about computer programs in TLA+ (or SPIN or B-Method or Alloy).
> which is hardly exotic in 2016
I think it's safe to say that most mathematicians in 2016 -- let alone software engineers -- are pretty unfamiliar with category theory, and have hardly heard of type theory. Engineers, however, already have nearly all the math they need to reason about programs and algorithm in the common mathematical way (they just may not know it, which is why I so obnoxiously bring it up whenever I can, to offset the notion that is way overrepresented here on HN that believes that "PFP is the way to mathematically reason about programs". It is just one way, not even the most common one, and certainly not the easiest one).
> but implementing and using these systems requires no category theory.
Right, but we're talking about foundational principles of computation. Those systems are predicated on denotational semantics, which is a formalization that identifies a computation with the function it computes (yes, some of those systems also have definitional equality, but still, denotational semantics is the core principle), rather than view the computation as built up from functions (in fact, this is precisely what monads do and why they're needed, as the basic denotational semantics fails to capture many important computations). This formalization isn't any better or worse (each could be defined in terms of the other), but it is more complicated, and is unnecessary to mathematically reason about programs. It does require CT concepts like monads to precisely denote certain computations.
> If you're bringing simulation and trace-inclusion into this, then you're just saying the stateful, imperative approach is well adapted to working with stateful, imperative systems.
There are no "imperative systems". Imperative/functional is a feature of the language used to describe a computation, not the computation itself (although, colloquially we say functional/imperative algorithms to refer to those algorithms that commonly arise when using the different linguistic approaches). The algorithm is always a state machine (assuming no textual deconstructionism) -- whether expressed in a language like Haskell or in a language like BASIC -- and that algorithm can be reasoned about with pretty basic math. And I am not talking about a "stateful" approach, but a basic mathematical approach based on state machines (a non-stateful pure functional program also ultimately defines a state machine).
> I agree, but how exactly does that equate to that approach having any benefit whatsoever for formalizing the rest of mathematics?
Oh, I wasn't talking about a new way to formalize the foundation of mathematics (which, I've been told, is the goal of type theory), nor do I think that a new foundation for math is required to mathematically reason about computation (just as it isn't necessary to reason about physics). I just pointed out that algorithms have a very elegant mathematical formulation in "simple" math, which is unrelated to PFP. This formulation serves as the basis for most formal reasoning of computer programs.
> I think it's safe to say that most mathematicians in 2016 -- let alone software engineers -- are pretty unfamiliar with category theory
I'm sure that's true for software engineers, but my experience is that category theory has permeated most fields to a significant degree. Many recent graduate-level texts on fields of mathematics from topology to differential geometry to algebra incorporate at least basic category theory like functors and natural transformations. It's even more common at the research level. And I say all of this not having actually met a single actual category theorist, only those in other fields who used at least some of it.
> Those systems are predicated on denotational semantics, which is a formalization that identifies a computation with the function it computes ... rather than view the computation as built up from functions
You know there's operational semantics too, right? Operational semantics typically describes the behaviour of your program as a state machine, especially the small step type of operational semantics.
Of course. But I was talking about FP there, and, I think, operational semantics must be embedded in FP. In the state machine view, operational semantics are simply a refinement SM of the denotational semantic (which could be also specified as a nondeterministic SM)
I was being vague and imprecise. Obviously, any semantics is semantics of a language, and every language has its own perfectly fine operational semantics. I think that FP is a non-optimal first-choice for operational reasoning; let's call it that.
But while I have your attention, let me try to put the finger on two concrete problem areas in the typed-PFP reasoning. Now, I'm not saying that this isn't useful for a programming language; I am saying that it is a poor choice for reasoning about programs. I think that FP has to discrete "jumps" between modes, that are simply unnecessary for reasoning and do nothing but complicate matters (again, they may be useful for other reasons):
The first is the jump between "plain" functional code and monadic code. Consider an algorithm that sorts list of numbers using mergesort. The program could be written to use simple recursion, or, it could be written using a monad, with the monadic function encoding just a single step. Those two programs are very different but they encode the very same algorithm! The first may use a more denotational reasoning, and the second a more operational one. In TLA, there is no such jarring break. You can specify an algorithm anywhere you want on a refinement spectrum. Each step may be just moving a single number in memory, splitting the array, all the way to sorting in one step. The relationship between all these different refinement levels is a simple implication:
R1 => R2 => R3
where R3 may be completely denotational and not even algorithmic as in:
done => result = Sorted(input)
(assuming some operator Sorted).
The second discrete jump is between the program itself and the types. I was just talking to an Agda-using type theorist today, and we noted how a type is really a nondeterministic program specifying all possible deterministic programs that can yield it. This is a refinement relation. Yet, in FP, types are distinct from programs (even in languages where they use the same syntax). In TLA the relationship between a "type", i.e. a proposition about a program and the program is, you guessed it, a simple refinement, i.e. simple logical implication (he figures that intermediate refinement steps are analogous to a "program with holes" in Agda). So, the following is a program that returns the maximal element in a list:
but it is also the type (assuming dependent types) of all programs that find the maximum element in a list, say (details ommitted):
A1 ≜ done = FALSE ∧ max = {} ∧ i = 0
∧ [](IF i = Len(input) THEN done' = TRUE
ELSE (input[i] > max => max' = input[i])
∧ i' = i + 1)
Then, A1 => A2, because A1 is a refinement of A2.
So two very central concepts in typed-PFP, namely monads and types are artificial constructs that essentially just mean refinements. Not only is refinement a single concepts, it is a far simpler concept to learn than either monads or types. In fact, once you learn the single idea of an "action" in TLA, which is how state machines are encoded as logic (it is not trivial for beginners, but relatively easy to pick up), refinement is plain old familiar logical implication.
So I've just removed two complicated concepts and replaced it with a simple one that requires little more than highschool math, all without losing an iota of expressivity or proving power.
That depends what it is that you want to specify: a function or a computation (i.e., a machine). In TLA+ as in "reality" computations are not functions, and functions are not computations.
We'll start with a function. It's basically `[a -> [a -> a]]` or `[a × a -> a]`. More completely (to show that `a` is abstract):
CONSTANT a, f
ASSUME f ∈ [a -> [a -> a]]
`CONSTANT` means that a can be anything (formally, any set)
If you want to specify a computation, that, when terminates[1] returns a result in `a`, the "type" would be `[](done => result ∈ a)` (where [] stands for the box operator, signifying "always"), or more fully:
CONSTANT a, Input
ASSUME Input ∈ a × a
VARIABLES done, result
f ≜ ... \* the specification of the computation
THEOREM f => [](done => result ∈ a)
So the first type is purely static. A set membership. The second has a modality, which says that whenever (i.e., at any step) `done` is true, then `result` must be in `a`.
------
[1]: Let's say that we define termination as setting the variable `done` to TRUE. BTW, in TLA as in other similar formalisms, a terminating computation is a special case of a non-terminating one, one that at some finite time stutters forever, i.e., doesn't change state, or, formally <>[](x' = x), or the sugared <>[](UNCHANGED x) namely eventually x is always the same (<> stands for diamond and [] for box; for some reason HN doesn't display those characters)
Reduce(Op(_, _), x, seq) ==
LET RECURSIVE Helper(_, _)
Helper(i, y) ==
IF i > Len(seq) THEN y
ELSE Helper(i + 1, Op(y, seq[i]))
IN Helper(1, x)
A couple of things. First, the need for the helper is just a limitation of the language that doesn't allow recursive operators to take operators as arguments. Second, I used indexing, but you can use the Head and Tail operators. It's just that I use this often, and I care about the model-checking performance. Finally, Reduce is an operator not a function (it is not a function in the theory). Hence, its domain (for all arguments that aren't operators) is "all sets" something that would be illegal as a "proper" function (Russel's paradox). Operators, like functions, can be higher order, but they're not "first-class", i.e., they can't be considered "data". Functions require a proper domain (a set).
> Do you mean you can't pass an operator to a function
Right. Functions are objects in the theory, and must have a domain which is a set. Operators are outside the theory, and aren't in any sets. You also can't return an operator as a result of an operator.
A function can, of course, return a function, but functions can't be polymorphic within a single specification. So they must have a particular (though perhaps unspecified) set as their domain. Why? Because there is no such thing as polymorphic functions in math. Polymorphism is a feature of a certain language or "calculus", and TLA+ is about algorithms, not about a specific linguistic representation of them. "Polymorphism" of the kind I've shown does make sense, because you determine the level of the specification, and you can say that you want to reason about the use of a function (or a computation) without assuming anything more specific other than your explicit assumptions about a certain set. But that is not to say that you can’t have an operator that “creates” functions generically. E.g.:
Foo(a) ≜ [x ∈ a × a ↦ x[1]] /or/ Foo(a) ≜ LET f[x ∈ a × a] ≜ x[1] IN f
Which you can then use like so: Foo(Nat)[1, 2]. Foo must be an operator because its argument `a` ranges over all sets, so it's not a proper domain (Russel, etc.)
> or make it the argument of a computation
Ah, that actually you can do, and it's quite common as it's very useful. A constant can be an operator (a variable can't because a variable is state, and state must be an object in the theory). For example, it's useful for taking a relation as input (although a relation can also be defined as a set of pairs, so it’s an object in the theory). If in the polymorphic example you want to say that the set `a` is partially ordered (because you're specifying a sorting algorithm), you can write[1]:
CONSTANTS a, _ ⊑ _
ASSUME ∀x, y, z ∈ a :
∧ x ⊑ x
∧ x ⊑ y ∧ y ⊑ x => x = y
∧ x ⊑ y ∧ y ⊑ z => x ⊑ z
All of this, BTW, is actually the "+" in TLA+ (which is a bit technical, and trips me up occasionally), and not where the core concepts lie, which is the TLA part. Lamport would say you’re focusing on the boring algebraic stuff, rather than the interesting algorithmic stuff… I guess you could come up with another TLA language, with a different "+", but the choice of this particular "+" (i.e, set theory) was made after experimentation and work with engineers back in the '90s (Lamport says that early versions were typed). There is actually a type inferencer for TLA+, which is used when proofs are passed to the Isabelle backend (or intended to be used, I don't know; the proof system is under constant development at INRIA). So you get rich mathematical reasoning using simple math — no monads or dependent types (nor any types) necessary. Those things can be useful (for automatic code extraction, maybe, or for deep mathematical theories of languages — not algorithms), but for mathematically reasoning about programs, this is really all you need. And it works well, in the industry, for programs as small as a sorting routine or as large as a complex distributed database.
[1]: The aligned conjunctions are a syntactic convenience to avoid parentheses.
Constants are external inputs to a specification. All you know about them is what you assert in the assumptions. When you use the proof system, the assumptions are your axioms and if you don't have any, then it's really polymorphic. If you use the model checker, you'll need to supply the checker with a concrete -- and finite[1] -- set for all constants.
[1]: The one supplied with TLA+, called TLC, is powerful enough to model-check an expressive language like TLA+, but it's algorithm is rather primitive; it's an "old-school" explicit state model-checker. A more modern, state-of-the-art model checker is under research: http://forsyte.at/research/apalache/. BTW, modern model checkers are amazing. Some can even check infinite models.
You always want to decompose your model to help reasoning about it though, no? Even if you're modeling as a state machine, rather than a single big state machine you'd want to separate it into small orthogonal state machines as much as possible.
I see the functional approach as taking that one step further: separate the large proportion of the program that doesn't depend on state at all (i.e. that's conceptually just a great big lookup table - which is the mathematical definition of a function) from the operations that fundamentally interact with state. I find anything involving state machines horrible to reason about, so I'd prefer to minimize the amount of the program where I have to think about them at all.
> Even if you're modeling as a state machine, rather than a single big state machine you'd want to separate it into small orthogonal state machines as much as possible.
Your notion of state is too specific for the theoretical meaning. Which function is executing in an FP computation is also state, as is the values of its arguments. Every software system is a single, possibly nondeterministic state machine. The decomposition you speak of is merely a decomposition of the state transition into various chunks (formulas). You can think of an abstract state machine as a single (pure) function -- just like a simple FP program -- except not a function from the input to the output, but a function from one state to the next (kinda like a state monad's monadic function, but expressed more naturally). One last complication is that it isn't quite a function but a relation, as you want to express the fact that your state machine may do one of several things in the next state, e.g. to describe different interleaving of threads, or even to model the user generating input.
Another thing you want to do is refinement and abstraction, i.e. specify your algorithm in different levels of detail (machines with more or fewer states) and show that the properties you want are preserved in the abstraction. Of course, you won't do that for something as simple as a sorting algorithm, but you also want to reason about large complex things, like an OS kernel or a distributed database.
So TLA simplifies things further by saying that the whole world is a single state machine, and your specifications are restricted views of that "world machine". This allows you to specify a clock with minute second hands, and another specification of a clock with just a minute hand, and then say that both are views of the same clock, with the first being just a more refined description of it than the first (this is a problem with multiple state machines, as one takes a step every second, and the other only every minute).
> I see the functional approach as taking that one step further: separate the large proportion of the program that doesn't depend on state at all (i.e. that's conceptually just a great big lookup table - which is the mathematical definition of a function) from the operations that fundamentally interact with state.
Again, what you consider state and the "abstract state" in abstract state machines are not quite the same. There is no such thing as a program that doesn't depend on state. Wherever there's a program, there's state. If you implement an algorithm that finds the maximum element in a list of numbers by a simple traversal in a PFP language and in an impure imperative language, the result would look very different, but the algorithm is the same, hence the state and the state transitions are identical.
That's the whole point in thinking of algorithms, not of code. I'd guess that this is what you do anyway -- regardless of the language. You don't necessarily always reach the same level -- e.g. once your complex distributed blockchain needs to find the maximal number in a list, you may go down to the index level in an imperative language, yet stop at the fold level in FP, and that's fine (you decide what an abstract machine can do at each step) -- but ultimately, at some point, you always think of your algorithm in the abstract -- you see it running in your mind -- rather than in linguistic terms, and that is the interesting level to reason about it. Forget about FSMs. What you imagine is pretty much the abstract state machine you can reason about mathematically. A mathematical state machine is simply a description of your program's steps (and every program is composed of steps) at any level of detail.
> That's the whole point in thinking of algorithms, not of code. I'd guess that this is what you do anyway -- regardless of the language. You don't necessarily always reach the same level -- e.g. once your complex distributed blockchain needs to find the maximal number in a list, you may go down to the index level in an imperative language, yet stop at the fold level in FP, and that's fine (you decide what an abstract machine can do at each step) -- but ultimately, at some point, you always think of your algorithm in the abstract -- you see it running in your mind -- rather than in linguistic terms, and that is the interesting level to reason about it. Forget about FSMs. What you imagine is pretty much the abstract state machine you can reason about mathematically.
This is a fascinating view, but no, I really don't. I think of the function that finds the maximal number in a list as a mathematical function (that is, "really" a set of pairs - a giant lookup table, that the implementing computer will have some tricks for storing in a finite amount of memory but those tricks are implementation details). I think of a function composed of two functions (when I think about it at all) as a bigger table (like matrix multiplication) - not as anything stateful, and not as anything involving steps. Like, if I think about 5 + (2 * 3), I think of that as 5 + 6 or 11, and sure you can model that as a succession of states if you want but I don't find that a helpful/valuable way to think about it. It seems like you want to think of the process of evaluation as first-class, when as far as I'm concerned it's an implementation detail; I see my program as more like a compressed representation of a lookup table than a set of operations to be performed.
> I think of the function that finds the maximal number in a list as a mathematical function
My point is not how you would classify the object which is the program, but how you would go about programming it. You can think of it as a lookup table all you want, but you will define your algorithm as some state machine. Maybe looking for the maximum value is too simple, as the use of a combinator is too trivial, but think of a program that sorts a list. You can imagine the program to be a function if you like, but when you design the algorithm, there is no way, no how, that you don't design it as a series of steps.
> not as anything stateful
Again, don't conflate what programmers normally think of as "stateful" with the concept of a state in an abstract state machine.
> and sure you can model that as a succession of states if you want but I don't find that a helpful/valuable way to think about it.
I don't find it helpful, either, so I don't do it (more on that later). Whether you want to say that 5 + (2 * 3) in TLA is one step or several is completely up to you. TLA is not a particular state machine model like lambda calculus or a Turing machine, but an abstract state machine.
> It seems like you want to think of the process of evaluation as first-class, when as far as I'm concerned it's an implementation detail
You are -- and I don't blame you, it's hard to shake away -- thinking in terms of a language, which has a given syntax and a given semantics and "implementation". When reasoning about algorithms mathematically, there are no "implementation" details, only the level of the algorithm you care to specify. In TLA+ you can define higher-order recursive operators like "reduce" that does a reduction in one step -- or zero-time if you like -- or you can choose to do so over multiple steps. It is up to you. There is no language, only an algorithm, so no "implementation", only the level of detail that interests you in order to reason about the algorithm. Personally, for simple folds I just use the reduce operator in "zero-time", because I really don't care about how it's done in the context of my distributed data store. But if you wanted to reason about, say, performance of a left-fold or a right-fold, you'll want to reason about how they do it, and how any computation is done is with a state machine.
> You can think of it as a lookup table all you want, but you will define your algorithm as some state machine. Maybe looking for the maximum value is too simple, as the use of a combinator is too trivial, but think of a program that sorts a list. You can imagine the program to be a function if you like, but when you design the algorithm, there is no way, no how, that you don't design it as a series of steps.
Again, disagree. I think of it more as defining what a sorted list is to the computer. (An extreme example would be doing it in Prolog or the like, where you just tell the computer what it can do and what you want the result to look like). It's very natural to write mergesort or bubblesort thinking of it a lookup table - "empty list goes to this, one-element list goes to this, more-than-one-element list goes to... hmm, this".
Now that's not an approach that's ever going to yield quicksort, but if we're concerned with correctness and not so much about performance then it's a very usable and natural approach.
> You are -- and I don't blame you, it's hard to shake away -- thinking in terms of a language, which has a given syntax and a given semantics and "implementation". When reasoning about algorithms mathematically, there are no "implementation" details, only the level of the algorithm you care to specify. In TLA+ you can define higher-order recursive operators like "reduce" that does a reduction in one step -- or zero-time if you like -- or you can choose to do so over multiple steps. It is up to you. There is no language, only an algorithm, so no "implementation", only the level of detail that interests you in order to reason about the algorithm.
I think you're begging the question. If you want to reason about an algorithm - a sequence of steps - then of course you want to reason about it as a sequence of steps. What I'm saying is it's reasonable and valuable, under many circumstances, to just reason about a function as a function. If we were doing mathematics - actual pure mathematics - it would be entirely normal to completely elide the distinction between 5 + (2 * 3) and 11, or between the derivative of f(x) = x^2 and g(x) = 2x. There are of course cases where the performance is important and we need to include those details - but there are many cases where it isn't.
> if you wanted to reason about, say, performance of a left-fold or a right-fold, you'll want to reason about how they do it, and how any computation is done is with a state machine.
Sure. (I think/hope we will eventually be able to find a better representation than thinking about it as a state machine directly - but I completely agree that the functional approach simply doesn't have any way of answering this kind of question at the moment).
> Again, disagree. I think of it more as defining what a sorted list is to the computer. (An extreme example would be doing it in Prolog or the like, where you just tell the computer what it can do and what you want the result to look like).
Exactly, but the point is defining at what level. Take your Prolog example. This is how you can define what a sorted list is in TLA+:
So this is a definition, but it's not an algorithm. Yet, in TLA+ it could describe a state machine of sorts: the nondeterministic state machine that somehow yields a sorted list. Any particular sorting algorithm is a refinement of that machine. Of course, the refinement does not need to happen in one step. You can describe a mergesort as a machine that somehow merges, which is a refinement of the magical sorting machine. Any machine that merges, say, using a work area in a separate array, is then a refinement of that. You decide when to stop.
If there's a difference in how you "define" a sorted list via bubble-sort or mergesort in your lookup table image that difference is an abstract state machine. For example, in the mergesort case your composition "arrows" are merges. You have just described a (nondeterministic) state machine. Otherwise, if your lookup table algorithm doesn't have any steps, it must simply map the unsorted array to the sorted array and that's that. If there is any intermediate mappings, those are your transitions.
The state machines are nondeterministic, so you don't need to say "do this then do that". It's more like "this can become that through some transformation". Every transformation arrow is a state transition. You don't need to say in which order they are followed.
Here is a certain refinement level of describing mergesort
∃x, y ∈ sortedSubarrays : x ≠ y ∧ x[2] + 1 = y[1]
∧ array' = Merge(array, x, y)
∧ sortedSubarrays' = {sortedSubarrays \ {x,y}} ∪ {<<x[1], y[2]>>}
That's it; That's the state machine for mergesort (minus the initial condition). It is completely declarative, and it basically says "any two adjacent sorted subarrays may be merged". Note that I didn't specify anything about the order in which subarrays are sorted; this machine is nondeterministic. Even that may be too specific, because I specified that only one pair of subarrays picked at each step. I could have said that any number of pairs is picked, and then my specification would be a refinement of that.
You can define an invariant:
Inv ≜ [](∀x ∈ sortedSubarrays : IsSorted(x))
The box operator [] means "always". Then you can check that
MergeSort => Inv
to assert that the sorted subarrays are indeed all sorted.
> I think/hope we will eventually be able to find a better representation than thinking about it as a state machine directly
There are two requirements: first, the formulation must be able to describe anything we consider an algorithm. Second, the formulation must allow refinement, i.e. the ability to say that a mergesort is an instance of a nondeterministic magical sorting algorithm, and that a sequential or a parallel mergesort are both instances of some nondeterministic mergesort.
> If there's a difference in how you "define" a sorted list via bubble-sort or mergesort in your lookup table image that difference is an abstract state machine.
Perhaps. But very often that's precisely the kind of difference I want to elide, since the two are equivalent for the purposes of verifying correctness of a wider program that uses one or the other.
I find it a lot more difficult to think about a state machine than about a function. What is it I'm supposed to be gaining (in terms of verifying correctness) by doing so? (I might accept that certain aspects of program behaviour can only be modeled that way - but certainly it's usually possible to model a large proportion of a given program's behaviour as pure functions, and I find pure functions easier to think about by enough that I would prefer to divide my program into those pieces so that I can think mostly about functions and only a little about states).
I think we're talking at two different levels here. I am not talking about using a sorting routine in a program, but about writing a sorting routine. You want to know that your sorting routine actually sorts. Your routine is a state machine; its blackbox description is a function. You want to prove that your state machine implements the function. Our goal is to verify that mergesort actually sorts. Not that the air traffic control program using mergesort works (we could, but then the ATC program would be the state machine, and sorting would be just a function).
> Perhaps. But very often that's precisely the kind of difference I want to elide, since the two are equivalent for the purposes of verifying correctness of a wider program that uses one or the other.
This means that both are refinements of the "magical sorting machine". You can use "magical sorting" in another TLA+ spec, or, as in this running example, prove that mergesort is a refinement of "magical sorting". If what you want to specify isn't a mergesort program but an ATC program that uses sorting, of course you'd use "magical sorting" in TLA+. You specify the "what" of the details you don't care about, and the "how" of the details you do.
> I find it a lot more difficult to think about a state machine than about a function.
Sure, but again when you want to verify an algorithm, it may use lots of black boxes, but the algorithm you are actually verifying is a white box. That algorithm is not a function; if it were, then you're not verifying the how just stating the what. The how of whatever it is that you actually want to verify (not the black boxes you're using) is a state machine. Other ways of thinking about it include, say, process calculi, lambda calculus etc., but at the point of verifying that, considering it a function is the one thing you cannot do, because that would be assuming what you're trying to prove.
> Sure, but again when you want to verify an algorithm, it may use lots of black boxes, but the algorithm you are actually verifying is a white box. That algorithm is not a function; if it were, then you're not verifying the how just stating the what. The how of whatever it is that you actually want to verify (not the black boxes you're using) is a state machine. Other ways of thinking about it include, say, process calculi, lambda calculus etc., but at the point of verifying that, considering it a function is the one thing you cannot do, because that would be assuming what you're trying to prove.
Well a function is necessarily a function (if we enforce purity and totality at the language level). Presumably what we want to verify is that its output has certain properties, or that certain relations between input and output hold. But that seems like very much the kind of thing we can approach in the same way we'd analyse a mathematical function.
Let me also address this with some snarky quotes by Lamport[1] (and just add, that having tried what he suggests on real-world programs, although what he says may sound more complicated, it is actually simple):
> Computer scientists collectively suffer from what I call the Whorfian syndrome — the confusion of language with reality
> [R]epresenting a program in even the simplest language as a state machine may be impossible for a computer scientist suffering from the Whorfian syndrome. Languages for describing computing devices often do not make explicit all components of the state. For example, simple programming languages provide no way to refer to the call stack, which is an important part of the state. For one afflicted by the Whorfian syndrome, a state component that has no name doesn’t exist. It is impossible to represent a program with procedures as a state machine if all mention of the call stack is forbidden. Whorfian-syndrome induced restrictions that make it impossible to represent a program as a state machine also lead to incompleteness in methods for reasoning about programs.
> To describe program X as a state machine, we must introduce a variable to represent the control state—part of the state not described by program variables, so to victims of the Whorfian syndrome it doesn’t exist. Let’s call that variable pc.
> Quite a number of formalisms have been proposed for specifying and verifying protocols such as Y. The ones that work in practice essentially describe a protocol as a state machine. Many of these formalisms are said to be mathematical, having words like algebra and calculus in their names. Because a proof that a protocol satisfies a specification is easily turned into a derivation of the protocol from the specification, it should be simple to derive Y from X in any of those formalisms. (A practical formalism will have no trouble handling such a simple example.) But in how many of them can this derivation be performed by substituting for pc in the actual specification of X? The answer is: very, very few.
> Despite what those who suffer from the Whorfian syndrome may believe, calling something mathematical does not confer upon it the power and simplicity of ordinary mathematics.
Obviously, we all suffer from the Whorfian syndrome, but I think it's worthwhile to try and shake it off. Luckily, doing it doesn't require any study -- just some thinking.
I feel like you have confused quite a few concepts here.
First, type theory based proof assistants (like Coq) and model checkers are very different tools with very different guarantees. Most model checkers are used for proving properties about finite models and can not reason about infinite structures.
A type theory with inductive types gives one a mechanism for constructing inductive structures, and the ability to write proofs about them.
I am not sure how you have equated denotational semantics and type theory but there is no inherent connection, denotational semantics are just one way to give a language semantics. One can use them to describe the computational behavior of your type theory, but they are not fundamental.
Category theory and type theory have a cool isomorphism between them, but otherwise can be completely silo'd you can be quite proficient in type theory and never touch or understand any category theory at all.
On the subject of TLA+, Leslie Lamport loves to talk about "ordinary mathematics" but he just chose a different foundational mathematics to build his tool with, type theory is an alternative formulation in this regard. One that is superior in some ways since the language for proof, and programming is one and the same and does not require strange stratification or layering of specification and implementation languages.
Another issue with many of these model based tools is the so called "formality gap" building a clean model of your program and then proving properties is nice, but without a connection to your implementation the exercise has questionable value. Sure, with distributed systems for example, writing out a model of your protocol can help find design bugs, but it will not stop you from incorrectly implementing said protocol. In the distributed systems even with testing, finding safety violations in practice is hard, and many of them can occur silently.
Proof assistants like Coq make doing this easier since your implementation and proof live side by side, and you can reason directly about your implementation instead of a model. If you don't like dependently typed functional languages you can check out tools like Dafny which provide a similar work style, but with more automation and imperative programming constructs.
> This formulation serves as the basis for most formal reasoning of computer programs.
On this statement I'm not sure what community you come from, but much of the work going on in the research community is using things like SMT which exposes SAT and a flavor of first order logic, an HOL based system like Isabelle, or type theory, very few people use tools like set theory to reason about programs.
> Engineers, however, already have nearly all the math they need to reason about programs and algorithm in the common mathematical way (they just may not know it, which is why I so obnoxiously bring it up whenever I can, to offset the notion that is way overrepresented here on HN that believes that "PFP is the way to mathematically reason about programs".
Finally this statement is just plain not true, abstractly its easy to hand way on paper about the correctness of your algorithm. I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper. I wager even proving the implementation of merge sort correct would prove too much. I've spent the last year implementing real, low-level systems using type theory, and this stuff is hard if you can show me a silver bullet I would be ecstatic, but any approach with as much power and flexibility is at least as hard to use.
Its not that "PFP" (and its not PFP, its type theory that makes this possible) is the "right way" to reason about programs but that it makes it possible to reason about programs. For example, how do you prove a loop invariant in Python? how would you even start? I know of a few ways to do this, but most provide a weaker guarantee then you would type theory version would, and requires a large trusted computing base.
> > This formulation serves as the basis for most formal reasoning of computer programs.
> On this statement I'm not sure what community you come from, but much of the work going on in the research community is using things like SMT which exposes SAT and a flavor of first order logic, an HOL based system like Isabelle, or type theory, very few people use tools like set theory to reason about programs.
Pron is an engineer and he cares about what's easy for engineers to use. He's uninterested in research.
Well, interested intellectually, but yes, I'm an engineer and I judge the utility of tools by their applicability to the work of engineers. That a tool like Coq could theoretically be used to fully reason about and verify a large program is of little interest to me, as no one so far has been able to do that, let alone enable engineers to do that.
BTW, I'm not too sure what the point on SMT was. SMT just uses SAT for whatever theory is needed (and possible). TLA+ uses SMT (as well as Isabelle) extensively as automated tactics in deductive proofs (in set theory), which I have only just started to use. SMTs (and certainly SATs) are not related in any way to one mathematical formulation or another, just like garbage collection isn't. In fact, SATs are commonly used today in bounded temporal-logic model checking.
No. It's medium-sized; and it required a world expert, and it required a lot of effort, and even he had to skip on the termination proofs because they proved too hard/time-consuming, so he just put in a counter and throws a runtime exception if it runs out.
> building a clean model of your program and then proving properties is nice, but without a connection to your implementation the exercise has questionable value
This is one thing that has always confused me about TLA+ since pron introduced me to it. Maybe translation of specification into implementation is always the easy part, though ...?
> Maybe translation of specification into implementation is always the easy part, though ...?
Not only is it the easy part, but we're talking about reasoning about programs. If reasoning requires end-to-end certification you're working too hard.
Even within TLA+, you don't work at one refinement level. The lowest level is just too low, and therefore too hard for some proofs. You create a couple of sound refinements -- simpler state machines that capture the relevant essence -- and verify the behavior of yours. It's simpler than it sounds (refinement is just plain old logical implication, =>, in TLA) but it does take effort to reason about -- let alone prove -- a complex algorithm regardless of the formalism you use. FP doesn't make it easier, and it does require more difficult math.
> First, type theory based proof assistants (like Coq) and model checkers are very different tools with very different guarantees.
I wasn't talking about model checkers but about mathematical specification languages. Some of them have proof assistants to help prove them, some have model checkers to help prove them, and some (like TLA+) have both. But the point is the formulation, not the proof tool.
> I am not sure how you have equated denotational semantics and type theory
No, I said FP is based on denotational semantics, and that reasoning about (typed) FP programs requires some type theory.
> but he just chose a different foundational mathematics to build his tool with
(When I speak of the math I'd rather refer to the logic TLA rather than the tool TLA+, but that doesn't matter). Obviously there is no objective, external way to classify mathematical theories as easy or hard. But that tool requires little more than highschool math, and it's been shown to be readily grasped by engineers with little training and almost no support. I think this qualifies as an objective evidence -- if not proof -- that it is, indeed, simpler, and the main reason why I encourage engineers to learn that first, and only later learn "FP math" if they wish.
> Proof assistants like Coq make doing this easier since your implementation and proof live side by side, and you can reason directly about your implementation instead of a model.
That "easier" bit is theoretical. AFAIK, there has been only one non-trivial real-world program written in Coq, it was written by a world-expert, it took a lot of effort in spite of being quite small, and even he had difficulties, so he skipped on the termination proofs.
> very few people use tools like set theory to reason about programs.
Don't use set theory if you don't want -- though it is easier by the measure I gave above -- as that's just the "static" part of the program. I'm talking about the dynamic part and temporal logic(s). TLs are far more common when reasoning about programs in the industry than any FP approach. Or even other approaches that work on Kripke structures, such as abstract interpretation.
> I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper.
I'm one.
> I wager even proving the implementation of merge sort correct would prove too much.
I wager that I can take any college-graduate developer, teach them TLA+ for less than a week, and then they'd prove merge-sort all by themselves.
> I've spent the last year implementing real, low-level systems using type theory, and this stuff is hard
It is. But I've spent the past few months learning and then using TLA+ to specify and verify a >50KLOC, very complex distributed data structure, and Amazon engineers use TLA+ to reason about much larger AWS services every day. It's not end-to-end certified development, but reasoning and certified proof of implementation are two different things. State-machine reasoning is just easier, and it doesn't require the use of a special language. You can apply it to Python, to Java or to Haskell.
> but that it makes it possible to reason about programs
State machine and temporal logic approaches have made it possible to reason about programs so much so that thousands of engineers reason about thousands of safety-critical programs with them every year.
> For example, how do you prove a loop invariant in Python?
Python is not a mathematical formulation. But proving a loop invariant in TLA+ is trivial. Sure, there may be a bug in the tranlation to Python, but we're talking about reasoning not end-to-end certification, which is beyond the reach -- or the needs -- of 99.99% of the industry, and will probably stay there for the foreseeable future. The easiest way to reason about a Python program is to learn about state machines, and, if you want, use a tool like TLA+ to help you and check your work.
> > I encourage you to show me the average engineer who can pick up a program and prove non-trivial properties about its implementation, even on paper.
> I'm one.
This doesn't seem right. TLA helps you prove properties of its specification, not its implementation. Or are you saying you can somehow prove properties of implementation too?
> I wager that I can take any college-graduate developer, teach them TLA+ for less than a week, and then they'd prove merge-sort all by themselves.
Sure, but prove properties of its specification, not its implementation. Unless I'm missing something ...
> State machine and temporal logic approaches have made it possible to reason about programs so much so that thousands of engineers reason about thousands of safety-critical programs with them every year.
Again, surely the specification of programs not their implementation?
> > For example, how do you prove a loop invariant in Python?
> ... Sure, there may be a bug in the tranlation to Python
Absolutely. That's the whole point. Translating it to Python is going to be hard and full of mistakes.
> but we're talking about reasoning not end-to-end certification, which is beyond the reach -- or the needs -- of 99.99% of the industry
If the implementation part truly is (relatively) trivial then this is astonishingly eye opening to me. In fact I'd say it captures the entire essence of our disagreements over the past several months.
> The easiest way to reason about a Python program is to learn about state machines, and, if you want, use a tool like TLA+ to help you and check your work.
No, that's the "easiest way" of reasoning about an algorithm that you might try to implement in Python.
> TLA helps you prove properties of its specification, not its implementation. Or are you saying you can somehow prove properties of implementation too?
What do you mean by "implementation"? Code that would get compiled to machine code? Then no. But if you mean an algorithm specified to as low a level as that provided by your choice of PL (be it Haskell or assembly), which can then be trivially translated to code, then absolutely yes. In fact, research groups have built tools that automatically translate C or Java to TLA+, preserving all semantics (although, I'm sure the result is ugly and probably not human-readable, but it could be used to test refinement).
Usually, though, you really don't want to do that because that's just too much effort, and you'd rather stop at whatever reasonable level "above" the code you think is sufficient to give you confidence in your program, and then the translation may not be trivial for a machine, but straightforward for a human.
> Translating it to Python is going to be hard and full of mistakes.
Not hard. You can specify the program in TLA+ at "Python level" if you like. Usually it's not worth the effort. Now, the key is this: full of mistakes -- sure, but what kind of mistakes? Those would be mistakes that are easy for your development pipeline to catch -- tests, types whatever -- and cheap to fix. The hard, expensive bugs don't exist at this level but at a level above it (some hard to catch bugs may exist at a level below it -- the compiler, the OS, the hardware -- but no language alone would help you there.
But I can ask you the opposite question: has there ever been a language that can be compiled to machine code, yet feasibly allow you to reason about programs as easily and powerfully as TLA+ does? The answer to that is a clear no. Programming languages with that kind of power have, at least to date, required immense efforts (with the assistance of experts), so much so that only relatively small programs have ever been written in them, and at great expense.
So it's not really like PFP is a viable alternative reasoning to TLA and similar approaches. Either the reasoning is far too weak (yet good enough for many purposes, just not uncovering algorithmic bugs) or the effort is far too great. Currently, there is no affordable way to reason with PFP at all.
> No, that's the "easiest way" of reasoning about an algorithm that you might try to implement in Python.
You specify until the translation is straightforward. If you think there could be subtle bugs in the translation, you specify further. I'm in the process of translating a large TLA+ specification to Java. There's a lot of detail to fill in that I chose not to specify, but it's just grunt work at this point. Obviously, if a program is so simple that you can fully reason about it in your head with no fear of subtle design bugs, you don't need to specify anything at all...
Well the classic here is his book Mindstorms where he talks about how to construct learning environments for children that take into account that bridge between concrete and abstract thought and turn it into an asset instead of a liability.
If you want more on the topic that's not from the hands-on perspective of an educator, you might have to go further back to Piaget's papers, but unfortunately I haven't had the opportunity to dive into these much.
Well, you don't get to decide what people find natural, they do.
And deep below, computers work under a model that's anything but functional (assembly language is basically a big mutable state machine).
There is value in understanding both imperative and functional programming and applying each of them wherever they are the best fit and there are some real world situations where sticking to immutability is a terrible choice (e.g. neural networks).
Programming is about making a piece of hardware do what you want to do. Many people seem to forget that, or treat it as an afterthought. "Oh yeah, it's slow. Hopefully the compiler will do its magic!". Except it does not, because it cannot rewrite your code to match what the hardware wants.
This is why we now live in a world where many programmers don't think there is a problem with running web servers on programming languages that slow down computation by orders of magnitude, in addition to being very cache and memory inefficient. We can scale! Yeah sure, use 50 servers where one could have done it. We would also use Saturn V to launch small individual LEO satellites, you know. Why not? It works!
Taking a functional approach is fine for a lot of tasks, and when you can do it without cryptic or inefficient code, you should do it. But when imperative is easier, use that instead.
You seem to have completely misunderstood my point.
> Well, you don't get to decide what people find natural, they do.
I am not contending that functional/declarative is a more natural way to think; I am saying that there is nothing inherently more natural about imperative programming. Since the latter comes from imitating the machine architecture up in the abstraction hierarchy, we should consider for a moment if this is really desirable.
> And deep below, computers work under a model that's anything but functional (assembly language is basically a big mutable state machine).
This is entirely irrelevant. Computers are highly stateful but the languages we design don't have to be stateful because of this. They have to meet the criterion of easily expressing the mental constructions we designate which I would definitely say is much better handled with statically typed functional programming languages.
> There is value in understanding both imperative and functional programming and applying each of them wherever they are the best fit...
We completely agree on this! Not all problems are best solved with functional languages (e.g., things actually involving the machine).
> there are some real world situations where sticking to immutability is a terrible choice (e.g. neural networks).
I don't see why immutability is a terrible choice for neural networks? Could you elaborate on this? It might be that almost all neural network implementations have been imperative so you have come to accept that as more normal.
Typical neural networks have multiple layers of hundreds of thousands of nodes, which then get updated very frequently during the learning phase.
It would be suicidal for these updates to be immutable (e.g. returning brand new neurons with the updated value instead of updating the neuron that's already in the graph).
I have a cup of coffee, if I drink a little I don't think a new cup of coffee with less coffee has been created, just that the old one is different. There are some times where imperative thinking seems more natural.
> There are some times where imperative thinking seems more natural
This is certainly true. But there are also times when there is nothing natural about it at all. Consider this scenario:
I give you a cup of coffee. You set it on your desk to cool. Then when you go to take a sip, the cup is empty! You see, I gave you the same cup I was drinking from.
A race condition I guess. At least we didn't try to drink at the same time.
For the record I usually write functional code. Let me put another example instead of coffee. If I have a differential equation, "dx/dt=f(x,t)", I think of the solution as a function "x(t)", but if I try to solve it numerically or even try to reason about it I visualize how "x" changes with time. Is this the reason why numerical code is usually imperative? Maybe it's just because of performance but I really think that sometimes there is a cognitive burden trying to reason fully functionally (I can visualize the parabolic trajectory of a ball but I cannot visualize the 6-dimensional trajectory of an airplane doing a manouver without seeing it "change with time").
I see your point. Learning how to reason effectively in higher and higher levels of abstraction can be daunting; however, it can pay off very well. Ask a mathematician about the set of interesting problems that can be visualized and they will tell you it has measure zero.
I think in a block-time model of the world. The coffee cup now and the coffee cup a minute ago are different things; my memory of the coffee cup a minute ago doesn't suddenly have less coffee in it.
The recipe analogy is certainly very natural, but it does not extend as we get into things like assignments and references, which are a crucial part of Von Neumann languages (which is really the thing I am criticizing).
Tactic languages in theorem provers like Coq are stateful too: you have a proof state and you keep giving instructions for it to change. I wouldn't say that this is an unnatural way to think; for that subtask of the problem, we can view the proof as something with state without any problems.
"Mathematical modeling" sounds too sophisticated to be natural for humans. Let me illustrate what I meant; the kind of naturality I was talking about is this:
sorted(xs): a permutation xs' of xs such that, as we traverse xs' each element is greater than or equal to the previous one.
This is a perfectly fine sorting algorithm. It contains all the information we need to sort a list of comparable things, and it is very unlikely to contain an error; it is the _definition_ of sorting.
There are righteous performance concerns about this but ignoring all performance issues this is still more natural than listing out the steps needed to carry out the task of sorting. I don't think it is possible come up with a simpler recipe-like algorithm for this task.
There are recipe-like tasks in programming: ones that don't involve modeling the world; for those the functional paradigm would not be a good one. But these are rare if you are doing programming the right way.
You are confusing algorithms (Operational Semantics) with definitions (Denotational Semantics). One of the key difference between mathematics and computing is that mathematics is far less concerned with constructing results, and far more concerned with reasoning about the properties of objects. Mathematics sits at a level of abstraction unsuited to physical hardware.
In Haskell, this is evident in the complex and difficult to reason about graph-reduction engine that lays hidden in the runtime. Denotional semantics => Haskell Math yay. Operational Semantics => Haskell's downfall in real-world programmers.
I don't see how either operational or denotational semantics is relevant as I was not talking about semantics of any kind.
I don't think I am confusing anything, you are reacting exactly as I would expect from someone who does not get the point I'm making: we don't have to necessarily program with steps of instructions. If we have a good solver for our language, we can very efficiently execute definitions like my example. I agree that this presents the challenge of efficiently executing such high-level definitions, but this is irrelevant to my point.
> Mathematics sits at a level of abstraction unsuited to physical hardware.
Precisely what I am talking about. Mathematics sits at a level of abstraction convenient for humans not machines.
Have you ever programmed in Prolog? I strongly suggest you try it to get a feeling of why programs are not inherently lists of instructions and how computation can be viewed as deduction.
But in a way, imperative is more natural, because it captures the notion of computation more precisely. Haskell -- like other pure FP languages -- is built around the approximation of denotational semantics, which does have a bit of a mismatch with computation (not to say it isn't extremely useful much of the time). Anyway, mathematical thinking about programs didn't start with PFP, nor is PFP the most common way of formalizing programs. See:
I believe that the best way to get better programs is to teach programmers how to think better. Thinking is not the ability to manipulate language; it’s the ability to manipulate concepts. Computer science should be about concepts, not languages. But how does one teach concepts without getting distracted by the language in which those concepts are expressed? My answer is to use the same language as every other branch of science and engineering—namely, mathematics.
It makes me sad that some PFP enthusiasts enjoy the mathematical aspects of it -- as they should -- yet are unfamiliar with the more "classical" mathematical thinking. I think it's important to grasp the more precise mathematics first, and only then choose which approximations you'd make in the language of your choice. Otherwise you get what Lamport calls "Whorfian syndrome — the confusion of language with reality".
Dijkstra, in his "on the cruelty of really teaching computing science" makes a very similar point that people should learn how to reason about programs before learning to program.
This is why dependent types are a great framework to program in. If a program is worth writing, it should at least contain everything the programmer thinks about the program, including the reasoning of why it is the program he wants to write!
Right. In computational semantics there are two schools. The "school of Dijkstra" (now championed most vocally by Lamport) -- which has largely taken hold in the field of formal verification -- and the "school of Milner" (Backus?) -- which has largely taken hold in the field of programming language theory. The former reasons in concepts and abstract structures (computations, Kripke structures), and the latter reasons in languages ("calculi").
The interesting philosophical question is this: can programs be said to exist as concepts independent of the language in which they are coded (in which case the language is an artificial, useful construct) or not (in which case the concept is an artificial useful construct)?
Whatever your viewpoint, the "conceptual" state machine math is a lot simpler than the linguistic math offered by PFP.
> Haskell -- like other pure FP languages -- is built around
> the approximation of denotational semantics,
Interesting, do you have any references for this? I thought that the primary reason for purity was to enable equational reasoning, but I have no sources for this. Also, AFAIK, there are no formal semantics for Haskell?
> Let me emphasize: you write the test cases for your program, and then you write your program. You are writing code to test something that doesn't even exist yet. I am not rightly able to apprehend the kind of confusion of ideas that could provoke such a method.
I don't understand this. What is so absurd about specifying facts about the program you will write? When we have tools that can prove facts, we will be doing formal specifications instead of random sampling. But still, testing is a way of statistically specifying facts about your program, for which it seems sensible to be written before the program.
Because like the author says, there is no substitute for competence. If you can't write good code, then your assumptions about the future design will be wrong and/or bad, and your tests will be as poorly written. In other words, TDD (anecdotally) doesn't improve your code but only introduces extra levels of complexity. Which in the hands of an incompetent developer becomes an even bigger problem than if there was no TDD.
If you can't write good code, then you will not write good code.
But, there is another common case where unit tests (written before or after) also come in tremendously handy: if you can't write good mistake-free code 100% of the time.
Which describes all programmers who have ever existed.
Can you elaborate on that a bit or give an example maybe? It seems to me that mathematics does pay attention to what things are, since mathematicians often start their arguments with getting themselves and their readers to agree on rigorous definitions of mathematical structures, before they do anything with those.
It's basically duck-typing. If you can do arithmetic on it, it's a number. If you can do matrix operations, it's a matrix. If it satisfies the axioms of X, it's an X.
This is just something I thought of and I am not sure if this makes sense but mathematicians explore objects much in the same way particle physicists do. So the physicists bounces particles off each other and see what happens to understand how these particles work.
Similarly, mathematicians study objects by looking how they act on other objects. So if it turns out that two differently defined objects have the same actions on other objects, we call them isomorphic and don't really distinguish between them.
So for instance, we say that the set of rigid motions that preserve the triangle and it's orientation is the same as the set of permutations of the roots of say: x^3-3x+1 even if the two sets are absolutely not defined in the same way.
I think he means that mathematical objects are pure structure without substance. They are defined only in relation to other objects and there is no deeper meaning.
The thing about Nix is that it's the first project, as far as I know, that tries to solve the problem of package management with a fundamentally different approach. Most of the solutions to date were just okay: they worked with a tolerable degree of failure.
A solution that works good enough is often a serious obstacle for developing a solution that works really well. E.g. Unix works just good enough to preclude the adoption of a new paradigm in OS'es such as Plan 9.
Treating packages like immutable values sounds like a genuinely new idea. So what I think is that people don't go "this seems better than my current development toolchain", instead they think that the idea that underlies Nix might result in disproportionately more reliable tools in package management. That's worth fussing about.
Disclaimer: I am not associated with Nix in any way.
The first but not the last, there is also Guix the package manager and GuixSD the system distribution, which is written in Scheme, as well as all the package definitions. For lispers it is heaven.
I am assuredly a "lisper" and I really don't want to use anything based on Nix. I really dislike how statically linked executables are not portable on Nix. I'm less upset w.r.t. dynamic library management.
> We need your permission to do things like hosting Your Stuff, backing it up, and sharing it when you ask us to. Our Services also provide you with features like photo thumbnails, document previews, email organization, easy sorting, editing, sharing and searching. These and other features may require our systems to access, store and scan Your Stuff. You give us permission to do those things, and this permission extends to trusted third parties we work with.
However congenial this sounds, they reserve the right to look at your stuff —— worth noting.