Languages talk about being multi-paradigm as if it's a good thing, but multi-paradigm means you can always do the bad thing if you feel you really need to, and programmers are extremely bad at doing sort of the the time-scale integration of the cost of doing something that they know is negative. I mean everyone will know...it's like "this global flag: this is not a good thing, this a bad thing - but it's only a little bad thing" and they don't think about how, you know, the next five years how many times that little bad thing is going to effect things. So brutal purity: you have no choice.
In particular, I like the bit about integrating your technical debt function with respect to time, over the time you have to live with the debt, and how bad programmers are at thinking in those terms. We tend to think about how much technical debt we have at a given fixed point in time, but the area under the curve is what bites you.
Reminds me of something a former boss used to say. "Pay me now, or pay me later; either way, there has to be a payment". It sounds like Carmack is saying that brutal purity always forces the "pay me now" approach since it's harder to write this code, but it is worth it in the long run.
That was more or less my take away as well, it forces you to deal with most of your technical debt up front and not let it pile up to some indefinite point in the future.
Essentially brutal purity is a really strong anti-kruft coating. Since kruft can't stick to your code, it tends to stay nice and shiny well into the future, unlike the dirty impure code elsewhere that builds up a nice thick layer of kruft over time and eventually collapses under that weight (or else some poor soul ends up spending a bunch of time either scrapping all the kruft off, or just throws it out and builds a new one).
While i can see his point, I'm not entirely sure this is the case. A text-book example of a multi-paradigm language is Scala. And yet with Scala, the more I've learned about the functional side of things, the more tempting it is just to avoid any imperative or object-oriented idioms or patterns and just go functional all the way. it's almost as if the designers of scala knew if they could convert java programmers over, the temptation to do bad things would go away if only because functional is so much more elegant that it doesn't make sense not to use it as much as possible.
My experience with Scala was the opposite: it has great support for OO and imperative programming. The functional way of doing things often wasn't better, while on the other hand, traits are crazy flexible. I've seen people go crazy with functional code in Scala that is barely readable and impossible to debug; the JVM debugger is still optimize for imperative programs, while no one has really figured out a good debugger for functional programming yet.
OO/imperative programming isn't all "bad." Being functional doesn't really ensure that the code is "good."
There are many programming languages, specifically structured, to prevent programmers from making a certain subset of bad programs. This is incredibly useful.
Yeah, but these languages tend to be not a good fit for inherently complex problems. When one tries to express complex ideas in those languages, you end up code that's either absurdly verbose, or code that experts in that language's idioms would call "bad".
One can write simple, structured code for simple problems in any language. But complex problems require languages that don't attempt to babysit programmers.
Some languages like Haskell put you in a purity (chastity?) belt; if you can code your solution there, you can code it anywhere. The problem is knowing your problem well enough to code it in Haskell...while a lot of time we are writing code to understand our problem (prototyping, experimentation). Some people live in a world where they understand their problems before writing code, and all their code is expected to immediately hit production.
Scala is quite flexible and can accept good and bad code, which is good. I was only calling out FP abuses and its promotion as some sort of holy grail that is so much better than objects. If you compare Scala FP to Java OOP, ya, Scala FP is better, but if you compare Scala FP to Scala OOP, it's a much fairer battle and I put my faith in the latter (with the caveat that I can always use Scala FP when it makes sense to do so!).
So what you are saying is that Haskell forces you to understand the problem for which you are implementing a solution, and this is bad because often that is too hard.
Instead one should be able to just hack away until the code kind of does what they want (even though they can't prove anything about that piece of code) because that is the only way to really manage inherently complex problems?
The older I get the less I find writing code an attractive method for understanding a problem. I also find utilizing something like a type system to enforce invariants in code liberating - and actually conducive to solving problems.
Well, it helps to be able to inspect your code at various stages to isolate which part of it isn't working right. In the pure functional paradigm, my programs are typically the composition of numerous functions. Now what if my program isn't doing what it should? (Contra FP advocates, it's possible for this to happen even if your program successful compiles.)
You could have it output the value after each function is applied, but that would either break purity (by having I/O) or, per the GP's point, be tedious to write. At this point, the Haskell community made the decision that in debugging, "screw purity", and the output is effectively untyped.
Certainly, you can use tests of expected invariants (eg QuickCheck), but that just tells you that the whole thing fails.
That is, I think, also the GP's point: that the same things that make your final code good in Haskell, also make it hard to write.
(No to say I don't like haskell; this is just a peeve,)
Long answer: You can, but Haskell is very serious about keeping pure code separate from impure code. Any function which performs I/O is impure, so its output must be an IO Thing instead of just a Thing. This means anything that uses its output must accept an IO Thing instead of an ordinary Thing, and so on. Of course, the way I/O works means that a 'Type1 -> Type2' function into can be transformed into an 'IO Type1 -> IO Type2' or a 'Type1 -> IO Type2' function (this applies to many things other than I/O -- consult one of the many monad tutorials for more on that). Haskell even includes some special syntax for invoking these transformations that makes things look rather like an imperative language, but adding in a debug print still requires you to rewrite a lot more code than you would have to in another language. Alternatively, you could use unsafePerformIO (whose type is 'IO a -> a') to hide from the type system the fact that you've done some I/O, but you may run into trouble with the printing not behaving as you expect (due to Haskell's lazy evaluation).
Your long answer is correct. I think I disagree with your short answer.
You can, and this is not just a technicality - it can be useful in debugging.
It's certainly true that laziness means that when things print will be unintuitive if your intuition is trained on strict languages, so any short answer conveying as little information as just "yes" or "no" is liable to be confusing, though...
The concept is that affecting stdout isn't "just" - it requires a clear definition when (and if) that function will be executed, and in Haskell you don't do that if you can avoid that - it allows for lazy execution, actually running the function less (or 0) times depending on how it's result is used.
I.e., if your program needs top 3 results in a race, then you can safely use a function that returns all the results, but since you afterwards use only first three then the remainder aren't calculated, that code most likely isn't run and the order of any "embedded" print statements would be undefined.
That being said, standard Debug.Trace module allows simple adding of debug prints anywhere.
Remember that Haskell is lazy. In broad strokes, the way you control when things happen is to sequence them by combining them (directly or indirectly) into the massive IO action that becomes main.
If you don't care when it happens, you can use unsafePerformIO - which is just fine for debugging, though there are often better approaches.
You can. You use a function called `trace` in module Debug.Trace in the standard library.
For instance, if you defined
f x y = g (x + y)
then you could instead write
f x y = trace "calling f" (g (x + y))
to insert a debug statement.
(dllthomas linked to the library but didn't point out how it works, and unrealistically expected people to like, follow the link or something. argv_empty is correct that side effects cannot be obtained without either wrapping functions in the IO monad or calling unsafePerformIO. Debug.Trace.trace calls unsafePerformIO behind the scenes.)
As mentioned, it'll only print when the wrapped value is actually evaluated, which won't always correspond to it's position very well.
However, it's very much worth noting: if you're in IO then you can just sprinkle debug statements. If you're not in IO, then your function isn't effect-full and you can run it (or pieces of it) safely with whatever args you care about (also, quickcheck is amazing).
That doesn't help with debugging pure functions. Many times I've been in a position where something is going wrong in the composition of several functions and I want to know where the error is in all of that. Normally, I'd just append a print statement to each of the functions, but in haskell I'd have to rewrite the program to allow output at each stage or else break purity.
I'm not following your first sentence. What doesn't help with debugging pure functions? Debug.Trace is quite precisely for the situation you describe.
Another option would be to pull the file in question up in ghci and play directly with the components of the function - in pure code that should be safe in a way that it won't be in effect-full code.
I also dispute the assertion that QuickCheck doesn't help - if you're getting weird behavior, then hopefully you can characterize that weird behavior, and get some example failed values out of QuickCheck - and getting a picture of what values the function is failing for can absolutely be useful.
The computer is a tool I use to help me solve problems. Part of the process of solving problems, for me at least, is to write code that executes, and spend some time in the debugger to see if my assumptions were right. The computer is a tool for solving problems, dammit! It is ridiculous to think that you should solve the problem using pencil and paper (b/c computers no good for problem solving!?!), with various mathematical proofs (b/c we can and should prove everything?!?) and the computer is just an annoying end point for getting solutions out.
Programming is not just about implementing solutions, but also about finding solutions.
I actually agree with you. My point was that types are a much better tool than hacking code and debugging. Once the types are embraced and used as a tool, instead of thought of as a straitjacket that is where the real power lies.
Types, especially with some good inference, don't necessarily get in the way of exploration, but purity most certainly does. Not being able to take shortcuts and being expected to have a pre-well elegant understanding of your problem can be a big blocker.
But types are fairly conservative, dynamic languages still provide quicker turn around times even if semantic feedback can't be provided. One of my research projects is focusing on getting the benefits of both.
Some people like to think in terms of 'proofs', it seems reasonable to me that they would think of a compiler as a tool for 'solving problems' in those terms. But most of us seem to be more empirical.
Interestingly the best reverse engineers I've ever met, who are total descriptivists with a debugger, are also some of the most pedantic prescriptivists when it comes to writing actual code. It comes full circle, I think because they understand the reasoning behind the API contracts better than the API documentation tells.
I think what is being said is more, one of the things computers are best at, much better than humans, is simulation--following rigidly and repeatedly the consequences of a set of rules, through cause and effect, to see the eventual behaviors of a set of rules.
When constructing a set of rules in the first place (or when first formalizing an existing, but implicit, set of rules), the human can do the requirements gathering, design, bug-finding, etc. But humans are terrible at simulation--they can't foresee what each and every consequence a particular rule might have. When constructing a rule-set, a human's productivity is enhanced by the computer taking over the "simulation" part, to explore the consequences for them, where the human can then change the rules in response.
Type systems of all kinds, though, basically assume that the rules defining the code are invariant. They don't help with simulation at all.
I disagree with that. Used correctly types are exactly a lightweight simulation layer. Values are not always what you need to know about and, again used correctly, types can reveal a lot of interesting emergent dtructure of your code.
Types prove that something cannot happen in the formal system the program represents. In the real world, though, radiation can flip memory bits into impossible positions, drive blocks can fail and spit random garbage, IPC messages can get dropped on their way to the receiver, etc.
Sadly, even when your system is pure and elegant and you've proven it all out, you still need runtime asserts (and more advanced things like quorum voting on the results of computations, with the ability to fail nodes experiencing problems) before you can be "sure." Ada wasn't enough for NASA; they needed redundant processors too.
CS is weird as a discipline: we get all the tools of mathematics (digital logic, proof-verification), and all the tools of engineering (rate-of-failure calculations, high-assurance systems)... and then it turns out that the problem-space (ensuring perfect automation over trillions of repetitions of a task) is so difficult that it requires both! Arguing types vs. asserts is a "rabbit-season, duck-season" debate. Anyone who isn't using both a type system and runtime checks, is working in the dark.
However, I can and should probably point out that useful runtime checks can be derived from static properties of your code--and a Sufficiently-Smart Compiler[1] would automatically insert them as it compiled your code.
---
[1] not all that rare these days, GHC's stream fusion goes way beyond "sufficiently smart"
Yeah I use strong typing, assert()s, and 'if (!x) throw ...' everywhere. Nevertheless, I figure if the biggest risk to my code comes from cosmic rays, then I'm doing pretty good.
Scala was actually the first language in which I've truly appreciated object-oriented programming. It's really quite simple when you get it: Objects Are Modules. You do not want or need an object for everything (looking at you, Java!), but when you want to abstract away some combination of behavior and data, you do in fact want OOP.
I'm still waiting for his in depth article about his experiences while porting Wolfenstein to Haskell - mainly to see if it's worth learning Haskell. Beacause as he said in the video: Most examples in books (I guess that counts for evangelization blogs too) are toy examples. And I'd like to know how viable the language is for bigger systems.
This is highly anecdotal, but I've built and been part of very practical/non-theoretical, large Haskell projects (100k+ lines, which is a lot for Haskell). The only big complaint I have is that it's somewhat hard to do loose coupling, i.e. for something somewhere to reference a type without either redeclaring the type (when that's possible), or having a huge, centralized Types.hs that declares all the types that are used in different places (to avoid cyclic imports.) (Contrast with e.g. Go or ML where you have interfaces/modules without 'implements'.)
This isn't unique to Haskell by any means, but it's the only real complaint I have about Haskell as a language for non-toy projects. The benefits definitely make it my go-to language. It's hard to list them all, but by far the nicest feeling is the correctness: when your code compiles, 60% of the time your program works every time. (Not to imply that tests aren't necessary--QuickCheck is great for that.) It's an otherworldly feeling to write a program not in terms of what to do, but what kinds of filters you want to put on something, and have it just work (and either stay working, or break future compiles if something's changed!) after compiling 10 lines of code, when you would have written at least 50-70 and had to debug it in almost any other language.
Edit: I'll add another complaint: Haskell is like C++ in that it's incredibly easy for a codebase to become completely unmanageable if your team doesn't have a common style/discipline. Go is a nicer language for "average"/"enterprise" teamwork, I think, since it almost forces you to write programs in a way everyone will understand. If you're in a team with good programmers that you trust not to abuse the language, this is a non-issue.
Edit: Okay, another one: If you change your Types.hs, the recompilation can take a long time in a large codebase, similar to C++. But GHC/Cabal keep getting faster.
Have you tried .hs-boot files for separate compilation with circular dependencies? It’s not bad, but it does lead to some extra crapwork when changing interfaces, much like with C++ headers.
I haven't, and it does look pretty cumbersome. So far a couple of Types.hs, i.e. for functionally separate components, have worked out fine, but I'll check it out if that becomes unwieldy.
Do you mind telling me what field is your large Haskel project in? I want to get a feeling of where Haskel is in the commercial world. Is it a finance industry product?
No, this is net tech. Crawlers, parsers, protocol implementations, web servers, cryptography. Sorry, I'm not allowed to disclose much more. Suffice it to say it's not specialized/overly mathy/abstract, and it's very similar to the kind of software the majority of HN users write.
I do hope I can advertise it at some point, as the "Haskell in Industry" page is sorely lacking non-finance companies, and I'm sure there are a lot of companies out there using it.
My biggest project probably takes about 5-7 minutes to compile in dev mode (no optimizations), and about 15-20 minutes with -fllvm and -O2 on a not-super-beefy i7 laptop (I haven't really timed it recently.)
It's not noticeable in general, since only the Types.hs affect many different files, and they're rarely changed. Also, 99% of the compiles are partial, i.e. only the files that have changed (or dependencies of them) are recompiled. Partial compiles don't feel slower than ones on my smaller Haskell projects.
Is it annoying compared to e.g. Go? Definitely. But it isn't ruining my life. And there's a lot of optimization, fusion particularly, that goes on in the background.
You frequently mention nice things about Go in comparison to Haskell. I'm learning Go at the moment, and must admit that I find myself struggling to stay motivated. Given your background in FP (and apparent enjoyment), what do you find appealing about Go?
I like both Go and Haskell. They are wildly different languages, of course (even though some things are at least a little bit similar, like typeclasses and interfaces, cabal and go get, type signatures/inference, 'forkIO' and 'go', etc.)
If it's just me making something, 99% of the time I'll pick Haskell, unless I know everything that I'm going to do is mutate a hash table or array, in which case I use Go. (Not that that's not doable in Haskell, it's just not as intuitive/easy to do efficiently. Keep in mind that I said "if it's the only thing"--Haskell's downsides in this area aren't significant if you're also doing other things, and especially so if just some of those things are pure/don't have side effects.)
The reason why I say "if it's just me" is that Go is much nicer to use in normal teams. To me, it's a Java/Python/Ruby/JS competitor. Let's face it, a lot of enterprise teams aren't as disciplined or as good at/interested in programming as they could be. For this, Go is perfect. You could read that as "Go is for average programmers", and that is true--in a good way. It's extremely easy to pick up for new members of the team (Haskell is very hard, I have to admit!), everybody can collaborate without asking a lot of questions, because of 'go fmt' there's never any indent wars, and it's a snap to compile and deploy binaries.
If I have to do anything like map/reduce/filter, really any kind of operation on a set, I hate using Go. But it's not Go that I hate; it's most imperative languages. I don't want to specify how to do all of that, much less repeat how to do it. (In Haskell you can actually run into performance problems pretty easily because you've gone overboard with filtering sets in ways that would have sounded alarm bells in imperative languages.) Granted, languages with generics are better for this, and to me it's the major thing Go has to gain from generics--but as you implied, I just enjoy the "freedom to think about important things" that you get when you're thinking about results and not operational steps. It's unfortunate that it's so hard to know what this feels like without actually picking up a functional language.
When friends ask me which language to look at, I say "get to know Python, then learn Go" nowadays, mainly because pointers and pass-by-value can be a little hard to understand. Go is a great Python replacement. Web applications and APIs/backends I've particularly enjoyed implementing in it. Haskell is for when you've spent so much time coding in imperative languages that it's all boring, and you want to step into an interesting, but extremely frustrating (at first) new world.
A lot of people say e.g. Haskell would be as easy to pick up if it was your first language. I don't think that's true, if only because of the amount of syntax you have to learn--Scheme is probably better--but I do wish I could go back and try.
I think it would be fair to say that Wolfenstein would qualify as a relatively small system in Haskell. Haskell's own compiler, GHC is a highly complex system and one of the most sophisticated compilers in existence. The source is free, so it's a great way to see what Haskell looks like when used for large applications in the hands of experts.
Haskell is very well suited to large applications in my personal experience. My feeling is that it is actually worse for small-scale applications where you don't need the type system guarantees it provides.
> The source is free, so it's a great way to see what Haskell looks like when used for large applications in the hands of experts.
I tend to disagree with this. Yes, GHC is a large application and most certainly developed by experts, but it's also a project with a very long history, whilst Haskell (language, extension, conventions,...) have changed over the years, and this shows in several places.
Next to that, GHC code isn't very idiomatic at times (e.g. for performance reasons, or because it can't depend on too many external high-level libraries).
Since GHC is bootstrapped by a lower stage compiler I'd imagine there are some limitations to what kinds of language features at least parts of it can use as well. I haven't looked into all the details, but last time I did look into building GHC from scratch I remember there being an initial step where you built an initial minimal Haskell compiler that was written in I think C--, and then used that to build the rest of GHC.
I've been using the xmonad window manager, which is roughly 1000 lines. While not a vastly complex system, it's non-trivial, especially solving things like gui interaction in a side-effect free world. I believe they had to use some sort of zipper structure for the mouse detection to overcome things that are easy in other languages.
I wouldn't say zippers are really "overcoming" anything, they're very nice and elegant data structures, particularly because any data type has an associated zipper form.
Go seems to have been used in lots of networking-related scenarios, but I have yet to see either a major graphical application or even well-known open source game written in it yet.
I would love to see what the architecture of a 3D renderer looks like in Go, as an example.
Like you, I also share the same interest with Haskell.
Laugh as you might, but this is the interface of Datomic, the database system by Rich Hickey, the creator of Clojure.
To perform a query, you connect to a database and request the present state of the database. This is an immutable representation of the database at that point in time, and you can query it however you like, or even hold on to it forever. Of course, the database isn't actually fully copied.
No, it's far more elegant than that. Datomic makes the entire database an immutable, persistent data structure. This means that anyone can grab the database as a value -- at any point in time, past or present -- and never have it change behind their back. There are no read transactions whatsoever and clients have the power to conduct their queries offline. This lets you easily scale read operations without limit. The only bottleneck in the system is a dedicated server known as the transactor which handles the coordination of all updates.
Looks interesting. The docs make it sound like a temporal triple store which you can (lazily) make replicas of.
How is consistency handling performed? Does it rely on knowing my snapshot version and checking for write conflicts (a la oracle SERIALIZED level), or is it based on explicit locking?
There are no snapshots (or you could say that everything is a snapshot); you can get a value of the database from any moment in time, stretching all the way back to the creation of the database. All transactions are performed against a stable view of the database (its value at a moment in time) and either succeed or fail entirely (they are atomic). Once a transaction is validated and sent to storage the new data can be communicated to all clients and cached indefinitely; it will never change.
Datomic has no notion of mutation/destruction. Data is stored in the form of facts which are asserted or retracted. Retracted data can still be retrieved from an earlier moment in time, so it isn't really gone. In many ways, it has a lot in common with DVCSs such as git (though it has only a single authority with commit access: the transactor).
I would rather say that everything is a snapshot - presumably I am working at a particular timestamp level - i.e. I won't get some facts that were valid at Timestamp A and some that were valid at Timestamp B.
My question is if/how you do conditional updates. Say I'm storing some particular concept C as a collection of facts, and I want to update fact C-1 to '15' if fact C-2 is '0'. In an RDBMS I might select for update fact C-2 to make sure it didn't concurrently change underneath me while I'm making the change - is there an equivalent in Datomic? I understand that under normal circumstances reads are completely decoupled from writes, but what if I want to make a write if and only if a certain state holds?
Datomic lets you write a function and send it to the transactor. This is a pure function which runs against the database value (specified) and returns a new database value which becomes the result of the transaction (assuming it succeeds). Because Datomic is also a library in your application, you can run this function on the client side as a dry run (against a value of the database) before sending it to the transactor.
I don't really get the joke. Generally functional languages will share more data than imperative languages, and therefore will copy less data. In an imperative language, it makes sense to copy whatever data you are working on, so your modifications don't inadvertently affect other parts of the program. In Haskell it doesn't.
He's just saying that, in pure functional code, "updating" a data structure means creating a new copy of it with the desired changes instead of actually changing the structure in place. Of course, since the structures are immutable, large parts of them can often be reused. That's not how it's presented to the language user, though.
No, typically functional languages end up copying more, not less, since typical algorithms will e.g. return a fresh copy of the modified structure - the purists detest mutation. The benefits of data sharing, are, imo, overstated outside of a few niche, datastore-y type areas.
At the level of source code logic it appears to be copying and returning a new data structure. That doesn't mean the compiled, optimized code is performing all the copies that the the program appears to be doing.
Pure FP = don't mutate variables in the program. Of course the actual implementation can reuse memory blocks or else pure FPers would have to keep buying new memory!
x is a bound variable. It doesn't 'vary' in the sense that the value it refers to can be mutated, as in a imperative language, but it varies between calls to the function f.
It is rebound. It is improper to call it a "variable" because in any given lifetime, it's value cannot vary. That's the point. It is possible to create a new binding with the same name, but any code holding on to the old instance keeps the old value. Subtle but important difference.
"Varying, in the context of mathematical variables, does not mean change in the course of time, but rather dependence on the context in which the variable is used."
While I agree with your philosophy of vocabulary here, I've read the Harper article too, it's valuable to state directly that everyone knows there's the _(Programming) article too and is simply disagreeing that its a "good" way to use the term. Anyone who isn't familiar with the distinction in terminology ought to consider what things a programming variable represents that a mathematics variable does not..
"The identifier in computer source code can be bound to a value during run time, and the value of the variable may thus change during the course of program execution. Variables in programming may not directly correspond to the concept of variables in mathematics."
It's wrong that it's "improper" to use the term "variable" for Haskell variables, since it is an established usage. This usage of "variable" occurs in the very first paragraph of Church's first paper on lambda calculus from 1931 in the motivation for the whole project!
That's a rather strange assertion, if you mean "variables" and "values." Math distinguishes between variables and values. Variables have different values. Values can be bound to variables. Sometimes we do math with mostly variables and few values -- that's called algebra.
When combined with lexically scoped closures, if you try to shadow your value bindings in a functional language and expect it to work the same way as mutating a variable does in an imperative language, you're in for a surprise.
Sure, you have to restrict the scope in which rebinding is legal, but I don't see how it's fundamentally any different than the use of a phi-node in SSA form.
val it = 1 : int // a is still bound to 1 as far as f() is concerned
Now in Javascript:
> var a = 1;
undefined
> function f(){return a};
undefined
> f();
1
> a = 2;
2
> f();
2
>
If you bind a val, then you define a function that references that val, then you later shadow the val binding, then call the function again, the function still sees the earlier val binding, because it's a closure of the environment at the point where the function was defined, not at the point where it was called. This is unlike variable assignment in imperative languages.
Functional languages generally encourage the use of persistent data structures. Algorithms that modify persistent data structures do not generally return 'fresh copies'. If you use the proper data structures in Haskell or Clojure your 'typical algorithms' will not return fresh copies.
That being said, whether data sharing is a good thing or not depends on the situation. For example, copying to a cache can expose more memory parallelism in an application. I feel like you are taking this opportunity to take a jab at FP here ("the purists detest mutation"). While FP is my preferred paradigm of development, I wasn't trying to push it on anyone--I was merely making a technical point.
You might be surprised just how much churn there is if you look at the source of, for instance, Haskell's Data.Map, which in the name of immutability is implemented as a balanced binary-tree representation, rather than the more common hash table. Many operations that are amortized O(1) with a hash table are O(log N) with a balanced B-tree, since for instance after an update a all the sub-trees around the update site will be made afresh, since the balancing will change.
The balanced binary tree of Data.Map is nowhere near state of the art in terms of persistent data structures. Clojure (and Haskell's unordered-containers library) use Hash Array Mapped Tries (HAMTs) which have a much higher branching factor (32-ary) than binary trees. In practice, this is a neglible difference from constant-time.
I'm sure there is a difference, I just don't care. They're still damn fast and the semantics of immutability would be worth quite a large speed difference to me.
If you do happen to be doing something where the difference is truly going to matter, or you're writing a library anticipating needing that kind of performance, using mutability internally in your functions isn't discouraged as long as they remain referentially transparent: http://clojure.org/transients
That's a good point. People don't understand that functional purity in Haskell, while a nice goal, isn't a law, you can break it if you have a good reason to. IO is sort of the canonical example, it isn't there to prevent you from doing IO, it's there to act as a giant warning sign that either this function, or one of the functions it uses isn't pure.
Likewise mutable data structures while not the default, can be implemented in Haskell, they just have some caveats attached and shouldn't be used lightly. In other words, all things being equal, use the immutable data structure, but if you have a good reason to, you can use a mutable data structure instead, just be sure you know what you're getting into.
The problem is that "all things being equal" never happens because immutable data structures have worse performance if not used in a way that benefits from persistence. IMO, that makes them a poor default.
I'd rather say that 'performance+ convenience/safety-' is a poor default - minute performance improvements matter only in the bottlenecks of your code, so it makes sense to use the varsatile though slightly slower implementations as the default, and have the 100%-speed option require some explicit code as they are needed only in a minority/exception segments of your program.
:) You could say that "all things being equal" never happens because mutable data structures have worse persistence if they're not used in a way that their performance matters. But also, different defaults make sense for different problem domains.
Pretty much everything except performance benefits from immutability and persistence, which is why using anything else as the default seems insane to me.
Clojure's data structures are implemented using trees that have a 32-way branching factor, so each node has 32 children, instead of the 2 you'd see in a binary search tree. A lot of things you might expect to be O(1), like checking for membership in a set, or getting the nth element of a vector, actually take O(log_base_32(n)).
Ah, I see. That doesn't matter in big-O notation, though, because log_32(n) = log_2(n) / log_2(32), that is the difference between two bases is a constant factor.
I was a bit confused because O(log_32(n)) = O(log(n)) = log(32)+log(n) = O(log(32n)), so no matter how I parsed it I would just get O(log(n)) :)
Also, C++, which is the king of mutability, also implements map as a tree instead of a hash table, probably trying to avoid the worst-case linear search.
The standards committee could not come to a complete agreement on hash tables and therefore it was not a part of C++03. Hash tables were in consideration along with the tree based map. However, maps made it in time but hash tables didn't. It was not one versus the other. They wanted to have both and only one was ready on time.
Eventually by the time they came to an agreement on hash tables (C++11), a lot of vendors had their own versions of it as non-standard additions to the library. To avoid conflicts with these existing hash tables, they named it "unordered_map".
The C++ standard places a number of requirements (ordered, stable iterators/references, logarithmic operations) on std::map which really only lines up with balanced binary trees. As another poster pointed out, C++11 introduced std::unordered_map (A hash table).
Of course, there's Data.HashMap.HashMap[1], which is just a Data.IntMap.{Lazy,Strict}.IntMap[2], whose "implementation is based on big-endian patricia trees".
> Many operations have a worst-case complexity of O(min(n,W)). This means that the operation can become linear in the number of elements with a maximum of W -- the number of bits in an Int (32 or 64).
While I agree that a functional implementation of something usually copies more, it's a little disingenuous to say that most algorithms return a "fresh copy" of a structure. Most of the collections in most of the functional runtimes use clever tricks (hash array mapped tries, hash consing, etc.) to avoid copying the whole structure.
In the face of these techniques (and others, e.g. re-structuring things to use zippers and being clever manually or whatever), I suspect the actual amount of redundant data might be quite a bit less than you might think.
What kind of evidence do you have for this claim? The amount of copying in a functional or imperative program depends on how they're written, how they're optimized, and how they've compiled or evaluated.
For example, in programs with mutable state, it's not uncommon for modules to return deep copies of objects to avoid mutation-related bugs. Java's String objects are immutable precisely to avoid having to do this.
> Generally functional languages will share more data than imperative languages, and therefore will copy less data.
It's the opposite: functional programs prefer immutable data and the only way to achieve this is to copy a lot more data than you would if you were just overwriting existing objects.
It's not so black and white, if you don't actually need multiple versions of a data structure, then you are strictly copying more. There are scenarios where you do want versioning, of course, but this is why its good to have choices.
AFAICT he doesn't reach any particularly satisfying conclusions and just speculates that Haskell is an interesting avenue of research that may bear fruit for game development.
It doesn't seem like he has put the same amount of effort in experimenting with Lisp. He doesn't mention any attempt to port Wolfenstein over to Common Lisp. Instead he seems content speculating from the same position many Lisp doubters have after reading a few books and working on some exercises (which is ironic considering his impetus for the Haskell project). I hope he gave Lisp the same treatment as Haskell before he drew any conclusions but it doesn't seem like he has from this speech.
Lisp for game development could be an interesting avenue (and has precedent in AAA console development). The dynamic vs. static argument isn't the interesting feature. Personally I think the symbolic model of computation is far more compelling. I've read posts by programmers who've written a high-level language for writing financial trading algorithms in Common Lisp that compile down to optimized VHDL for running on FPGAs. Sure you don't have a static analyzer to tell you you've done something wrong before you run your programs but I've rarely seen that becoming an issue in practice at that level. There are plenty of Common Lisp libraries that have been around for a long time that don't require much maintenance which makes me wonder where this belief that dynamic languages don't produce solid, maintainable code comes from.
In my rather limited experience I find the over-specification required by statically typed language to be a impedance to writing robust, compose-able software (at least it's much more difficult and tends to lead to Greenspunning if you try to go that route).
Either way... a very interesting talk and it's cool to hear that he's experimenting with this stuff. Carmack is in a rare position to have such a breadth of experience and deep technical knowledge that even just messing around with this stuff might make waves throughout the industry.
If you'll go read his opinions, you'll see that he has been a strong advocate for static analysis. His position is very understandable since he spent a lot of time working on complex pieces of C/C++ code, encountering a lot of subtle accidental errors that could have been avoided with better tools for static analysis or with a better language. He also worked mostly on client side software, where the fail fast / fail hard strategy of projects built in dynamic languages doesn't work so well.
Lisp vs ML vs others is a very subjective issue since languages involve different kinds of trade-offs, so optimal choices depend on the project or on personal style.
Also, this is John Carmack. For me he was a God 10 years ago and he's still one of the best and most practical developers we have today. Seeing him talk about Haskell is amazing.
BTW, I've been a big fan of Scala lately, and I don't see myself using much Haskell precisely because its tools for modularity seem limited. In Scala you can use OOP to build abstract modules, with the much hyped Cake pattern being based on that. Dynamic languages are naturally more modular, however I still prefer static languages.
Haskell's tools for modularity cut in a dimension you're not used to, and if you're not experienced with it, won't see. Percentage-wise in the last 15 years I haven't spent that much time in Haskell, but it's Haskell's tools I'm missing more often in imperative languages than the other way around. (And note the "more often"; it's not that I never miss more conventional tools in Haskell, it's just that I spend more time missing Haskell tools in other languages.)
I would think it is because of how "modules" can effectively be "bolted" on top of existing things quite easily.
Consider putting a new "module" on a bicycle. If you were doing it statically, it would have to have the screws in the proper place to be able to attach as it needs. Dynamically, however, you just use a zip tie to hold your piece onto the bike.
To be sure, if you buy a nice bike, many of the common attachments have "static" points where things can be added. If you want to place a holder for a phone, however, you are much more likely to do something that is much more adaptable at fitting things on.
Yeah, Carmack has weight due to his history of being incredibly technical and writing lots of incredibly useful code. Also, he has a huge profile in the industry.
He's been a heavyweight proponent of static typing for a while - 2-3 years IIRC - and so it doesn't totally surprise me that Haskell is up his alley. It's particularly notable that he prefers it as a technical lead as a low-pass filter on bugs.
Thank you so much for that second link! I've been interested recently in how dependently typed languages can be made more natural for general-purpose programming, and this looks like a really good read. It gives me hope that it won't be too long before there's a dependently typed language comparable to Haskell in terms of community and ecosystem - in my (limited) experience, Agda and Idris are awesome for theorem proving and for small programs, but they're both a ways off from being usable for large, complex programs, and I feel like a big part of what they're lacking is just well-documented libraries for common operations.
When Carmack speaks, I do listen. He's a very good programmer and has the level of experience and technical expertise few will get to enjoy. I think it's awesome to see how far he has come from being a C purist, to C++, and now nice functional languages like Haskell.
Already mentioned in previous threads, but Tim Sweeney (Epic Games), who made a long presentations about the benefits of FP around 2006, is presenting a paper about a formalism related to dependent typing http://lambda-the-ultimate.org/node/4791
I can't watch his talk right now, but at QuakeCon he recently discussed Scheme, and the potential for embedding it in a game. I think he also pondered what Quake would have been if QuakeC was based on Scheme instead.
But my basic issue with talking about Carmack with respect to programming languages is that he never struck me as an expert on languages. Or even programming for that matter. The source code to Quake was on the messy side of things. His rationale for switching to C++ always seemed a little on the thin side of things, and seemed more like industry pressure than anything. He's great on concepts, on graphics, and keeping up with OpenGL and whatever NVidia/ATI are doing. But he's just now coming around to Scheme and Haskell. I don't really think his perspective on languages carries that much weight to be honest. But I'll listen to him talk about it anytime.
As an intermediate haskeller (I use it for my business every day) his criticisms largely mirror mine. I found the quake codebase to be great in the 90s. I have also found him more than most other developers to admit when thing didn't work out, specifically that doom 3 was c++ in name only and he didn't fully understand it. A scheme based language would have been better for most modders than quakeC.
That's one of the nice things about the Haskell community. The experts tend to be much, much better than in other places. There's much more depth there than memorization of quirks and ability to hold large numbers of arbitrary, disconnected pieces in your head all at once.
So, while becoming an expert haskeller is enviable, intermediate still suggests an ability to tackle a
Fantastic array of problems.
Indeed - it's definitely one of the best places to be if you like to be surrounded by incredibly talented, but also very humble folks.
In fact I'd go so far as to say that the pool of wicked smart people who would love to build stuff in Haskell represents a significant opportunity to start ups right now. How have you found hiring for your company?
Well while I can produce large swaths of elegant code when I need crazy instances I usually call upon my colleague for his invaluable assistance. I don't claim to be an expert in the language. Many people know much more about it than me.
I've been trying to come up with a general way to rank Haskell experience, and so far I've got:
Beginner: Can read the basic syntax, knows about E.G. Monads and folds
Intermediate: Understands and can use the more advanced classes like Applicative, Arrow, lenses, and how to use things like fix
Advanced: ??? Makes new powerful classes/libraries like Pipe, Conduit, or Netwire? Can easily translate a problem domain into a concise and elegant type representation.
Maybe there needs to be some more layers in there, I don't know... also the advanced level feels weak to me. I'm still somewhere between beginner and intermediate myself.
Well I don't claim to understand the mathematical basis of Monads, and I don't use arrows and just started using applicative more often. Lenses rock and I have used those for a long time. I do build my own Monad Transformers so maybe I am still a beginning with intermediate tendencies. :-)
What I meant by that is that he doesn't use what most people qualify as "best practices" today. At least per the language he uses. If you look at his (or id Software's, rather) C++, it doesn't follow anything resembling standards in that area. They don't use design patterns, there are straight C (stdio, for example) calls in C++, etc.
I'm not saying that's bad. But it's obvious he doesn't keep on top of trends that well. Or languages, for that matter.
An outside perspective is valuable in its own right, and especially when it comes from a widely respected programmer. I think it's valuable to have both long time expert users, intermediate users and relatively new users chime in on issues: the experts often lack a certain perspective, since it was so long ago that they were novices themselves. And of course the intermediate and novice users lack experience. Together they can hopefully complement each other.
I think that's part of why people are so interested in his opinion in this case. The games programming industry has some very unique challenges, and the focus on squeezing every last drop of performance out of the stuff you write tends to make it a pretty brutal environment for programming languages. From his background as a voice of authority in the games programming industry it's interesting to hear his feelings on the merits/drawbacks of something like Haskell specifically from that perspective.
The fact that he's not a long time user, or particularly a fan of functional programming, but a relative newcomer to it just adds more weight as he isn't some known fanboy that's going to praise FP no matter what. He's got extensive background in C, and at worst, average C++ skills if not considerably better than that, so he makes a fairly good case study of how practical it is for a non-functional programmer (but someone who's really experienced with imperative/OO programming) to pick up something like Haskell and actually do something useful with it.
It's also good to listen to the things he doesn't like about Haskell, E.G. debugging, and see if anything can be done to improve that.
Languages talk about being multi-paradigm as if it's a good thing, but multi-paradigm means you can always do the bad thing if you feel you really need to, and programmers are extremely bad at doing sort of the the time-scale integration of the cost of doing something that they know is negative. I mean everyone will know...it's like "this global flag: this is not a good thing, this a bad thing - but it's only a little bad thing" and they don't think about how, you know, the next five years how many times that little bad thing is going to effect things. So brutal purity: you have no choice.
In particular, I like the bit about integrating your technical debt function with respect to time, over the time you have to live with the debt, and how bad programmers are at thinking in those terms. We tend to think about how much technical debt we have at a given fixed point in time, but the area under the curve is what bites you.