Maybe it's just me, but this looks like a bunch of incomprehensible gibberish code. I'm sure I could understand it if I spent many hours pouring over it, but why would you ever do that? When written out in Python (or even in JS!), the whole thing is so much simpler looking and more closely resembles the underlying math. This is especially true if you use a package like Numpy. I know he said he didn't want to use any libraries, but if the underlying primitives in the problem are vectors/matrices, then it seems like you are reinventing the wheel in a very substandard way that doesn't aid in understanding in any way and results in something that isn't beautiful, isn't high performance, and is confusing for someone to read-- even if the person is familiar with the subject matter!
but if the underlying primitives in the problem are vectors/matrices, then it seems like you are reinventing the wheel in a very substandard way that doesn't aid in understanding in any way and results in something that isn't beautiful, isn't high performance, and is confusing for someone to read
You mean like...both of the languages you listed?
There's an obviously superior, faster, simpler language when working with vectors (APL), but people are obsessed with new languages.
If you really think it can be done in Python better than in Haskell, why not demo it in Python? You'll get internet points, and if you're right, you'll have something to show for it.
You should be able to wget the file and run it (Python 3) from start to finish without any set-up and get ~88% accuracy on the test set.
It uses all the data (not one-sixth like in the blog posts) and does 200 iterations by default, so here's the loss plot on the training set if you want to skip all the fun: https://i.imgur.com/F57zmXV.png
This will probably shock you not at all, but I find this way easier to read than Python code. You can’t really compare with Numpy code since the author deliberately avoids loading libraries.
Of course, I’m more familiar with Haskell. The fact you’re more familiar with imperative languages isn’t the argument for readability you think it is.
If someone told me they wrote a neural net in Haskell, my first reaction would be "wow that's Cool!", after all I don't think I can do that. I wouldn't care if the code was gibberish.
I've never found a functional programming language that's easy to read.
The best way to understand functional programming without learning it is to learn an imperative language like Ruby that everyone says is easy but that is actually hard. Then it's:
programming you know -> experience you don't have yet -> code everyone uses
Ruby makes the jump to the last step without explaining the middle one. That's what the whole convention over configuration part is about.
Functional programming languages do the exact same thing. All of those arrows and symbols and syntactic sugar transpile directly to Lisp. They're basically shorthand. Unfortunately, I've never seen tutorials talk about that translation.
To get the middle part, it's probably best to start with the low-hanging fruit. Probably learn something like spreadsheets, then Scheme, then ClojureScript, then F#. I never made it as far as Haskell or Scala.
I always get lost somewhere around monads and impurity. And all FP languages fall down at that point in similar ways anyway. You either treat mutable variables as imaginary numbers that aren't examined until they must be (lazily), or throw the rules out the window and let variables be reassigned or renamed to themselves with new values, which breaks the whole point of using FP in the first place. It's pretty much an open problem, and the failure to solve it satisfactorily is why no FP language has caught on yet in the mainstream IMHO.
I love haskell as much as the next PL nerd, but the community has a real code golf problem. An example from the blog post:
deltas :: [Float] -> [Float] -> [([Float], [[Float]])] -> ([[Float]], [[Float]])
deltas xv yv layers = let
(avs@(av:_), zv:zvs) = revaz xv layers
delta0 = zipWith (*) (zipWith dCost av yv) (relu' <$> zv)
in (reverse avs, f (transpose . snd <$> reverse layers) zvs [delta0]) where
f _ [] dvs = dvs
f (wm:wms) (zv:zvs) dvs@(dv:_) = f wms zvs $ (:dvs) $
zipWith (*) [(sum $ zipWith (*) row dv) | row <- wm] (relu' <$> zv)
...
descend av dv = zipWith (-) av ((eta *) <$> dv)
learn :: [Float] -> [Float] -> [([Float], [[Float]])] -> [([Float], [[Float]])]
learn xv yv layers = let (avs, dvs) = deltas xv yv layers
in zip (zipWith descend (fst <$> layers) dvs) $
zipWith3 (\wvs av dv -> zipWith (\wv d -> descend wv ((d*) <$> av)) wvs dv)
(snd <$> layers) avs dvs
Writing this in 2-3x as many lines with clear variable names for some intermediate expressions would make it so much clearer. Haskell has a nasty reputation for "you have to study the shit out of it to make heads or tails of the code" and I'm pretty certain that 90% of it comes from how terse haskellers try to make code.
Just add intermediate expressions and annotate their types, maybe even with some type synonyms for intermediate types, because code is for humans.
All these building NNs in haskell, ocaml or another language solving an extremely simple problem which is ok if you want to just have some fun. But if the proponents are really serious they would put out a detailed tutorial solving a complex task (e.g building a state-of-the-art language model) showing the ease / difficulty of the process, and how using a typed functional language helps in debugging the model - which is supposed to be the biggest selling point of these languages? This will also show whether it is viable for a real life practitioner to invest time in learning these languages / frameworks.
People have done this but with hasktorch. Which are haskell bindings to the pytorch C++ libraries. The static typing definitely helps when you want to ensure that the input/output shapes of a layer are consistent.
Learning the basics has its place. People have to start somewhere. A tutorial for more complex, real world problems is a totally different thing. Both are interesting in their own ways.
If you have no background, why would you want to learn the basics in a language no one uses rather than the dominant language / framework of the specific domain? Seems to me an extra cognitive load to bear.
I am learning the basics of neural nets using Haskell because it’s far easier for me to reason precisely and mathematically with Haskell than with a referentially opaque and imprecise language such as Python.
Because it only takes one user to make language more popular. For example, Grammarly's first developer used Lisp as his main language, their linguistics is still written in a Lisp-family lang.
As a Haskell developer, I'd be more comfortable to read Haddock-generated documentation for types of symbols, unfortunately it doesn't have too many good quality libraries in the ML field yet.
I have written quite some Haskell in my life, but never a line of python. For me this tutorial makes sense, it shows the underlying maths. So I don’t really understand your negative attitude.
I have been casually playing around with TensorFlow for Swift since it was announced. It shows promise but I question long term support and development.
The Julia language with the Flux deep learning library is another very interesting but not mainstream path to take.
I hope there is no long term support problem, but I don’t know. I am sort-of addicted to Lisp languages, but I admit there is a lot to like about Swift, and TensorFlow turtles all the way down in Swift is a great idea.
I think that Hasktorch is a very cool project but it is not turtles all the way down: it is a Haskell API on top of PyTorch.
The Haskell bindings for TensorFlow are a little bit difficult for me to work with. When HaskTorch gets more mature and stable, it will hopefully be easier to use than the TensorFlow bindings.
Nice. I really recommend that practitioners implement simple neural architectures from scratch for learning, but use TensorFlow, PyTorch, mxnet, etc. for production and serious research.
New frameworks like TensorFlow for Swift and Julia’s Flux are a little easier to understand if you read the code, but still complex stuff.
played with this for ~5 mins and it's insanely bad (maybe it "guessed" right 3/15 tries?) i.e. slightly better than random guesses, even with very clear handwriting :(
Are you talking about the "Sample" button, or drawing the digits yourself? It seems to have a very good accuracy on the samples, but gets a lot of my hand-drawn digits wrong.
Seems like a classic case of overfitting to be honest.