More

Dn_Ab · on Dec 18, 2018

As others have said, F# interesting language features are computation expressions, active patterns, units and type-providers. The library, platforms and ecosystem benefits are gravy. Though subjective, the syntax is clean too, being somewhere between an ML and Python.

Something that no one has mentioned yet is that F# is now among the fastest functional first programming languages. At least according to (take with a grain of salt) benchmarks like [1] and https://www.techempower.com/benchmarks/

[1] https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Dn_Ab · on June 5, 2016

I did not downvote (and you're clear now) but your post is not a relevant argument. The determinism that the SFWT is arguing against is that of certain hidden variable theories of quantum mechanics. It states that if the humans are free to choose particular configurations for an experiment measuring this or that spin, then bounded by relativity and experimentally verified aspects of quantum mechanics, the behaviors of the particles cannot be dependent on the past history of the universe. The main characters are the particles, people are incidental.

> "Our argument combines the well-known consequence of relativity theory, that the time order of space-like separated events is not absolute, with the EPR paradox discovered by Einstein, Podolsky, and Rosen in 1935, and the Kochen-Specker Paradox of 1967"

So as far as I can tell, it takes for granted the humans' ability to choose the configurations freely, which though suspect in of itself doesn't matter so much to their argument as it's not really an argument for free will, it's a discussion of how inherent to quantum mechanics non-determinism is.

> "To be precise, we mean that the choice an experimenter makes is not a function of the past."

> "We have supposed that the experimenters’ choices of directions from the Peres configuration are totally free and independent."

> "It is the experimenters’ free will that allows the free and independent choices of x, y, z, and w ."

It is actually, if anything, in favor of no distinction between humans and computers (more precisely, it is not dependent on humans, only a "free chooser") as they argue that though the humans can be replaced by pseudo random number generators, the generators need to be chosen by something with "free choice" so as to escape objections by pendants that the PRNG's path was set at the beginning of time.

> The humans who choose x, y, z, and w may of course be replaced by a computer program containing a pseudo-random number generator.

> "However, as we remark in [1], free will would still be needed to choose the random number generator, since a determined determinist could maintain that this choice was fixed from the dawn of time."

There is nothing whatsoever in the paper that stops an AI from having whatever ability to choose freely humans have. The way you're using determinism is more akin to precision and reliability—the human brain has tolerances but it too requires some amount of reliability to function correctly, even if not as much as computers do. In performing its tasks, though the brain is tolerant to noise and stochasticity, it still requires that those tasks happen in a very specific way. Asides, the paper is not an argument for randomness or stochasticity.

> ” In the present state of knowledge, it is certainly beyond our capabilities to understand the connection between the free decisions of particles and humans, but the free will of neither of these is accounted for by mere randomness."

jsprogrammer · on June 5, 2016

If an AI is an algorithm, then it will be unable to produce "answers" to what we observe. That is the relevance. One would need to show a contradictory example to the theorem to ignore it.

>There is nothing whatsoever in the paper that stops an AI from having whatever ability to choose freely humans have.

There is if an AI is dependent on deterministic methods. I agree that AI is not a well-defined term, but all proposals I have seen are algorithms, which are entirely deterministic. This is entirely at odds with the human conception of free choice. An algorithm will always produce the same choice given the same input. Any other behavior is an error.

The SFWT says that observations can be made that cannot be replicated through deterministic means, which would seem (I agree there is a very slight leap in logic here) to rule out any AI from ever being able to understand at least some aspects of our reality (and also reveals them to be simple, logical machines, with no choice).

bllguo · on June 6, 2016

Algorithms are not by definition deterministic, which seems to be one of your key points. Probabilistic algorithms exist. They may or may not be used in machine learning currently, but they do exist.

jsprogrammer · on June 6, 2016

Can you provide an example? All probabilistic algorithms I have seen rely on a pseudo-random generator a rely on an external source if numbers. I have argued elsewhere in these comments that both cases my be considered deterministic.

Dn_Ab · on May 26, 2016

Wow, your idea is heading incredibly towards the same general direction as https://en.wikipedia.org/wiki/Sparse_distributed_memory described there. Excellent!

dclowd9901 · on May 27, 2016

Oh, surprising, I'd never heard of this before. I'm definitely going to dig into the concepts a bit more. Thanks for the link!

Dn_Ab · on Feb 27, 2016

One can view RNNs as a sort of generalization to markov chains. RNNs have the advantage of a memory, context tracking and are not limited to learning patterns of some specific length. RNNs can apply these advantages to learn subtleties of grammars, balance parenthesis, the proper use of punctuation and other things that a markov chain might never learn (and certainly not memory efficiently). For any given piece of text, RNNs can be said to have gotten closer to understanding what was consumed.

The other question is, are those difficult to learn things truly worth the cost of training and running an RNN? If a fast and simple markov chain serves, as is likely the case in practical settings, then it is better to go with the markov chain. The RNN will still make obvious mistakes, all while correctly using subtle rules that trouble even humans. Unfortunately, this combination is exactly the kind of thing that will leave observers less than impressed: "Yes I know it rambles insensibly but look, it uses punctuation far better than your average forum dweller!" Alas, anyone who has gone through the trouble of making a gourad shaded triangle spin in Mode X and proudly showing their childhood friends, can explain just what sort of reaction to expect.

Eh, so, the moral here is pay attention to cost effectiveness and don't make things any more complicated than they need to be.

Yoav Goldberg treats much the same thing as this blog post but with far more detail and attention to subtlety here: http://nbviewer.jupyter.org/gist/yoavg/d76121dfde2618422139

contravariant · on Feb 28, 2016

Viewing RNNs as a generalisation of Markov chains ia a bit confusing, because what you're calling a Markov chain isn't really a Markov chain in its most general form.

The one characteristic a Markov chain must have is that the transition probabilities are completely determined by its current state. This property is true for both RNNs and what you call Markov chains. The main difference is that the state space for RNNs is a lot bigger and better at describing the current context (it was designed to be).

joe_the_user · on Feb 28, 2016

Formally, there is no limit to the number of states in a Markov chain.

So in this sense, actually a RNN is a kind of Hidden Markov Chain - one with more structures added to it. The structure might an RNN better than an HMM but it doesn't make it more general, it makes it more specific.

Dn_Ab · on Nov 28, 2015

It's not a report, it's an almost 400 page book. He doesn't compare random energies, instead he looks at what, under reasonably generous conditions, the daily power budget each energy source could provide per person. The very generous 20kWh/d of power per person is the key point. 6 m/sec is already high, with few places reaching such speeds consistently. And almost no one will reach double that, so you can look at an optimal 17 W/m^2 for wind. http://web.stanford.edu/group/efmh/winds/global_winds.html

In chapter 25, he acknowledges that while the cost of photovoltaics will fall, he does not see it doing so in a timeline that will be useful in terms of getting everything deployed for a ~2050 deadline. Economically speaking, carpeting deserts with concentrating collectors will be the cheaper of the solar options. The book is careful about doing all the math, citing all its sources and carefully explaining the scenarios it models. It is a very good book[+].

But cost is not the only issue—even as prices fall, there is still the problem of land use area. Efficiencies aren't going to pass 30% (without going to much more expensive materials) and for mass production, we can halve that; cheap as panels may someday become, places with high pop densities (on top of seasonal variations/not being near the equator) are going to have trouble meeting their needs. Especially if they don't want to get rid of their curling irons, hair/clothes dryers, toasters and electric stove/kettles. But panels/turbines aren't the whole picture.

Already today, panels take up only a fraction of the cost of solar. You ideally, want an MPPT controller. You might need voltage regulators, you'll need a rack for the panel and batteries, an appropriately sized inverter, wiring and installation. Batteries—to save more money long term—you want to oversize them so you rarely hit a low depth of discharge. But more batteries means more panels. You also want enough batteries such that you can wait out ~4 days of low light (speaking from experience, on cloudy days you can go the entire day at ~13% typical amp output). Even those at the equator will only get ~6 good hours of sunlight (~8 hours for an appreciable amount), so even for the best case scenario, 12 hours of storage per person is not going to cut it. Solar is great but it's no panacea. And the math doesn't work out for chemical energy storage. Molten salt storage, compressed air look to be more logical at the grid level but even they won't be sufficient.

That said, Mr Theil is also incorrect to place Nuclear in opposition to renewables. Renewables will be in addition to Nuclear [-]. As well as looking into more DC appliances, more HVDC and working out circuit breakers for them, optimal manufacturing layouts such that 'waste output' can be redirected to where it is needed. More energy efficient devices, energy routing algorithms (and a global grid of superconducting HVDC while we're at it—seems far fetched but still at a much higher technological readiness level when compared to fusion), better city planning, climate control with geothermal heatpumps, more material reclamation and recycling, nuclear waste as fuel, carbon capture, extracting CO2 from the ocean for fuel and a cultural move away from an over consuming disposable society.

[+] I am biased in that I'd already known the author for one of the best free books on information theory and machine learning. Anyone interested in the link between learning, energy and thermodynamics should see this book as a starting point. http://www.inference.phy.cam.ac.uk/itprnn/book.pdf

[-] Ch. 24 of sewtha.pdf goes into numeric data backed detail on why most build out, waste, cost arguments against nuclear are weak. Personally, I think at best, we only have a couple hundred more years where we can all be justifiably irrationally paranoid over Nuclear. We should have DNA repair down by then.

Dn_Ab · on Nov 27, 2015

Ah, I was confused for a second—I'd thought Markov was a library, but you meant the markov assumption—the topics are actually loosely related and orthogonal. Your excellent looking library deals with reinforcement learning agents that model environment/agent interactions as a (PO)Markov Decision Process, where as the Alchemy library combines FOL with network representations of particular (satisfying certain markov properties) probability distributions to perform inference.

More pertinent to your post, Sutton's working on an updated RL book here: http://people.inf.elte.hu/lorincz/Files/RL_2006/SuttonBook.p...

If anyou have the time, Chapter 15 (pdf pg 273) of the above link is a fascinating read. In particular, TD-Gammon had already achieved impressive results using NNs in the early 90s; reaching world class levels in Backgammon with zero specialized knowledge.

Dn_Ab · on Oct 20, 2015

This is a medical case study however, not a journalistic piece and you two are being at least a little bit unfair, I think. Since seizures and sudoku are not a common combination, upon seeing the title, I assumed it was something conditional. This sort of Crucifix Glitch—ahem, environmental epilepsy—is also very uncommon and usually genetic, which makes this all the more interesting.

Here, it seems inhibitory circuits in a section of the right parietal lobe were damaged; without dampening, as with any feedback system, the system quickly goes out of whack. What's interesting here is that in this patient, the only activity that seems to generate a pattern resulting in such over-excitation is playing sudoku. But surely that's not the only Visuospatial task he partakes in, so why? All we're left with is: "Our patient stopped solving sudoku puzzles and has been seizure free for more than 5 years".

Dn_Ab · on Oct 15, 2015

I agree that this wasn't done by the computer (did computers uncover the Higgs Boson?) but I also do not believe humans can take most of the credit: this was the result of a Man Machine System team up—trying to disentangle credit assignment is not a worthwhile activity. Roughly and from a quick reading of a paper thickly frosted with jargon I am unfamiliar with, the method works by creating networks—which highlight key relationships—for visualization by searching for stable clusters in a reduced dimensionality space of the variables.

Humans are there to explore the visualizations, interpret the network structures and understand the clusters and variables. The machines are intelligent too; they do the heavy work of comparing large numbers of points in a high dimensional space, factorization and searching for a way to express the data in a manner that makes it easier to uncover promising research directions and hypotheses.

Scanning this, it seems the most valuable contribution are their network visualization and exploratory tools. I think they should be proud of those and see no need to stretch so mightily to connect this to Stronger AI. As Vinge notes, "I am suggesting that we recognize that in network and interface research there is something as profound (and potential wild) as Artificial Intelligence."

http://www.nature.com/ncomms/2015/151014/ncomms9581/full/nco...

Yomammas_Lemma · on Oct 15, 2015

>I agree that this wasn't done by the computer (did computers uncover the Higgs Boson?) but I also do not believe humans can take most of the credit: this was the result of a Man Machine System team up

You realize that they're using software made by a team of mathematicians and software developers, right? If you want to give credit to the software, give credit to the people who wrote the code and discovered the mathematics. This isn't any different than how physicists would use Mathematica.

Dn_Ab · on Sept 21, 2015

All the things you mentioned (plus e.g. bayesian networks and Restricted Boltzmann Machines) are examples of Graphical Models. You can roughly think of (linear chain) CRFs as being to HMMs as logistic regression is to Naive Bayes. HMMs and Naive bayes learn a joint probability distribution on the data while Log Reg and CRFs fit conditional probabilities.

If none of that makes sense then, basically, in general and with more data, the CRF (or discriminative classifier) will tend to make better predictors because they don't try to directly model complicated things that don't really matter for prediction anyways. Because of this they can use richer features without having to worry about how such and such relates to this or that. All this ends up making discriminative classifiers more robust when model assumptions are violated because they don't sacrifice as much to remain tractable (or rather, the trade off/sacrifice they make tends to end up not mattering as much when prediction accuracy is your main concern).

So in short, you use a HMM instead of a Markov Chain when the sequence you're trying to predict is not visible. Like say when you want to predict the parts of speech but only have access to words, you'll use the relationship between the visible sequence of words to learn the hidden sequence of Parts of speech labels. You use CRFs instead of HMMs because they tend to make better predictors while remaining tractable. The downside is discriminative classifiers will not necessarily learn the most meaningful decision boundaries, this starts to matter when you want to move beyond just prediction.

Dn_Ab · on Aug 26, 2015

Time reversibility exists in quantum mechanics because observables are self adjoint operators. Closed systems evolve unitarily. In simpler terms, you can think of it as the requirement that maps preserve distances and are easily invertible. We need this so that the information describing a system (which we can still talk about in terms of traces), remains invariant with time. In the classical sense, the corresponding violation leads to probabilities not summing to 1! We clearly can't have information shrink and for pure systems, dropping distance preserving maps leads to a really awesome universe (I believe this also ends up highly recommending L2). We literally go from a universe that is almost certainly near the bottom end of the Slow Zone of Thought to the Upper Beyond (https://en.wikipedia.org/wiki/A_Fire_Upon_the_Deep#Setting). We gain non-locality, causality violations and powerful computational ability.

In practice, our confusion about a system does increase with time as classical systems become ever more correlated, losing distinguishability, aka decoherence.