Windows 8 (Metro) used semantic zoom. It's been a while, but I do remember that one of the apps that used it very nicely was Photos. A search for "windows metro semantic zoom" comes up with lots of articles about semantic-zoom-aware GridView controls etc.
Why isn't it commonplace? I think that touchscreen laptops are still too much a minority, and keyboard + mouse + monitor are too entrenched, for anyone to seriously attempt it again for a while. (A shame -- I'm one of the few who really liked the Windows 8 Metro interface.) I think that phones are too small for it to really work well. I don't know why it's not more popular on tablets.
That xkcd comic highlights the problem with observational (as opposed to controlled) studies. TFA is about A/B testing, i.e. controlled studies. It’s the fact that you (the investigator) is controlling the treatment assignment that allows you to draw causal conclusions. What you happen to believe about the mechanism of action doesn’t matter, at least as far as the outcome of this particular experiment is concerned. Of course, your conjectured mechanism of action is likely to matter for what you decide to investigate next.
Also, frequentism / Bayesianism is orthogonal to causal / correlational interpretations.
I think what kevinwang is getting at, is that if you A/B test with a static version A and enough versions of B, at some point you will get statistically significant results if you repeat it often enough.
Having a control doesn't mean you can't fall victim to this.
AB tests are still vulnerable to p-hacking-esque things (though usually unintentional). Run enough of them and your p value is gonna come up by chance sometimes.
Observational ones are particularly prone because you can slice and dice the world into near-infinite observation combinations, but people often do that with AB tests too. Shotgun approach, test a bunch of approaches until something works, but if you'd run each of those tests for different significance levels, or for twice as long, or half as long, you could very well see the "working" one fail and a "failing" one work.
I'm trying something similar with an introductory Algorithms class.
After we go through Breadth First Search, there's a practical assignment where students are asked to modify the algorithm to return _all_ shortest paths. Then I ask ChatGPT for its solution, and students try to spot its mistakes.
Later, after going through the proof of correctness of Dijkstra's algorithm, I ask ChatGPT for a proof of correctness of its all-shortest-paths algorithm, and again students try to spot what's wrong in the proof. I want students to learn to tell the difference between a bullshit proof and a real proof; in the past I've given them bullshit proofs from real students in exams, but ChatGPT makes the point more nicely.
Finally, students are asked to figure out prompts that will make ChatGPT give a correct algorithm and proof. I haven't managed this myself! I'm looking forwards to seeing what students manage.
> Finally, students are asked to figure out prompts that will make ChatGPT give a correct algorithm and proof. I haven't managed this myself! I'm looking forwards to seeing what students manage.
Isn't there a probabilistic nature to ChatGPT replies? So even if a student finds a response that gives a correct proof, that doesn't mean it'll work every time. Or am I wrong here?
You're right, ChatGPT is probabilistic. None of this is graded by the way -- it's all just for fun and bragging rights.
I've asked students to share their full dialog, both prompts and replies, so the whole class gets to see; and I'll invite one or two to talk through their attempts. This is all just a trick to make students engage with "how do you you spot bugs in a proof?", hopefully more than they would from just reading CLRS! Often, students engage well when they're hearing the material from other students.
Aside from a temperature of 0, which always results in the same completion, the details and translation examples (aka, in-context learning, few-shot) can force very reliable results, say, 8/10 times, meaning a sample-and-vote gives consistent results when the temperature is non-zero.
Edit: I was not in any way rude nor saying anything incorrect.
If you want to see how to do what I’m talking about, here’s an almost finished article describing the above:
I tried... I pointed out a problem and asked ChatGPT to fix it, unsuccessfully. I asked it for a proof of correctness, then pointed out a problem in its proof and asked ChatGPT to fix it, again unsuccessfully. (It's all in the notes I linked to.) Perhaps I'm just crummy at prompt engineering; or perhaps this is one of those questions where the only way to engineer a successful prompt is to know the answer yourself beforehand.
I've also had this issue multiple times where ChatGPT provides a flawed answer, is able to identify the flaw when asked but "corrects" it in such a way that the original answer is not changed. I've tried this for code it wrote, for comments on my code and for summaries of texts that I provided.
I can’t tell if people just don’t understand how ChatGPT works or if there is another reason they are eager to dehumanize themselves and the rest of us along with them.
I am aware no learning is going on live during the discussion with ChatGPT, nor are the mechanisms that lead to the similar outcome even remotely similar.
I also don't think humans are less human just because machines started making mistakes similar to human ones.
But I do see this similarity as a reminder that machines are becoming more human in an accelerating way.
> in the past I've given them bullshit proofs from real students in exams
I'd be so honored to be one of these proofs. "Your wrongness is an elegant balance of instructive error and subtle misunderstanding. Can I save it for posterity? (c-)"
I’m in the same boat as you — sensitive teeth, but can’t not rinse. What works for me is to treat the toothpaste as a “post-rinse medicated rub”: squeeze out a bit more, and rub it over the most sensitive teeth.
> Please, please: learn some probability via measure theory. You’ll start reading machine learning papers wondering how people ever express themselves precisely without it. The entire field seems to be predicated around writing things like x ~ p_\theta(x|z=q_\phi(x)) as if that’s somehow meaningful notation.
Hear hear! How did ML get saddled with such awful notation?
How many papers with awful notation are actually the reverse engineering of some (barely) working code, cobbled together from random libraries and coefficients?
Probability estimates are not the same thing as uncertainty.
Consider tossing a coin. If I see 2 heads and 2 tails, I might report "the probability of heads is 50%". If you see 2000 heads and 2000 tails you'd also report the SAME probability estimate -- but you'd be more certain than me.
Neural networks give probability estimates. Bayesian methods (and also frequentist methods) give us probability estimates AND uncertainty.
The literature on neural network calibration seems to me to have missed this distinction.
It is common for a network to output the distribution, so the output is both the mean and variance instead of just the mean like you pointed out. For example check out variational autoencoders.
In my example, of predicting a coin toss, the naive output is a probability distribution: it's "Prob(heads)=0.5, Prob(tails)=0.5". This is the distribution that will be produced both by the person who sees 2 heads and 2 tails, and by the person who sees 2000 heads and 2000 tails.
Bayesians use the terms 'aleatoric' and 'epistemic' uncertainty. Aleatoric uncertainty is the part of uncertainty that says "I don't know the outcome, and I wouldn't know it even if I knew the exact model parameters", and epistemic uncertainty says "I don't even know the model".
Your example (outputting a mean and variance) is reporting a probability distribution, and it captures aleatoric uncertainty. When Bayesians talk about uncertainty or confidence, they're referring to model uncertainty -- how confident are you about the mean and the variance that you're reporting?
Right, the claim was that "Neural networks give probability estimates. Bayesian methods give us probability estimates AND uncertainty" which presents a false dichotomy. I think we agree.
Ah yes, got you. It is a false dichotomy because it neglects that there’s such a thing as Bayesian neural networks. Also, taking ensembles of ordinary neural networks with random initializations approximates Bayesian inference in a sense and this is relatively well known I think.
Indeed, there are Bayesian neural networks and there are non-Bayesian neural networks, and I shouldn't have implied that all neural networks are non-Bayesian.
I'm just trying to point out that there is a dichotomy between the Bayesian and the non-Bayesian, and that the standard neural network models are non-Bayesian, and that we need Bayesianism (or something like it) to talk about (epistemic) uncertainty.
Standard neural networks are non-Bayesian, because they do not treat the neural network parameters as random variables. This includes most of the examples that have been mentioned in this thread: classifiers (which output a probability distribution over labels), networks that estimate mean and variance, and VAEs (which use Bayes's rule for the latent variable but not for the model parameters). These networks all deal with probability distributions, but that's not enough for us to call them Bayesian.
Bayesian neural networks are easy, in principle -- if we treat the edge weights of a neural network as having a distribution, then the entire neural network is Bayesian. And as you say these can be approximated, e.g. by using dropout at inference time [0], or by careful use of ensemble methods [1].
Quote: "Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty."
Quote: "Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian."
Yeah right, in my experience I haven't needed as many networks in the ensemble as I first assumed. This paper [1] suggests 5-10, but in practice I've found only 3 has often been sufficient.
There might be a TCP flow using the shared link, in which case fairness means that the MPTCP sender should only send the same total traffic as would a single TCP flow. Or there might be a TCP flow using link1, and a second TCP flow using link2, and the shared link might be uncongested, in which case MPTCP could reasonably behave like two TCP flows. Or maybe there's a single path TCP flow using (link1 -> shared), or one using (link1 -> link2), or ...
The MPTCP congestion controller was designed so that in all of these cases it's no greedier than if it were a single TCP flow. And it has to do this without knowing the network topology, nor what other flows are using the network.
Suppose the sender has a single interface, and it's the receiver that has multiple interfaces. With MPTCP, the sender will learn that there are multiple paths available, and it'll balance its load adaptively over those paths.
I don't know enough about bonded interfaces, but I don't see how they'd help in this scenario. This scenario is natural with mobile devices, which is probably why Apple uses MPTCP.
Why isn't it commonplace? I think that touchscreen laptops are still too much a minority, and keyboard + mouse + monitor are too entrenched, for anyone to seriously attempt it again for a while. (A shame -- I'm one of the few who really liked the Windows 8 Metro interface.) I think that phones are too small for it to really work well. I don't know why it's not more popular on tablets.