Very interesting read--the immediate topic and the intertwined mental health discussion.
We generally approach science research as separate from the people making it (except for questions of reputation). There is a Scientific Method and if one follows it, Science results.
Of course, hundreds of small decisions are made by scientists all the time, each potentially introducing a bias (I find the catalog of cognitive biases alone impressive). Even the decision to pursue a particular hypothesis betrays a bias (albeit unavoidable if we want to rationally make use of past learning and accomplish anything).
But I hadn't considered mental health as a key factor--such things are quite rarely discussed (nor do they have a standard reporting mechanism in scientific papers). Apart from general stigma around mental health, I suspect it's also because it undermines the image if scientist as dispassionate seeker of truth. How much more interesting, nuanced and complex would our results be if that changed?
I'd be interested to see the results of applying the test repeatedly at n-week intervals throughout the course.
One interpretation of the original finding that everyone who passes the test passes the course, but not everyone who fails the test fails the course, is that some people who did not have a consistent mental model before the course develop one during it (and nobody who has developed such a model loses it). That should show up in a longitudinal test: you would expect that as people develop a model, they switch from the failing to passing groups.
This would be consistent with the sequential learning model. If you did a basic class, and didn't have a model by the end of it, then when you do a more advanced class, you have even less of a chance of learning anything from it, and less of a chance of developing a model during it.
I wonder if a way of structuring a course would be to start with a sort of mental model bootcamp, where the teaching was aimed specifically at developing a model, and where nobody would progress to the rest of the course without doing so. That way, students who take longer to get ready are not confronted with later stages of the course that they will not be able to make use of, and have a progressively greater share of the teaching resources in the first stage to help them do so.
> [...] asked some statistician colleagues if they could help us recover more information from his data.
It's a shame more organisations don't have access to statistician helpers to ensure that they are being accurate and honest when seeking, interpreting, and presenting data. Perhaps this is something else that is a result of the dominance of Excel - people have collections of numbers and you can pummel them into a spreadsheet and produce some nice charts and graphs but that leads to people over-interpreting the data.
> After a lot of work, the answers were, by and large, that we couldn’t see any such differences in our data.
This is surprising to me. I remember reading the blogs around the time and it seemed like a sensible claim. I can't remember anyone digging into the data and pointing out flaws. Did they?
I think I believed it because I feel "unteachable".
EDIT: I freaking love this paper because of its discussion of a mistake made during a phase of mental ill health, and the recovery journey afterwards.
> This is surprising to me. I remember reading the blogs around the time and it seemed like a sensible claim. I can't remember anyone digging into the data and pointing out flaws. Did they?
I'm not sure that was possible. I haven't re-read the original 2006 paper, but it sounds like the claims in the 2006 paper may simply have been false:
> I did a number of very silly things whilst on the SSRI and some more in the immediate aftermath, amongst them writing “The camel has two humps”. I’m fairly sure that I believed, at the time, that there were people who couldn’t learn to program and that Dehnadi had proved it. Perhaps I wanted to believe it because it would explain why I’d so often failed to teach them. The paper doesn’t exactly make that claim, but it comes pretty close. It was an absurd claim because I didn’t have the extraordinary evidence needed to support it. I no longer believe it’s true. I also claimed, in an email to PPIG, that Dehnadi had discovered a “100% accurate” aptitude test (that claim is quoted in (Caspersen et al., 2007)). It’s notable evidence of my level of derangement: it was a palpably false claim, as Dehnadi’s data at the time showed.
1.5.1.2 For people with moderate or severe depression, provide a combination of antidepressant medication and a high-intensity psychological intervention (CBT or IPT).
So, for moderate or severe depression, the standard initial treatment is an SSRI and therapy.
For less severe depression, though, the guidance is to start with non-pharmaceutical options, and only move to drugs if those don't work.
I'm not saying that 2014 treatment is magical. But there are some important differences:
There's now a recognition of "subthreshold depressive symptoms" - which are troubling and unpleasant but which either would have been missed in the past or would have been treated solely with medication.
Other stuff is much more important now. "A wide range of biological, psychological and social factors, which are not captured well by current diagnostic systems, have a significant impact on the course of depression and the response to treatment."
We're using DSM IV, not ICD10, which "* also makes it less likely that a diagnosis of depression will be based solely on symptom counting.*"
To get the therapy in the UK the person would self-refer to an IAPT (improved access to psychological therapy) style course. That would carry some kind of assessment of need, and the person would thus have another check (the first would be the GP) to see if they need specialist secondary care.
The important stuff here for the OP is much more concentration on therapy not just medication; and much more concentration on how the person is coping with life not just counting symptoms. Of course, some places do this much better than others.
That's the list of models in various combinations.
C0 means C[0]. If you are consistent with a single model in your answers on more than 80% of the test then you are C[0]. If, say, you split between model 1 and 2 and in total they were 80% of your answers, then you'd be C[1]. If you were split between 2 and 3 then you'd be C[2], since that's the first time those are grouped together.
The test doesn't alert() you to what model you are. You have to set a breakpoint and poke around the code. When I took it I was consistent across all 12 questions with model 2.
It's kind of weird that one of the most logical answers to the first question is not in the answer sheets. I bet that if it was there 100% of non-programmers would tick it, and not be wrong.
The question:
a = 10;
b = 20;
a=b;
The logical answer of course being that the computer should throw an error or return false, because a does not equal b. 14 years of schooling should have hammered that in quite thoroughly.
If you really want to test non-programmers native skill for working with computer, you should at least briefly explain how the computer will read this statements. i.e. the computer interprets the statements sequentially, and reads the '=' symbol as 'becomes', not as 'equals'.
Responding with "false" as the result seems logically incoherent, as it assumes that the first two lines of the the three-line program are "true" when there is no reason to assume that is the case.
Without any previous understanding of what computer programming is, what it does, or how it works, and relying solely on elementary mathematical learnings, is there a particular reason that one would assume that the first two lines are directives and the third line is what we are being asked to validate? I am too far down the rabbit hole to intuitively know if that is the case, can someone else suggest whether this is a plausible conjecture?
It's been maybe 15 or so years, so I'm similarly pretty far down the rabbit hole, but I definitely remember having a lot of trouble with:
x = x + 1
At the time, it seemed patently obvious that it was a false statement, because there is no single value of x for which this is true.
If the situations are analogous, my guess would be that you would assume that each of these statements is an assertion, and that at least one of them must be false. Intuitively, I'd guess that it's the last one that people would assume would be false, because as you're reading from top to bottom, you've already "accepted" the first two.
When teaching JavaScript to Kids about 10 years old. I tend to use x+=1 over x=x+1.
It seems to be much easier for people to attach the new construct += to to a new idea. I don't bring in the x=x+1 form until they have had plenty of use assigning other expressions to x. Kids don't seem to have any problem with x=y+1. They just need a little time for that idea to set properly before they start mixing things up.
If you take it to mean "The cardinality of an infinite set", "X + 1" to mean "The set X with one more element added to it", and "X = Y" to mean "X and Y have the same cardinality", then "X = X + 1" is entirely true.
Mathematics, like programming, is ultimately founded on definitions.
> is there a particular reason that one would assume that the first two lines are directives and the third line is what we are being asked to validate?
A much less narrow assumption is required to reach answers of "false". Think of each example as a system of simultaneous equations.
Since this test was seemingly designed relying on the idea of destructive updates, none of the given examples have satisfying assignments. But of course standout easy-pick answers of "no solution" would ruin the test. I'd really like to see a similar study of psychologies that took into account different programming bases. Perhaps such a test would even be a good way to sort students into separate intro classes that used a language suited to their preexisting mental model.
Several older languages—including Pascal and the Algol languages—will use the := operator for all assignment, on the grounds that assignment is a fundamentally asymmetric operation. In the ML family of languages, there are immutable definitions and mutable reference cells, and different operators for each:
(* multiple bindings, so the inner x shadows the
* outer x---indeed, this code would give you a
* warning about the unused outer x *)
let x = 5 in
let x = 6 in x
(* y is a mutable reference cell being modified *)
let y = ref 5 in
y := 6; y
Haskell makes a distinction between name-binding and name-binding-with-possible-side-effect, but still reserves = to mean signify an immutable definition and not assignment:
-- this defines x to be one more than itself---which
-- causes an infinite loop of additions when x is used
let x = x + 1 in x
-- a different assignment operator is used in a context
-- in which side-effects might occur, and can be
-- interspersed with non-side-effectful bindings:
do { x <- readInt -- does IO
; let y = 1 -- doesn't do IO
; return (x + y)
}
> "The logical answer of course being that the computer should throw an error or return false"
But that's not the question that was asked. The question is "what are the final values of a and b?". Someone adopting your interpretation would say "a=10, b=20". Otherwise, why else would you be claiming "a does not equal b"?
> "If you really want to test non-programmers native skill for working with computer, you should at least briefly explain how the computer will read this statements. i.e. the computer interprets the statements sequentially, and reads the '=' symbol as 'becomes', not as 'equals'."
That would somewhat defeat the point of the test, which is to gauge what mental models (if any) people have before they've been told anything about programming.
That article didn't seem very coherent. E.g. on the original test he says:
"His test is not a very good predictor: most of those who
appear to use a model in the pre-course test pass the end-of-course exam but so do many of those who do
not."
How many pass, how many do not? That is the information needed to determine a predictive test.
Reading between the lines, I would guess that he got in trouble because his original article wasn't politically correct. One line in the retraction reads "We hadn’t shown that nature trumps nurture. Just a phenomenon and a prediction." I would guess he came under political pressure, and felt pressured to write this article in order to avoid further problems.
... comes the following quote from the author of the camel-humps paper:
"I presented our latest results about 18 months ago at a PPIG workshop/conference in the UK. I felt it was helpful, since the claims I made had provoked hostility to the work, to retract those claims verbally. It had a dramatic effect, to the good. But I found (how I know is the confidential bit) that there are people who didn’t hear that retraction, and who are still hostile; and that hostility is doing harm. So I decided to retract more publicly.
"Interestingly, one person who I would have counted previously as hostile heard (indirectly) of the verbal retraction, and this summer was more than supportive. Research inspired by our work is going forward. So the retraction was worthwhile."
So, confirmed: there was an angry backlash, and this retraction is an attempt to calm people down.
You're right that this article is a bit hard to read in places. But as for your "reading between the lines", give me some reason to believe you're not just projecting your pre-existing assumptions onto this article. What did he actually say that let you read between the lines.
He has no real criticism of his original article. What he says about it is incoherent, not merely hard to read, as I point out in my comment.
And he makes reference to nature vs nurture, which is completely irrelevant to the actual issue, and only makes sense in the context of a political apology.
where does he say that specifically? He refers to statisticians helping to do further analysis on the data, but I don't see where they showed that the main claim (that the test is predictive of success in the course) is refuted by statisticians.
We generally approach science research as separate from the people making it (except for questions of reputation). There is a Scientific Method and if one follows it, Science results.
Of course, hundreds of small decisions are made by scientists all the time, each potentially introducing a bias (I find the catalog of cognitive biases alone impressive). Even the decision to pursue a particular hypothesis betrays a bias (albeit unavoidable if we want to rationally make use of past learning and accomplish anything).
But I hadn't considered mental health as a key factor--such things are quite rarely discussed (nor do they have a standard reporting mechanism in scientific papers). Apart from general stigma around mental health, I suspect it's also because it undermines the image if scientist as dispassionate seeker of truth. How much more interesting, nuanced and complex would our results be if that changed?