Running your intelligence study in the college's psychology building, paying participants in course credits?
Calling Raven's Progressive Matrices, a measure of "reasoning about novel problems" when it's far from novel in psychology, and most of your test subjects are psychology students?
I bet computer science majors have more of an "unfair" advantage than psych majors. It was hard not to notice how many of the patterns match bit manipulation rules such as not and xor.
By the way, I think they mean that Raven's Matrices test (somehow) for an ability to reason about problems that are new to the test subject, not that the test itself is new.
The spatial reasoning on reading code does not happen on the dimensions of the literal text, at least not only on these. It happens in how we interpret the code and build relations in our minds while doing so. So I think that the problem is not about the spatial reasoning of what we literally see per se, but if the specific representation helps in something. I like visual representations for the explanatory value they can offer, but if one tries to work rigorously on a kind of spatial algebra of these, then this explanatory power can be lost after some point of complexity. I guess there may be contexts where a visual language may be working well. But in the contexts I have encountered I have not found them helpful. If anything, the more complex a problem is, the more cluttered the visual language form ends up being, and feels overloading my visual memory. I do not think it is a geometric feature or advantage per se, but about how brains of some people work. I like visual representations and I am in general a quite visual thinker, but I do not want to see all these miniscule details in there, I want to them to represent what I want to understand. Text, on the other hand, serves better as a form of (human-related) compression of information, imo, which makes it better for working on these details there.
> If anything, the more complex a problem is, the more cluttered the visual language form ends up being, and feels overloading my visual memory
I feel like you are more concerned about implementation than the idea itself. For me it's the opposite - I find it's easier to understand small pieces of text, but making sense of hundreds of 1k lines files is super hard.
Visual programming in my understanding should allow us to "zoom" in and out on any level and have a digestible overview of the system.
Here is an example of visual-first platform that I know is used for large industrial systems, and it allows viewing different flows separately and zooming into details of any specific piece of logic, I think it's a good example of how visual programming can be: https://youtu.be/CTZeKQ1ypPI?si=DX3bQSiDLew5wvqF&t=953
I do not think what they say is that it is hard to visualise it, but that it does not offer much utility to do so. A "for" loop like that is not that complicated to understand and visualising it externally does not offer much. The examples the article gives is about more abstract and general overviews of higher level aspects of a codebase or system. Or to explain some concept that may be less intuitive or complicated. In general less about trying to be formal and rigorous, and more about being explanatory and auxiliary to the code itself.
Thanks for putting all this work and sharing it in such a detail! Data extraction/structuring data is the only serious application of LLMs I have actually engaged in for real work and found useful. I had to extract data from experience sampling reports which I could not share online, thus chatgpt etc was out of question. There were sentences describing onsets and offsets of events and descriptions of what went on. I ran models through llama.cpp to turn these into csv format with 4 columns (onset, offset, description, plus one for whether a specific condition was met in that event or not which had to interpreted through the description). Giving some examples of how I want it all structured in the prompt, was enough for many different models to do it right. Mixtral 8x7b was my favourite because it ran the fastest in that quality level on my laptop.
I am pretty sure that a finetuned smaller model would be better and faster for this task. It would be great to start finetuning and sharing such smaller models: they do not really have to be really better than commercial LLMs that run online, as long as they are not at least worse. They are already much faster and cheaper, which is a big advantage for this purpose. There is already need for these tasks to be offline when one cannot share the data with openai and the like. Higher speed and lower cost also allow for more experimentation with more specific finetuning and prompts, with less care about token lengths of prompts and cost. This is an application where smaller, locally run, finetunable models can shine.
> Data extraction/structuring data is the only serious application of LLMs
I fully agree. I realized this early on when experimenting with GPT-3 for web data extraction. After posting the first prototype on Reddit and HN, we started seeing a lot of demand for automating rule-based web scraping stacks (lots of maintenance, hard to scale). This eventually led to the creation of our startup (https://kadoa.com) focused on automating this "boring and hard" problem.
It comes down to such relatively unexciting use cases where AI adds the most value.
AI won't eliminate our jobs, but it will automate tedious, repetitive work such as web scraping, form filling, and data entry.
Well I precisely talked about things I have engaged professionally. Obviously this cannot cover everything one may do, eg I do not build chatbots for customer service or stuff like that, thus I obviously cannot speak for all possible applications of LLMs and how useful they may be. I am pretty sure there will be useful applications in fields I am not and will not be engaged in as nobody engages with everything. However, some other things that I have tried (eg copilots, summarising scientific articles) imo create much more hype than real value. They can be a bit useful if you know what to actually use them for and what their limits are, but nowhere close to the hype they generate, and I just find myself just googling again tbh. They are absolutely horrible especially with more niche subjects and areas. On the other hand, data extraction and structuring has a quite universal application, has already demonstrated usefulness and potential, and seems a quite realistic, down to earth application that I am happy to see other people and startups working on. Not as fancy, and harder to build hype upon, but very useful regardless.
Thanks! Yes one 'next step' that I'd like to do (probably around the work on deployment / inference that I'm turning to now) will be to see just how small I can get the model. Spacy have been pushing this kind of workflow (models in the order of tens of MB) for years and it's nice that there's a bit more attention to it. As you say, ideally I'd want lots of these tiny models that were super specialists at what they do, small in size and speedy in inference time. As I hinted towards the end of the post, however, keeping all that updated starts to get unwieldy at a certain point if you don't set it all up in the right way.
I work in human developmental research and have never heard or read anybody make such claims that you consider "bog standard child development science", and some of what you say are definitely not supported by the current understanding of human development.
For example
> child language development milestones that are waymarked by age down to the month
is totally false. It is quite known that developmental milestones are acquired by children in different times and even in different orders and sequences. This "down to the month" is pure non-sense for most of the milestones.
Young children are better served to be guided by their own curiosity, interest and exploration drives and which parents feed with variable inputs and building upon, rather than by anxious parents feeding them with whatever terabytes of exploitation-intended information they think is gonna "serve to maximize IQ".
Yes, reading to kids in certain ways (using numbers/spatial relationships/theory of mind stuff/interactively) has been found in some studies to correlate with certain outcomes but there is nothing to suggest a totally linear relationship such that talking to a kid 24/7 since the womb is gonna produce the next Einstein.
I'm a clinician who works in child development pathology, which is why I commented. You have zero idea of what you are talking about. Your unprofessional hyperbole and willful misconstrual of clear statements aside.
>"have never heard or read anybody make such claims that you consider "bog standard child development science"
Reconsider your profession.
>This "down to the month" is pure non-sense for most of the milestones
"Most", you say? Which is it? Are some milestones down to the month and some not? Or are all not down to the month?
"Down to the month" is shorthand for a normative developmental window. So "down to a limited window of months, which together comprise a large percentage of the child's age in months".
Is that better meaningful communication on a public board? Some would instead say "too long and unnecessary". Avoiding such communication being a core communication skill. Or perhaps you are too, say, "bothered" to be able to allow for meaningful communication devices in public. Get a grip.
Those limited, often overlapping, windows assist in defining service eligibility. Overlapping means that differently ordered milestones are possible. Your "different times" of milestone acquisition is limited to the very limited normative windows. Per both the standardized clinical knowledge body and each individual State that adopts those standards. As an expert, you'd know that all normed measurements are subject to statistical deviations in development but sure as heck are also limited by the same. This is super basic. Also, it is generally unnecessary to explain normed scores, normed milestones, and basic statistical concepts on HN. But thanks for the variously pedantic and wrong corrections.
>Young children are better served to be guided by their own curiosity, interest and exploration drives and which parents feed with variable inputs and building upon,
"Variable inputs" is too vague as to be meaningless as an argument. "Building upon" generally refers to scaffolding, which is a single tool used in certain circumstances or when otherwise appropriate. Beyond that, who said that implied basic interaction / play techniques are excluded?
I'm speaking of total language exposure over years, from a host of sources.
"better served" is unprofessional and unclear language. If you mean to say that child curiosity and interest should replace talking and reading to them as much as possible, then you are dead wrong and should reconsider your line of work. However, the reality is that these are not mutually exclusive activities and so at the least we can conclude that your correction is unnecessary (and weird). Many if not most children crave more such interaction than they receive.
>rather than by anxious parents
Who said anything about anxiety? Speaking and reading to children is a natural activity. Some parents do it more than others. Some parents neglect it more than is healthy for a child's development. The constant being that more of it is better for development. If you are trying to argue that point, I'll move on from "find a new line of work" to "I don't believe you in your statement about your line of work".
>feeding them with whatever terabytes of exploitation-
"Terabytes of exploitation"? What? You don't work in child development research, at least not in any meaningful way.
>intended information they think is gonna "serve to maximize IQ".
Its not debatable that maximizing language input, from an early age, maximizes developmental potential. That doesn't mean, nor did I imply, that parents need to do anything but the best that they can in providing children with language stimulation. That means being at least superficially aware of and respecting the relevant developmental science, and trying to meet the best standard that they can in light of it. You seem to have an issue reading extremes into things , which in turn leads to your use of truly bizarre language.
> I'm a clinician who works in child development pathology, which is why I commented.
sounds like you take research at it's word instead of understanding the fundamental ideas and concepts that are being explored...classic white coat thinks the book is right and everyone else is wrong
I do not think that such a conversation is done in productive way (especially when cutting phrases in half to make them appear making no sense) but will try get a couple of points across:
- I interpreted the comment in the context of the answer to a specific article/interview. In the linked content, for example, there is a video of a 2.5yo doing a "flashcard class". While I do not think there is anything inherently harmful or anything, it is not a way that 2.5yos learn about the world, and even if it is not harmful it is not needed for 2.5yos to sit on a chair and doing a class to learn about the world. Their curiosity and own exploration drive is enough for pulling them into learning, and this is what I mean by parents should feed, ie see what their kids are most curious and interested in and feeding them inputs to that direction. The comment you answered to was referring to this article, and I interpreted your answer in that context. If I misinterpreted anything, I can only see the context that is shared here, not in your mind.
- To reiterate and clarify more on the context, "Speaking and reading to children is a natural activity" is _not_ what OP was about. What OP was about is applying a specific strategy for kids at 2+ years, ie to learn to read using a specific exploitation-based approach. If that is all you meant by your previous comment, then you may want to reread the comment you answered to from that perspective. Nobody here is saying "leave the kids do what they want and do not care about interacting with them much/talking around them" that you seem to suggest. When I say parents building upon kids' own curiosity and exploration drive I mean seeing what sort of inputs their kids become more curious and interested in at a certain time and feeding them inputs like that. When a kid starts being interested in sounds and music, feed them with sounds and music and sound/music-related books and toys. There is no handbook that is gonna say which month and day exactly this should happen for a specific kid.
- I may miss a lot of knowledge indeed, but I still find setting goals of "maximising language exposure" and "maximising IQ" weird and unclear. No, I have never read or heard this way of approaching development and learning. Parents doing their best and being mindful of the importance of language exposure is different than "maximising" anything. Maximising with respect to which parameters? Even defining this as an optimisation problem, any complex optimisation problem like this is a tradeoff between different parameters and outcomes. What happens to the other parameters and outcomes when you optimise on just one?
- If "you do not speak the jargon" is what you prefer to focus, just say that and any more discussion will not be needed.
> How many homework questions did your entire calc 1 class have? I'm guessing less than 100 and (hopefully) you successfully learned differential calculus.
Not just that: people learn mathematics mainly by _thinking over and solving problems_, not by memorising solutions to problems. During my mathematics education I had to practice solving a lot of problems dissimilar what I had seen before. Even in the theory part, a lot of it was actually about filling in details in proofs and arguments, and reformulating challenging steps (by words or drawings). My notes on top of a mathematical textbook are much more than the text itself.
People think that knowledge lies in the texts themselves; it does not, it lies in what these texts relate to and the processes that they are part of, a lot of which are out in the real world and in our interactions. The original article is spot on that there is no AGI pathway in the current research direction. But there are huge incentives for ignoring this.
> Not just that: people learn mathematics mainly by _thinking over and solving problems_, not by memorising solutions to problems.
I think it's more accurate to say that they learn math by memorizing a sequence of steps that result in a correct solution, typically by following along with some examples. Hopefully they also remember why each step contributes to the answer as this aids recall and generalization.
The practice of solving problems that you describe is to ingrain/memorize those steps so you don't forget how to apply the procedure correctly. This is just standard training. Understanding the motivation of each step helps with that memorization, and also allows you to apply that step in novel problems.
> The original article is spot on that there is no AGI pathway in the current research direction.
I think you're wrong. The research on grokking shows that LLMs transition from memorization to generalized circuits for problem solving if trained enough, and parametric memory generalizes their operation to many more tasks.
They have now been able to achieve near perfect accuracy on comparison tasks, where GPT-4 is barely in the double digit success rate.
Composition tasks are still challenging, but parametric memory is a big step in the right direction for that too. Accurate comparitive and compositional reasoning sound tantalizingly close to AGI.
> The practice of solving problems that you describe is to ingrain/memorize those steps so you don't forget how to apply the procedure correctly
Simply memorizing sequences of steps is not how mathematics learning works, otherwise we would not see so much variation in outcomes. Me and Terence Tao on the same exact math training data would not yield two mathematicians of similar skill.
While it's true that memorization of properties, structure, operations and what should be applied when and where is involved, there is a much deeper component of knowing how these all relate to each other. Grasping their fundamental meaning and structure, and some people seem to be wired to be better at thinking about and picking out these subtle mathematical relations using just the description or based off of only a few examples (or be able to at all, where everyone else struggles).
> I think you're wrong. The research on grokking shows that LLMs transition from memorization to generalized circuits
It's worth noting that for composition, key to abstract reasoning, LLMs failed to generalize to out of domain examples on simple synthetic data.
> The levels of generalization also vary across reasoning types: when faced with out-of-distribution examples, transformers fail to systematically generalize for composition but succeed for comparison.
> Simply memorizing sequences of steps is not how mathematics learning works, otherwise we would not see so much variation in outcomes
Everyone starts by memorizing how to do basic arithmetic on numbers, their multiplication tables and fractions. Only some then advance to understanding why those operations must work as they do.
> It's worth noting that for composition, key to abstract reasoning, LLMs failed to generalize to out of domain examples on simple synthetic data.
Yes, I acknowledged that when I said "Composition tasks are still challenging". Comparisons and composition are both key to abstract reasoning. Clearly parametric memory and grokking have shown a fairly dramatic improvement in comparative reasoning with only a small tweak.
There is no evidence to suggest that compositional reasoning would not also fall to yet another small tweak. Maybe it will require something more dramatic, but I wouldn't bet on it. This pattern of thinking humans are special does not have a good track record. Therefore, I find the original claim that I was responding to("there is no AGI pathway in the current research direction") completely unpersuasive.
I started by understanding. I could multiply by repeat addition (each addition counted one at a time with the aid of fingers) before I had the 10x10 addition table memorized. I learned university level calculus before I had more than half of the 10x10 multiplication table memorized, and even that was from daily use, not from deliberate memorization. There wasn't a day in my life where I could recite the full table.
Maybe schools teach by memorization, but my mom taught me by explaining what it means, and I highly recommend this approach (and am a proof by example that humans can learn this way).
> I started by understanding. I could multiply by repeat addition
How did you learn what the symbols for numbers mean and how addition works? Did you literally just see "1 + 3 = 4" one day and intuit the meaning of all of those symbols? Was it entirely obvious to you from the get-go that "addition" was the same as counting using your fingers which was also the same as counting apples which was also the same as these little squiggles on paper?
There's no escaping the fact that there's memorization happening at some level because that's the only way to establish a common language.
There's a difference between memorizing meanings of words (addition is same as counting this and then the other thing, "3" means three things) and memorizing methods (table of single digit addition/multiplication to do them faster in your head). You were arguing the second, I'm a counterexample. I agree about the first, everyone learns language by memorization (some rote, some by use), but language is not math.
> You were arguing the second, I'm a counterexample.
I still don't think you are. Since we agree that you memorized numbers and how they are sequential, and that counting is moving "up" in the sequence, addition as counting is still memorizing a procedure based on this, not just memorizing a name: to add any two numbers, count down on one as you count up on the other until the first number number reaches zero, and the number that counted up is the sum. I'm curious how you think you learned addition without memorizing this procedure (or one equivalent to it).
Then you memorized the procedure for multiplication: given any two numbers, count down on one and add the other to itself until the counted down number reaches one. This is still a procedure that you memorized under the label "multiplication".
This is exactly the kind of procedure that I initially described. Someone taught you a correct procedure for achieving some goal and gave you a name for it, and "learning math" consists of memorizing such correct procedures (valid moves in the game of math if you will). These moves get progressively more sophisticated as the math gets more advanced, but it's the same basic process.
They "make sense" to you, and you call it "understanding", because they are built on a deep foundation that ultimately grounds out in counting, but it's still memorizing procedures up and down the stack. You're just memorizing the "minimum" needed to reproduce everything else, and compression
is understanding [1].
The "variation in outcomes" that an OP discussed is simply because many valid moves are possible in any given situation, just like in chess, and if you "understand" when a move is valid vs. not (eg. you remember it), then you have an advantage over someone who just memorized specific shortcuts, which I suspect is what you are thinking I mean by memorization.
I think you are confusing "memory" with strategies based on memorisation. Yes memorising (ie putting things into memory) is always involved in learning in some way, but that is too general and not what is discussed here. "Compression is understanding" possibly to some extent, but understanding is not just compression; that would be a reduction of what understanding really is, as it involves a certain range of processes and contexts in which the understanding is actually enacted rather than purely "memorised" or applied, and that is fundamentally relational. It is so relational that it can even go deeply down to how motor skills are acquired or spatial relationships understood. It is no surprise that tasks like mental rotation correlates well with mathematical skills.
Current research in early mathematical education now focuses on teaching certain spatial skills to very young kids rather than (just) numbers. Mathematics is about understanding of relationships, and that is not a detached kind of understanding that we can make into an algorithm, but deeply invested and relational between the "subject" and the "object" of understanding. Taking the subject and all the relations with the world out of the context of learning processes is absurd, because that is in the exact centre of them.
I did memorize names of numbers, but that is not essential in any way to doing or understanding math, and I can remember a time where I understood addition but did not fully understand how names of numbers work (I remember, when I was six, playing with a friend at counting up high, and we came up with some ridiculous names for high numbers because we didn't understand decimal very well yet).
Addition is a thing you do on matchsticks, or fingers, or eggs, or whatever objects you're thinking about. It's merging two groups and then counting the resulting group. This is how I learned addition works (plus the invariant that you will get the same result no matter what kind of object you happen to work with). Counting up and down is one method that I learned, but I learned it by understanding how and why it obviously works, which means I had the ability to generate variants - instead of 2+8=3+7=... I can do 8+2=9+1=..., or I can add ten at a time, etc'.
Same goes for multiplication. I remember the very simple conversation where I was taught multiplication. "Mom, what is multiplication?" "It's addition again and again, for example 4x3 is 3+3+3". That's it, from that point on I understood (integer) multiplication, and could e.g. wonder myself at why people claim that xy=yx and convince myself that it makes sense, and explore and learn faster ways to calculate it while understanding how they fit in the world and what they mean. (An exception is long multiplication, which I was taught as a method one day and was simple enough that I could memorize it and it was many years before I was comfortable enough with math that whenever I did it it was obvious to me why what I'm doing here calculates exactly multiplication. Long division is a more complex method: it was taught to me twice by my parents, twice again in the slightly harder polynomial variant by university textbooks, and yet I still don't have it memorized because I never bothered to figure out how it works nor to practice enough that I understand it).
I never in my life had an ability to add 2+2 while not understanding what + means. I did for half an hour have the same for long division (kinda... I did understand what division means, just not how the method accomplishes it) and then forgot. All the math I remember, I was taught in the correct order.
edit: a good test for whether I understood a method or just memorized it would be, if there's a step I'm not sure I remember correctly, whether I can tell which variation has to be the correct one. For example, in long multiplication, if I remembered each line has to be indented one place more to the right or left but wasn't sure which, since I understand it, I can easily tell that it has to be the left because this accomplishes the goal of multiplying it by 10, which we need to do because we had x0 and treated it as x.
> The practice of solving problems that you describe is to ingrain/memorize those steps so you don't forget how to apply the procedure correctly
Perhaps that is how you learned math, but it is nothing like how I learned math. Memorizing steps does not help, I sucked at it. What works for me us understanding the steps and why we used them. Once I understood the process and why it worked, I was able to reason my way through it.
> The practice of solving problems that you describe is to ingrain/memorize those steps so you don't forget how to apply the procedure correctly.
Did you look at the types of problems presented by the ARC-AGO test? I don't see how memorization plays any role.
> They have now been able to achieve near perfect accuracy on comparison tasks, where GPT-4 is barely in the double digit success rate.
Then lets see how they do on the ARC test? While it is possible that generalized circuits can develop in Ls with enough training but I am pretty skeptical till we see results.
> Perhaps that is how you learned math, but it is nothing like how I learned math.
Memorization is literally how you learned arithmetic, multiplication tables and fractions. Everyone starts learning math by memorization, and only later start understanding why certain steps work. Some people don't advance to that point, and those that do become more adept at math.
> Memorization is literally how you learned arithmetic, multiplication tables and fractions
I understood how to do arithmetic for numbers with multiple digits before I was taught a "procedure". Also, I am not even sure what you mean by "memorization is how you learned fractions". What is there to memorize?
> I understood how to do arithmetic for numbers with multiple digits before I was taught a "procedure"
What did you understand, exactly? You understood how to "count" using "numbers" that you also memorized? You intuitively understood that addition was counting up and subtraction was counting down, or did you memorize those words and what they meant in reference to counting?
> Also, I am not even sure what you mean by "memorization is how you learned fractions". What is there to memorize?
The procedure to add or subtract fractions by establishing a common denominator, for instance. The procedure for how numerators and denominators are multiplied or divided. I could go on.
Fractions is exactly an area of mathematics where I learned by understanding the concept and how it was represented and then would use that understanding to re-reason the procedures I had a hard time remembering.
I do have the single digit multiplication table memorized now, but there was a long time where that table had gaps and I would use my understanding of how numbers worked to to calculate the result rather than remembering it. That same process still occurs for double digit number.
Mathematics education, especially historically, has indeed leaned pretty heavily on memorization. That does mean thats the only way to learn math, or even a particularly good one. I personally think over reliance on memorization is part of why so many people think they hate math.
> Fractions is exactly an area of mathematics where I learned by understanding the concept and how it was represented and then would use that understanding to re-reason the procedures I had a hard time remembering.
Sure, I did that plenty too, but that doesn't refute the point that memorization is core to understanding mathematics, it's just a specific kind of memorization that results maximal flexibility for minimal state retention. All you're claiming is that you memorized some core axioms/primitives and the procedures that operate on them, and then memorized how higher-level concepts are defined in terms of that core. I go into more detail of the specifics here:
I agree that this is a better way to memorize mathematics, eg. it's more parsimonious than memorizing lots of shortcuts. We call this type of memorizing "understanding" because it's arguably the most parsimonious approach, requiring the least memory, and machine learning has persuasively argued IMO that compression is understanding [1].
Every time I see people online reduce the human thinking process to just production of a perceptible output, I start questioning myself, whether somehow I am the only human on this planet capable of thinking and everyone else is just pretending. That can't be right. It doesn't add up.
The answer is that both humans and the model are capable of reasoning, but the model is more restricted in the reasoning that it can perform since it must conform to the dataset. This means the model is not allowed to invest tokens that do not immediately represent an answer but have to be derived on the way to the answer. Since these thinking tokens are not part of the dataset, the reasoning that the LLM can perform is constrained to the parts of the model that are not subject to the straight jacket of training loss. Therefore most of the reasoning occurs in-between the first and last layers and ends with the last layer, at which point the produced token must cross the training loss barrier. Tokens that invest into the future but are not in the dataset get rejected and thereby limit the ability of the LLM to reason.
> People think that knowledge lies in the texts themselves; it does not, it lies in what these texts relate to and the processes that they are part of, a lot of which are out in the real world and in our interactions
And almost all of it is just more text, or described in more text.
You're very much right about this. And that's exactly why LLMs work as well as they do - they're trained on enough text of all kinds and topics, that they get to pick up on all kinds of patterns and relationships, big and small. The meaning of any word isn't embedded in the letters that make it, but in what other words and experiences are associated with it - and it so happens that it's exactly what language models are mapping.
It is not "just more text". That is an extremely reductive approach on human cognition and experience that does favour to nothing. Describing things in text collapses too many dimensions. Human cognition is multimodal. Humans are not computational machines, we are attuned and in constant allostatic relationship with the changing world around us.
I think there is a component of memorizing solutions. For example, for mathematical proofs there is a set of standard "tricks" that you should have memorized.
I think the reasoning is very simple. Everything that happens happens in space through time. Intelligent systems must solve problems where they observe what's happening in space over some amount of time, and then predict whats going to happen to space over some other amount of time.
> Would an intelligent but blind human be able to solve these problems?
Blind people can have spatial reasoning just fine. Visual =/= spatial [0]. Now, one would have to adapt the colour-based tasks to something that would be more meaningful for a blind person, I guess.