Yes it is misleading indeed. From the description, the program extrapolated the emotional responses collected from 100 people for 500 paintings (??) to other paintings. No explanation is involved here to justify the use of "why" in the title as you point out. The title is likely to have made the original authors cringe.
Also consider this - the predictive power is dependent on the features tracked by the machine vision algo, which is decided by humans in the first place. We can ask whether the prediction would be as good as 80% if the vision algo only processed monochrome images. So was the "explanation" baked in beforehand?
So the predictions are customized for each person, right? The computer has to calibrate your emotional responses to known painting in order to predict how you, personally, will react to a new painting. So emotional response to abstract art is still subjective and personal, and the computer isn't so much understanding the art as getting to know the people.
80% accuracy seems too inaccurate, IMO. I would actually be surprised if you can't get nearly that accurate with nothing but comparing overall colors and quantity of edges (pick an edge detection effect, count pixels affected) - 4 dimensions of data. People are pretty simple, and pretty similar, and more importantly so are artists.