I wonder if it would be best to return multiple colors when something appears to have multiple primary ones - e.g. black and white for a zebra, green and red for a watermelon.
Separating the subject from the background would also be useful - as simple as weighting the center of the image more than the edges?
I've done this exercise and generally what you want is a combination of your suggestion (which is elaborated by 'polyphora[0]) and 'p1necone's suggestion [1]
Polyphora mentioned CIELAB as just one example, and it's a good example. I believe state of the art these days is Oklab[2], talked about here[3]. I'd like to pull out a comment from 'jiggawatts in that discussion:
> This is a tour de force of colour theory, and should be mandatory reading for anyone serious about computer colour!
I completely agree.
With regards to 'p1necone's suggestion, k-nearest neighbors is one simple and relatively easy way to separate the colors into bins. I've only done this on a single image, but with multiple images maybe you could also k-nn bin the resulting colors from each image and only return bins which have multiple members.
"Average color" is not really what I want.