Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think rather than going back and forth, perhaps someone should explain the difference between average and median?


That shouldn't be necessary on this site, but since you asked.

Average (or mean) is the sum of the values divided by the number of values. Because of the way it is calculated, it is more likely to skew when the data includes extremely large or extremely small values. Median on the other hand is a way to measure the "middle value". For data sets of odd cardinality, it is simply the middle value, and for even cardinality, it is the mean of the two middlemost values. You can have a large number of extreme values with no effect on the median value, depending on the data.

As an example let us look at a hypothetical neighborhood. We have a family making 50k, another 100k, another 150k. Their average salary is 100k with the same median. Assume some super rich guy making 10 million. Our average goes up to several million dollars, but the median sits at 125k, more representative of where the data is clumped.

None of this to say that median is a good measurement in what I'll call malicious data sets. Imagine a data set with 1000 entries of negative 10m and 1000 of positive 10m. If I add a new data point of 5000, this is my median but doesn't represent my data at all.

Bonus: there's also "mode" which measure the value that has the most entries. E.g. in a set of 1 2 2 2 3 4 5, mode is 2.


average is skewed towards distribution outliers, the more asymmetric the distribution the more pronounced will be such effect.

On the other hand median just gives you the value that split the distribution in two exact parts.

Price/Income distributions are the perfect match if you want to see avg bias taken to its extreme.

e.g.

median of [2,2,2,2,2,2,2,100] = 2 average of [2,2,2,2,2,2,2,100] = 14.25




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: