The wikipedia article doesn't have the best illustration, but I think the inherent idea is really wise: in trying too hard to meet your initial constraints, you can come up with a solution that's only useful at those constraint points.
Interesting, I thought that was exactly what you were talking about :)
I learned about it in numerical analysis. It illustrates the downside of trying to be too precise--you can make a polynomial that will go through an arbitrary number of data points.
As the number of points increases, your function will look less like a line and more like a magnitude 9 earthquake on a seismograph. The function will pass through all the points used to define it. However, it'll be useless when predicting the original data's behavior, as it changes too quickly on small input.
Instead, mathematicians find more useful functions by relaxing the conditions so that the model function only has to come 'near' the data points.
The wikipedia article doesn't have the best illustration, but I think the inherent idea is really wise: in trying too hard to meet your initial constraints, you can come up with a solution that's only useful at those constraint points.