It’s a heuristic argument that critical points are extremely unlikely to be local minima (ie positive definite second derivative). Loss surfaces of DNNs do typically have a global minimum (zero if they fit the training data exactly).
Arguably, a DNN seems likely to have many global minimums - given the level of (over)parametrization commonly used, a set of parameters that gets the lowest possible loss won't be unique, there will be huge sets of parameters that give exactly identical results.