The way I see it there's a whole spectrum of methods for checking program correc...

The way I see it there's a whole spectrum of methods for checking program correctness.

On the one end are "point tests", manually written tests that check correctness in several predetermined cases. That's what most software engineers mean by "testing".

Then comes randomized tests/fuzz testing/QuickCheck where properties are defined to hold for all values of some variables and then can be checked on randomly generated values. This offers better correctness guarantees than manually created "point tests" but might still miss failing cases since it's sampling and not exhaustively searching through the whole space.

Finally, on the far end are proofs, where one formally proves that the formerly mentioned properties always hold. I cannot imagine a stronger guarantee that something holds than an actual mathematical proof of it.

Dependently typed languages (Agda, Idris, Coq, Lean, etc) push software development into that far end where correctness can actually be proved. Imo that's where the future of software lays for certain classes of programs where correctness is a must.

Here's a proof of the original question in Lean:

  def foo : (λ x, 2 * x) = (λ x, x + x) :=
  begin
    apply funext, intro x,
    cases x,
    { refl },
    { simp at *,
      dsimp [has_mul.mul, nat.mul],
      have zz : Π a : nat, 0 + a = a := by simp at *,
      rw zz }
  end