One reason might be that music is (to some extent) translation invariant in a way text is not: you can play the very same music starting from anything from C to B. Those 12 versions will sound different but not that different. Note names might be distracting when you are focusing on that abstract structure?