The real "a-ha" demystifying moment for me was not so much learning about the elementary rotation, translation or even perspective projection operations. It was understanding how all of those operations can be composed together into a single transformation and that all that 3D graphics really is, is transforming coordinates from one relative space to another.
One important revelation in that regard for instance, was that moving a camera within a world is mathematically exactly the same as moving the world in the opposite direction relative to the camera. Once you get a feel for how transformations and coordinate spaces work, you can start playing around with them and a whole new world of possibilities opens up to you.
Though in the real-world case, there's an important difference that breaks the symmetry: You experience acceleration, whereas everybody else standing around you doesn't.
The way he animated points with an increasing z value made it click for me. Now, when I look at the formula it makes sense. The larger the value of z, the smaller your projected x and y will be. This checks out because things get smaller as they move farther away. Something that’s twice as far away will seem half as big.
Interestingly, in a way, rotation is less mystical than the perspective projection. The rotation is linear: x' = Rx, but the perspective projection is non-linear.
This is where things become fun. Next up are homogeneous coordinates or quaternions. Takes a few years of your life to actually enjoy this though :)
This formula also leads to weird geometric perceptual distortions like when one stands in front of a tall building, looks up and down and the shape of the building changes depending on the angle of the view. VR got rid of that.
I can't really say that this formula demystifies things, but the video is nice if you're eager to learn about this.