Its just physically not feasible. For an antenna array with 2 dimensional beamforming you need an array of NxN antennas, which all have the dimension of a fraction of a wavelength, so we are talking about a not insignificant size. Additionally, to limit mutual coupling via substrate and stray radiation, you need certain distances. Third, you need to be careful to prevent grating lobes (side emissions of unwanted radiation).
And even then, you will need individually controlled phase shifters for each array element, adding cost, complexity and mostly negating energy savings.
Edit: All of that in addition to the problem that you need to know the location of the base station and have you array antenna roughly oriented towards it.
In addition to all the above, phones move a lot during a call. You'd need very good sensors to keep track of the phone's orientation in space. If you're only shooting for 50%, the sensors in phones are probably good enough, but what happens when the user puts their head between the phone and the base station? The phone could say "please rotate your body to reconnect call" but that would be a lousy user experience.
There are neat tricks to keep phased array antennas in alignment. One for example is to just send the signal back in the same direction you're receiving from. (ie. Same phase delays in every antenna element)
Then however much the phone is tumbling, you'll always be exactly sending the data in the right direction.
Turns out the problem is moot anyway - unless you have a huge antenna to get a really narrow beam, sub-degree positional errors won't make any impact.
And even then, you will need individually controlled phase shifters for each array element, adding cost, complexity and mostly negating energy savings.
Edit: All of that in addition to the problem that you need to know the location of the base station and have you array antenna roughly oriented towards it.