There's an argument to be made that, so long as your testing fully encompasses all visitors to your site, you aren't sampling the population, you're fully observing it, and statistical significance is irrelevant.
Sites are always getting new visitors, losing old ones and the ones they’ve observed return irregularly (or commonly, or somewhere in between). So it’s not realistic to assume a given sample of visitors is the population.
If future users are markedly different than past users, a p-value isn't going to help you here.
It's not unreasonable to assume it's a sample, I just don't think it's worth getting paralyzed by worrying about whether or not you have power, or getting into hacky tricks to try to fix it.
...but most power calculations are also sort of bullshit.
It may be that after analyzing the data, we still have substantial uncertainty, just depends on the process’s inherent variability and what the data provides in terms of information (a function of the skill of the person determining what should be collected).
That argument is missing that you are using past users’ behaviour as representative of future users’ preferences. You are not sampling marbles in a jar, but making a lot of assumptions, notably about continuity.