Inferencing on this model works fine on Google Colab which gives Tesla K80 GPU with access to 12GB of GPU RAM. You can buy a used K80 for probably about $850, but it's not really ideal for putting in a home computer because of the cooling requirements.
If I remember correctly K80 memory is actually 24GB, not 2x12GB. This is a pretty important distinction in this context (training GPT-2).
Also, you can get at least 6 K80s for the price of a single RTX Titan (also 24GB). So it would be faster (I don't think RTX Titan is 6x faster than K80) and 6x more memory for the same price. It's a very good deal.
[ deleted reference to 2070 Super ]