2. Make sure you have the latest nvidia driver for your machine, along with the cuda toolkit. This will vary by OS but is fairly easy on most linux distros.
4. Run the model following their instructions. There are several flags that are important, but you can also just use their server example that was added a few days ago - it gives a fairly solid chat interface.
4) Run gpt4all, and wait for the obnoxiously slow startup time
... and that's it. On my machine, it works perfectly well -- about as fast as the web service version of GPT. I have a decent GPU, but I never checked if it's using it, since it's fast enough.