It will definitely not be slower. Local inference with a 7b model on a 3090/4090 will outpace 3.5-turbo and smoke 4-turbo.
It will definitely not be slower. Local inference with a 7b model on a 3090/4090 will outpace 3.5-turbo and smoke 4-turbo.