Isn't input/output lengths an arbitrary distinction. Under the hood, output becomes the input for the next token at each step. OpenAI may charge you more $$ by forcing you to add output to the input and call the API again. But running local you don't have that issue.