I have briefly told ChatGPT 3.5 about the syntax of a CLI tool I wrote, then asked it to perform a few operations. It did a surprisingly good job, even when I said "and format the result as JSON with fields named Command and Description where the latter explains what the command does".
If I was to actually use this in a real system, I'd definitely build a restricted shell to execute in, and probably run it inside a Docker container with just the essential files mapped in, because I don't trust an LLM not to describe what it's doing as "updating timestamps" or whatever but actually the command is "rm -rf ~"
If I was to actually use this in a real system, I'd definitely build a restricted shell to execute in, and probably run it inside a Docker container with just the essential files mapped in, because I don't trust an LLM not to describe what it's doing as "updating timestamps" or whatever but actually the command is "rm -rf ~"