If you can't trace the source then it's transformative use. If it matches training data then it needs to report the source like a search engine and place all responsibility on the user.
And fuzzy code matching could be easily implemented by using the a model similar to CLIP (contrastive) to embed code snippets.
That's the problem. The output of a GAN like Copilot usually can't be traced directly back to a single input.