This doesn’t sound right, if they don’t know the structure of the NN how can the...

aix1 · on March 23, 2023

Every agent training the model on their proprietary data has to have access to the model form in some way (otherwise how would they train it?)

For this reason, one must assume that the model form is known to the adversary.

With this, the question becomes: is it possible to reconstruct training data from a trained model? We already know that, at least for some image models, the answer to that question is "yes": https://arxiv.org/pdf/2301.13188.pdf

onethought · on March 23, 2023

That must only be true if there isn’t a one way compression step occurring, or any approximation in the whole model.

aix1 · on March 23, 2023

I don't think lossy compression is sufficient. The very first example in the paper I linked to is clearly not identical to the original image (=lossily compressed) yet leaks a training image in a way that would be highly problematic in certain domains, e.g. medical imaging.

onethought · on March 23, 2023

I see what you are saying. Agree. Seems we need some set patterns in NN models that will reliably remove reversibility without effecting loss too drastically.