Mitigating Embedding Leakage via Latent Disruption with Controlled Reconstruction

Publikasjonsdetaljer

Journal: Transactions on Machine Learning Research (TMLR), 2026
Internasjonale standardnumre:
- Elektronisk: 2835-8856
Lenker:
- DOI: openreview.net/forum?id=nZWBrxJyrS
- ARKIV: hdl.handle.net/11250/5527300

Pre-trained encoders produce semantically rich latent embeddings, which, however, may expose unintended information through malicious inference or exploitation. We propose SEAL, a framework that mitigates embedding leakage by disrupting latent representations based on information-theoretic principles. It reduces the risk of potential misuse while enabling controlled reconstruction for trusted users. SEAL learns to encode controlled perturbations by minimizing the Matrix Norm-based Quadratic Mutual Information (MQMI) functional between original and perturbed embeddings within a hyperspherical latent space. Meanwhile, a private decoder, jointly trained with the SEAL encoder, is trained to reconstruct the original data that is accessible only to authorized users under an access-controlled setting. Extensive experiments on vision and text datasets demonstrate that SEAL reduces latent leakage, weakens the effectiveness of evaluated inference attacks, and preserves reconstruction under the considered setting.