Vertical federated learning is a trending solution for multi-party
collaboration in training machine learning models. Industrial frameworks adopt
secure multi-party computation methods such as homomorphic encryption to
guarantee data security and privacy. However, a line of work has revealed that
there are still leakage risks in VFL. The leakage is caused by the correlation
between the intermediate representations and the raw data. Due to the powerful
approximation ability of deep neural networks, an adversary can capture the
correlation precisely and reconstruct the data. To deal with the threat of the
data reconstruction attack, we propose a hashing-based VFL framework, called
textit{HashVFL}, to cut off the reversibility directly. The one-way nature of
hashing allows our framework to block all attempts to recover data from hash
codes. However, integrating hashing also brings some challenges, e.g., the loss
of information. This paper proposes and addresses three challenges to
integrating hashing: learnability, bit balance, and consistency. Experimental
results demonstrate textit{HashVFL}’s efficiency in keeping the main task’s
performance and defending against data reconstruction attacks. Furthermore, we
also analyze its potential value in detecting abnormal inputs. In addition, we
conduct extensive experiments to prove textit{HashVFL}’s generalization in
various settings. In summary, textit{HashVFL} provides a new perspective on
protecting multi-party’s data security and privacy in VFL. We hope our study
can attract more researchers to expand the application domains of
textit{HashVFL}.

By admin