This is a course project of CISC 870. Machine learning is being used in sectors from different domains, it often needs to deal with data that are extremely confidential. When sensitive data is used to train a machine learning model, the model's reliance on sensitive user data renders it unsuitable for creating Machine Learning workflows without compromising user privacy and confidentiality. These considerations apply to each and every machine learning method or model that deal with sensitive data. According to the findings of this project, we recommend the adoption of a customized homomorphic encryption scheme as a means of mitigating this danger. This scheme will enable proper encryption of user data by making use of a combination of public and private keys. In this project, I have explored different types of homomorphic encryption schemes. Also, I experimented the scheme in a sensitive dataset using linear regression. It has been noted that the encryption does a good job of preserving the confidentiality of the input test data as well as the data that corresponds to the results of the regression model. It protects the machine learning model and any sensitive user data that is linked with it from attacks involving model inversion as well as membership inference.
- Implementing_HE_from_scratch.py files represents the codes associated with the scratch implementaion of HE using python
- HE_in_ML.ipynb represents the use of HE in machine learning.