If you just read our previous post on data brokers, you might feel hopeless about our state of security. Not only is personal information being sold and stolen, but the data also isn’t sufficiently anonymized. Last October, we reported that machine learning algorithms can reverse engineer to unblur images and even extract the models used to train the dataset.
One way to mitigate this issue is to encrypt the data. Encryption, however, poses problems for big data analytics.
Previously, applying machine learning algorithms to encrypted data sets did not yield accurate results. But now, Microsoft Research has presented a neural network that is compatible with encrypted data called CryptoNet.
The key idea is to combine homomorphic encryption with neural networks, while not sacrificing speed or accuracy.
Naive encryption protocols of the past had some weak points. For example, if you requested a service to a server, you would send encrypted data, have the server use the decryption key to get raw values, and finally return encrypted results.
Homomorphic encryption patches this weakness by taking away the decryption key from the server and making the server process the information in its encrypted form.
A homographic encryption system provides major advantages from a security standpoint. Data science can now work on sensitive information without compromising privacy. This has a significant impact on the healthcare and finance sectors where data breaches can be devastating.
The problem with homographic encryption, however, was that it was 100 trillion times slower than unencrypted systems, rendering big data analytics infeasible.
The key idea in Microsoft’s paper is matching the operation used in homomorphic codes and neural networks to increase parallelism — and ultimately speed.
Knowing that homomorphic encryption works with polynomials, Microsoft tweaked the neural network to only work with polynomials. This allows a significant speed up using single-instruction-multiple-data instructions and is compatible with the way GPUs are employed for machine learning.
CryptoNet was tested on an MNIST database for handwritten digit recognition. It achieved 99% accuracy and made ~59000 predictions/hour, which makes it a feasible commercial service.
It’s been nearly a year since this paper was published, but CrpytoNet has not yet been integrated into commercial systems. Still, Microsoft’s demonstration of combining homomorphic encryption with neural networks provides a promising future of secure data analytics.