I am a Ph.D. student at University of Wisconsin-Madison majoring in Computer Sciences. My research interests lie in the intersection of Machine Learning and Security. I work with Prof. Somesh Jha and Prof. Kassem Fawaz at MADS&P. I also collaborate with Prof. Earlence Fernandes.
I did my undergraduate at Indian Institute of Technology Delhi, majoring in Electrical Engineering with minor in Computer Science.
PhD in Computer Sciences, 2019 - Present
University of Wisconsin-Madison
B.Tech. in Electrical Engineering, 2014 - 2018
Indian Institute of Technology Delhi
Detecting deepfakes remains an open problem. Recent detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deep- fake to evade detection. We propose Disjoint Deepfake Detection (D3), the first adversarially robust deepfake detector to the best of our knowledge. D3 uses an ensemble of models over disjoint subsets of the frequency spectrum to signifi- cantly improve robustness beyond de facto solutions such as adversarial training. Our key insight is to leverage a redundancy in the frequency domain and apply a saliency partitioning technique to disjointly distribute individual frequency com- ponents across multiple models. We formally prove that these disjoint ensem- bles lead to a reduction in the dimensionality of the input subspace in which the adversarial deepfakes lie. We then empirically validate the D3 method against white-box attacks and black-box attacks, and find that D3 significantly outper- forms existing state-of-the-art ensemble defenses in deepfake detection against an adaptive adversary.
Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts – a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. An implicit assumption here is that the perturbations must be visible so that a camera can sense them. By contrast, we contribute a procedure to generate, for the first time, physical adversarial examples that are invisible to human eyes. Rather than modifying the victim object with visible artifacts, we modify light that illuminates the object. We demonstrate how an attacker can craft a modulated light signal that adversarially illuminates a scene and causes targeted misclassifications on a state-of-the-art ImageNet deep learning model. Concretely, we exploit the radiometric rolling shutter effect in commodity cameras to create precise striping patterns that appear on images. To human eyes, it appears like the object is illuminated, but the camera creates an image with stripes that will cause ML models to output the attacker-desired classification. We conduct a range of simulation and physical experiments with LEDs, demonstrating targeted attack rates up to 84%.
Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user’s intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We build SkilIFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (N = 116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.