Experience
Microsoft, Data Scientist Intern (Applied Research)
May, 2024 - July, 2024
Developed a classifier for sensitive documents based on metadata in a semi-supervised learning (PU learning) scenario with only 1% labeled samples. Applied traditional and state-of-the-art techniques, including non-negative risk estimation. Authored a paper on the methodology, submitted to Microsoft Journal of Applied Research.
Google, Software Engineer
July, 2021 - July, 2023
Worked on Cloud Data Fusion under Google Cloud which is a cloud-native, enterprise data integration service for building and managing data pipelines.
Siemens, Research Intern
Aug, 2020 - Dec, 2020
Worked with the Research and Automation Team to develop an interactive tool to query and traverse knowledge graphs effectively. Leveraged technologies such as Neo4j graph database, Blazegraph, SPARQL, Cypher, Protege.
D. E. Shaw and Co., Software Engineering Intern
Apr, 2020 - Jun, 2020
Enriched the authorisation scheme in the Kafka cluster using Python to provide access to resources based on UNIX group and mailing list. Created RESTful services to allow developers to provide resource access to other developers and let non-superusers create topics based on custom configuration.
Google Summer of Code, Student Developer at Mifos Initiative
May, 2019 - Aug, 2019
Developed a computer vision-based Android app in Kotlin which allows users to click pictures of households, detects objects and fill the Poverty Probability Index(PPI) survey using Google Cloud APIs. Handled the small dataset size by applying image augmentation and pre-processing techniques using TensorFlow and Scikit-learn.View project
Summer Research Fellow, Indian Institute of Science, Bangalore
May, 2019 - June, 2019
Worked on improving the statistical algorithm to remove noise and refine images captured in low-light. Decreased execution time by 83% by restructuring iterative sections as matrix operations in the MATLAB implementation.View project
Operations Research Intern, Optym, India
Dec, 2018
Redesigned the machine learning model for carrier truck halt-time prediction, which improved accuracy by 16%. Leveraged Python and Scikit-learn library.