Data Engineer
CV
Professional Summary
Having 5+ years of experience within the IT industry specialising in Telecom domain, I've delivered multiple models, prediction dashboards and tools to help teams get insight of data in the easiest format.
I have experience in design and development of an interactive dashboard for tracking customer churn and providing actionable insight to sales and marketing teams.
​
Skills
Programming Languages :
Python R Programming SQL, Java, Unix/Linux Shell Scripting, ML, NLP, Django
Machine Learning Libraries :
Pandas, matplotlib, NumPy, seaborn, Keras, TensorFlow, NLTK, OpenCV, scikit learn, Caret
Datawarehouse & BI Tools :
MySQL, MS SQL Server, MS Access, PowerPivot, PowerQuery,Vertica, BigQuery, Informatics, Microstrategy, CloudFirestore
Data visualisation tools & Cloud tech :
Looker, GCP, Docker, Kubernetes, AWS, Tableau, Google Analytics, PowerBI
Big Data / Analytics :
Hive, Hadoop, Spark, TensorFlow
Other Tools / Skills :
Machine Learning, NLP, Jira, Selenium, SoapUI, Jenkins, Kafka, UI Path,
MS Excel Analytics
Work Experience
February 2023 - April 2023
Omdena
​
Mentored and provided guidance to junior team members on data science best practices and ETL approaches
​
Designed and implemented data ingestion pipelines to collect vaccine-related data from various sources, including online and offline tools, clinical trials, public health databases, and government reports
​
Utilized machine learning models like CNNs, RNNs, Random Forest and SVM, to predict best matching epitopes for vaccine creation
with 70% accuracy
​
Machine Learning Engineer
June 2022 - August 2022
SwitchPitch
Crawled, scraped, transformed, and organized data of over 11,500 startup websites along with sentiment analysis by creating automated python scripts, NLTK, Flair
​
Auto-Categorization of startup websites based on their content and data domain types into 85 different categories by using Python, Beautiful-Soap, APIs, NLTK, IBM Watson NLP and SVM
​
Streamlined web-crawling scripts to collect data from over 20,000 websites every month without any manual intervention using AWS EC2, AWS EKS, Shell-scripts, Python and MySQL
Supervised Trending Words Spotting project to identify trending technologies from more than 15,000 startups using Named Entity Recognition (NER) and Topic Modeling
​
Data Science Intern
January 2022 - Present
Graduate Research Analyst
NEXIS | Syracuse University
Formula One data analysis and prediction using race drivers standing in every race, based on every driver’s performance on that racetrack.
​
Driver performance specific to tracks analysis using clustering techniques(K-means) on the racetracks, classification models like Decision Trees, XGBoost, RandomForests, Naïve Bayes classifier
March 2020- July 2021
Data Analyst
Amdocs
Built end to end forecasting dashboard for order management with Django framework and identified 10+ network / order failures in production environment which saved 6 deployment failures in a year
Developed order classification model to support stuck order automation using Python, SQL, and Machine Learning for application support team to reduce manual efforts of 300 hours per month
Collaborated with different business units to perform Exploratory Data Analysis to understand Order management System metrics such as regional order management issue, average order processing time, frequent stuck order
February 2017 - February 2020
Data Developer
Tech Mahindra
Eliminated delay of 6 hrs/day of business report generation by developing automated report generation model
Established an interactive dashboard for tracking customer churn and providing actionable insight to sales and
marketing teams
Pioneered business operation automation analytics to provide tracking insights of 100+ databases
Increased data extraction speed from 2 hours to 30 minutes via redesigning application in Django framework
Awarded for building Database Management and Visualization tool preventing failures in DB with 1 billion+ records
Education
2021-2023
Syracuse University
2012-2016
Savitribai Phule Pune University
Intrests
Some of the areas I’m interested in exploring are Data Science and Analytics, Machine learning, Artificial intelligence and Big data. What intrigues me is how big data works real-time using various technologies like Hadoop, Spark and other big data technologies. I’m particularly keen in exploring the use of data science in the Automobile industry.