About

Learn more about me

Data Scientist

I am an experienced software developer with robust problem-solving skills and proven experience in creating and designing software in a test-driven environment. Have excellent analytical skills and experience of working with large amounts of unstructured or structured data.

  • Birthday: October 28, 1996
  • Website: vikingpathak.github.io
  • Phone: +91 86527 96533
  • City: Mumbai, India
  • Age: 26
  • Degree: PGDM/MS
  • Email: amit_pathak@outlook.in
  • Experience: 5.5+ years

I also like to write technical blogs or articles on various platforms; published more than 50 articles. Among top 10% StackOverFlow users to earn highest reputation for the year 2021, 2022, and 2023.

Education

PGDM/MS in Data Science, Big data & Business Analytics

2020 - 2021

Aegis School of Data Science Mumbai

Credits Earned: 36

Bachelor of Electronics & Telecommmunication Engineering (B.E.)

2014 - 2018

Mumbai University

CGPA: 7.75

Higher Secondary School Certificate (H.S.C.)

2012 - 2014

Maharashtra State Board

Percentage: 75%

Secondary School Certificate (S.S.C.)

2000 - 2012

Maharashtra State Board

Percentage: 86%

Professional Experience (5.5+ years)

Senior Data Science & Operations Research Analyst

October 2022 - Present

Wipro, Pune

Deputy Manager, Analytics Services

January 2021 - October 2022

Tata Insights & Quants, Bangalore

Senior Software Developer, Data Science

September 2020 - January 2021

Pegasus Infocorp, Mumbai

Python Developer

June 2020 - September 2020

UnoLigo Solutions (now UnoBot), Mumbai

Systems Engineer

June 2018 - January 2020

Infosys Limited, Bangalore

Projects

Articles/Blogs

e-Learning Certificates

Publication

Skills

Python Programming 100%
DBMS & SQL 90%
Data Analysis & Visualization 90%
Machine Learning 80%
Data Science Tools & Libraries 80%
Python API Framework 80%

Interests

AIML Modeling

Data Visualization

Data Analytics

Statistical Modeling

Big Data

Computer Vision

Technical Blogs

API Development

Projects

Check My Work

Highlights

Overview

  • Built scalable solutions using MLOps practises to deploy and maintain machine learning models in production reliably and efficiently, and produce richer, more consistent insights with machine learning.
  • Experience in using multiple data science methodologies to solve complex business problems (e.g., statistical analysis, machine learning techniques, data modeling, financial analysis, demand modeling, etc.)
  • Ability to create APIs in Flask/Django, build data models using SQLAlchemy, implementing the OOPS concept, data structure, unit testing, and writing modular codes in Python.
  • Proficient in developing python codes with pylint standards and SQL injection proof and other security measures with bandit check.
  • Familiar with popular python framework and libraries for Data Analysis like pandas, numpy, matplotlib, plotly, Flask, scikit-learn, xgboost, beautifulsoup, mlflow, tkinter, etc.
  • Experience in gathering, cleansing and analyzing data from multiple database sources (PostgreSQL, APIs, MongoDB, etc) using automated techniques and standardize the data structure.
  • Experience in visualizing or presenting data to stakeholders using Tableau, Advanced Excel, ChartJS, Powerpoint, etc.
  • Proven ability to communicate verbally and in writing to technical peers and leadership teams with various levels of technical knowledge, educating them about data insights and data-driven recommendations.
  • Proficient understanding of code versioning tools such as Git and experience in working with Azure cloud and Linux platform.
  • Experience with Agile software development and SCRUM and writing detailed technical documents along with excellent communication skills and team-working capability.

Operations Research Projects

Disruption Recovery Solver for Supply Chain Disruptions

  • The project aims to provide organizations with a powerful tool to proactively manage and recover from supply chain disruptions.
  • By optimizing recovery strategies using OR methods, businesses can reduce downtime, minimize financial losses, and maintain customer satisfaction even in the face of unforeseen disruptions.
  • Create a solver that generates optimal recovery strategies in the event of a disruption, taking into account factors like inventory management, transportation rerouting, and resource allocation.

Inventory Optimization using OR tools

  • The project improves inventory management practices, resulting in cost savings and improved operational efficiency.
  • The project combines a wide array of OR tools and techniques, such as mathematical modeling, optimization algorithms, simulation, and data analytics, to achieve the objectives.
  • The objectives include reducing the risk of overstocking and understocking, order aggregation from multiple vendors. Estimation of order delivery, forecasting orders, identifying best warehouse routes, recommending best-fitting package, etc.

Data Science Projects

Digitalizing Contracts & Services Procurement

Sentence Transformer
Rapidminer
Python

  • As part of digitizing contracts and services procurement, an AI search functionality is built to leverage the power of sentence transformer models to provide desired responses for a user search made for a contract or service.
  • The system is built and deployed on RapidMiner platform.

Vendor Procurement Analysis & Vendor Recommendation Engine

MLOps
XGBoost
LightGBM
Statistical Models

  • Implemented end-to-end software solutions on the principles of MLOps lifecycle management using Databricks MLFlow library.
  • Analysis of the vendors based on 7 different metrics including popularity of a vendor, delivery duration, pricing of items, market reputation, etc. Each vendor is compared with their counterparts based on the different combination of items present with them.
  • The vendors were analyzed and ranked accordingly. The estimated delivery time of the vendor is modelled using the XGBoost and LightGBM models whereas other metrics are built on NLP engine for market reputation and heuristic models.

Discount Recommendation for Interest based Payout for the Business Sellers

Linear Regression
Statistical Model
Clustering
Docker

  • Built a system to provide dynamic interest rate for the early payment that can be made to a given business seller based on its firmographics and the type of historical business provided by the Seller.
  • This helps the businesses having ample amount of cashflow save huge on operational costs and have a reward mechanism to encourage maximum seller participation.
  • The system is built using SVM's clustering approach on top of a statistical model which calculates 16 different Seller metrics using the historical time-bound data for each of the Seller.
  • The system is further enhanced to predict the acceptance of a Seller for a given interest-based payout by the businesses to maximize the participation of them by using the regression based linear models.
  • The system is designed end-to-end and deployed using docker containers to the target environment.

Spare Parts Forecasting to improve Supply Chain Efficiency

Survival Analysis
Forecasting

  • Design a Survival model to learn the time horizon for the breakdown of appliances and forecast the requirement of spare parts for the next six months.
  • The final solution is implemented with the help of four different models whose input and (or) output are inter-connected.
  • The first model is a standard ARIMA model to forecast the primary sales of appliances followed by an algorithm built using cohort analysis to forecast the secondary sales. The third model predicts the spare parts requirements by using the survival model - Cox Proportionality Hazard. The final model is a feedback model which simulates the existing inventory, their location and transit times and accordingly modify the final requirement of spare parts that needs to be ordered/stored/manufactured.

Timeseries forecasting for Metal Procurement

Timeseries Forecasting
ARIMAX
SARIMAX

  • Enhance the existing timeseries VAR models to a neural network-based approach to achieve higher accuracy.
  • Maintain the current models by regularly updating data and analyzing the changing trends and share report with the client team to help them take better procurement decisions.
  • Automated the complete data download and processing stage to reduce the manual effort from 6 hours to a few seconds.
  • Refactoring the existing code, building libraries and modules to make the usage of the models more efficient.

Article & News Recommendation Model using Heuristics Approach

Heuristics Modeling
NLP
Recommendation Engine
Docker

  • Build a heuristics-based model to provide dynamic scoring to news and articles based on the user's profile who has logged in.
  • The scores are developed using a ranking-based algorithm for the articles and highest-score for the news by considering the oldness, category, sub-category, business vertical, user horizontal, word semantics, etc. for each of the news and articles on the platform.
  • Designed the complete end to end solution and deployed using docker containers to the target environment.

Intruder Detection System GUI Application - Security Surveillance

GUI Programming
Computer Vision
PyQt5
Security

  • Developed a PyQt5 desktop application for Intruder detection surveillance using OpenCV.
  • The application captures a series of images if any movement is observed in a surveillance room during a given timeframe and sends a GIF to the concerned authorized person to take necessary security measure if required.

Blog

My Writings

Contact

Contact Me

My Address

Mumbai, Maharashtra, India

Social Profiles

Email Me

amit_pathak@outlook.in | apathak092@gmail.com

Call Me

+91 86527 96533

Designed by BootstrapMade