Jersey City, NJ

Mukul Desai

I'm anData Engineer

Data Engineer building production ETL/ELT pipelines, cloud data warehouses on AWS and Snowflake, and AI-augmented analytics across healthcare and financial services. Comfortable across Airflow and dbt orchestration, dimensional modeling, RAG systems with LangChain, and stakeholder-facing Power BI and Tableau dashboards.

0+
Projects
0+
Certifications
0+
Years Experience
Mukul Desai
Available for work
Jersey City
Based in NJ

Technical Skills

My expertise spans across data engineering, machine learning, and analytics technologies

Programming Languages

Python
95%
SQL
90%
R
85%
JavaScript
80%
Java
75%
Scala
70%

Data Engineering

Apache Spark
90%
Apache Kafka
85%
Apache Airflow
88%
Hadoop
80%
ETL/ELT
92%
Data Warehousing
87%

Machine Learning & AI

TensorFlow
85%
PyTorch
80%
Scikit-learn
90%
OpenAI API
88%
NLP
82%
Computer Vision
78%

Analytics & Visualization

Tableau
90%
Power BI
85%
Plotly
88%
D3.js
75%
Matplotlib
92%
Seaborn
90%

Cloud Platforms

AWS
85%
Google Cloud
80%
Azure
82%
Snowflake
88%
Databricks
85%
Docker
80%

Tools & Technologies

Git
90%
Jupyter
95%
VS Code
92%
Linux
85%
MongoDB
80%
PostgreSQL
88%

Work Experience

Building production data systems across healthcare and financial services

Johnson & Johnson

Data Engineer

Full-time
Nov 2025 – Present
Raritan, NJ
  • Developed event-driven monitoring and validation pipelines across 6+ enterprise applications, processing 200+ files and 100K+ records daily with automated schema and anomaly checks eliminating 15+ hours/week of manual support.
  • Standardized release workflows by engineering Jenkins CI/CD pipelines with shared libraries across multiple data and integration repositories, reducing average deployment time by 65%.
  • Engineered cross-system identity data reconciliation logic across ServiceNow, IAM, and Salesforce, automating access provisioning workflows for hundreds of users and cutting cycle time by 60% across 3 teams.
JenkinsCI/CDPythonData PipelinesServiceNowSalesforce

TripForCure Inc.

AI & Data Platform Engineer

Full-time
June 2025 – Oct 2025
Plainsboro, NJ
  • Designed and orchestrated 8+ Airflow DAGs running Python and FastAPI data workflows for hospital recommendation systems, cutting manual data prep time by 70% for downstream reporting teams.
  • Built ingestion and embedding pipelines indexing 2K+ healthcare document chunks into a ChromaDB vector store, powering a LangChain and GPT-4 RAG system serving 500+ clinical stakeholders across 31 hospital locations.
  • Led zero-downtime migration of data infrastructure to AWS (S3, RDS, IAM, CloudWatch) using a dual-write cutover, improving system reliability and HIPAA-aligned compliance posture.
AirflowFastAPILangChainChromaDBAWSHIPAA

TripForCure Inc.

BI & Data Engineering Intern

Internship
Sep 2024 – May 2025
Plainsboro, NJ
  • Productionized Python and SQL ETL pipelines consolidating tens of thousands of healthcare records from 31 hospital locations into Snowflake, cutting manual consolidation effort by 80%.
  • Implemented dbt transformations and quality tests across staging and mart layers, maintaining 99.9% data quality across 7 production models.
  • Created Power BI dashboards on the Snowflake mart layer, tracking clinical KPIs (readmission rates, length of stay, patient throughput) for 500+ stakeholders across clinical and operational teams.
PythonSQLSnowflakedbtPower BIETL

Larsen & Toubro Technology Services

Data Analyst Intern

Internship
Aug 2021 – Sep 2021
Mumbai, India
  • Delivered Tableau dashboards and SQL workflows tracking project performance and financial metrics across 5+ engineering units, cutting manual reporting effort by 40%.
TableauSQLFinancial AnalyticsReporting

Featured Projects

Data engineering, AI, and analytics projects across healthcare and financial services

Healthcare Data Reliability Platform
Latest
Healthcare
May 2026

Healthcare Data Reliability Platform

End-to-end modern data engineering project simulating how healthcare organizations create trusted analytics systems. Ingests synthetic healthcare data, transforms into analytics-ready warehouse models, applies automated quality checks, monitors pipeline health, and exposes business insights via Streamlit dashboard.

dbt
Apache Airflow
Snowflake
+3
ZeroDay - Multi-Agent AI Developer Onboarding
Featured
AI Platform
July 2025

ZeroDay - Multi-Agent AI Developer Onboarding

Architected a 4-agent LangChain system (code search, task recommendation, learning guidance, real-time help) sharing context through a ChromaDB vector store, achieving sub-second retrieval and reducing onboarding queries by 40%.

LangChain
OpenAI
ChromaDB
+2
QuantFlow - AI-Augmented DCF Valuation Platform
Active
Finance
June 2025

QuantFlow - AI-Augmented DCF Valuation Platform

Automated end-to-end DCF valuation pipeline (NOPAT modeling, 5-year forecasts, terminal value, peer benchmarking) with scenario-driven sensitivity analysis, surfacing buy/sell recommendations via Power BI.

Python
yfinance
Alpha Vantage
+2
Real-Time Financial Fraud Detection
Active
Finance
February 2025

Real-Time Financial Fraud Detection

Stream-based fraud detection system using Kafka and Flink for real-time anomaly monitoring with 99.2% detection accuracy processing 10M+ transactions per second.

Kafka Streams
Apache Flink
PostgreSQL
+2
InterviewGPT - AI Interview Trainer
Active
AI Platform
March 2025

InterviewGPT - AI Interview Trainer

AI-powered interview preparation platform providing personalized mock interviews, real-time feedback, and skill assessment for job seekers across 50+ job domains.

Next.js
OpenAI GPT-4o
Firebase
+1
AI-Driven Risk Analytics Dashboard
Completed
Finance
January 2025

AI-Driven Risk Analytics Dashboard

ETL-based dashboard for VaR/CVaR risk analytics with dynamic Tableau visualizations, built with scalable Apache Airflow pipelines and PostgreSQL backend.

Apache Airflow
dbt
PostgreSQL
+1
Marketing Content Generation Tool
Completed
AI Platform
December 2024

Marketing Content Generation Tool

AI-powered content generation tool leveraging OpenAI and ChromaDB to produce platform-specific, SEO-optimized digital marketing content.

OpenAI
ChromaDB
Python
+2
Lung Cancer Detection with AI
Completed
Healthcare
September 2024

Lung Cancer Detection with AI

Built a logistic regression model and deployed a Streamlit app for early lung cancer detection with explainability using SHAP values.

Logistic Regression
Streamlit
SHAP
+2
IPL 2023 vs 2024 Analysis
Completed
Analytics
May 2024

IPL 2023 vs 2024 Analysis

Tableau dashboard analyzing IPL match statistics, visualizing trends, and predicting match outcomes using ML algorithms with 75% accuracy.

Tableau
Python
Machine Learning
+1
UAE Vehicle Market Analysis
Completed
Analytics
April 2024

UAE Vehicle Market Analysis

Power BI dashboard visualizing vehicle sales trends with clustering algorithms to segment the market and provide actionable insights for stakeholders.

Power BI
ETL
Data Pipeline
+1

Education & Certifications

My academic foundation and professional certifications in data science and engineering

Education

Northeastern University

Master of Science, Information Systems

Sep 2023 - May 2025
GPA: 3.38
Key Coursework:
Data Management and Database Design
Big Data Architecture & Governance
Data Science and Engineering Methods
Prompt Engineering and AI

University of Mumbai – VESIT

Bachelor of Engineering, Electronics & Telecommunications

Aug 2019 - May 2023
GPA: 3.29
Key Coursework:
Cloud Computing
Artificial Neural Networks and Fuzzy Logics
Financial Management

Professional Certifications

IBM Data Engineering Professional Certificate

IBM Data Engineering Professional Certificate

IBM

December 2024
Verify
Google Cloud Data Analytics Specialization

Google Cloud Data Analytics Specialization

Google Cloud

November 2024
Verify

Get In Touch

Ready to discuss your next data project? Let's connect and explore how we can work together.

Contact Information

Location

Jersey City, NJ

Connect With Me

Send a Message

Ready to Collaborate?

I'm always excited to work on innovative data projects and help organizations unlock the power of their data. Let's build something amazing together!

Built with v0