Hi, my name is

Anchit Gupta.

I build data-driven solutions.

I'm a Senior Data Engineer with 4+ years of experience specializing in building scalable data pipelines and infrastructure. Currently working at Sevaro, solving real-world problems through data engineering and AI.

01. About Me

Hello! I'm Anchit, a Senior Data Engineer specializing in GenAI and Cloud-Native Analytics. I enjoy building scalable data platforms and integrating Generative AI solutions to solve complex business problems.

With over 4 years of experience, I've worked across the full data lifecycle - from building MDM pipelines for pharmaceutical giants to developing LLaMA-based gaming bots and automating medical document summarization.

My background includes an M.Tech in CS from IIIT Delhi and deep expertise in the modern data stack. I'm constantly exploring new technologies in the AI/ML space to build more intelligent systems.

Here are some technologies I use daily:

  • Python & PySpark
  • Databricks & AWS
  • GenAI (LangChain, Llama)
  • Snowflake & dbt
  • Airflow & Docker
  • SQL & NoSQL
Anchit Gupta

02. Where I've Worked

Oct 2025 — Present

Senior Data Engineer

@ Sevaro Health

  • Architected migration to Medallion architecture, reducing technical debt by 20-30%
  • Engineered CI/CD pipelines using GitHub Actions for automated deployments
  • Implemented RBAC policies in Databricks for data security and governance
  • Built data observability framework and monitoring dashboards
  • Spearheaded cloud cost management audits to minimize infrastructure expenses

Jul 2024 — Oct 2025

Data Engineer III

@ Junglee Games

  • Built LangChain-based GenAI summarization agent used across 5 games, saving 3.5 hours per analyst/day
  • Migrated Spark jobs from EMR to Databricks, improving ETL performance by 40% and reducing costs by 30%
  • Developed LLaMA-based Rummy bot for Learn/Practice section, enhancing user onboarding
  • Built MLOps pipelines for A/B testing and dynamic recommendation systems

Jul 2023 — Jul 2024

Data Engineer II

@ Junglee Games

  • Led 3 engineers to develop a centralized alerting and observability framework
  • Designed reusable alerting system with email support, reducing alert development effort by 70%
  • Created freshness and anomaly detection modules, reducing data quality incidents by 50%

Oct 2022 — Jul 2023

Data Engineer I

@ Junglee Games

  • Built self-serve analytics layers for cross-functional teams
  • Enhanced GST/TDS pipelines, reducing turnaround time by 20%
  • Improved KPI dashboards, cutting SLA delays by 80%

Apr 2022 — Sep 2022

Associate

@ Axtria

  • Developed MDM pipelines to unify 50M+ records using NLP, achieving 87% match accuracy
  • Delivered ETL tools and Golden Record logic for pharma client data

Jul 2021 — Apr 2022

Data Analyst

@ Axtria

  • Generated HCP/HCO hierarchies and automated stakeholder notifications
  • Reduced matching algorithm time by 40% through memory optimization

03. Education

Master of Technology

IIIT Delhi

2019 — 2021

Specialized in Computer Science with focus on Natural Language Processing and Deep Learning. Worked on research projects involving semantic analysis and information retrieval.

GATE Score: 98.14

Bachelor of Technology

Moradabad Institute of Technology (AKTU)

2014 — 2018

Completed B.Tech in Computer Science with First Division. Developed foundational skills in programming, Android development, and machine learning.

First Division

04. Things I've Built

🤖

GenAI Medical Summarization

Automated system for summarizing complex medical documents using Generative AI. Streamlines information extraction for healthcare professionals to improve patient care.

LLMs LangChain Python Healthcare
🎮

LLaMA Gaming Bot

Integrated LLaMA models into gaming environments to create intelligent, responsive bots that enhance player engagement at Junglee Games.

GenAI LLaMA Game Dev Python
💊

Pharma MDM Pipeline

Enterprise Master Data Management pipeline processing 50M+ records. Implemented fuzzy matching and NLP for high-accuracy entity resolution.

Big Data NLP Spark Axtria
📚

Wikipedia Citation Verifier

NLP-based Chrome Extension to find semantically similar sentences in cited documents for Wikipedia citations using deep learning embeddings.

Python Deep Learning Chrome Ext
😄

Humour Detection

Text classification system to detect humor using deep learning embeddings, optimized for memory-sensitive mobile devices with model size <10 MB.

TensorFlow NLP Mobile AI
🩸

Blood Bank Finder

Android application to help users locate blood banks across India. Features real-time availability and location-based search.

Android Java Firebase Maps API

05. Publications

Technical articles on Data Engineering, Apache Spark, and AI/ML

06. Certifications

🎓

GATE 2019

Score: 98.14 Percentile

🔷

Databricks Fundamentals Accreditation

Databricks

🤖

LangChain for LLM Application Development

DeepLearning.AI

🔧

Apache Airflow Fundamentals

Astronomer

☁️

Azure Databricks & Spark Core

Udemy

📊

MTA Database Fundamentals

Microsoft

J2SE Java Certification

Oracle

📱

Android App Development

Google

06. What's Next?

Get In Touch

I'm currently open to new opportunities and exciting projects. Whether you have a question or just want to say hi, my inbox is always open. I'll try my best to get back to you!