Open to roles · Global

Data that ships.
Decisions that stick.

Data scientist with 5+ years shipping production pipelines, inventory systems & GenAI applications. I own the full arc — scoping, building, and landing the "so what" with leadership.

5+
Years shipping
10K+
SKUs at Wayfair
~25%
Revenue growth · Pantone
2
GenAI projects shipped
01 — About

The short version.

Most of my five years has been in retail and commerce: at Wayfair, owning end-to-end data pipelines for inventory allocation and replenishment across 10K+ SKUs; at Pantone, leading sales and marketing analytics — LTV, CAC, cohort ROI — that translated into pricing and GTM moves.

Lately I've been pulling that same discipline into GenAI. I built a Slack-based SQL bot at Wayfair that put self-serve analytics in stakeholders' hands, and I've been shipping a portfolio of production-grade GenAI systems — RAG with evaluation gates, local LLM benchmarking, QLoRA fine-tuning — applying the same correctness-and-measurement instincts that pipelines taught me.

The through-line: I care about building systems that produce correct, auditable outputs — and proving it.

// profile.json

RoleData Scientist
FocusRetail · Analytics · GenAI
Experience5+ yrs
LatestWayfair
EducationMS · Northeastern
AvailabilityOpen globally
02 — Selected work

Shipped & live.

Two production-grade GenAI projects. Both run, both evaluated, both on GitHub.

2 shipped
RAG
01 / Production
Retrieval-Augmented Generation

Production RAG with a quality gate.

Domain-specific document Q&A with hybrid retrieval, re-ranking, citation enforcement, and CI-integrated evaluation. Instrumented with per-stage latency tracing, cost-per-request tracking, and a Streamlit dashboard. A GPT-4o-as-judge harness blocks deploys on faithfulness regression.

PythonFastAPIChromaDBLangChainGPT-4oStreamlitCI/CD
LLM
02 / Benchmark
Open-Source LLM Evaluation

Benchmarking three LLMs on Apple Silicon.

A data-driven model-selection report measuring latency, throughput, and structured-output accuracy for Llama 3.2, Phi-4-mini, and Mistral 7B — with quantization tradeoff analysis and Pydantic-validated output with retry logic.

PythonOllamaPydanticLlama 3.2Phi-4-miniMistral 7BApple M1
03 — Up next

The roadmap.

What's queued. Updated as things ship.

1 in progress · 2 planned
01 In progress
Fine-Tuning

QLoRA on a free T4.

Fine-tuning Qwen2.5-3B-Instruct for structured JSON extraction from e-commerce product descriptions — chosen for retail overlap and clean field-level accuracy evaluation.

PyTorchtransformerspeftbitsandbytesQwen2.5
02 Planned
LLM Evals

Drop-in eval library.

Extending the RAG-project eval approach into a standalone library: dataset generation, rubric scoring, cost tracking, and regression detection across model versions.

PythonpytestOpenAIAnthropic
03 Planned
Agentic Workflow

Retail replenishment agent.

An agent answering operational questions over a simulated inventory warehouse using tool use, SQL generation, and self-critique. Bridges the portfolio back to the domain I know best.

PythonLangGraphDuckDBClaude
04 — Career

Where I've shipped.

5 stops
# Company Role Highlight Period
01
Wayfair
Data Scientist · Physical Retail AnalyticsBoston, USA
BigQuery/Python pipelines across 10K+ SKUs. Multi-warehouse allocation engine, Slack-based GenAI/SQL bot, Retail Inventory Toolkit flagging stockouts 2–4 weeks early.
2025–2026
02
Pantone
Data Analyst · Sales & MarketingNew York, USA
LTV, CAC payback, cohort ROI → ~25% revenue growth, ~30% acquisition lift. Churn & renewal propensity models improved retention ~20%.
2023–2024
03
Apps Associates
Associate Consultant · Co-opBoston, USA
SQL pipelines integrating Oracle ERP and Salesforce into Snowflake — 40% faster reporting turnaround.
2022
04
MySyara
Strategy AnalystDubai, UAE
Tiered subscription model (SQL, GA4, Python) → +18% revenue. CRM campaigns lifted downloads +15% and MAUs +20%.
2019–2020
05
Sapience Technology
Data Scientist · InternDubai, UAE
Predictive models forecasting customer behaviour; 20% engagement lift. Tableau BI from stakeholder requirements.
2018–2019
05 — Toolkit

The stack.

Five years of assembled tools — from warehouses and pipelines to GenAI backends.

Programming & Data

// languages & warehousing
  • Python
  • SQL
  • R
  • BigQuery
  • Snowflake
  • Redshift
  • AWS
  • GCP
  • ETL/ELT
  • Airflow
  • Composer
  • Vertex AI
  • Hadoop

Backend & GenAI

// LLM & API engineering
  • FastAPI
  • Flask
  • REST APIs
  • LangChain
  • OpenAI
  • Anthropic
  • ChromaDB
  • RAG
  • PyTorch
  • transformers
  • peft
  • Ollama

Analytics & Modeling

// statistical & ML
  • Forecasting
  • Regression
  • Classification
  • Clustering
  • A/B Testing
  • Churn Prediction
  • Cohort Analysis
  • LTV/CAC
  • Propensity Modeling

Visualisation

// BI & dashboards
  • Power BI (DAX)
  • Tableau
  • Looker Studio
  • Streamlit
  • Executive KPIs

Retail Domain

// where I've shipped
  • Category Management
  • Inventory & Replenishment
  • Pricing & Promotion
  • Assortment Optimisation
  • Consumer Insights
  • Supplier Scorecards

Workflow & Ops

// how I ship
  • Git
  • GitHub Actions
  • Jupyter
  • Docker
  • Slack
  • Jira
  • pytest
06 — Writing

Notes, forthcoming.

One post per project. Shipping shortly.

Short-form writing on what I'm building — the decisions, the surprises, the measured deltas. No LinkedIn thinkpieces.

first_drop.md
# QLoRA on a free T4
# field-level accuracy
# training wall-clock
# what broke
Let's talk

Build something measured.

Open to data-science and AI-engineering roles globally. Happy to chat about retail systems, retrieval, or what broke on your last LLM launch.