Open to opportunities

From data to product to real-world impact.

I'm Mylie. I combine data, code, and product thinking to ship things people actually use.

Now

A snapshot of my current life — updated regularly-ish.

Finishing my Masters

MS Data Science @ Stevens Institute of Technology. graduating May 2026. Currently figuring out how to deploy models that don't break in production (harder than it sounds).

📍 Hoboken, NJ

Job hunting

Seeking full-time Data, ML, or Product roles starting Summer 2026. Interested in teams solving user-driven problems with data; especially where analytics and models meaningfully influence product and customer experience.

Open to relocate

Building in public

Currently hacking on AI tools that automate the tedious parts of life. Started with job applications because I've sent way too many of those myself.

Side projects

Learning & exploring

Deep diving into LLMs, vector databases, and building with them. Also learning how to use chopsticks (I love sushi).

Always curious

Projects

Things I've built that actually shipped. No abandoned repos here.

Sales & Marketing Analytics

End-to-End Customer Analytics Framework

Python RFM Analysis SQL Tableau

Comprehensive analytics framework analyzing customer lifecycle with advanced techniques including RFM segmentation, churn prediction, CLV modeling, market basket analysis, and uplift modeling for marketing optimization.

Key Challenges
  • Built comprehensive framework analyzing customer lifecycle
  • Implemented advanced techniques like uplift modeling for marketing optimization
  • Integrated 9 different analytical modules into unified dashboard
What I Learned
  • End-to-end analytics thinking across business functions
  • Business impact of ML beyond accuracy metrics
  • Multi-disciplinary approach to customer analytics (marketing + data)
9 Modules 93% Churn Prediction 5 Techniques

Stock Portfolio Optimization

Quantitative Finance & Modern Portfolio Theory

Python Pandas Quantitative Finance MPT

Applied Modern Portfolio Theory to real financial data, optimizing the risk-return tradeoff across multiple sectors using advanced quantitative finance principles.

Key Challenges
  • Applied Modern Portfolio Theory to real financial data
  • Optimized risk-return tradeoff across multi-sector portfolio
  • Backtested allocation strategies against historical market performance
What I Learned
  • Quantitative finance principles and their application
  • How theory meets real market data and constraints
  • Risk modeling beyond traditional approaches
42.5% Annual Return 1.57 Sharpe Ratio Multi-sector Allocation

Time Series Analysis

Statistical Forecasting & Trend Analysis

Python ARIMA Forecasting Statsmodels

Comprehensive time series analysis framework including ARIMA modeling, seasonal decomposition, trend analysis, and forecasting across multiple datasets and domains.

Key Challenges
  • Handled non-stationary data with proper differencing
  • Tuned ARIMA parameters using ACF/PACF analysis
  • Implemented seasonal decomposition for complex patterns
What I Learned
  • Statistical foundations of time series forecasting
  • Balancing simplicity vs complexity in model selection
  • Importance of proper data preprocessing for temporal data

YouBuy

Multimodal E-Commerce Search Engine

Apache Spark CLIP GCP LSH

Built search system processing 10M+ products, combining text (TF-IDF) and image embeddings (CLIP) for hybrid retrieval.

Key Challenges
  • Reduced query latency from 12s to 800ms using Locality-Sensitive Hashing
  • Combined text and image embeddings achieving 0.78 NDCG@10
  • Optimized Spark pipeline to handle 500 queries/sec on 4-node cluster
What I Learned
  • Balance accuracy vs latency - achieved 95% recall while being 15x faster
  • Spark optimization is about partitioning strategy, not just cluster size
  • Multimodal search requires careful embedding space alignment
0.78 NDCG@10 800ms Latency 10M Records 500 q/sec

Macro Volatility Prediction

Financial Modeling

Python Bloomberg API XGBoost Time Series

Predicted equity market volatility using 10+ years of Bloomberg Terminal data (CESI, VIX, SPX indicators).

Key Challenges
  • Engineered 15+ macro features from Bloomberg data
  • Built ensemble models (XGBoost, Random Forest) with 78% directional accuracy
  • Backtested trading signals showing 12% annualized alpha
What I Learned
  • 78% accuracy != profitable trading after transaction costs
  • Overfitting is easy with macro time series data
  • Translating statistical tests into decision frameworks is the valuable skill
78% Accuracy 12% Alpha 10+ Years Data

AI Job Search Assistant

GPT-4 Flask AWS Web Scraping

AI tool generating personalized cold emails from LinkedIn profiles and job descriptions. Acquired 100+ users in 48 hours organically.

Key Challenges
  • Integrated GPT-4 with custom prompts maintaining professional tone
  • Built LinkedIn scraper without violating ToS
  • Deployed with rate limiting after $200 API bill on viral day
What I Learned
  • Product-market fit happens fast when solving real pain
  • Cost management crucial for AI products
  • Users care about time saved over technical complexity
100+ Users in 48h 93% Time Saved 100% Organic

Currently Building

Side projects in progress. Always exploring new ideas and learning.

Beta

Intent Cart - AI Commerce

Demo this project →

From any recipe link to a ready-to-order cart in seconds

Paste a recipe URL and get an instant, shoppable grocery list — no more scrolling through ingredients and adding them one by one. AI does the parsing, you just checkout.

NLP Web Scraping E-commerce APIs AI Parsing
Building MVP

Smart AI Bookmarker & Tagger

Your content, intelligently organized

Automatically tags, categorizes, and surfaces relevant bookmarks and content when you need them. Semantic understanding that actually works.

AI Classification Vector Search Semantic Analysis Browser Extension
Prototype Phase

Live News RAG System

Real-time intelligence on what matters

RAG system that continuously ingests live news, understands context, and answers questions about breaking events as they unfold. Because Ctrl+F doesn't work on the entire internet.

RAG Real-time Indexing Vector Databases News APIs LLMs
Alpha Testing

Job applications that don't make you want to cry

Tracks applications, extracts job details automatically, tells you when to follow up. Because spreadsheets are so 2010.

Smart Parsing Automation Analytics Dashboard NLP

Writing

I write about data science, building products, and occasionally rant about things that annoy me.

February 2026

How I Used Python to Optimise a Stock Portfolio — and What I Learned

Python Portfolio Theory Quantitative Finance

Applied Modern Portfolio Theory to real financial data, optimized the risk-return tradeoff, and learned why theory meets messy reality.

92 claps
Read article
February 2026

How I Built a 9-Module Analytics Framework to Solve Real E-Commerce Problems — and What I Learned

Analytics E-Commerce Python

A deep dive into building a comprehensive customer analytics framework — from RFM segmentation to churn prediction to uplift modeling.

5 claps
Read article
October 17, 2025

Same Features, Different Feels: Why TikTok, Reels, and Shorts Feel So Different

Product Design UX Social Media

Why do TikTok, Reels, and Shorts each feel so unique, despite sharing the same features? It's all about design choices...

Read article
October 6, 2025

Measuring Success in AI Products: Beyond Accuracy Metrics

AI Products Metrics Product Management

Why accuracy isn't everything when building AI products that people actually want to use. Real-world metrics that matter more.

Read article
September 29, 2025

Deep Dive: How Agentic AI is Actually Working in 2025 (And Why It's Wilder Than You Think)

Agentic AI ML Systems Technical

A technical deep-dive from someone who's been neck-deep in agent architectures for the past year.

16 claps 1 response
Read article
March 16, 2025

Large Language Models in Finance: A Deep Dive into the Future of Finance

LLMs Finance AI Applications

The financial world is getting a serious AI-powered upgrade. By 2026, AI-driven analytics are projected to generate hundreds of billions...

Read article
July 22, 2024

Behind the Playlist: Analyzing Spotify's Recommendation System

Recommendation Systems Data Science Music Tech

Ever wondered how your favourite tunes magically appear in your Spotify playlists? As a data scientist and music enthusiast, I've been...

Read article

Core Skills

Languages

Python SQL R Scala

Machine Learning

Scikit-learn TensorFlow PyTorch XGBoost Deep Learning Neural Networks

Big Data & Cloud

Apache Spark AWS GCP BigQuery Snowflake Docker

Data Visualization

Tableau Power BI Looker Streamlit Plotly

NLP & AI

NLTK HuggingFace GPT-4/LLMs CLIP Sentiment Analysis Text Classification

Product & Analytics

A/B Testing Product Analytics User Behavior Analysis Cohort Analysis Statistical Modeling KPI Development

Data Engineering

ETL Pipelines Data Modeling Airflow Real-time Processing Data Warehousing

Specialized

Time Series Analysis Recommendation Systems RAG Systems Feature Engineering Model Deployment

Certificates

Professional certifications in data science, ML, and cloud technologies.

Aug 2025

AWS Certified AI Practitioner

Amazon Web Services

AI/ML on AWS SageMaker Generative AI ML Operations
View Certificate
Jun 2023

Azure AI Fundamentals

Microsoft

Azure AI Services Machine Learning Computer Vision NLP
View Certificate
Nov 2022

Data Analytics Certificate

Google

Data Analysis SQL Tableau R Programming
View Certificate

About

I'm Mylie — a data scientist who cares more about whether something works than whether it uses the trendiest framework. (Though I do love a good new framework.)

I grew up fascinated by patterns — in music, in games, in the way people make decisions. Data science felt like getting paid to solve puzzles, especially ones involving recommendation systems, personalization, and understanding human behavior.

Currently wrapping up my Masters in Data Science while building tools that help people with tedious tasks. Most of my curiosity lives at the intersection of ML and domains like finance, healthcare, e-commerce, and marketing — places where data products can actually move the needle. I believe the best ML is the kind you don't notice — it just makes things work better.

When I'm not arguing with Jupyter notebooks, you'll find me making unnecessarily complicated pour-over coffee, getting lost in random NYC neighborhoods, or deep in a Wikipedia rabbit hole about something completely useless.

Quick facts

  • 📍 Based in Hoboken, NJ — happy to relocate for the right role
  • Coffee snob (light roast, single origin, 30-second bloom)
  • 🍳 Experimental cook — some things work, many don't (flatmate can confirm)
  • 📊 Will judge your Excel charts (but nicely, and with suggestions)
  • 🤷‍♀️ A/B test everything — humans are surprisingly unpredictable
Try my project demos Inboxd Job Track Pro Intent Cart