Models, experiments, and systems built on real data.

M.S. Data Science student at NYU. I work at the intersection of statistics, machine learning, and engineering—benchmarks, inference, and full-stack tools.

Graduate

NYU · Data Science

Expected

May 2027

Undergraduate

John Jay · CS & Security

Brooklyn-based · curious about small data & big models

I am an M.S. Data Science student at New York University (expected May 2027) with a B.S. in Computer Science & Information Security and a mathematics minor from John Jay College (CUNY), May 2025.

Recent work includes a multimodal benchmark (MathLABS), large-scale survey analytics (RateMyProfessor), and full-stack apps with Streamlit and Django. Outside the terminal: volleyball, pickleball, and trails.

Lucas Yao

Tools I reach for

Ordered for how I actually build: from raw tables to deployed interfaces.

Python · pandas · scikit-learn Stats · hypothesis tests · regression ML · classification · evaluation SQL · MySQL · SQLite MongoDB · JSON pipelines APIs · REST · model routers Streamlit · quick UIs Django · web backends C++ · structures & algorithms Git · Linux · VS Code

Selected work

Research-style benchmarks, applied statistics, and shipping products. Each card links out to code, a live app, or a written report.

MathLABS

Small data · MLLM benchmark · Fall 2025

400+ visual discrete-math items, MongoDB + Hugging Face pipeline, hybrid evaluator with Gemini and OpenRouter—nine VLMs on math MCQs.

RateMyProfessor Analytics

IDS capstone · 18k+ ratings

Welch’s t-tests on gender vs. difficulty, logistic regression on “hotness” (AUROC 0.80), 26+ feature drivers.

PlantID Care Companion

Streamlit · PlantNet · LLM chat

Species ID from photos, fuzzy-matched care guides, character-driven plant chat.

NLP · Phishing & spam

Streamlit · ML · capstone

Content-level classification beyond brittle keyword rules.

BMCConnect

Django · curriculum & chat · BMCC TLC

Search, live support, and analytics for ~1,500 CIS students.

Emotionfy

React · Flask · Spotify API

Facial emotion cues mapped to playlist recommendations.

Coursework & lab archive → — cryptography write-ups, Qt project, bookstore demo, chatbot repo.

Education & experience

Education

M.S. Data Science · NYU

New York, NY · expected May 2027

Coursework: intro DS, optimization & linear algebra, learning from small data, ML, big data, advanced Python.

B.S. CS & Information Security · John Jay (CUNY)

Minor in Mathematics · May 2025

OOP (C++), data structures & algorithms, probability & statistics.

Experience & programs

College Assistant · BMCC

Sep 2023 – Dec 2023

CS 101 lectures and tutoring for 20 students.

Technology Learning Community Intern · BMCC

May 2023 – Jul 2023

Curriculum search, chat, and dashboards for CIS students.

CodePath · Apprenticeship

Jun 2025 – Aug 2025

Web dev & technical interview prep tracks.

CUNY Tech Prep · Data Science Fellow

Jul 2024 – May 2025

EDA through ML with Python, Jupyter, pandas, TensorFlow, scikit-learn.

JJAY CS Society · Web developer

Sep 2024 – May 2025

Maintained club site at jjaycss.tech.

Let’s talk data

Email, GitHub, or LinkedIn—best for collaborations, internships, or swapping project ideas.