About me
Hi! My name is Lavender Jiang (蒋遥). I am a fourth year Data Science PhD student at New York University, co-advised by Eric Oermann and Kyunghyun Cho. I work on natural language processing for clinical notes and I am interested in representation learning. I am a member with OLAB and ML2. I am honored to be the recipient of medical fellowship from NYU Langone Health and the AIML PhD fellowship from Apple.
I received my BSc in Electrical and Computer Engineering and Mathematical Sciences from Carnegie Mellon, where I worked with José Moura on graph signal processing, Pulkit Grover on cortical spreading detection, and Howard Choset on sensor fusion.
If you want to discuss research with me, feel free to write me an email. (If we know each other, we can schedule a 30-minute meeting). We can meet on Zoom or in person (Center for Data Science at 60 5th Ave, Tisch Hospital at 550 1st Ave, or Washington Square Park).
I enjoy cooking, gaming and yoga. I am married to my best friend Xujin Liu. Some of my new hobbies include bass guitar and homemaking. I’ve been vegan since 2020 and have not died from protein deficiency yet ;)
Publications and Talks
- Health system-scale language models are all-purpose prediction engines. Lavender Yao Jiang, Chris Liu, Mustafa Nasir-Moin, Nima Pour Nejatian, Duo Wang, Anas Abidin, Howard Riina, Ilya Laufer, Paawan Punjabi, Kevin Eaton, Madeline Miceli, Nora C. Kim, Cordelia Orillac, Zane Schnurman, Christopher Livia, Hannah Weiss, David Kurland, Sean Neifert, Yosef Dastagirzada, Douglas Kondziolka, Alexander M Cheung, Grace Yang, Ming Cao, Mona Flores, Anthony B. Costa, Yindalon Aphinyanaphongs, Kyunghyun Cho and Eric Karl Oermann. (Nature)
NYUTron: Health System-scale Language Models for Clinical Operations: 30-day Readmissions. Lavender Y. Jiang, Nima P. Nejatian, Anthony B. Costa, Chris X. Liu, Yindalon Aphinyanaphongs, Mona G. Flores, Kyunghyun Cho, Eric K. Oermann. (NVIDIA GTC, 2022)
Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model. Salman Rahman, Lavender Yao Jiang, Saadia Gabriel, Yindalon Aphinyanaphongs, Eric Karl Oermann, Rumi Chunara.
Language Models Can Guess Your Identities from De-identified Clinical Notes. This paper has not been published, which I believe is partly due to how it challenges some foundational assumptions of current clinical NLP practice. I still think it raises important questions about how we handle and share de-identified health data, and I invite you to read it and form your own opinion.
Automated, Scalable and Generalizable Deep Learning for Tracking Cortical Spreading Depression Using EEG. Alireza Chamanzar¹, Xujin Liu¹, Lavender Y. Jiang, Kimon A. Vogt, José M. F. Moura, Pulkit Grover. (International IEEE/EMBS Conference on Neural Engineering, 2021.) Patent: System and method for deep learning for tracking cortical spreading depression using eeg (WO2022235467A1).
- Graph Signal Processing and Deep Learning, Mark Cheung, John Shi, Yao Jiang, Oren Wright and José Moura. (IEEE Signal Processing Magazine Special Issue on Graph Signal Processing)
Internship
05/2025 - 09/2025: Research Intern at Apple (New York).
06/2024 - 09/2024: Research Intern at Apple (Seattle).
Grants
Seed Grant Funding Award (Apple Workshop on ML for Health 2024, $25,000). Kyunghyun Cho, Eric Oermann, Lavender Jiang.
Academic Services
Reviewer for IEEE TNNLS, BMC Health Services Research, ICLR DMLR Workshop 2024
Emergency Reviewer for ACL Student Research Workshop 2023,ACL 2023, All Things Attention, AACL-IJCNLP 2022
Co-organizer for NYU AI School 2022
Teaching
1/2025 - 05/2025 Section leader and grader for NYU DS-GA.1003 Machine Learning (I wrote the lab on classification and backprop)
1/2024 - 05/2024 Grader for NYU DS-GA.1003 Machine Learning
09/2023 - 12/2023 Section leader and grader for NYU DS-GA.1011 Natural Language Processing with Representation Learning; Private tutor for NYU DS-GA.1005 Inference and Representation
03/2023 - 05/2023: Private tutor for NYU DS-GA.1003 Machine Learning
06/2022 - 05/2025: Research mentor for undergrad / master/ high school students (project and names are sorted by time and alphabetical order.)
Spring 2024 - Spring 2025: Evaluating the impact of packing and shuffling strategies on training decoder models. Accepted to ACL Rolling Review.
- Ruilin (Luca) Wang (NYU DS MS 25’, NYU BSc Nutrition 23’)
- Yanbing (Cynthia) Chen (Penn State Bioinformatics PhD 30’, NYU Biostats MS 25’)
Summer 2023 - Summer 2024: Evaluating LLM’s ability of generating knowledge graphs of medical concepts. Accepted to ML4H.
- Gabriel Rosenbaum (UChicago BSc CS 29’, Packer Collegiate 25’)
Summer 2023 - Spring 2024: Evaluating clinical LM’s understanding of lab measurements.
- Avery Hang (Yale Stats MS 26’, NYU DS+Math 24’)
- Ruiqi Deng (Cornell Tech Health Tech MS 26’, NYU CS+DS 24’).
Fall 2023: Evaluating the predictive power of different note types and different note segments. Accepted to ACL SRW.
- Hongyi Zheng (Quantitative Strategist at Akuna Captial 23’-present, NYU Math+CS+DS BA 23’)
- Tracy Zhu (UChicago Stat MS 25’, NYU DS+Math BA 23’)
Summer 2022 - Spring 2023: evaluating token-level sensitivity of clinical language model. Accepted to ML4H.
- Grace Ge’er Yang (Data Scientist at Databricks 25’-present, Stanford DS MS 25’, NYU Math+DS BA 23’)
- Ming Cao (UPenn DS MS 25’, NYU DS+CS BA 23’)
Summer 2022 - Fall 2023: using the correlation bias for automatic ICD-9 code assigment from discharge note (co-mentor with Chris Xujin Liu). Accepted to ACL SRW.
- Gavin Yang (NorthEastern CS PhD 30’, NYU CS+DS BA 24’)
- Lucy Wu (Data Scientist at Microsoft 25’-present, Colombia DS MS 25’, NYU DS+CS BA 23’)
- Stephen Zhang (UPenn DS MS 25’, NYU CS+DS BA 23’)
09/2020 - 12/2020 Section leader and grader for CMU 21-260 differential equations