avatar
Yifeng He
We must know, we shall know.

About Me

Hey there! I’m 贺一峰 (He, Yifeng), a third-year Ph.D. candidate at UC Davis. My advisor is Professor Hao Chen. I got my B.S. (with Honors) from UC Davis too, double majoring in Applied Math and Computer Science. During undergrad, I also served as president of HackerHub, a student-run tech club on campus.

These days, I spend most of my time thinking about how AI and software engineering can help each other out. More specifically, my research interests are:

  • using AI to improve software engineering, with focus on testing and security,
  • making AI itself more secure — both foundation models and agents,
  • building better AI models for software engineering,
  • safe and secure programming languages (bonus points if useful for AI).
    • PS: useful PL for AI is a very unclear topic, and should be researched on.

I’m also into video games. Some of my all-time favorites include Pokémon (recently played Sword 🙃, Violet, and Arceus 😊), The Witcher 3, Clash of Clans (hit world rank #169 back in April 2018!), and Genshin Impact (though I haven’t touched it since 4.4, thanks miHoYo 😅). Other games I’ve enjoyed or recommend: The Legend of Zelda series, Palworld (was obsessed for a while… then drifted back to egg hatching in Pokémon).

I also enjoy powerlifting. I have trained for a little over two years. My (training) personal records are Squat 405lb (184kg), Bench Press 245lb (111kg), Deadlift 455lb (206kg) summing up to 1105lb (501kg) @ 85kg body weight. I also dabble in photography. If you’re curious, feel free to check out some of my shots on Instagram.

Curriculum vitae [PDF]

Education

Ph.D. in Computer Science, UC Davis (2023 – present)

B.S. in Computer Science and Applied Mathematics with Honors, UC Davis (2019 – 2023)

Publications

Peer-reviewed Papers

Yifeng He, Luning Yang, Christopher Castro Gaw Gonzalo, Hao Chen. Evaluating Program Semantics Reasoning with Type Inference in System F. Neural Information Processing Systems Datasets & Benchmarks Track (NeurIPS), 2025. [PDF (recommended version by LuaLaTeX)], [arXiv], [code], [slides], [poster].

Yifeng He, Jicheng Wang, Yuyang Rong, Hao Chen. FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation. Conference on Empirical Methods in Natural Language Processing: Findings (EMNLP), 2025. [DOI], [PDF], [arXiv], [code], [video], [slides], [poster].

Yifeng He, Ethan Wang, Yuyang Rong, Zifei Cheng, Hao Chen. Security of AI Agents, International Workshop on Responsible AI Engineering (ICSE-RAIE), 2025. [DOI], [PDF], [arXiv (recommended version with more figures!)], [code], [slides].

Yifeng He, Jiabo Huang, Yuyang Rong, Yiwen Guo, Ethan Wang, Hao Chen. UniTSyn: A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing, International Symposium on Software Testing and Analysis (ISSTA), 2024. [DOI], [PDF], [code], [slides], [poster].

Jiabo Huang, Jianyu Zhao, Yuyang Rong, Yiwen Guo, Yifeng He, Hao Chen. Code Representation Pre-training with Complements from Program Executions. Conference on Empirical Methods in Natural Language Processing: Industry Track (EMNLP), 2024. [DOI], [PDF], [poster].

Jianyu Zhao, Yuyang Rong, Yiwen Guo, Yifeng He, Hao Chen. Understanding Programs by Exploiting (Fuzzing) Test Cases, Findings of Association for Computational Linguistics (ACL), 2023. [DOI], [PDF], [code].

Yifeng He, Big Data and Deep Learning Techniques Applied in Intelligent Recommender Systems, International Conference on Civil Aviation Safety and Information Technology (ICCASIT), 2022. [DOI].

Preprints

Hongxiang Zhang, Yifeng He, Hao Chen. SteerDiff: Steering towards Safe Text-to-Image Diffusion Models. https://arxiv.org/abs/2410.02710

Jicheng Wang, Yifeng He, Hao Chen. RepoGenReflex: Enhancing Repository-Level Code Completion with Verbal Reinforcement and Retrieval-Augmented Generation. https://arxiv.org/abs/2409.13122

Hongxiang Zhang, Yuyang Rong, Yifeng He, Hao Chen. LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing. https://arxiv.org/abs/2406.07714

Services

Program Committee Member / Reviewer for:

  • International Conference on Learning Representations (ICLR): 2026, 2025
  • International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA): 2026
  • International Conference on Multimedia (MM): 2025
  • International Conference on Software Engineering Advances (ICSEA): 2025
  • Association for Computational Linguistics LLM Security Workshop (ACL-LLMSEC): 2025
  • Conference of the ACM Special Interest Group on Data Communication, Artifact Evaluation (SIGCOMM): 2025

Honors and Awards

Citation For Outstanding Performance, Department of Mathematics, UC Davis

Dean’s Honor List, Fall 2019, Spring 2020, Spring 2021, Spring 2022, College of Letter & Science, UC Davis

Teaching

  • ECS 36C Data Structure
    • Teaching Assistant. Fall 2023, Spring 2023, Fall 2024.

Industry Experience

ByteDance (Toutiao), Creator Income Platform. Software Development Engineer Intern. 04/2021 - 07/2021

  • Used microservice tech to connect parts of the author income settlement bushiness
  • Transformed author-relation data architecture design from relational database (SQL) to graph database (Gremlin) to
  • low better efficiency for the business model
  • Refactored the income calculation control process with in Python 3

ByteDance (Xigua Video), Creator Experience Team. Software Development Engineer Intern. 07/2021 - 08/2021

  • Created a data cleaner with ORM to maintain the size and readability of online data settlement table
  • Created the offline flow of Medium-Length Video Encouragement Project for weekly data calculation
  • Built the interface for frontend of web and mobile app to display the data visualization

Past Toy Projects

CourseReco 06/2022 - 09/2022

  • Designed the overall system architecture
  • Led the programming for API server and recommender engine
  • Negotiated with the third-party provider, SchedGo, for data service
  • Provided technical leadership to teammates

Music Genre Classifier 05/2022 - 06/2022

  • Processed music samples into spectrogram by Short-time Fourier transform
  • Designed the appropriate model (CNN) to classify spectrograms into category
  • Analyzed the resulting model and test outputs with saliency maps

ImageOrientation 03/2022 - 04/2022

  • Pre-processed image data by rotating them with random generated angles, and assigned these angles as label
  • Designed the appropriate CNN for regression task, tested and improved the model
  • Applied Hyper-parameter tuning based on train, validation, and tested results

Dcash-server 05/2021 - 07/2021

  • Created a multi-threaded API server using C++ to allow users to create accounts to make deposit and transfer
  • Used MySQL to store and maintain user data
  • Made API calls to the Stripe API server to handle credit card information

Genshine Impact Gacha Analyzer 08/2021 - 09/2021

  • Designed fetching process of gacha data from MiHoYo and categorized the process
  • Stored data into local database automatically, wrote into excel for data analysis by option
  • Generated text or graph visualization report from data analyze results

Activities

HackerHub Club, UC Davis 07/2020 - 06/2023

Co-founder, President, Technical Officer

  • Design and maintain a course recommendation system, CourseReco, for UC Davis students
  • Organize and lead the Code Jam Competition on data visualization, AI, augmented reality and virtual reality, and machine learning
  • Coach in introductory programming workshops, including Assembly, functional programming, recommender system, generative adversarial network, etc.