Hanqi Yan is a lecturer (Assistant Professor) at King’s College London, Department of Informatics, affliated with the NLP group. Before that, she was the postdoctoral researcher in the same department (2024-2025). She mainly focuses on interpretability and robustness for language models on various reasoning tasks, especially from the representation learning perspective. She has published more than 10+ papers as (co-)first author on related topics in top conferences, such as ACL, EMNLP, NeurIPS, ICML. She serves as area chair for ACL, EMNLP, and co-organized the student workshop at AACL2022. She obtained her Ph.D from the University of Warwick(2024), advised by Prof. Yulan He and Dr. Lin Gui, Msc and BEng from Peking University (2020) and Beihang University (2017).

I’ve been incredibly lucky to have a number of amazing collaborators and mentors across KCL and a range of other institutions, including Carnegie Mellon University, MIT, MBZUAI, Hong Kong Polytechnic University,University of Warwick. None of the research so far—would be possible without their kind help and support.

I am always on the lookout for excellent PhD students to work with me about LLM reasoning, robustenss, interpretability etc.

Fully-funded positions for exceptional candidates through KCL scholarships. These funding allocations are for full fees and a living stipend (London weighted) for any nationality.
K-CSC (for Chinese nationals). These funding allocations are for full fees and a living stipend (London weighted).
Alternative fundings.

News

11.2025: I go to Suzhou, China to present our CODI (implicit CoT) and GraphMind (Paper novelty assessment demo)
11.2025: Zhanghao's paper "Spectrum Projection Score: Aligning Retrieved Summaries with Reader Models in Retrieval-Augmented Generation" accepted to AAAI oralstrong> !!!
09.2025: A joint tutorial "Structured Representation Learning: Interpretability, Robustness and Transferability for Large Language Models" is accepted by AAAI-26. See you in Singapore!
09.2025: A new work on safety vulnerabilities in reasoning-intensive setups—like think-mode or fine-tuning on narrow math tasks is released! "Thinking Hard, Going Misaligned: Emergent Misalignment in LLMs"
09.2025: I started the lecturer position at KCL. Looking forward to more challenges and opportunities ahead!
08.2025: 1*EMNLP paper is accepted. CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation.
07.2025: I go to Vienna, Austria🇦🇹 to present our faithful rationale generation and RAG papers accepted at ACL 2025. 🏞️ Excited to escape city life and explore Gosau & Hallstatt!
07.2025: I go to Vancouver, Canada🇨🇦 to present our LLM reasoning papers accepted at ICML 2025. 🍜 Can’t wait to revisit my favorite Chinese restaurant there!
07.2025: 1*paper, SciReplicate-Bench, accepted by COLM25, a benchmark for paper replication via code generation.
05.2025" 2*papers accepted by ACL25 main conference, including a co-first author paper in faithful rationale generation during LLM inference.
05.2025: 3*papers accepted by ICML25, including a first-author paper about meta-reasoning in the Position paper track.
11.2024: I go to Miami☀️🌊🍹🏝, US for EMNLP24 to present our accepted papers and connect with like-minded researchers👩‍💻👨‍💻.
10.2024: 1*first-author paper about monosemantic neuron in multi-modal model is accepted by Neurips-RBMF workshop.
09.2024: 3*papers (monomsemantic neurons, oral survey in ICL, weak2strong event extraction) are accepted by EMNLP24 Main Conference. 🎉
08.2024: I go to Bangkok, Thailand🇹🇭 for ACL24. ✈️
06.2024: 2* paper accepted by ACL24, including 1 1st author in main and 1 in findings.
04.2024: I pass the PhD viva with no corrections 🎓.
01.2024: I become a PostDoc👩‍🏫 at King's College London, NLP Group.
01.2024: I finish my PhD thesis (draft) on the same day of my birthday.
01.2024: 1*first-author paper is finally accepted by TKDE.
12.2023: I go to New Orleans🎷, US to present our Neurips paper.
07.2023: I go to Hawaii🌴, US to present our ICML-workshop paper.
07.2023: 1*first-author paper is accepted by Neurips (my Neurips paper).
02.2023: I go back to the UK from Abu Dhabi, UAE🇦🇪, finish my Machine Learning trip in MBZUAI.
02.2023: I attend the EMNLP23 held in Abu Dhabi, to present our Computational Linguistics paper.
01.2023: 1*paper is accepted by EACL23🇭🇷-findings (first time as a mentor for a master's student).
12.2022: Lionel Messi leads Argentina to win the ⚽️World Cup championship.
10.2022: I start to be a funded visiting student in Machine Learning, Department at MBZUAI🏫, Abu Dhabi, UAE, advised by Prof. Kun Zhang.
08.2022: I go to Eindhoven, Netherlands🇳🇱 to present our UAI paper.
05.2022: 1*first-author paper is accepted by UAI23-spotlight (🥳my first ML paper)
05.2021: The first time to get paper accepted! 1*first-author paper is accepted by ACL21 🌟Oral. A super encouragement in my early PhD career.
10.2020: I start my PhD📚 journey at University of Warwick, UK🇬🇧.

Research Summary

My research interests lie in the intersection of Machine Learning and Natural Language Processing, i.e., incorporating fundamental representation learning to enhance the interpretability and reliability of different NLP models, with 10+ (co-)first-authored papers published at top-tier venues:

Mechanistic interpretability (neuron-level) in language models and multi-modal models [EMNLP24,NeurIPS24-RBMF], explaining the conflicts between safety and reasoning advancement [Neurips25-MI]; self-explainable models with a conceptualised layer linking the input and decision layer [Computational Linguistics 22,TKDE24].
Empirical and principled methods to enhance model robustness over various test inputs, e.g., position bias [ACL24-findings,ACL21-oral], distribution shifts [NeurIPS23] and representation inefficiency in transformer-based models [EMNLP24,EACL23-findings,UAI22-spotlight].
Understanding and enhancing LM’s reasoning capabilities via injecting external knowledge[ACL21-oral][ACL25], weak supervision [EMNLP24], applying a self-refinement mechanism for factual knowledge reasoning [ACL24]. More recently focus in the two directions:
- LLM agent for scientific liteture understanding, such as code generation for scientific paper replication on our own SciReplicate-Bench and novelty assessment GraphMind.
- Reasoning in latent space, such as a position paper at ICML25 about meta-reasoning, EMNLP25 in implicit CoT, in retrieved-based QA [ACL25] [AAAI25 oral], navigating search in latent space [ICML25spotlight] and sparse feature for preference optimization [ICML25].

Mentee (Current PhD students in KCL NLP Group)

LLM reasoning
- Qinglin Zhu (latent reasoning), Zhanghao Hu (RAG), Zhenyi Shen (implicit CoT), Jiangnan Ye (efficiency)
Explainable AI
- Jiazheng Li (faithful explaination), Hainiu Xu (mechanistic interpretability)
LLM agent for scientific literature understanding
- Yanzheng Xiang (code generation), Italo Da Silva (Hypothesis assessment)

Professional Service

Organiser
- PreTrain 2025: Spotlight on ACL/ICML/ICLR at KCL
- Co-Chair of the AACL22-Student Research Workshop
Area Chair:
- ACL, EMNLP, EACL
Reviewers:
- NLP: AACL, NAACL, EACL, EMNLP, ACL, COLM
- AI/ML: UAI, AISTATS, NEURIPS, ICLR, ICML, AAAI
- Journal: NeuroComputing, TOIS, TMLR, Transactions on Big Data, Transactions on Artificial Intelligence.

💬 Invited Talks

08/2025. Thomson Reuters, Foundational Research Team invited by Jonathan Richard Schwarz.Towards More Robust Reasoning for LLMs
07/2024. Fudan University, NLP Group . Representation Learning and Mechanistic Interpretability
03/2023. Turing AI Fellowship Event , London, Distinguishability Calibration to In-Context Learning
04/2022. UKRI Fellows Workshop , University of Edinburgh, Interpreting Long Documents and Recommendation Systems via Latent Variable Models
07/2021. SPAAM Seminar. University of Warwick. Emotion Cause detection.

📝 Publications

(* indicates equal contribution)

When THINKING BACKFIRES: MECHANISTIC INSIGHTS INTO REASONING-INDUCED MISALIGNMENT
H. Xu, H. Yan, S. Qi, S. Yang, Y. He
Preprint | Paper
Representation Interpretability

Spectrum Projection Score: Aligning Retrieved Summaries with Reader Models in Retrieval-Augmented Generation
Z.Hu, Q.Zhu, S. Qi, Y. He, H. Yan, L. Gui
AAAI25 Oral | Paper
application Representation

GraphMind: Interactive Novelty Assessment System for Accelerating Scientific Discovery
I. Silva, H. Yan, L. Gui, Y. He
EMNLP25 Demo | Paper
application

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Z. Shen, H. Yan, L. Zhang, Y. Du, Y. He
EMNLP25 | Paper
application Representation

Position: LLMs Need a Bayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning
H. Yan, L. Zhang, J. Li, Z. S, Y. He
ICML25, Position Track | Paper
Application

Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference
J. Li, H. Yan, Y. He
ACL25, Main | Paper
application Interpretability

Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
Q. Zhu, R. Zhao. H. Yan, Y. He, Y. Chen, L. Gui
ICML25, Spotlight | Paper
Representation application

Direct preference optimization using sparse feature-level constraints
Q. Yin, C. Leong, H. Zhang, M. Zhu, H. Yan, Q. Zhang, Y. He, W. Li, J. Wang, Y. Zhang, L. Yang
ICML25 | Paper
Interpretability Representation

Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective
H. Yan, Y. Xiang, G Chen, Y. Wang, L. Gui, Y. He
EMNLP24, main | Paper
Interpretability Representation

Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
I. Silva, H. Yan, L. Gui, Y. He
EMNLP24, main | Paper
Causality application

The Mystery and Fascination of LLMs: A Comprehensive Survey on the Interpretation and Analysis of Emergent Abilities
Y. Zhou, J. Li, Y.Xiang, H.Yan, L. Gui, Y. He
EMNLP24, main | Paper
Interpretability

Mirror: A Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning
H. Yan, Q. Zhu, X. Wang, L. Gui, Y. He
ACL24, main | Paper
application

Addressing Order Sensitivity of In-Context Demonstration Examples in Causal Language Models.
Y. Xiang, H. Yan, L. Gui, Y. He
ACL24, findings | Paper
Representation

Counterfactual Generation with Identifiability Guarantee
H. Yan, L. Kong, L. Gui, Y. Chi, Eric. Xing, Y. He, K. Zhang
Neurips23, main | Paper
Causality Representation application

Explainable Recommender with Geometric Information Bottleneck
H. Yan, L. Gui, M. Wang, K. Zhang and Y. He
TKDE | Paper
Interpretability application

Hierarchical Interpretation of Neural Text Classification
H. Yan, L. Gui and Y. He
Computational Linguistics, Present at EMNLP23 | Paper
Interpretability application

Addressing Token Uniformity in Transformers via Singular Value Transformation
H. Yan, L. Gui, W. Li and Y. He
UAI22, spotlight | Paper
Representation

Distinguishability Calibration to In-Context Learning
H. Li, H. Yan, L. Gui, W. Li and Y. He
EACL23, findings | Paper
Representation

A Knowledge-Aware Graph Model for Emotion Cause Extraction
H. Yan, L. Gui and Y. He
ACL21, Oral | Paper
Causality application

📝 Notes

o1-technical report (notes for video)
Machine Unlearning via CausalLens and in NLP tasks
Reading List For Large Language Model
Identifiability101 in Causality (3rd PhD)
Induction Head_ contribute to In-context Learning (3rd PhD)
Recommendation with Causality (2nd PhD)
Causality101 (Feb 2022, 2nd PhD)
Explaining Neural Networks (Oct 2020 1st PhD)