Hello! I’m a Staff Research Scientist and Tech Lead at Google Research, where I build factually grounded Large Language Models. My focus is on making Gemini both more knowledgeable and less prone to hallucinations. My research spans the full development cycle of LLMs, from designing pre-training strategies to reinforcement learning techniques that help models better distinguish fact from fiction.

I’m fascinated with a fundamental question: how do language models actually ‘know’ things? My research interests center on how LLMs acquire, represent, and apply knowledge to reason about the world. I’m particularly interested in the mechanisms by which they internalize factual information during pre-training and how that knowledge can be reliably extracted to enable reasoning.

I completed a PhD in Natural Language Processing at Tel Aviv University, supported by a Google PhD Fellowship. My doctoral research, as well as my three internships at Google, focused on reasoning over structured data. Prior to my PhD, I was a Research Staff Member at IBM Research, where I worked on deep learning for language understanding. My research has been recognized with best paper awards at INLG and NAACL.

Feel free to reach out if you are interested in collaborating!

📜 Publications


arXiv 2025 Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Gemini Team, ..., Jonathan Herzig and others
PDF
COLM 2025 Inside-out: Hidden factual knowledge in LLMs
Zorik Gekhman, Eyal Ben David, Hadas Orgad, Eran Ofek, Yonatan Belinkov, Idan Szpektor, Jonathan Herzig, Roi Reichart
PDF
arXiv 2025 DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs
Arie Cattan, Alon Jacovi, Ori Ram, Jonathan Herzig, Roee Aharoni, Sasha Goldshtein, Eran Ofek, Idan Szpektor, Avi Caciularu
PDF GitHub Repository
Findings of ACL 2024 Multilingual instruction tuning with just a pinch of multilinguality
Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
PDF
ACL 2024 A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva
PDF Project page
arXiv 2024 Constructing benchmarks and interventions for combating hallucinations in llms
Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov
PDF GitHub Repository
EMNLP 2024 Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, Jonathan Herzig
PDF
NeurIPS 2024 TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
Avi Caciularu, Alon Jacovi, Eyal Ben-David, Sasha Goldshtein, Tal Schuster, Jonathan Herzig, Gal Elidan, Amir Globerson
PDF Project page
ICML 2024 Representation Surgery: Theory and Practice of Affine Steering
Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru
PDF GitHub Repository
arXiv 2024 Can few-shot work in long-context? recycling the context to generate demonstrations
Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu
PDF
arXiv 2024 Distinguishing ignorance from error in llm hallucinations
Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov
PDF GitHub Repository
Findings of ACL 2023 mFACE: Multilingual Summarization with Factual Consistency Evaluation
Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata
PDF GitHub Repository
NeurIPS 2023 What You See is What You Read? Improving Text-Image Alignment Evaluation
Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor
PDF Project page
EMNLP 2023 Trueteacher: Learning factual consistency evaluation with large language models
Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor
PDF GitHub Repository
EMNLP 2023 Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Benjamin Muller, John Wieting, Jonathan H Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang
PDF GitHub Repository
GEM Workshop 2023 QAMPARI: A benchmark for open-domain questions with many answers
Samuel Amouyal, Tomer Wolfson, Ohad Rubin, Ori Yoran, Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
Findings of EMNLP 2023 A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva
PDF
Compendium of Neurosymbolic Artificial Intelligence 2023 Latent Trees for Compositional Generalization
Jonathan Herzig, Jonathan Berant, Ben Bogin
PDF
NAACL 2022 Learning To Retrieve Prompts for In-Context Learning
Ohad Rubin, Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
NAACL 2022 TRUE: Re-evaluating Factual Consistency Evaluation
Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias
PDF GitHub Repository
EMNLP 2022 Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu, Peter Shaw, Panupong Pasupat, Tianze Shi, Jonathan Herzig, Emily Pitler, Fei Sha, Kristina Toutanova
PDF
arXiv 2022 Attributed question answering: Evaluation and modeling for attributed large language models
Bernd Bohnet, Vinh Q Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
PDF GitHub Repository
ACL 2021 Span-based semantic parsing for compositional generalization
Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
NAACL 2021 Open Domain Question Answering over Tables via Dense Retrieval
Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
PDF GitHub Repository
arXiv 2021 Unlocking compositional generalization in pre-trained models using intermediate representations
Jonathan Herzig, Peter Shaw, Ming-Wei Chang, Kelvin Guu, Panupong Pasupat, Yuan Zhang
PDF GitHub Repository
EMNLP 2021 Finding needles in a haystack: Sampling Structurally-diverse Training Sets from Synthetic Data for Compositional Generalization
Inbar Oren, Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
ACL 2020 TAPAS: Weakly Supervised Table Parsing via Pre-training
Jonathan Herzig, Paweł Nowak, Thomas Müller, Francesco Piccinno, Julian Martin Eisenschlos
PDF GitHub Repository
Findings of EMNLP 2020 Improving Compositional Generalization in Semantic Parsing
Inbar Oren, Jonathan Herzig*, Nitish Gupta*, Matt Gardner, Jonathan Berant
PDF GitHub Repository
NAACL 2019 CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge Star Best Resource Paper Award
Alon Talmor*, Jonathan Herzig*, Nicholas Lourie, Jonathan Berant
PDF GitHub Repository
NAACL 2019 Value-based Search in Execution Space for Mapping Instructions to Programs
Dor Muhlgay, Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
ACL 2019 TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
Guy Lev, Michal Shmueli-Scheuer, Jonathan Herzig, Achiya Jerbi, David Konopnicki
PDF GitHub Repository
*SEM 2019 Bot2vec: Learning representations of chatbots
Jonathan Herzig, Tommy Sandbank, Michal Shmueli-Scheuer, David Konopnicki
PDF
UMAP 2019 Detecting persuasive arguments based on author-reader personality traits and their interaction
Michal Shmueli-Scheuer, Jonathan Herzig, David Konopnicki, Tommy Sandbank
PDF
EMNLP 2019 Don't paraphrase, detect! Rapid and Effective Data Collection for Semantic Parsing
Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
EMNLP 2019 A Summarization System for Scientific Documents
Shai Erera, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Debasis Ganguly, David Konopnicki
PDF
EMNLP 2018 Decoupling Structure and Lexicon for Zero-Shot Semantic Parsing
Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
NAACL 2018 Detecting Egregious Conversations between Customers and Virtual Agents
Tommy Sandbank, Michal Shmueli-Scheuer, David Konopnicki, Jonathan Herzig, John Richards, David Piorkowski
PDF
IUI 2018 On the Expression of Agent Emotions in Customer Support Dialogs in Social Media
Michal Shmueli-Scheuer, Jonathan Herzig, Tommy Sandbank, David Konopnicki
PDF
ACL 2017 Neural semantic parsing over multiple knowledge-bases
Jonathan Herzig, Jonathan Berant
PDF GitHub Repository
IUI 2017 Ehctool: Managing emotional hotspots for conversational agents
Tommy Sandbank, Michal Shmueli-Scheuer, Jonathan Herzig, David Konopnicki, Rottem Shaul
PDF
INLG 2017 Neural response generation for customer service based on personality traits Star Best Short Paper Award
Jonathan Herzig, Michal Shmueli-Scheuer, Tommy Sandbank, David Konopnicki
PDF
ICTIR 2017 Emotion detection from text via ensemble classification using word embeddings
Jonathan Herzig, Michal Shmueli-Scheuer, David Konopnicki
PDF
CSCW 2016 I understand your frustration
Guy Feigenblat, David Konopnicki, Michal Shmueli-Scheuer, Jonathan Herzig, Hen Shkedi
PDF
UMAP 2016 Predicting customer satisfaction in customer support conversations in social media using affective features
Jonathan Herzig, Guy Feigenblat, Michal Shmueli-Scheuer, David Konopnicki, Anat Rafaeli
PDF
SIGDial 2016 Classifying emotions in customer support dialogues in social media
Jonathan Herzig, Guy Feigenblat, Michal Shmueli-Scheuer, David Konopnicki, Anat Rafaeli, Daniel Altman, David Spivak
PDF
IEEE Transactions on Biomedical Engineering 2014 Monitoring cardiac stress using features extracted from S1 heart sounds
Jonathan Herzig, Amitai Bickel, Arie Eitan, Nathan Intrator
PDF
Hypertext 2014 An author-reader influence model for detecting topic-based influencers in social media
Jonathan Herzig, Yosi Mass, Haggai Roitman
PDF
ImmersiveMe 2014 Mindful: A Platform for Large-Scale Affective Field Research
Guy Feigenblat, Jonathan Herzig, Michal Shmueli-Scheuer, David Konopnicki
PDF
IBM Journal of Research and Development 2013 A statistical approach to mining customers' conversational data from social media
David Konopnicki, Michal Shmueli-Scheuer, Doron Cohen, Benjamin Sznajder, Jonathan Herzig, Ariel Raviv, N Zwerling, Haggai Roitman, Yosi Mass
PDF
PCI 2013 Eventsense: Capturing the pulse of large-scale events by mining social media streams
Emmanouil Schinas, Symeon Papadopoulos, Sotiris Diplaris, Yiannis Kompatsiaris, Yosi Mass, Jonathan Herzig, Lazaros Boudakidis
PDF