Yonatan Bitton

Yonatan Bitton

Senior Research Scientist at Google, CS PhD

The Hebrew University of Jerusalem

Biography

I am a Research Scientist at Google Research in Tel-Aviv where I work on multimodal consistency.

My research is centered on improving large vision-and-language models. I develop feedback models for text-to-image and text-to-video applications, specifically designed to enhance the alignment of visual outputs with their corresponding textual prompts. Additionally, I work on multimodal factuality, including visual understanding and image or video-to-text evaluation, ensuring that the generated text is factually correct and attributable to trustworthy textual or visual sources.

I completed my PhD in The Hebrew University of Jerusalem, Israel. During my time there, I had the privilege of being advised by Dr. Roy Schwartz and Dr. Gabriel Stanovsky. My PhD talk "Bridging Vision and Language with Data: From Perception to Understanding" 🎬 record is available here. I did my MSc with Prof. Michael Elhadad and Prof. Eitan Bachmat, at the Ben Gurion University.

Download my complete CV: link.
πŸ“„ Download my bio: link.

Education
  • PhD in Computer Science (Vision-and-Language), 2020-2023

    The Hebrew University of Jerusalem, Israel

  • MSc in Computer Science (Natural Language Processing), Magna cum laude, 2018-2019

    Ben Gurion University of the Negev, Israel

  • BSc in Computer Science, 2015-2018

    Ben Gurion University of the Negev, Israel

Students

I've had the opportunity to collaborate with several MSc and PhD students towards their publication goals:

1. Wenbo (Gordon) Hu (University of California, Los Angeles) 1 3DLLM-Mem
2. Brian Gordon (Tel-Aviv University) 2 Mismatch Quest Unblocking Detailed Captions
3. Aviv Slobodkin (Bar-Ilan University) 1 RefVNLI
4. Moran Yanuka (Tel-Aviv University) 1 Bridging the Visual Gap
5. Mor Ventura (Technion β€“ Israel Institute of Technology) 1 NL-Eye
6. Orr Zohar (Stanford University) 1 Video-STaR
7. Hritik Bansal (University of California, Los Angeles) 4 VideoPhy2 VideoPhy TALC Video-Con
8. Nitzan Bitton-Guetta (Ben-Gurion University of the Negev) 2 WHOOPS! Visual Riddles
9. Ron Yosef (The Hebrew University of Jerusalem) 2 IRFL EditInspector
10. Oren Sultan (The Hebrew University of Jerusalem) 1 ParallelPARC
11. Netta Madvil (The Hebrew University of Jerusalem) 1 Read, Look or Listen?

If you’d like to work together on vision-and-language research, send me an email.

Papers by Venue

24 peer-reviewed papers Β· 2021 – 2025

Work Experience

 
 
 
 
 
Google Research
Senior Research Scientist
Google Research
April 2024 – Present Israel
Advancing multimodal consistency. Developing feedback models for text-to-image and text-to-video applications and enhance multimodal factuality to ensure the accuracy of text generated from visual sources.
 
 
 
 
 
Google Research
Research Scientist
Google Research
Jun 2023 – April 2024 Israel
Focusing on vision-and-language. Recent works include image-text alignment, improving text-to-image models, and visual instruction tuning.
 
 
 
 
 
Google Research
Research Intern
Google Research
Jul 2022 – Jun 2023 Israel
Cerebra team, Conversational AI, working with LLMs (LaMDA, PaLM, BARD, etc)
 
 
 
 
 
Amazon Lab126
Applied Scientist Intern
Amazon Lab126
Oct 2019 – July 2022 Israel
Visual Fitness - Halo team
Developed a virtual fitness trainer, specializing in 2D/3D pose estimation, action recognition, error correction, on-device deployment and more.
 
 
 
 
 
IBM Research
Research Student
IBM Research
Jun 2017 – Oct 2019 Israel
Using data-science and machine-learning methods in order to detect frauds

Invited Talks

Bridging Vision and Language with Data: From Perception to Understanding
Commonsense Benchmarks for Vision and Language
q2d: Turning Questions into Dialogs to Teach Models How to Search
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
VASR: Visual Analogies of Situation Recognition

Others

Managing Research
This talk deals with several research related questions. For example findings new research ideas, choose a research topic, staying updated with new research, working with your supervisors, and more.
AirPal
A platform that connects drone pilots with people in need of drone services.
This project participated in Starter - Jump course and won 1st place in the final Demo Day event.
Press coverage: telecomnews, israeldefense, sheva7.