Naman Goyal

Machine Learning - SWE Google DeepMind

prof_pic_color.png

SF Bay Area, CA

Exploring the world, one step at a time.

I am a Machine Learning Software Engineer at Google DeepMind, where I work on the Gemini team responsible for making Gemini output more useful and human-centric. My role involves advancing multimodal large language models through applied research and development, focusing on enhancing reasoning, planning, and instruction-following capabilities. I work on LLM-based synthetic data generation to address data scarcity challenges, instruction tuning, reinforcement learning from human feedback (RLHF), and building LLM orchestration over tools and knowledge bases.

Previously, I worked at NVIDIA on the autonomous vehicle team, developing the perception stack at scale. I designed horizontally scalable pipelines for data preparation and cloud inference, improving DNN training efficiency and resource utilization. Before that, I interned at Apple on multimodal learning for Visually Rich Document Understanding, and at Adobe Research developing adversarially robust training strategies for deep metric learning.

I hold an M.S. in Computer Science from Columbia University (2021-2022) and a B.Tech. in Computer Science from IIT Ropar (2015-2019). I have been a sponsored speaker at multiple AI conferences in 2025, including the AI Conference San Francisco, AI Risk Summit, AI Dev Summit, and Adobe Research World Headquarters, speaking on topics ranging from enterprise AI agents to multimodal AI challenges.

LinkedInGitHub

talks

Sep 2025 Adobe Research World Headquarters, San Jose — Architectures for the Next Generation of Enterprise AI Agents
Sep 2025 The AI Conference, San Francisco — The Ascendancy and Challenges of Agentic Large Language Models
Aug 2025 AI Risk Summit, CISO Forum, Half Moon Bay — The Ascendancy and Challenges of Agentic Large Language Models
May 2025 AI Dev Summit, San Francisco — The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias

papers

  1. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
    Gheorghe Comanici, Eric Bieber, Mike Schaekermann, and 3 more authors
    arXiv preprint arXiv:2507.06261, 2025
  2. A survey on Self Supervised learning approaches for improving Multimodal representation learning
    Naman Goyal
    arXiv preprint arXiv:2210.11024, 2022
  3. Graph neural networks for image classification and reinforcement learning using graph representations
    Naman Goyal and David Steiner
    arXiv preprint arXiv:2203.03457, 2022
  4. A comprehensive study of on-device NLP applications–VQA, automated Form filling, Smart Replies for Linguistic Codeswitching
    Naman Goyal
    arXiv preprint arXiv:2409.19010, 2024

latest posts