Naman Goyal

SF Bay Area, CA

Exploring the world, one step at a time.

I am a Machine Learning Software Engineer at Google DeepMind, where I work on the Gemini team responsible for making Gemini output more useful and human-centric. My role involves advancing multimodal large language models through applied research and development, focusing on enhancing reasoning, planning, and instruction-following capabilities. I work on LLM-based synthetic data generation to address data scarcity challenges, instruction tuning, reinforcement learning from human feedback (RLHF), and building LLM orchestration over tools and knowledge bases.

Previously, I worked at NVIDIA on the autonomous vehicle team, developing the perception stack at scale. I designed horizontally scalable pipelines for data preparation and cloud inference, improving DNN training efficiency and resource utilization. Before that, I interned at Apple on multimodal learning for Visually Rich Document Understanding, and at Adobe Research developing adversarially robust training strategies for deep metric learning.

I hold an M.S. in Computer Science from Columbia University (2021-2022) and a B.Tech. in Computer Science from IIT Ropar (2015-2019). I have been a sponsored speaker at multiple AI conferences in 2025, including the AI Conference San Francisco, AI Risk Summit, AI Dev Summit, and Adobe Research World Headquarters, speaking on topics ranging from enterprise AI agents to multimodal AI challenges.

LinkedIn • GitHub

latest posts

Mar 13, 2026	Two AI Agents, One MacBook, Zero API Keys new
Mar 10, 2026	Your MacBook Can Do Autonomous AI Research Now
Apr 20, 2025	Bridging the Divide - A Mac User Guide to Productivity with Android
Apr 07, 2025	Gemma 3 Technical Deep Dive - Architecture, Performance, and Implications
Mar 23, 2025	Navigating the Social Scene - A Young Professional's Guide to the South Bay

talks

Sep 2025	Adobe Research World Headquarters, San Jose — Architectures for the Next Generation of Enterprise AI Agents
Sep 2025	The AI Conference, San Francisco — The Ascendancy and Challenges of Agentic Large Language Models
Aug 2025	AI Risk Summit, CISO Forum, Half Moon Bay — The Ascendancy and Challenges of Agentic Large Language Models
May 2025	AI Dev Summit, San Francisco — The Dual Edge of Multimodal AI: Advancing Accessibility While Navigating Bias

publications

MLX

Two AI Agents, One MacBook, Zero API Keys

Naman Goyal

Mar 2026

DOI HTML
MLX

Your MacBook Can Do Autonomous AI Research Now

Naman Goyal

Mar 2026

DOI HTML
LLM

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gheorghe Comanici, Eric Bieber, Mike Schaekermann, and 3 more authors

arXiv preprint arXiv:2507.06261, 2025

HTML
SSL

A survey on Self Supervised learning approaches for improving Multimodal representation learning

Naman Goyal

arXiv preprint arXiv:2210.11024, 2022

HTML
GNN

Graph neural networks for image classification and reinforcement learning using graph representations

Naman Goyal and David Steiner

arXiv preprint arXiv:2203.03457, 2022

HTML
NLP

A comprehensive study of on-device NLP applications–VQA, automated Form filling, Smart Replies for Linguistic Codeswitching

Naman Goyal

arXiv preprint arXiv:2409.19010, 2024

HTML