Naman Goyal

Exploring the world, one step at a time.

I am a Machine Learning Software Engineer at Google, where I work on the Gemini team, focusing on advancing large multimodal language models to enhance reasoning, planning, and instruction-following capabilities. My role involves cutting-edge applied research in synthetic data generation, addressing AI data scarcity issues, and developing innovative solutions for human-centered, large-scale applications. Previously, at NVIDIA, I contributed to the autonomous vehicle team, optimizing ML flow and workflow designs to maximize the efficiency of computing resources in large-scale training and inference tasks.

My academic background includes an M.S. in Computer Science from Columbia University, where I completed a thesis in Multi-Modal Learning and Natural Language Processing under the guidance of Prof. Kathleen McKeown. I also hold a B.Tech. in Computer Science from the Indian Institute of Technology (IIT), where I graduated with the highest academic rank. Throughout my career, I have had the privilege of interning with leading technology companies such as Apple and Adobe, where I developed and deployed advanced machine learning models to solve complex, multimodal problems in various domains.

news

Jul 08, 2024	Started work at Google DeepMind - Gemini.
Jan 30, 2023	Started work with Autonomous Vehicles team at NVIDIA

latest posts

Apr 20, 2025	Bridging the Divide - A Mac User Guide to Productivity with Android
Apr 07, 2025	Gemma 3 Technical Deep Dive - Architecture, Performance, and Implications
Mar 23, 2025	Navigating the Social Scene - A Young Professional's Guide to the South Bay

papers

A survey on Self Supervised learning approaches for improving Multimodal representation learning

Naman Goyal

arXiv preprint arXiv:2210.11024, 2022

HTML
Graph neural networks for image classification and reinforcement learning using graph representations

Naman Goyal and David Steiner

arXiv preprint arXiv:2203.03457, 2022

HTML
A comprehensive study of on-device NLP applications–VQA, automated Form filling, Smart Replies for Linguistic Codeswitching

Naman Goyal

arXiv preprint arXiv:2409.19010, 2024

HTML