Nishad Gothoskar

I am currently at GenWeb, where I am working on using probabilistic programming to scale up 3D perception. I recieved my PhD in Computer Science at MIT advised by Vikash Mansinghka and Josh Tenenbaum. Checkout my thesis here. My goal is to build AI vision systems that can learn as rapidly and generalize as broadly as humans.

Experience

Google
SWE Intern
2016

Uber ATG
Software Engineer
2017 - 2018

Vicarious AI
Research Engineer
2018 - 2020

MIT
Graduate Student
2020 - 2025

GenWeb

2025 -

Publication Highlights
Bayes3D: fast learning and inference in structured generative models of 3D objects and scenes
Nishad Gothoskar, Matin Ghavami, Eric Li, Aidan Curtis, Michael Noseworthy, Karen Chung, Brian Patton, William T. Freeman, Joshua B. Tenenbaum, Mirko Klukas, Vikash K. Mansinghka
[PDF]

We propose a generative probabilistic programming-based architecture for modeling 3D objects and scenes, and use our architecture to do accurate and robust object pose estimation from RGBD images.

3DP3: 3D Scene Perception via Probabilistic Programs
Nishad Gothoskar, Marco Cusumano-Towner, Ben Zinberg, Matin Ghavamizadeh, Falk Pollok, Austin Garrett, Dan Gutfreund, Joshua B. Tenenbaum, Vikash Mansinghka
NeurIPS, 2021 [MIT News] [PDF]

We propose a generative probabilistic programming-based architecture for modeling 3D objects and scenes, and use our architecture to do accurate and robust object pose estimation from RGBD images.

Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps
Dileep George, Rajeev V. Rikhye, Nishad Gothoskar,J. Swaroop Guntupalli, Antoine Dedieu, Miguel Lazaro-Gredilla
Nature Communications, 2021 [PDF]

Cognitive maps are mental representations of spatial and conceptual relationships in an environment, and are critical for flexible behavior. To form these abstract maps, the hippocampus has to learn to separate or merge aliased observations appropriately in different contexts in a manner that enables generalization and efficient planning. Here we propose a specific higher-order graph structure, clone-structured cognitive graph (CSCG), which forms clones of an observation for different contexts as a representation that addresses these problems.

DURableVS: Data-efficient Unsupervised Recalibrating Visual Servoing via online learning in a structured generative model
Nishad Gothoskar, Miguel Lazaro-Gredilla, Yasemin Bekiroglu, Abhishek Agarwal, Joshua B. Tenenbaum, Vikash K. Mansinghka, Dileep George
ICRA, 2022 [PDF]

In this work, we present a method for unsupervised learning of visual servoing that does not require any prior calibration and is extremely data-efficient. Our key insight is that visual servoing does not depend on identifying the veridical kinematic and camera parameters, but instead only on an accurate generative model of image feature observations from the joint positions of the robot. We demonstrate that with our model architecture and learning algorithm, we can consistently learn accurate models from less than 50 training samples (which amounts to less than 1 min of unsupervised data collection), and that such data-efficient learning is not possible with standard neural architectures. Further, we show that by using the generative model in the loop and learning online, we can enable a robotic system to recover from calibration errors and to detect and quickly adapt to possibly unexpected changes in the robot-camera system (e.g. bumped camera, new objects).


Template from here.