The rapid expansion of large-scale AI systems has created a new problem. They increasingly act in environments that humans never fully specify and never fully understand. A model can be competent inside a controlled workflow, yet become brittle once the real world shifts beneath it.

How do we get AI systems to learn stable, human-aligned behaviors from the messy reality they actually operate in so those behaviors transfer to new environments, new shifts in data, and new operators?

I've explored these questions through reliability, safety, and agent behavior research at Microsoft Research and Google DeepMind, with a focus on how large models generalize (and fail to generalize) when embedded into real production ecosystems rather than synthetic testbeds.

Recently, I left the frontier lab path to pursue questions centered on these dynamics.