Marius Binner
Position
Phd candidate, Artificial intelligence
Affiliation
Research groups
Short info
Looking at technical AI safety topics, currently mechanistic interpretability of agentic frontier models. Frontier AIs can autonomously perform increasingly longer horizon tasks: https://arxiv.org/abs/2503.14499
It would be nice to have ways to ensure they're not dangerous.
It would be nice to have ways to ensure they're not dangerous.