I’m Iván, Lead Research Scientist at Poseidon Research working on AI Safety.
I earned my Computer Science PhD from the University of Buenos Aires, Argentina. Since 2024, I’ve focused on AI Safety research through the MATS program, publishing at NeurIPS, ICML, and ICLR.
My research centers on understanding and evaluating the internal reasoning of large language models. In 2024, I co-created InterpBench (NeurIPS 2024), a benchmark for evaluating Mechanistic Interpretability techniques.
My work on Chain-of-Thought unfaithfulness (ICLR 2025 Workshop) showed that frontier LLMs produce subtly unfaithful reasoning under realistic conditions, accumulating 137 citations in under a year.