Iván Arcuschin Moreno

Iván Arcuschin Moreno

Lead Research Scientist

Poseidon Research

I’m Iván, Lead Research Scientist at Poseidon Research working on AI Safety.

I earned my Computer Science PhD from the University of Buenos Aires, Argentina. Since 2024, I’ve focused on AI Safety research, completing two MATS terms (under Adrià Garriga-Alonso at FAR AI and Arthur Conmy at Google DeepMind) and publishing at NeurIPS, ICML, and ICLR workshops.

My research focuses on chain-of-thought interpretability of frontier LLMs, at the intersection of black-box methods (what models say) and white-box methods (what they compute). On the black-box side, I’ve shown that frontier LLMs produce rare but subtle unfaithful chain-of-thought with no nudge prompting or response editing (ICML 2026, ~180 citations), and I’ve built automated pipelines that uncover the unverbalised biases LLMs hide in their stated reasoning (ICML 2026). On the white-box side, I’ve shown that thinking models like DeepSeek R1 mostly repurpose reasoning mechanisms already present in their base counterparts (ICML 2026 Spotlight), and I’ve collaborated on benchmarks for evaluating mechanistic interpretability techniques at NeurIPS 2024 and ICML 2025.