Iván Arcuschin Moreno

Iván Arcuschin Moreno

Independent Researcher

I’m Iván, an Independent Researcher working on AI Safety.

I earned my Computer Science PhD from the University of Buenos Aires, Argentina, specializing in Automated Test Generation for Android apps. My research has been published in top Software Engineering conferences such as ICST and AST.

Since then, I’ve been participating in the ML Alignment & Theory Scholars (MATS) Program, working on AI Safety. In 2024, I worked on InterpBench, a benchmark of semi-synthetic transformers with known circuits for evaluating Mechanistic Interpretability techniques. This work was published at NeurIPS 2024.

In 2025, I’ve been working on the paper Chain-of-Thought Reasoning in the Wild is Not Always Faithful. In this work, we show that frontier thinking and non-thinking LLMs can produce subtle unfaithfulness – they can construct superficially coherent yet contradictory arguments, selectively interpreting ambiguous information or switching between different types of arguments to align with their inherent biases. In contrast to other research on unfaithfulness, this form of unfaithfulness occurs without any explicit prompting.