Publications

Manuel Fernandez Burda, Santiago Aranguri, Iván Arcuschin, Enzo Ferrante (2026). Inference-Time Toxicity Mitigation in Protein Language Models via Logit-Diff Amplification. Workshop on Generative and Experimental Perspectives for Biomolecular Design (ICLR 2026).

Atticus Wang, Iván Arcuschin, Arthur Conmy (2026). Automatically Finding Reward Model Biases. 43rd International Conference on Machine Learning (ICML 2026).

Iván Arcuschin*, David Chanin*, Adrià Garriga-Alonso, Oana-Maria Camburu (2026). Biases in the Blind Spot: Detecting What LLMs Fail to Mention. 43rd International Conference on Machine Learning (ICML 2026).

Arthur Meek, Euan Sprejer, Iván Arcuschin, Austin J. Brockmeier, Steven Basart (2025). Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity. Preprint.

Constantin Venhoff*, Iván Arcuschin*, Philip Torr, Arthur Conmy, Neel Nanda (2025). Base Models Know How to Reason, Thinking Models Learn When ⭐. 43rd International Conference on Machine Learning (ICML 2026 Spotlight).

Fazl Barez, Tung-Yu Wu, Iván Arcuschin, Michael Lan, Vincent Wang, Noah Siegel, Nicolas Collignon, Clement Neo, Isabelle Lee, Alasdair Paren, Adel Bibi, Robert Trager, Damiano Fornasiere, John Yan, Yanai Elazar, Yoshua Bengio (2025). Chain-of-Thought Is Not Explainability. Oxford AI Governance Initiative (AIGI).

Aaron Mueller, Atticus Geiger, Sarah Wiegreffe, Dana Arad, Iván Arcuschin, Adam Belfki, Yik Siu Chan, Jaden Fiotto-Kaufman, Tal Haklay, Michael Hanna, Jing Huang, Rohan Gupta, Yaniv Nikankin, Hadas Orgad, Nikhil Prakash, Anja Reusch, Aruna Sankaranarayanan, Shun Shao, Alessandro Stolfo, Martin Tutek, Amir Zur, David Bau, Yonatan Belinkov (2025). MIB: A Mechanistic Interpretability Benchmark. 42nd International Conference on Machine Learning (ICML 2025).

Iván Arcuschin, Jett Janiak, Robert Krzyzanowski, Senthooran Rajamanoharan, Neel Nanda, Arthur Conmy (2025). Chain-of-Thought Reasoning In The Wild Is Not Always Faithful. 43rd International Conference on Machine Learning (ICML 2026).

Constantin Venhoff, Iván Arcuschin, Philip Torr, Arthur Conmy, Neel Nanda (2025). Understanding Reasoning in Thinking Language Models via Steering Vectors. Workshop on Reasoning and Planning for Large Language Models (ICLR 2025).

Rohan Gupta, Iván Arcuschin, Thomas Kwa, Adrià Garriga-Alonso (2024). InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques. 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks..

Michael Auer, Iván Arcuschin Moreno, Gordon Fraser (2024). WallMauer: Robust Code Coverage Instrumentation for Android Apps. 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024).

Iván Arcuschin Moreno, Lisandro Di Meo, Michael Auer, Juan Pablo Galeotti, Gordon Fraser (2024). Brewing Up Reliability: Espresso Test Generation for Android Apps. 17th IEEE International Conference on Software Testing, Verification and Validation (ICST 2024).

Iván Arcuschin Moreno, supervised by Juan Pablo Galeotti (2024). Random Espresso Test Case Generation for Android. PhD Thesis.

Iván Arcuschin Moreno, Christian Ciccaroni, Juan Pablo Galeotti, José Miguel Rojas (2022). On the feasibility and challenges of synthesizing executable Espresso tests. 3rd ACM/IEEE International Conference on Automation of Software Test (AST 2022).

Iván Arcuschin Moreno, Juan Pablo Galeotti, Diego Garberbetsky (2021). An Empirical Study on How Sapienz Achieves Coverage and Crash Detection. Journal of Software: Evolution and Process (JSEP), Volume 35, Issue 4, 2023.

Iván Arcuschin Moreno (2020). Search-Based Test Generation for Android Apps. Companion Proceedings of the 42nd International Conference on Software Engineering (Doctoral Symposium at ICSE 2020).

Iván Arcuschin Moreno, Juan Pablo Galeotti, Diego Garberbetsky (2020). Algorithm or Representation? An Empirical Study on How SAPIENZ Achieves Coverage. 1st ACM/IEEE International Conference on Automation of Software Test (AST 2020).