We've been accepted for presentation at BlackboxNLP (EMNLP 2024). Read here

LSE.AI

We are an AI research lab focused on mechanistic interpretability of LLMs.

Our Research

Our recent paper on layerwise trasfer learning in sparse autoencoders was accepted for publication by ACL and presentation at BackboxNLP (EMNLP 2024)!

Objectives
Publish in NeurIPS/ACL/ICLR/ICML, and have significant impact.
Our Methodology
We improve sparse autoencoders and apply them to open problems across the sciences with a particular focus on mechanistic interpretability.
Open Problems
What circuits LLMs use for a task and how do they work? How to make SAEs cheaper? What do LLMs tell us about language?
Benefits of Interpretability
An AI lie detector would significantly reduce AI-related risks. Moreover, interpretable AIs are well suited to fields like medicine and finance, where confidence in the output is critical.

Our team

We’re a dynamic group of individuals who are passionate about what we do and dedicated to delivering the best.