Posts
Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study
Using an LLM perplexity filter to detect weight exfiltration
Evaluating Sparse Autoencoders with Board Games
An Intuitive Explanation of Sparse Autoencoders for LLM Interpretability
Manipulating Chess-GPT's World Model
Chess-GPT's Internal World Model
subscribe via RSS