All posts

Retrospective on my unsupervised elicitation challenge
My unsupervised elicitation challenge
On 'Inventing Temperature' and the realness of properties
Augustine of Hippo's Handbook on Faith, Hope, and Love in Latin (or: Claude as Pandoc++)
Consider not donating under $100 to political candidates
A theory of how alignment research should work
A failure of an argument against sola scriptura
Why keep a diary, and why wish for large language models
Bayesian inference without priors
n of m ring signatures
How to type Aleksander Mądry's last name in LaTeX
If a little is good, is more better?
Watermarking considered overrated?
Difficulties in making powerful aligned AI
On Blogging and Podcasting
Things I carry almost every day, as of late December 2022
Announcing The Filan Cabinet
Takeaways from a survey on AI alignment resources
What’s the chance a smart London resident dies of a Russian nuke in the next month?
A Nice Representation of the Laplacian
The Meta-Puzzle
Even if you're right, you're wrong
Handicapping competitive games
A second example of conditional orthogonality in finite factored sets
A simple example of conditional orthogonality in finite factored sets
Challenge: know everything that the best go bot knows about go
Privacy vs proof of character
Cognitive mistakes I've made about COVID-19
Announcing AXRP, the AI X-risk Research Podcast
Security Mindset and Takeoff Speeds
An Analytic Perspective on AI Alignment
A Personal Rationality Wishlist
Verification and Transparency
Test Cases for Impact Regularisation Methods
Bottle Caps Aren't Optimisers
Mechanistic Transparency for Machine Learning
Insights from 'The Strategy of Conflict'
Topology
A discussion on the usefulness on 538's forecasts
Kelly bettors

RSS feed