Gwern — Deep Learning↗
by Gwern Branwen
Neural network research, scaling, and architectures.
12 posts
Loading...
Delivery order
Each email contains one post, starting with #1
The Kelly Coin-Flipping Game: Exact Solutions
38 minWe can approximate it with our pre-existing value function for a known stopping time/edge/max wealth, sampling from the posterior; for example, we might draw 1000 values from 𝒩(300,25), the Pareto,...
Free-Play Periods for RL Agents
8 minProposal for incentivizing meta-learning of exploration in deep reinforcement learning: domain randomization with reward-shaping, where there is a fixed-length ‘play time’ with no rewards/losses at...
Novelty Nets: Classifier Anti-Guidance
10 minGenerative modeling proposal for increasing diversity of samples by a helper NN memorizing past samples and ‘repelling’ new samples away from old ones.
Number Search Engine via NN Embeddings
5 minProposal to create a ‘search engine’ like OEIS but for individual numbers, allowing fuzzy lookups, by training a neural net embedding on the scientific & mathematic literature’s corpus of...
Research Ideas
36 minChoose-Your-Own-Adventure generative fiction for efficiency/editing (2021-06-06) CYOA generative fiction Try directly optimizing reward generation (2019-12-16): backpropping reward...
Absolute Unit NNs: Regression-Based MLPs for Everything
7 minOne might wonder: is an AUNN truly the very simplest possible NN architecture? Maybe not. We still have the index input making it more complex.
LLM Daydreaming
10 minProposal & discussion of how default mode networks for LLMs are an example of missing capabilities for search and novelty in contemporary AI systems.
‘end-to-end’ directory
1 minhttps://arxiv.org/abs/2106.10316#deepmind : “Proper Value Equivalence” , Christopher Grimm, André Barreto, Gregory Farquhar , David Silver , Satinder Singh link-bibliography https://arxiv.
‘NN sparsity’ directory
2 minhttps://arxiv.org/abs/2510.15103#facebook : “Continual Learning via Sparse Memory Finetuning” , Jessy Lin, Luke Zettlemoyer , Gargi Ghosh , Wen-Tau Yih, Aram Markosyan, Vincent-Pierre Berges, Barlas...
‘AI scaling’ directory
32 minhttps://www.sciencedirect.com/science/article/pii/S016028962500025X : “Psychometrically Derived 60-Question Benchmarks: Substantial Efficiencies and the Possibility of Human-AI Comparisons” , Gilles...
‘MLP NN’ directory
6 minhttps://arxiv.org/abs/2503.24187 : “NeuRa L a T e X : A Machine Learning Library Written in Pure L a T e X ” , James A. D. Gardner, Will Rowan, William A. P. Smith link-bibliography https://arxiv.
‘self-attention’ directory
7 minhttps://arxiv.org/abs/2507.02754#facebook : “Fast and Simplex: 2-Simplicial Attention in Triton” , Aurko Roy, Timothy Chou, Sai Surya Duvvuri , Sijia Chen, Jiecao Yu, Xiaodong Wang , Manzil Zaheer,...