| | NeurIPS 2025 Best Papers in Comics: From Artificial Hivemind to 1000-Layer RL (gonzoml.substack.com) |
| 3 points by che_shr_cat 41 days ago | past |
|
| | Visualizing Research: How I Use Gemini 3.0 to Turn Papers into Comics (gonzoml.substack.com) |
| 1 point by che_shr_cat 48 days ago | past |
|
| | Tiny Recursive Model (TRM) vs. Hierarchical Reasoning Model (HRM) (gonzoml.substack.com) |
| 2 points by che_shr_cat 84 days ago | past |
|
| | Stochastic Activations (gonzoml.substack.com) |
| 2 points by che_shr_cat 3 months ago | past |
|
| | V-JEPA 2: Scaling V-JEPA (gonzoml.substack.com) |
| 2 points by che_shr_cat 4 months ago | past |
|
| | Tversky Neural Networks (gonzoml.substack.com) |
| 131 points by che_shr_cat 4 months ago | past | 12 comments |
|
| | Paper FOMO and ICML 2025 Outstanding Papers (gonzoml.substack.com) |
| 1 point by che_shr_cat 5 months ago | past |
|
| | Darwin Gödel Machine (gonzoml.substack.com) |
| 1 point by che_shr_cat 7 months ago | past |
|
| | Are Deeper LLMs Smarter, or Just Longer? (gonzoml.substack.com) |
| 3 points by che_shr_cat 7 months ago | past |
|
| | Muon Optimizer Accelerates Grokking (gonzoml.substack.com) |
| 8 points by che_shr_cat 8 months ago | past |
|
| | ThoughtTerminator (gonzoml.substack.com) |
| 2 points by che_shr_cat 8 months ago | past |
|
| | Chain of Continuous Thought (Coconut) (gonzoml.substack.com) |
| 3 points by che_shr_cat 8 months ago | past |
|
| | Intuitive Physics Emergence in V-JEPA (gonzoml.substack.com) |
| 1 point by che_shr_cat 9 months ago | past |
|
| | BLT: Byte Latent Transformer (gonzoml.substack.com) |
| 4 points by che_shr_cat on Dec 26, 2024 | past |
|
| | A Single 'Super Weight' Can Break Your Billion-Parameter Model (gonzoml.substack.com) |
| 2 points by che_shr_cat on Nov 29, 2024 | past |
|
| | Jax Things to Watch for in 2025 (gonzoml.substack.com) |
| 1 point by che_shr_cat on Nov 26, 2024 | past |
|
| | Diffusion models are evolutionary algorithms (gonzoml.substack.com) |
| 126 points by che_shr_cat on Nov 9, 2024 | past | 27 comments |
|
| | Make Softmax Great Again (gonzoml.substack.com) |
| 2 points by che_shr_cat on Nov 6, 2024 | past |
|
| | Deep Learning Frameworks: The Fourth Pillar of Deep Learning Revolution (gonzoml.substack.com) |
| 1 point by che_shr_cat on Nov 5, 2024 | past |
|
| | TextGrad: Automatic "Differentiation" via Text (gonzoml.substack.com) |
| 3 points by che_shr_cat on June 26, 2024 | past |
|
| | Superconducting Supercomputers (gonzoml.substack.com) |
| 1 point by che_shr_cat on June 24, 2024 | past |
|
| | Decoder-decoder architecture is coming (gonzoml.substack.com) |
| 2 points by che_shr_cat on June 2, 2024 | past |
|
| | Chronos: Using Pretrained LLMs for Probabilistic Time Series Forecasting (gonzoml.substack.com) |
| 2 points by che_shr_cat on April 28, 2024 | past |
|
| | Big Post About Big Context (gonzoml.substack.com) |
| 49 points by che_shr_cat on Feb 29, 2024 | past | 19 comments |
|
| | Neural Network Diffusion (gonzoml.substack.com) |
| 1 point by che_shr_cat on Feb 26, 2024 | past |
|
| | Thermodynamic AI is getting hotter (gonzoml.substack.com) |
| 51 points by che_shr_cat on Feb 8, 2024 | past | 5 comments |
|
| | Training LLMs with AMD GPUs on Frontier Supercomputer (gonzoml.substack.com) |
| 1 point by che_shr_cat on Jan 16, 2024 | past |
|
| | Beyond Chinchilla-Optimal Accounting for Inference in Language Model Scaling Law (gonzoml.substack.com) |
| 1 point by che_shr_cat on Jan 8, 2024 | past |
|
| | Project CETI (gonzoml.substack.com) |
| 2 points by che_shr_cat on Dec 17, 2023 | past |
|
| | GonzoML on Mamba and S6 (+previous post on S4) (gonzoml.substack.com) |
| 1 point by che_shr_cat on Dec 13, 2023 | past |
|
|
| More |