In addition to increasing the number of layers, you can also grow the weight mat...

		yorwba 4 months ago \| parent \| context \| favorite \| on: Fine-tuning LLMs is a waste of time In addition to increasing the number of layers, you can also grow the weight matrices and initialize by tiling them with the smaller model's weights https://neurips.cc/media/neurips-2023/Slides/83968_5GxuY2z.p...