My view is to focus on doing stuff. That's what I did. Pick up some task you wan...

My view is to focus on doing stuff. That's what I did. Pick up some task you want the model to do, try finetuning llama, playing with APIs from OpenAI, etc. Googling and asking GPT along the way.

Foundational model training got so expensive that unless you can get hired by "owns nuclear power plant of GPUs" you are not going to get any "research" done. And as the area got white-hot those companies have more available talent than hardware nowadays. So just getting into the practitioner area is the best way to get productive with those models. And you improve as a practitioner by practicing, not by reading papers.

If you're at the computer, your time is best spent writing code and interacting with those models in my opinion. If you cannot (e.g. commute) I listen to some stuff (e.g. https://www.youtube.com/watch?v=zjkBMFhNj_g - Anything from Karpathy on youtube, or https://www.youtube.com/@YannicKilcher channel).