Transformers readings: Less-technical explainer: https://thenextweb.com/news/understanding-transformers-the-machine-learning-model-behind-gpt-3-machine-learning-ai-syndication More-technical explainer https://jalammar.github.io/illustrated-transformer/ Original paper (unreadable) https://arxiv.org/abs/1706.03762
From the currently dormant yak collective ML study group… we did a session on these readings about 6 months ago. Planning to revive in 2023.
I found this “from scratch” walk through to be quite good if you haven’t come across https://e2eml.school/transformers.html