transformer-architecture

Coverage of the neural network design that underpins modern language models, from attention mechanisms and embeddings to the layered structure that turns tokens into predictions. Expect breakdowns of how transformers process context, why they replaced earlier sequence models, and how architectural choices shape model behavior. Content connects the theory to practical concerns like running models locally and understanding what happens under the hood during inference.

Laptop on a desk with a small llama figurine, running a local LLM at home with Ollama.

local-llms, ollama, transformer-architecture, llm-tutorial

The power of LLama - Part 1: The Brain, the Engine, and Your First Llama on Ollama

Get started with local LLMs! This post breaks down how the transformer architecture works, why you need an inference engine to run models, and how to set up your first local llm using Ollama.