<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Sorry Engineering by Rafal Makara: AI Learning Notes]]></title><description><![CDATA[I've been using AI for a while—ended up running AI tasks by talking to my watch. However, after an experiment with TensorFlow in 2015, I skipped the basics needed for real engineering. These short learning notes are my way of catching up, focusing on theory rather than usage. Watch out! Some of them might even be wrong—that’s part of learning!]]></description><link>https://www.sorryengineering.com/s/ai-learning-notes</link><image><url>https://www.sorryengineering.com/img/substack.png</url><title>Sorry Engineering by Rafal Makara: AI Learning Notes</title><link>https://www.sorryengineering.com/s/ai-learning-notes</link></image><generator>Substack</generator><lastBuildDate>Wed, 06 May 2026 23:15:14 GMT</lastBuildDate><atom:link href="https://www.sorryengineering.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Rafal Makara]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[rafalmakara@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[rafalmakara@substack.com]]></itunes:email><itunes:name><![CDATA[Rafal Makara]]></itunes:name></itunes:owner><itunes:author><![CDATA[Rafal Makara]]></itunes:author><googleplay:owner><![CDATA[rafalmakara@substack.com]]></googleplay:owner><googleplay:email><![CDATA[rafalmakara@substack.com]]></googleplay:email><googleplay:author><![CDATA[Rafal Makara]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Model Architecture: seq2seq]]></title><description><![CDATA[A model architecture defines how a machine learning model is structured&#8212;how data flows through it, how different components interact, and how it makes predictions.]]></description><link>https://www.sorryengineering.com/p/model-architecture-seq2seq</link><guid isPermaLink="false">https://www.sorryengineering.com/p/model-architecture-seq2seq</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Wed, 12 Feb 2025 23:58:09 GMT</pubDate><content:encoded><![CDATA[<p>A model architecture defines how a machine learning model is structured&#8212;how data flows through it, how different components interact, and how it makes predictions.</p><p>There have been quiet a few model architectures. Right now, Transformer architecture dominates the field.</p><p>But before Transformer, Seq2Seq (Sequence-to-Sequence) was the big thing.</p><h2><strong>How does Seq2Seq work?</strong></h2><p>Seq2Seq is built with two main components:</p><p>&#8226; <strong>Encoder</strong>: Processes the input.</p><p>&#8226; <strong>Decoder</strong>: Generates the output.</p><p>Both work with sequences of tokens and, in the classic approach, use Recurrent Neural Networks (RNNs) or their more powerful versions&#8212;LSTMs and GRUs.</p><ol><li><p>The <strong>encoder</strong> reads the input sequence step by step, updating its hidden state at each time step.</p></li><li><p>The <strong>final hidden state</strong> after processing the last input token represents the entire input sequence.</p></li><li><p>The <strong>decoder</strong> receives this final hidden state as its initial state and starts generating the output sequence, token by token.</p></li></ol><p>Metaphor used in "AI Engineering" book was: working with final hidden state is like answering questions about a book based on just the summary. The final hidden state tries to capture everything from the input, but some details may get lost.</p><p>Since RNNs work sequentially, we must process the entire input before generating even the first output token. The longer the input, the longer we wait before we get anything in return. It doesn&#8217;t create the best UX for the chatbots.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NlO_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NlO_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 424w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 848w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 1272w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NlO_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png" width="1456" height="199" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:199,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:70798,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NlO_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 424w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 848w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 1272w, https://substackcdn.com/image/fetch/$s_!NlO_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41e9dff0-89bd-422d-bfb5-8eaa8bfa502d_2404x328.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>After some time, the problem of not looking back at original input tokens (but only at the Final Hidden State) got solved by the Attention Mechanism. But this is a story for another time. </p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li><li><p><em>https://d2l.ai/chapter_recurrent-modern/seq2seq.html</em></p></li><li><p><em>https://arxiv.org/abs/1409.3215v3</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[What is Inference Optimization?]]></title><description><![CDATA[It sounds like a smart term.]]></description><link>https://www.sorryengineering.com/p/what-is-inference-optimization</link><guid isPermaLink="false">https://www.sorryengineering.com/p/what-is-inference-optimization</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sat, 08 Feb 2025 22:53:41 GMT</pubDate><content:encoded><![CDATA[<p>It sounds like a smart term. In simple words, it is about making models faster and cheaper. It involves techniques to reduce computational costs, latency, and memory usage while maintaining or improving model accuracy.</p><p>If a chatbot generates a response of 300 tokens, with each token taking 10 milliseconds to produce, the user experience will be significantly better than if each token took 100 milliseconds to generate. </p><p>At this moment, I do not know much about inference optimization techniques. Perplexity says there are such as: pruning, quantization, knowledge distillation, weight sharing, low-rank factorization, early exit mechanisms, deployment strategy, caching and memoization, parallelism and batching. And probably more.</p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li><li><p><em>Perplexity</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Dataset Engineering]]></title><description><![CDATA[Dataset engineering refers to designing, collecting, curating, generating, annotating, optimizing the data needed for training and adapting AI models.]]></description><link>https://www.sorryengineering.com/p/dataset-engineering</link><guid isPermaLink="false">https://www.sorryengineering.com/p/dataset-engineering</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sat, 08 Feb 2025 22:40:49 GMT</pubDate><content:encoded><![CDATA[<p>Dataset engineering refers to designing, collecting, curating, generating, annotating, optimizing the data needed for training and adapting AI models.</p><p>Imagine two problem types: classification and open chat response.</p><p>In <strong>closed-ended models</strong>, such as traditional classification models, dataset engineering is straightforward. For example, labeling an image as &#8220;cat&#8221; or &#8220;not a cat&#8221; is a well-defined task with clear ground truth.</p><p>However, in <strong>open-ended models</strong>, such as foundation models, dataset engineering becomes more complex. Since these models (through e.g. chatbot UI) can generate responses in an almost unlimited number of ways, it extends beyond simple labeling. Instead, dataset engineering focuses on tasks like deduplication, tokenization, context retrieval, quality control, removal of sensitive information.</p><p>How to prepare datasets, so specific model can train effectively on them?</p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li><li><p><em>Additional research</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Pre-training, Finetuning, Post-training]]></title><description><![CDATA[Training is deeply associated with the process of adjusting the model weights.]]></description><link>https://www.sorryengineering.com/p/pre-training-finetuning-post-training</link><guid isPermaLink="false">https://www.sorryengineering.com/p/pre-training-finetuning-post-training</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sat, 08 Feb 2025 22:14:20 GMT</pubDate><content:encoded><![CDATA[<p>Training is deeply associated with the process of adjusting the model weights. While prompt engineering influences the output by modifying the the given input/context, prompt engineering doesn&#8217;t change the model weights.</p><p><strong>Pre-training</strong>, training a model from scratch. Model weights are initialised. Then huge amount of training data is processed to adjust the model weights. This is the most resource-intensive phase. </p><p><strong>Finetuning</strong>, continuation of the training with the weights obtained from the previous training sessions. This process usually uses much smaller or specialised dataset.</p><p><strong>Post-training</strong>, from one perspective, the finetuning and the post-training are the same as both happen after the model is pre-trained as the mission of both is to improve the model. </p><h2>Finetuning vs Post-training</h2><p>Then, what&#8217;s the difference between finetuning and post-training? </p><p>I am not sure if this widely accepted definition, but in the context of foundation models, according to the book, finetuning is made by users of foundation models, while post-training is made by the foundation model engineers. There might be also a difference in a goal. </p><p>If you&#8217;re building your end-user facing application on top of OpenAI, and you decide to adjust the weights of the existing model, then you do finetuning. At this point, your finetuning will be probably targeting your specific use cases (e.g. domain) of the model to make the model more accurate and knowledgable in your context. </p><p>If you&#8217;re building your end-user facing application on top of OpenAI, and they decide to adjust the weights of the existing model, then they do post-training. For example, they can apply Reinforcement Learning from Human Feedback (RLHF) to align the model with human values, ethical principles, and intended use cases.</p><p>The last phase, post-training, which may use RLHF doesn&#8217;t necessarily adjust the weights, but might apply some output filtering techniques. So, training is not always about adjusting the weights?</p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li><li><p><em>Additional research</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Basic Layers of AI Systems]]></title><description><![CDATA[We&#8217;ve had a lot of time to get used to lots of architectural approaches for building applications.]]></description><link>https://www.sorryengineering.com/p/basic-layers-of-ai-systems</link><guid isPermaLink="false">https://www.sorryengineering.com/p/basic-layers-of-ai-systems</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sat, 08 Feb 2025 21:43:49 GMT</pubDate><content:encoded><![CDATA[<p>We&#8217;ve had a lot of time to get used to lots of architectural approaches for building applications. In the AI world, there are a few as well. The most basic one, seems to be:</p><ul><li><p>Application Development Layer</p><ul><li><p>AI Interface</p></li><li><p>Prompt engineering</p></li><li><p>Context construction</p></li><li><p>Evaluation</p></li></ul></li><li><p>Model Development Layer</p><ul><li><p>Inference optimization</p></li><li><p>Dataset engineering</p></li><li><p>Modeling &amp; training</p></li><li><p>Evaluation</p></li></ul></li><li><p>Infrastructure Layer</p><ul><li><p>Compute management</p></li><li><p>Data management</p></li><li><p>Serving</p></li><li><p>Monitoring</p></li></ul></li></ul><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[First thoughts on AI Metrics]]></title><description><![CDATA[While non-AI software engineering world is still getting used to metrics, metrics are an important part of AI world as well.]]></description><link>https://www.sorryengineering.com/p/first-thought-on-ai-metrics</link><guid isPermaLink="false">https://www.sorryengineering.com/p/first-thought-on-ai-metrics</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sat, 08 Feb 2025 21:38:56 GMT</pubDate><content:encoded><![CDATA[<p>While non-AI software engineering world is still getting used to metrics, metrics are an important part of AI world as well.</p><p>There might be such: </p><ul><li><p>Quality of the chatbot responses. Thumbs up. Thumbs down.</p></li><li><p>Chat message follow-up rate. </p></li><li><p>Accuracy. Precision. Recall. R1 Score. Hallucination rate.</p></li><li><p>Inference Latency. Time to first token. Time per full answer. Time per token.</p></li><li><p>P70, P95 tokens per response. Tokens per prompt.</p></li><li><p>Memory usage. GPU/CPU Utilization.</p></li><li><p>Cost per job.</p></li><li><p>Median, P75 of agents involved in a job.</p></li><li><p>Energy consumption.</p></li><li><p>Labeling accuracy. </p></li><li><p>Error rates.</p></li></ul><p>There are more. Will be more. Some will be valuable in a given context and time. The other ones will be pointless. </p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li><li><p><em>Additional research</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Agents, learnings from Anthropic]]></title><description><![CDATA[Agents, Agentic Workflows and Workflow Patterns]]></description><link>https://www.sorryengineering.com/p/agents</link><guid isPermaLink="false">https://www.sorryengineering.com/p/agents</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Wed, 05 Feb 2025 23:16:59 GMT</pubDate><content:encoded><![CDATA[<h2>What&#8217;s an Agent?</h2><p>Agents function like workflows, but with AI deciding what process or tool to use next. </p><p>Most agents are just LLMs using external tools&#8212;taking in data, making decisions, and acting accordingly. They are LLMs with enhancements like retrieval, tool integration, and memory.</p><p>There&#8217;s a trade-off: higher costs, more compute, and sometimes slower execution. But in return, you gain flexibility&#8212;fewer hardcoded conditions and more dynamic problem-solving.</p><h2>Agentic Workflows</h2><p>Different patterns shape how agentic workflows operate. A few examples:</p><ul><li><p><strong>Prompt Chaining:</strong> Breaks tasks into steps for better results.<br><em>Example: Generate marketing copy &#8594; Translate &#8594; Format.</em></p></li><li><p><strong>Routing:</strong> Directs tasks to the right process.<br><em>Example: FAQ bot for general queries, automation for refunds, AI for tech support.</em></p></li><li><p><strong>Sectioning:</strong> Splits work across multiple models.<br><em>Example: One model generates responses, another moderates content.</em></p></li><li><p><strong>Voting:</strong> Runs multiple times for accuracy, enabling models to challenge each other.<br><em>Example: Content moderation using three models to balance false positives.</em></p></li><li><p><strong>Orchestrator-Workers:</strong> A central LLM assigns tasks to worker models&#8212;similar to map-reduce for LLMs.<br><em>Example: Separate research workers doing the research, which gets aggregated later.</em></p></li><li><p><strong>Evaluator-Optimizer:</strong> One LLM generates, another evaluates and refines, creating a self-improvement loop.<br>Example: Preparing the feedback and improving it execution-by-execution.</p></li></ul><p>Agentic workflows have best practices, design patterns, and bad smells&#8212;just like coding.</p><div><hr></div><p>Sources:</p><ul><li><p>https://www.anthropic.com/research/building-effective-agents</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Workflows: Learnings from Gumloop]]></title><description><![CDATA[Subflows, simple UIs, Chrome Extension, Custom Nodes]]></description><link>https://www.sorryengineering.com/p/learnings-from-gumloop</link><guid isPermaLink="false">https://www.sorryengineering.com/p/learnings-from-gumloop</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Tue, 04 Feb 2025 22:16:14 GMT</pubDate><content:encoded><![CDATA[<p>I watched <a href="https://www.youtube.com/watch?v=QFc7jXZ2pdE">this Gumloop presentation</a>. I personally use <strong>n8n</strong>, but I like checking out alternatives to get inspired by them. For sure, Gumloop is a product worth looking at.</p><h3><strong>Subflows Are Like Functions</strong></h3><p>Workflows often repeat the same steps. Gumloop has subflows, which let you reuse parts of a workflow instead of rebuilding the same logic again. It&#8217;s like writing functions in code.</p><p>If you have 20 different workflows, but all of them end with sending a Slack message written in a way aligned with your writing style - it can be a subflow.</p><h3><strong>Simple UI for Execution</strong></h3><p>Gumloop lets you create a basic website/HTML form with a few text fields for input parameters. You fill them in, click a button, and the workflow runs. Such forms can be shared with non-technical users.</p><p>Of course, without this functionality, you could setup such a website on your own and implement a button to call a webhook executing the workflow. However, it&#8217;s cool to see that they have a feature for it.</p><h3><strong>Chrome Extension for Instant Automation</strong></h3><p>Gumloop comes with a Chrome extension that lets you grab content from a currently opened website and send it as input to a workflow. One-click, and automation kicks in, already populated with the content from the website. Faster than copy&amp;paste. Much easier than URL scrapping, especially for pages behind authorization. </p><h3><strong>Building Custom Nodes in AI-Powered Way</strong></h3><p>Gumloop supports custom nodes, letting you connect to third-party data sources that aren&#8217;t natively integrated. But what stood out in the video? The process. You copy &amp; paste the API documentation of a service that isn&#8217;t yet supported, and Gumloop&#8217;s AI generates the custom node for you. No manual coding, no complex setup&#8212;just an instant, AI-assisted connector, ready to be used in your workflows.</p><div><hr></div><p>Sources:</p><ul><li><p><a href="https://www.youtube.com/watch?v=QFc7jXZ2pdE">https://www.youtube.com/watch?v=QFc7jXZ2pdE</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Prompts: Zero-shot, Few-shot, Chain-of-Thought]]></title><description><![CDATA[How to prompt?]]></description><link>https://www.sorryengineering.com/p/prompts-zero-shot-few-shot-chain</link><guid isPermaLink="false">https://www.sorryengineering.com/p/prompts-zero-shot-few-shot-chain</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Tue, 04 Feb 2025 21:28:34 GMT</pubDate><content:encoded><![CDATA[<h2><strong>Zero-shot prompting</strong></h2><p>Zero-shot prompting means that the prompt used to interact with the model does not contain examples. </p><p>Prompt:</p><blockquote><p>Classify the text into neutral, negative or positive. </p><p>Text: I think the vacation is okay.</p><p>Sentiment:</p></blockquote><h2><strong>Few-shot prompting</strong></h2><p>Few-shot prompting provides examples in the prompt to steer the model to better performance.</p><p>Prompt:</p><blockquote><p>This is awesome! // Negative</p><p>This is bad! // Positive</p><p>Wow that movie was rad! // Positive</p><p>What a horrible show! //</p></blockquote><h2><strong>Chain-of-Thought</strong></h2><p>Chain-of-Thought prompting provides an example of a reasoning process in a prompt, that the model can learn from. The sentence <em>&#8220;let&#8217;s take it step by step&#8221;</em> works like magic.</p><p>Prompt:</p><blockquote><p>Q: Bulb of garlic has 9 cloves. I ate 5 of them. I bought another bulb of garlic. How many cloves do I have?</p><p>A: Let's take it step-by-step. You had 9 cloves in a single bulb. You ate 5 cloves, after that you had 4 cloves left. You bought another bulb, assuming it also has 9 cloves. So you have 9+4 cloves in total, which is 13 cloves.</p><p>Q: My plant gives 3 new flowers every week. One of them dies every 2 weeks. If I buy one more plant in the second week. How many flowers I am going to have after 4 weeks?</p><p>A: </p></blockquote><p>I haven&#8217;t shared the answers that the model gave me. Try it yourself. </p><div><hr></div><p>Sources:</p><ul><li><p>https://www.promptingguide.ai/</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Tokens and Vocabulary]]></title><description><![CDATA[["I", "would", "n", "'", "t", "like", "to", "go", "to", "the", "gym", "today", "because", "I", "am", "sick"]]]></description><link>https://www.sorryengineering.com/p/tokens-and-vocabulary</link><guid isPermaLink="false">https://www.sorryengineering.com/p/tokens-and-vocabulary</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sun, 02 Feb 2025 10:08:12 GMT</pubDate><content:encoded><![CDATA[<p>Phrase:</p><ul><li><p>&#8220;I wouldn&#8217;t like to go to the gym today because I am sick&#8221;</p></li></ul><p>might be split into the tokens (depending on the model) like:</p><ul><li><p>["I", "would", "n", "'", "t", "like", "to", "go", "to", "the", "gym", "today", "because", "I", "am", "sick"]</p></li></ul><h2>Why tokenization?</h2><ul><li><p>Tokens are more meaningful than single characters. </p></li><li><p>There are fewer unique tokens, than words. Also <em>&#8220;ing&#8221;</em> token in English is quiet common. Making model more efficient.</p></li><li><p>Tokens help with unknown words, like &#8220;bananing&#8221;, which is made of &#8220;banana&#8221; and &#8220;ing&#8221;.</p></li></ul><h2>Vocabulary</h2><p>Set of all tokens a model can work with is the model&#8217;s vocabulary. </p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly)</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Supervision and Self-supervision]]></title><description><![CDATA[Is labeled training date necessary?]]></description><link>https://www.sorryengineering.com/p/supervision-and-self-supervision</link><guid isPermaLink="false">https://www.sorryengineering.com/p/supervision-and-self-supervision</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sun, 02 Feb 2025 10:06:22 GMT</pubDate><content:encoded><![CDATA[<p><strong>Supervision</strong> means training a model using labeled data. You show it two pictures&#8212;one labeled as &#8220;cat&#8221; and the other not labelled as &#8220;cat&#8221;. The model learns to recognize pictures with cats.</p><p><strong>Self-supervision</strong> means learning directly from the input data itself. Take the phrase: <em>&#8220;Sky is blue and beautiful.&#8221;</em> The model treats it as a training sequence: 1. Sky; 2. Sky is; 3. Sky is blue; 4. Sky is blue and; 5. Sky is blue and beautiful. It learns patterns and context without explicit labels, learns even from inputs.</p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Key AI Model Types and Concepts]]></title><description><![CDATA[LM, LLM, MMM, GPM, FM, MLEng, AIEng]]></description><link>https://www.sorryengineering.com/p/key-ai-model-types-and-concepts</link><guid isPermaLink="false">https://www.sorryengineering.com/p/key-ai-model-types-and-concepts</guid><dc:creator><![CDATA[Rafal Makara]]></dc:creator><pubDate>Sun, 02 Feb 2025 10:05:03 GMT</pubDate><content:encoded><![CDATA[<p><strong>Language Models (LMs)</strong></p><p>Language Models are based on statistical patterns learned from one or more languages. If they complete sentences like <strong>&#8220;My favorite color is _&#8221;</strong>, then it is <strong>autoregressive language model</strong> predicting the next word. Alternatively, when they fill in the blanks in sentences like <strong>&#8220;My favorite _ is blue&#8221;</strong>, then this is <strong>masked language model</strong>.</p><p><strong>Large Language Models (LLMs)</strong></p><p>The key difference between a standard language model and a large language model is <strong>scale</strong>&#8212;LLMs are trained on <strong>larger datasets, have more parameters, and require greater computational power</strong>, making them significantly more capable. Size matters.</p><p><strong>Multimodal Models</strong></p><p>Unlike traditional language models that process only text, <strong>multimodal models</strong> can handle multiple types of input, such as <strong>text, images, audio, and speech</strong>, enabling more complex interactions and understanding.</p><p><strong>Task-Specific Models</strong></p><p>These models are <strong>optimized for a single function</strong>&#8212;for example, a translation model can convert text between languages but <strong>cannot</strong> perform sentiment analysis. They are highly efficient but limited in scope.</p><p><strong>General-Purpose Models</strong></p><p>These models are <strong>versatile</strong> and can handle multiple tasks, such as <strong>translation, sentiment analysis, and more</strong>, without requiring significant modifications.</p><p><strong>Foundation Models</strong></p><p>S<strong>ubcategory of general-purpose models</strong>, serving as a <strong>base for building AI applications</strong>. Thanks to them gazillions of startups can call themselves &#8220;AI Startup&#8221;.</p><p><strong>ML Engineering</strong></p><p>Machine Learning Engineering involves not only developing end-user applications but also designing, training, and optimizing machine learning models. It is sometimes referred to as <strong>MLOps, AIOps, or LLMOps</strong>. When we had no Foundation Models, that was the way to go.</p><p><strong>AI Engineering</strong></p><p>AI Engineering is the <strong>process of building applications on top of Foundation Models</strong>, making AI accessible and easy to use for most of us.</p><div><hr></div><p>Sources:</p><ul><li><p><em>AI Engineering by Chip Huyen (O&#8217;Reilly)</em></p></li></ul>]]></content:encoded></item></channel></rss>