Language Models (LMs)
Language Models are based on statistical patterns learned from one or more languages. If they complete sentences like “My favorite color is _”, then it is autoregressive language model predicting the next word. Alternatively, when they fill in the blanks in sentences like “My favorite _ is blue”, then this is masked language model.
Large Language Models (LLMs)
The key difference between a standard language model and a large language model is scale—LLMs are trained on larger datasets, have more parameters, and require greater computational power, making them significantly more capable. Size matters.
Multimodal Models
Unlike traditional language models that process only text, multimodal models can handle multiple types of input, such as text, images, audio, and speech, enabling more complex interactions and understanding.
Task-Specific Models
These models are optimized for a single function—for example, a translation model can convert text between languages but cannot perform sentiment analysis. They are highly efficient but limited in scope.
General-Purpose Models
These models are versatile and can handle multiple tasks, such as translation, sentiment analysis, and more, without requiring significant modifications.
Foundation Models
Subcategory of general-purpose models, serving as a base for building AI applications. Thanks to them gazillions of startups can call themselves “AI Startup”.
ML Engineering
Machine Learning Engineering involves not only developing end-user applications but also designing, training, and optimizing machine learning models. It is sometimes referred to as MLOps, AIOps, or LLMOps. When we had no Foundation Models, that was the way to go.
AI Engineering
AI Engineering is the process of building applications on top of Foundation Models, making AI accessible and easy to use for most of us.
Sources:
AI Engineering by Chip Huyen (O’Reilly)