02 AI Model Landscape

AI Model Landscape

1. Foundation Models

Large-scale pre-trained models that can be adapted for many tasks.
Can be unimodal (text, image, audio) or multimodal (text + image + audio).
Examples: GPT-4, DALL·E, PaLM, LLaMA, Stable Diffusion

2. Generative AI (subset of foundation models)

Type	Description	Example Models
Text	Generates human-like language	GPT-4, Claude 3, LLaMA 2, Falcon
Image	Creates images from prompts	DALL·E, MidJourney, Stable Diffusion
Video	Generates videos or animations	Runway Gen-2, Synthesia
Audio / Music	Creates speech, sound, or music	MusicLM, AudioGen, VALL-E
3D / Simulation	Generates 3D models or environments	Point-E, Kaedim3D

✅ All LLMs that generate text fall under Generative AI.
⚠️ Not all Generative AI are LLMs (image/video/audio generators aren’t).

3. Large Language Models (LLMs) (subset of text-based Generative AI)

Specifically trained on text data to understand, summarize, translate, answer questions, and generate text.
Can also perform code generation, reasoning, and conversation.
Examples:

Special Notes:

LLMs can be instruction-tuned (follow user instructions better) or multimodal (accept text + images).
LLMs often serve as the core engine behind AI assistants like ChatGPT or Claude.

4. Other AI Models

Discriminative AI / Predictive AI: Focus on classifying or predicting rather than generating.

Example: BERT (for understanding text), ResNet (for images), XGBoost (structured data).