02 AI Model Landscape
AI Model Landscape
1. Foundation Models
- Large-scale pre-trained models that can be adapted
for many tasks.
- Can be unimodal (text, image, audio) or multimodal
(text + image + audio).
- Examples: GPT-4, DALL·E, PaLM, LLaMA, Stable
Diffusion
2. Generative AI (subset of
foundation models)
- AI that creates new content rather than just
analyzing it.
- Can include text, image, video, audio, and 3D.
- Subcategories:
|
Type |
Description |
Example Models |
|
Text |
Generates human-like language |
GPT-4, Claude 3, LLaMA 2, Falcon |
|
Image |
Creates images from prompts |
DALL·E, MidJourney, Stable
Diffusion |
|
Video |
Generates videos or animations |
Runway Gen-2, Synthesia |
|
Audio / Music |
Creates speech, sound, or music |
MusicLM, AudioGen, VALL-E |
|
3D / Simulation |
Generates 3D models or
environments |
Point-E, Kaedim3D |
✅ All LLMs that generate text
fall under Generative AI.
⚠️
Not all Generative AI are LLMs (image/video/audio generators aren’t).
3. Large Language Models (LLMs)
(subset of text-based Generative AI)
- Specifically trained on text data to understand,
summarize, translate, answer questions, and generate text.
- Can also perform code generation, reasoning, and
conversation.
- Examples:
- Proprietary: GPT-4, GPT-4-turbo, Claude 3, Gemini
1.5
- Open source: LLaMA 2, Falcon 7B, Mistral 7B,
RedPajama
Special Notes:
- LLMs can be instruction-tuned (follow user
instructions better) or multimodal (accept text + images).
- LLMs often serve as the core engine behind AI
assistants like ChatGPT or Claude.
4. Other AI Models
- Discriminative AI / Predictive AI: Focus on
classifying or predicting rather than generating.
- Example: BERT (for understanding text), ResNet (for
images), XGBoost (structured data).
- Reinforcement Learning Models: Learn via
trial-and-error feedback.
- Example: AlphaGo, AlphaZero
Comments
Post a Comment