Microservices and System Architecture for Applied AI
Imagine microservices as modular spacecraft, and Docker as your personal spaceport for assembling them. Here you will learn how to design "orbital stations" for AI Agents: breaking down monolithic systems into autonomous satellite services, configuring their interaction through interstellar protocols (REST/gRPC), and automating deployment with CI/CD launch vehicles. These skills will allow your neural networks to scale like a galactic empire, update without downtime, and survive failures of individual components without collapsing the entire system.
Ask AI Instructions
Since these topics do not change over time, it is best to study them with a personal tutor - ChatGPT.
The learning process should be as follows:
- you create a system prompt for ChatGPT (templates), where you describe your background, preferences, level of detail of explanations, etc.
- copy the topic from the list (triple click), and ask ChatGPT to explain this topic to you
- if you want to delve deeper, ask clarifying questions
At the moment, this is the most convenient way to learn the basics. In addition to concepts, you can study additional materials in the Gold, Silver, Extra sections.
- Gold - should definitely be studied before communicating with ChatGPT
- Ask AI - ask questions on each unfamiliar topic
- Silver - secondary materials
- Extra - in-depth topics
Golden
1. Videos
sys des: docker
2. Architectures for GenAI
Standard Architecture for an AI Agent
- we need a proxy to the LLM to comply with rate-limiting
- we need a proxy to the external API to comply with rate-limiting and caching results
- we need a gateway in front of the backend to classify requests, determine user roles (paid tier, free tier), set limits on context size, etc., set rate-limiting, etc.
Ask AI
-
Basic Concepts of Microservice Architecture (Starter Guide)
-
Microservices vs Monolith: a complete comparison of architectures
-
Domain-Driven Design: basic principles for microservices
-
API Gateway: patterns for beginners (Overview)
-
Docker for AI services: minimally necessary practices
-
Kubernetes: orchestration basics for AI developers
-
Load balancing of GPU tasks: basic approaches
-
Versioning AI models: semantic versioning
-
A/B testing of models: production use-cases (Brief overview)
-
Security of AI services: OAuth2/JWT in practice
-
Rate limiting for AI operations: basics for beginners
-
CI/CD for AI: minimal working pipeline
-
GRPC vs REST: comparison for AI API (Concept)
-
Caching ML model results: basic strategies
-
Circuit Breaker: operating principle and implementation
-
Monitoring neural network services: key metrics
-
Caching: TTL vs invalidation (Comparative analysis)
Caching: strategies and cache invalidation
-
Message Queue: basic concepts and use-cases (Overview)