πŸ’‘ If you like this website, please share it with your friends and network! πŸš€
AI Infrastructure & Scalability & Large Language Models

AI Infrastructure & Scalability
Interview Prep Portal

Master Large Language Models (LLMs), RAG pipelines, vector semantic search, embedding geometries, prompt engineering methodologies, and autonomous tool-calling AI agents.

LLMs & TransformersRAG PipelinesVector SearchPrompt EngineeringAI Agents
PROGRESS0 / 11 Mastered
0%
Filter Level:
AI Infrastructure & ScalabilityIntermediateQ1

LLM optimization techniques

AI Infrastructure & ScalabilityIntermediateQ2

How do you select GPUs for LLM inference?

AI Infrastructure & ScalabilityAdvancedQ3

What is model parallelism vs data parallelism in distributed training?

AI Infrastructure & ScalabilityAdvancedQ4

What is tensor parallelism, and how does it help serve large models?

AI Infrastructure & ScalabilityAdvancedQ5

What is pipeline parallelism?

AI Infrastructure & ScalabilityAdvancedQ6

How does continuous batching improve LLM inference throughput?

AI Infrastructure & ScalabilityAdvancedQ7

What is speculative decoding, and how does it speed up inference?

AI Infrastructure & ScalabilityAdvancedQ8

What is KV cache, and how do you manage memory for it?

AI Infrastructure & ScalabilityAdvancedQ9

What is Paged Attention?

AI Infrastructure & ScalabilityIntermediateQ10

How do you optimize inference for edge and mobile deployment?

AI Infrastructure & ScalabilityAdvancedQ11

What is model quantization (INT8, INT4, FP16, BF16), and how does it affect quality?