Maths

  • Linear Algebra: Vectors, Matrices, Eigenvalues, Eigenvectors, Singular Value Decomposition (SVD), …
  • Calculus: Derivatives, Integrals, Gradient Descent, Chain Rule, …
  • Probability and Statistics: Distributions, Bayes’ Theorem, Maximum Likelihood Estimation, Hypothesis Testing, …
  • Optimization Theory

Classical ML

  • Regression / Classification
  • Clustering Techniques
  • Evalulation: Confusion Matrix, ROC, AUC, Precision, Recall, F1 Score
  • Cross Validation, Train Test Split
  • Bias-Variance Tradeoff, Overfitting vs Underfitting
  • Decision Trees
  • SVMs, KNN, Naive Bayes
  • Clustering: K-Means, Hierarchical Clustering, DBSCAN, …
  • Dimensionality Reduction: PCA, t-SNE, UMAP, …
  • Ensemble Methods: Bagging, Boosting, Random Forests, Gradient Boosting, XGBoost, LightGBM, CatBoost
  • Recommendation Systems
  • Probabilistic Models, Logical Models, Geometric Models
  • Explainability: SHAP, LIME, Feature Importance

Deep Learning

  • What is Neuron, Perceptron, Neural Networks
  • Activation Functions: Sigmoid, Tanh, ReLU, Leaky ReLU, Softmax …
  • Vanishing or Exploding Gradients
  • Layers: Fully Connected, Convolutional, Recurrent, LSTM, GRU, …
  • Optimization Algorithms (Optimizers): SGD, Momentum, Nesterov, Adagrad, RMSProp, Adam, …
  • Backpropagation, MLPs, Loss Functions, Regularization, Dropout, Batch Normalization
  • Pytorch
  • Computer Vision: CNNs, ResNets, Object Detection, Image Segmentation
  • Generative Models: GANs, VAEs
  • Natural Language Processing: RNNs, LSTMs, GRUs, Transformers, BERT, GPT, NER
  • Audio and Speech Processing: RNNs, LSTMs, GRUs, Transformers, Wavenet, TTS, ASR
  • Transformers: Attention Mechanism, Self-Attention, Multi-Head Attention, Positional Encoding, BERT, GPT, T5, MoEs, latest advancements
  • Autoencoders
  • Diffusion Models
  • Vision Transformers (ViTs)
  • Multimodal Models
  • Graph Neural Networks (GNNs)
  • Quantization

LLMs

  • Sampling: Temperature, Top-k, Top-p, Beam Search
  • Pretraining, Mid training, Post training, Fine-tuning (Instruction Tuning, RLHF)
  • Fine-tuning: Instruction Tuning, RLHF, LoRA, PEFT, SFT, RAG, QLoRA, adapters
  • Evaluation: Perplexity, BLEU, ROUGE, Human Evaluation, Arena-style evaluations
  • Encoders, Decoders, Encoder-Decoder Architectures
  • Tokenization: WordPiece, Byte-Pair Encoding (BPE), SentencePiece, Unigram
  • Embeddings: Word2Vec, GloVe, FastText, BERT Embeddings, Sentence Embeddings
  • KV Caches
  • Flash Attention
  • Context Length Scaling
  • Sparse Attention
  • New concepts:
    • Mixture of Experts Routing
    • Chain of Thought
    • Tool Use
    • Function Calling
    • Reasoning Models
    • Test-Time Compute
    • Long Context Models

Reinforcement Learning

  • Q-Learning, Policy Gradients, Deep Q-Networks, MDPs, DQNs
  • RL for LLMs: RLHF, RLAIF, DPO

Applied AI

  • LLM APIs
  • Embeddings and Vector Databases
  • RAG
  • Agents (Frameworks: Langchain, Langgraph, Camel, etc.)
  • MCP
  • Inference Optimization
  • Hybrid Search
  • Reranking
  • Workflows

MLOps

  • Experiment Tracking
  • Dataset Versioning
  • CI/CD for ML
  • Monitoring
  • Drift Detection
  • A/B Testing
  • Model Registries
  • Feature Stores
  • Deployment
    • ONNX
    • TensorRT
    • Triton
  • Kubernetes for ML
  • Observability

AI Infrastructure

  • GPUs, TPUs, CUDA Basics
  • Memory Management, Parallelism, Distributed Systems
  • Inference Servers, Serving Architectures
  • Batching, Caching
  • Model Compression

Research

  • Scaling Laws
  • Emergence
  • Mechanistic Interpretability
  • Sparse Autoencoders
  • Model Editing
  • World Models
  • Self-Supervised Learning
  • Contrastive Learning
  • Curriculum Learning
  • Meta Learning