ScyllaAgent: Scalable and Low Latency Agentic Chatbot

ScyllaAgent: Scalable and Low Latency Agentic Chatbot

in

About the Project

Deploy a Scalable and Low-Latency Agentic Chatbot in python using cutting edge techniques like Cache Augmented Retrieval. Implement a professional MLOps Pipeline to ensure visibility, scalability, low latency, logging and continuous feature integration and deployment.

Resources

Week 0

  1. Understand Regression and Grad Descent
  2. Neural Networks
  3. Coding an NN in PyTorch, and a micrograd framework from scratch

Week 1 | Introduction to NLP and Sequence Models

Basic NLP

  1. NLP Playlist by Tensorflow
  2. Side by side, refer to this github repo
  3. Text Pre processing
  4. Text Normalization
  5. Bag of words representation
  6. Term Frequency-Inverse Document Frequency
  7. Continuous Bag of Words
  8. One Hot Encodings

Sequence Models

  1. Recurrent Neural Networks
  2. Mathematics of RNNs
  3. Long Short Term Memory
  4. overall idea of lstms and rnns with a little maths - video link
  5. Introduction to Transformers
  6. Attention in Transformers

Week 1 Additional Resources

  1. Attention Is All You Need
  2. QLoRA: Efficient Finetuning of Quantized LLMs
  3. LoRA: Low-Rank Adaptation of Large Language Models
  4. Fine Tuning Repository

Introduction to Python

  1. Intro to Python
  2. Exception Handling
  3. Anaconda
  4. Path and environment variables for Python and Anaconda in Windows
  5. Interactive Python Notebooks
  6. Venv
  7. Managing Packages with venv
  8. Python virtualenv
  9. PyEnv for Python Version Management
  10. Git

Week 2 | Retrieval Augmented Generation

APIs, LLMs and HuggingFace

  1. Requests and APIs:
  2. APIs
  3. REST API
  4. Requests
  5. Accessing LLMs via APIs:
    1. Tokens
    2. Tokens and Pricing
    3. LLM APIs
  6. Explore models on Huggingface, resources are as follows:
    1. https://huggingface.co/docs/transformers/en/llm_tutorial
    2. https://www.analyticsvidhya.com/blog/2023/12/large-language-models-on-hugging-face/
    3. https://huggingface.co/models?other=LLM

RAGs

  1. LangChain Ecosystem
    1. LangChain Crash Course
    2. LangSmith
      1. Introduction
      2. Docs and Getting Started
    3. LangGraph Crash Course
  2. Prompting
    1. 12 Prompting Techniques
    2. Prompt engineering by HugginFace
  3. RAGs
    1. RAG for Knowledge-Intensive NLP Tasks Paper
    2. LangChain Implementation

Week 3 | Agentic Systems

  1. Data Validation and Typing
    1. Typing module
    2. Typing Docs
    3. Intro to Pydantic
    4. Pydantic in Detail
    5. Pydantic Docs
  2. Concurrency
    1. Concurrency, Parallelism, and asyncio
    2. Repo for codes in (2.1)
    3. Async Programming in python
    4. A nice video tutorial (alternative to 2.3)
    5. Exception Handling in python asyncio
    6. Nice clarificaton from stackoverflow
    7. Parallelism v/s Concurrency
  3. Design Patterns
    1. Abstract Factory and Abstract Base Classes
      1. Refactoring Guru Article
      2. Abstract Factory Python Implementation
      3. Intro to ABC
    2. Factory Method, Composite Patterns, Decorators, State, Iterators etc from Refactoring Guru Design Patterns
    3. Grokking OOPs
  4. LlamaIndex
  5. [Tools, Agents, Agentic Orchestration]
  6. [Llamaindex Workflows]

Week 4 | Agentic and Advanced RAGs

Mini Advanced RAGs Roadmap

  1. Ingestion
    1. Data Preprocessing/Cleaning
    2. Chunking
      1. Fixed Size Chunking
      2. Content-aware Chunking
        1. Simple Sentence and Paragraph splitting
        2. Recursive Character Level Chunking
      3. Document structure-based chunking
      4. Semantic Chunking
      5. Contextual Retrieval: Provides scalability for larger documents
        1. Contextual BM25
        2. Chunk + General Doc Summary
        3. HyDE
        4. Summary Based Indexing
    3. Embedding:
      1. Semantic Embeddings
      2. Lexical Embeddings
        1. BM-25 (Best Matching 25): Lexical Matching which builds upon TF-IDF (Term Frequency-Inverse Document Frequency)
  2. Retrieval
    1. Search
      1. Semantic Search (dense vectors)
      2. Lexical Search (sparse vectors)
      3. Hybrid Search
        1. Querying Hybrid Index
        2. Querying Sparse and Dense Index and reranking
    2. Reranking: Increases quality of retrieved documents
      1. BGE Reranker
      2. Passage Reranking with BERT
  3. Augmentation
  4. Generation
  5. Evaluation
    1. Offline Metrics
      1. Binary Relevance Metrics
        1. Order-unaware:
          1. Precision@k: TP/(TP+FP) how many items in the result set are relevant
          2. Recall@k: TP/(TP+FN) how many relevant results your retrieval step returns from all existing relevant results for the query
          3. F1@k: (2 * Precision@k * Recall@k)/(Precision@k + Recall@k)
        2. Order-aware:
          1. Mean Reciprocal Rank (MRR)
          2. Mean Average Precision@K (MAP@K)
      2. Graded Relevance Metrics
        1. Discounted Cumulative Gain (DCG@k)
        2. Normalized Discounted Cumulative Gain (DCG@k)
    2. Online Metrics: Based on user data, RL-based
    3. Frameworks and Tooling
      1. Arize
      2. ARES
      3. RAGAS
      4. TraceLoop
      5. TruLens
      6. Galileo
  6. Benchmarking AI Assistants

Anthropic Cookbook

Additional Resources

  1. Advanced RAG Techniques
  2. RAG Optimizations implementations in LangChain

Week 5 and 6| Creating a Python Module for Advanced RAGs

System Design and Patterns Review

  1. Python Design Patterns
  2. Design Patterns
  3. ABCs
  4. ABCs v/s protocols
  5. Sequence Diagrams
  6. Activity Diagrams
  7. Some case studies

Project Management

  1. Structuring Python Projects
  2. Poetry
  3. Building Python Packages

Repos of similar modules for reference

  1. FlashRAG: https://github.com/RUC-NLPIR/FlashRAG
  2. RAGligh: https://github.com/Bessouat40/RAGLight

Building and structuring modules

  1. How to design modular python projects
  2. Designing Modules
  3. Why the Plugin Architecture Gives You CRAZY Flexibility
  4. Clean Architectures in Python
  5. Why Use Design Patterns When Python Has Functions

Software testing:

  1. Unit, Integration and Functional Testing
  2. Types of Software Testing
  3. Unit v/s Integration Testing
  4. Unit v/s Integration Testing

Week 7 | Reading and Implementing Research Papers

  1. Cache Augmented Generation
  2. MetaGPT
  3. MedRAG
  4. Trading Agents
  5. Knowledge Augmented Generation

Week 8 | AI Dev