# RAG - Introduction

```{seealso}
If you are new to copilot and RAG concepts, consider going through these resources after the workshop:
- Watch [Vector search and state of the art retrieval for Generative AI apps.](https://ignite.microsoft.com/en-US/sessions/18618ca9-0e4d-4f9d-9a28-0bc3ef5cf54e?source=sessions)
- Read [Retrieval Augmented Generation (RAG) in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview)
```

## Introduction

[Retrieval Augmentation Generation (RAG)](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview) is an architecture that augments the capabilities of a Large Language Model (LLM) by adding an information retrieval system that provides grounding data. It is a major architectural pattern for most enterprise GenAI applications, increasingly used in our latest engagements.

```{note}
RAG is not the only solution for incorporating domain knowledge, as illustrated below:

![domain-knowledge](./images/domain-knowledge2.png)

```

**RAG Architecture**

![RAG_pattern](./images/rag-pattern.png)

**Information retrieval system**

Why is the information retrieval system important? Because it gives you control over the knowledge that the LLM is using to formulate a response. That means that you can constrain the LLM to your own content from _vectorized_ documents, images, and other data formats.

**Azure AI Search**

[Azure AI Search](https://learn.microsoft.com/en-us/azure/search/) is a proven solution for information retrieval in a RAG architecture. Architecturally, it sits between the external data stores (with un-indexed data) and your client app. The client app sends query requests to a search index and handles the response:

![search_service](./images/search-service.png)

<!-- https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-ai-search-outperforming-vector-search-with-hybrid/ba-p/3929167 -->

<!-- In any search system, there are two layers of execution: retrieval and ranking.

- Retrieval - also called L1, has the goal to quickly find all the documents from the index that satisfy the search criteria (possibly across millions or billions of documents). These are scored to pick the top few (typically in order of 50) to return to the user or to feed the next layer. Azure AI Search supports three different models:

  - Keyword: Uses traditional full-text search methods – content is broken into terms through language-specific text analysis, inverted indexes are created for fast retrieval, and the BM25 probabilistic model is used for scoring.

  - Vector: Documents are converted from text to vector representations using an embedding model. Retrieval is performed by generating a query embedding and finding the documents whose vectors are closest to the query’s. We used Azure Open AI text-embedding-ada-002 (Ada-002) embeddings and cosine similarity for all our tests in this post.
  - Hybrid: Performs both keyword and vector retrieval and applies a fusion step to select the best results from each technique. Azure AI Search currently uses Reciprocal Rank Fusion (RRF) to produce a single result set.

- Ranking – also called L2, takes a subset of the top L1 results and computes higher quality relevance scores to reorder the result set. The L2 can improve the L1's ranking because it applies more computational power to each result. The L2 ranker can only reorder what the L1 already found – if the L1 missed an ideal document, the L2 can't fix that. L2 ranking is critical for RAG applications to make sure the best results are in the top positions.
  - [Semantic ranking](https://learn.microsoft.com/en-us/azure/search/semantic-search-overview) is performed by Azure AI Search's L2 ranker which utilizes multi-lingual, deep learning models adapted from Microsoft Bing. The Semantic ranker can rank the top 50 results from the L1. -->

**Vector search in Azure AI Search - Overview**

Vector search is an approach in information retrieval that stores numeric representations of content for search scenarios.
![vector_search](./images/vector-search-architecture-diagram.png)

<!-- The generally available functionality of vector support requires that you call other libraries or models for data chunking and vectorization. However, [integrated vectorization (preview)](https://learn.microsoft.com/en-us/azure/search/vector-search-integrated-vectorization) embeds these steps. -->

<!--
\*_ The technology behind Azure AI Search_ -->

```{note}
### Approaches for RAG with Azure AI Search

Due to its increased popularity, Microsoft has several built-in implementations for using Azure AI Search in a RAG solution.

1. Azure AI Studio, [use a vector index and retrieval augmentation - Preview](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/retrieval-augmented-generation).
2. Azure OpenAI Studio, [use a search index with or without vectors - Preview](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/use-your-data?tabs=ai-search).
3. Azure Machine Learning, [use a search index as a vector store in a prompt flow - Preview](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-vector-index?view=azureml-api-2).

During this workshop, we will take a **code-first** approach.
```
