Replicating a RAG CoPilot for use with Customer Service Engine

jdisimone · August 5, 2024, 1:08pm

Hi Sam!

George’s answer is indeed comprehensive and offers a lot of valuable insights. I’d like to expand on a few points, particularly regarding the implementation of the Customer Service Engine (CSE) and how you can embed this functionality to replicate a RAG-based model.

Chunking: Divide your documents into manageable chunks.
Embedding: Convert the content of each chunk into a numerical representation (vector).
Indexing: Store these embeddings into the Ontology for efficient retrieval.
Retrieval: Use Semantic Search to find relevant chunks to answer customer queries.

Chunking

Has already been covered by George.

Embedding

Once you’ve chunked your documentation, transform the text in each chunk into a numerical representation (vector). You can achieve this using the pipeline builder expression: Text to Embedding.

Indexing

With your chunks transformed into embeddings, create a new Object Type in the Ontology.

Here’s an example of what the [Customer Service] Doc Object Type might look like:

content: The text content of the chunk
content_embedding: The embedding of the text, defined as a Vector type with the following Embedding Model:
- Language Modeling Service Model
- OpenAI’s text-embedding-ada-002

Retrieval

Indexing your documentation makes it ready for efficient retrieval. To enhance the generation of emails with relevant context, modify the Generate Response for Customer Service Alert AIP Logic file:

The retrieval block is the most critical operation in enhancing the generation of responses. As George mentioned, this block is responsible for fetching the most relevant chunks of documentation based on the customer’s query. By default, I set it to return the top 50 relevant chunks, but you can adjust this number depending on the capabilities of the LLM used to generate the answer.

Once you have the variable Most Relevant Documentation populated with these relevant chunks, you can pass it as an input to the reply_to_customer block. This integration ensures that the generated response is enriched with precise and contextually appropriate information, thereby improving the overall response accuracy.

Enhancing Retrieval with Ontology

The example provided is a basic version of semantic search, where embeddings are used to find the most relevant chunks of text. However, you can significantly improve the performance and relevance of the retrieved documents by leveraging the Ontology as suggested by George This can involve setting up deterministic filters in AIP Logic to be coupled with the AI Semantic Search Capability presented above.

For example, if you have ontology objects for products, you can link the sub-set of docs related to each product in the ontology. Leveraging this sort of Ontologized document structure while looking up information in your Logic function will make retrieval significantly more performant and accurate.

I hope this adds clarity and provides a structured approach to implementing and enhancing your Customer Service Engine. Feel free to reach out if you have any more questions or need further assistance!

Best, Jacopo