From RAGs to Riches

A Unique RAGOps Opportunity for NSPs to Offer RAG to their Enterprise Customers

Enterprises are going to embrace GenAI, of that there is no doubt. GenAI will add value in just about every function of an enterprise. The speed at which an enterprise adopts GenAI will clearly result in a competitive advantage. However, a more durable and lasting competitive moat will result by blending enterprise data with the GenAI model. The more data an enterprise can utilize for GenAI, the deeper their competitive moat.
‍

There are two options for an enterprise to mix corporate data with the GenAI model:

Fine tune an existing Foundational Model: In this option, an enterprise fine tunes a private copy of an existing GenAI Foundational Model (FM) with their own corporate data. Though much simpler than training a new GenAI model, which we are not even considering, this option is difficult for most enterprises. It requires GPUs in the tune of $Ms, a high degree of skill set to set up Large Language Model Operations (LLMOps) pipelines, and the need to continuously fine tune the model to prevent it from drifting or getting stale.
Retrieval Augmented Generation (RAG): In this approach, an enterprise uses a lightweight Foundational Model (FM) that has generic natural language processing capability but no real domain knowledge. Users will then supplement the prompt with real time augmented data to get a meaningful result. Finally, RAG can also prevent hallucination by citing the exact data source(s). However, this approach is network heavy in that with each prompt there may be a large amount of traffic to retrieve the relevant data.

In that sense the two approaches are analogous to the following images:

*Fine tuning an FM is akin to tapping into an* ***intelligent employee who has been fully trained in your corporate data****. Of course, they need to be trained on an ongoing basis to stay current.*

‍

*RAG is similar to hiring an* ***intelligent employee/consultant who doesn’t have prior knowledge of any specific domain****, but is fast enough to read any information you want in real-time.*

‍

Given the above: Most enterprises will use RAG

‍

There are three deployment models for RAG:

Public model – In this option, a public model e.g. Microsoft is used for RAG. The public model will use corporate data to provide the response. The fly in the ointment is the requirement to move all the relevant data to a public GenAI service provider. Some enterprises might be comfortable with this but most will not be for a variety of reasons.
Private model in a public cloud – In this approach, an enterprise uses a private FM in a public cloud along with other components such as vector databases. This is convenient but again, all the data needs to be shipped to the public cloud. This is perhaps less scary than the previous option since the data would reside in a private repository; nevertheless, it is a lot to swallow.
Private model in a private cloud – In this option, the enterprise would use a private FM along with other components like a vector database in a private cloud. What makes this approach attractive is that the private cloud already has all the required network connections to internal data sources. However, this approach does require a bit more sophistication on the part of the user to deploy and manage RAG.
‍

From the above, it is clear: A RAG model in a private cloud will dominate

‍

Enter Network Service Providers (NSP)

Unlike ML/LLMOps which require significant ML expertise, RAG does not. In fact, RAG requires expertise in data connectivity since the value of a RAG model is directly proportional to the amount of corporate data made available to it. Who better to provide managed RAG than the provider of SD-WAN and managed IP networks?
‍

NSPs are best positioned to offer managed RAG

Getting Started with RAGOps

RAGOp may be summed up as DevOps based methodology to deploy and manage a RAG model. RAGOps requires the following steps:

To expand a bit more:

Deploy virtual infrastructure with GPUs to host the RAG model. This may be a combination of virtual compute (containers, VMs), storage, virtual networks, and Kubernetes/hypervisor layer.
Deploy an FM along with a vector database, text embedding, and other data sources.
Deploy supporting guardrail/management/monitoring components.
Set up data pipelines to collect Enterprise data from diverse sources and populate the vector database.
Monitor and manage (upgrade, scale, troubleshoot) the environment over Days 1,2 as needed.

Since NSPs can provide data connectivity, they hold a competitive advantage. However, the competitive advantage NSPs hold will not last forever. For this reason:
‍

NSPs need to start RAGOps PoCs for enterprise customer ASAP

‍

Next Steps

Contact us for help on getting started with RAGOps.

The Aarna Networks Multi Cluster Orchestration Platform orchestrates and manages edge environments including support for RAGOps. We have specifically created an offering that is suitable for NSPs by focusing not just on the FM and related ML components, but also on the infrastructure e.g. using Equinix Metal to speed up deployment and Equinix Fabric for seamless data connectivity. As an NVidia partner, we have deep expertise with server platforms like the NVidia GraceHopper and platform components such as NVidia Triton and NeMo.