Are you a data center provider, telco, NVIDIA Cloud Partner (NCP) or startup that has decided to offer a GPU-as-a-Service (GPUaaS) AI cloud? You need to rapidly decide what your offering is going to look like. With multiple technical options, the ultimate decision depends on your customer requirements, the type of competition you are facing and your desired differentiation in an increasingly commoditized service.
Some first level decision points are whether your offering will be Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) or Software-as-a-Service (SaaS). Of course, these are not mutually exclusive, you may choose to offer a combination. Let’s dig into some details.
IaaS
IaaS largely means offering compute instances with GPUs to end users. This is probably the most common offering today. The sizing of these instances will vary based on the GPU capability, vCPU count, memory and storage sizing, and network throughput. Even with IaaS, there are some sub-options:
Of course with IaaS, you will encounter challenges like multi-tenancy and isolation, self-service APIs, and on-demand billing that will need to be solved to be able to offer a complete solution to customers.
PaaS
With PaaS, complexities of the underlying infrastructure are hidden and the offering is a higher level abstraction. The options range from a GPU based Kubernetes cluster optimized to run NVIDIA NIM, LLMOps/MLOps, fine-tuning-as-a-Service, vector-database-as-as-Service, GPU spot instance creation (to sell excess unused capacity), among other services. A move from IaaS to PaaS instantly creates more value around your offering but requires additional technical sophistication and instrumentation.
SaaS
The next level of sophistication is to offer managed software directly to users in the form of SaaS. This could include LLM-as-a-Service (similar to what OpenAI and the hyperscalers provide), RAG-as-a-Service, and more. This layer adds even more value than IaaS or PaaS.
To compete you will need to move up the value chain, leaving the low level “boring” infrastructure orchestration & management to Aarna Networks so that you can focus on building your differentiation.The Aarna Multi Cluster Orchestration Platform (AMCOP) orchestrates and manages low level infrastructure to achieve network isolation, Infiniband isolation, GPU/CPU configuration, OS and Kubernetes orchestration, storage configuration and more. Once the initial orchestration is complete, AMCOP monitors and manages the infrastructure as well. If you would like to slash your time-to-market, and build a differentiated and sustainable GPUaaS please get in touch with us for an initial 3-day architecture and strategy assessment.