New AI-Service: Disover our Small LLM GPT Model

Utilize Cloud Resources Effectively

The Next AI Trend? Saving Costs While Conserving Resources!

“Rightsizing” is often synonymous with job cuts within companies. However, we aim to establish a much more positive context for this term, one that is increasingly relevant for the application of artificial intelligence now and in the future: the cost-efficient use of cloud resources.

Benjamin Krenn

VP Software & Co-Founder

Karin Schnedlitz

Content Managerin

To make the most of the cloud, computing resources within the infrastructure are continuously redistributed according to the required resources without affecting the system’s availability. Additionally, virtual machines can be replaced as needed based on changing requirements and resource needs (for example, to opt for more cost-effective alternatives).

Closing Gaps Instead of Reinventing the Wheel

Rightsizing is an effective way to reduce the operational costs of software applications in the cloud, maintaining system availability. While trends come and go, rightsizing has the potential for long-term adoption. With challenges such as climate change, resource scarcity, inflation, and rising energy costs, it’s in everyone’s best interest to make IT solutions both efficient and resource-friendly. By optimizing cloud infrastructure, storage, and applications, companies can make better use of their resources and significantly reduce costs without needing to introduce new technologies or complex procedures.

The Solution: Cloud Rightsizing

Rightsizing simply means better managing the cloud and its online storage, and providing cloud functions based on user needs. To use the cloud properly, certain prerequisites regarding the chosen cloud system architecture and security are required. Popular public cloud providers offer a wide range of services that can have varying degrees of architectural impact on the overall system. In the portfolios of the most common providers, you’ll find services such as managed service bus solutions, serverless solutions, object storages, analytics streaming services, and many more. These services generally differ in their level of abstraction over virtual machines in the cloud and the range of features offered. Broadly speaking, services can be categorized by their high level of abstraction and strong architectural dependency (such as serverless solutions or proprietary event bus systems) or by their low level of abstraction (such as managed Kubernetes clusters [1]).
High-abstraction services offer several key advantages: easy automatic scaling and usually excellent integration with other provider services. However, these cloud services are often proprietary solutions developed by the provider [2].

How Do Cloud Provider Costs Add Up for Customers?

The more extensively a service is used within the overall system, the greater the dependency on the cloud provider. Additionally, cost optimization in this service group is often very difficult. Pricing is usually based on multiple interdependent factors such as monthly requests, data transfer, time, and the chosen scaling model. The cost models for these services are often heterogeneous, meaning the more these services are used, the more complex cost optimization becomes. Special cloud-cost-optimization (CCO) techniques and tools are used to reduce costs. On the other hand, the pricing model for low-abstraction services is simpler to optimize. Of course, there are also several factors that influence costs here, but the primary cost drivers are the hours per month a virtual machine is rented. The hourly price increases with the performance of the chosen virtual machines. The pricing model of virtual machines differs only slightly between cloud providers. Rightsizing can be applied to optimize costs for these services.

Rightsizing as an Ongoing Automated Process

Rightsizing is especially suitable for dynamic systems, meaning those where rapid horizontal scalability or on-demand computing is required. It can be understood as an ongoing process that is continuously applied. Kubernetes, an open-source container orchestration platform, enables the automation of rightsizing to a large extent. The computing resources in the cluster can be continuously redistributed according to current needs, without affecting the system’s availability. Additionally, virtual machines can be replaced with more cost-effective alternatives depending on the system’s resource requirements. provides several tools [,3,4,5,6] to achieve this, and further adjustments [7] can be made to Kubernetes’ scheduling through custom extensions.
By using these tools, companies can better utilize and continuously optimize their virtual machines, reducing operational costs. The prerequisites for this, particularly in horizontal scaling, are comprehensive application metrics and telemetry data, which serve as the basis for rightsizing operations. Therefore, establishing a powerful and comprehensive metrics and telemetry system is essential to even perform rightsizing.

Rightsizing Ă  la Leftshift One

Kubernetes is an excellent platform for continuously optimizing the operational costs of complex software systems in the cloud environment. However, it’s worth mentioning that Kubernetes only makes sense when the software system reaches a certain level of complexity and the scale of software usage is significant. Moreover, it is a complex system that requires substantial training and specialized personnel. For simpler software projects that do not have strict portability requirements, proprietary cloud services can be a sensible and often cost-effective alternative. Leftshift One’s MLOps platform, AIOS, is built on Kubernetes and leverages its capabilities to perform rightsizing for machine learning models. Every machine learning workload (Skill) started via AIOS is automatically integrated into AIOS’s telemetry system and exposes the necessary application metrics. Workloads are continuously monitored for resource usage and are automatically redistributed or scaled to the most cost-effective virtual machines, depending on the system’s current load. Since the platform completely abstracts Kubernetes, users do not need expertise in Kubernetes or telemetry systems. Workloads can be brought into production easily and reliably, allowing customers to fully benefit from rightsizing without limitations.

Referenzen
[1] https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/

[2] Proprietary software means that its use and distribution are restricted by the provider. The opposite is “open-source “.

[3] https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler
[4] https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
[5] https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
[6] https://virtual-kubelet.io
[7] https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/
Scroll to Top