In a recent post I introduced one of the biggest problems that companies in the industry (like VMware) are trying to solve: mapping application workloads onto a next-generation IT infrastructure. The picture I used to highlight this mapping can be seen below:
The main problem that this diagram is trying to illustrate is the efficient and automated assignment and monitoring of millions of application workloads on an IT infrastructure.
In this post I'd like to highlight the mathematical approach taken in order to solve this problem. The formulas I describe below were supplied by our team at the St. Petersburg Center of Excellence, based on a research engagement with the Applied Research Center for Computer Networks (ARCCN).
For this post I will focus only on mapping based on performance, and not on other characteristics (e.g. availability, protection).
On the left we have a large number of applications that wish to be placed effectively onto an IT infrastructure. A first step in mathematical placement would be to create models of both of these elements.
A request for infrastructure to deploy an application workload (G) is defined as follows:
W is the application (e.g. the virtual machine), S is the storage (the application data store), and E is the virtual connection between the two. Functions can be run on each of these that define SLO requirements of the workload. For example v(w) represents the required performance of the application or virtual machine.
Modeling the physical infrastructure of the software-defined data center (H) could be described as follows:
P is the compute layer, M is the storage layer, K is the network layer, and L is the set of connections between them all. Functions can be run on each of these to extract their characteristics. For example vh(p) specifies the performance capabilities of a given compute node.
An assignment (A) of an application workload on top of an SDDC, therefore, looks like this:
This definition is the foundation onto which many other requirements can be built. The computer science problem that I mentioned previously boils down to this: How can one design algorithms that perform the above assignment in a way which ensures efficient utilization of an SDDC's physical infrastructure, while meeting the SLOs of the workloads.
For example, consider equations for assigning virtual machines to physical nodes:
Virtual machine w can be assigned to physical node p if the following is true:
From a performance standpoint this is easy to understand. However, it leads to another set of questions that the industry must work on:
- How does one factor in security requirements for placement?
- How does one specify data protection? Data availability?
- What are the models that take into account "green" placement for power-challenged data centers?
This very basic introduction should give readers some insight to the problem that the industry must solve in order to truly build an infrastructure that can efficiently run millions of applications. The topic is the fifth in a series of posts meant to stimulate dialogue on the topic of building next generation (3rd platform infrastructure).
As part of upcoming posts I will focus on the applications that sit on top of this SDDC orchestration layer.