PolyViNE: policy-based virtual network embedding across multiple domains
© Samuel et al.; licensee Springer. 2013
Received: 30 November 2012
Accepted: 8 February 2013
Published: 11 March 2013
Skip to main content
© Samuel et al.; licensee Springer. 2013
Received: 30 November 2012
Accepted: 8 February 2013
Published: 11 March 2013
Intra-domain virtual network embedding is a well-studied problem in the network virtualization literature. For most practical purposes, however, virtual networks (VNs) must be provisioned across heterogeneous administrative domains managed by multiple infrastructure providers (InPs).
In this paper, we present PolyViNE, a policy-based inter-domain VN embedding framework that embeds end-to-end VNs in a decentralized manner. PolyViNE introduces a distributed protocol that coordinates the VN embedding process across participating InPs and ensures competitive prices for service providers (SPs), i.e., VN owners, while providing monetary incentives for InPs to participate in the process even under heavy competition. We also present a location-aware VN request forwarding mechanism – basd on a hierarchical addressing scheme (COST) and a location awareness protocol (LAP) – to allow faster embedding. We outline scalability and performance characteristics of PolyViNE through quantitative and qualitative evaluations.
Network virtualization has gained significant attention in recent years as a means to support multiple coexisting virtual networks (VNs) on top of shared physical infrastructures [1–4]. The first step toward enabling network virtualization is to instantiate such VNs by embeddinga VN requests onto substrate networks. But the VN embedding problem, with constraints on virtual nodes and virtual links, is known to be -hard [5, 6]. Several heuristics [5–9] have been proposed to address this problem in the single infrastructure provider (InP) scenario. However, in realistic settings, VNs must be provisioned across heterogeneous administrative domains belonging to multiple InPs to deploy and deliver services end to end.
One of the biggest challenges in end-to-end VN embedding is to organize the InPs under a framework without putting restrictions on their local autonomy. Each InP should be able to embed parts or the whole of a VN request according to its internal administrative policies while maintaining global connectivity through mutual agreements with other InPs.
Moreover, InPs (i.e., network operators) are notoriously known for their secrecy of traffic matrices and topology information. As a result, existing embedding algorithms that assume complete knowledge of the substrate network are not applicable in this scenario. Each InP will have to embed a particular segment of the VN request without any knowledge of how the rest of the VN request has already been mapped or will be mapped.
Finally, there will be constant tussles between service providers (SPs) and InPs on multiple levels:
Each InP will be interested in getting as much of the deployment as possible put on its equipment, and then optimizing allocation under given constraints. In addition, InPs will be more interested in getting requests for their high-margin equipment while offloading unprofitable work onto their competitors.
SPs are also interested in getting their requirements satisfied while minimizing their expenditure. Tussles might arise between SPs and InPs when each party selfishly try to optimize their utility functions.
Any inter-domain VN embedding mechanism must enforce proper incentives and mechanisms to address these tussles.
In this paper, we introduce PolyViNE, a policy-based end-to-end VN embedding framework that embeds VNs across multiple InPs in a globally distributed manner while allowing each concerned InP to enforce its local policies. PolyViNE introduces a distributed protocol that coordinates the participating InPs and ensures competitive pricing through repetitive bidding at every step of the embedding process.
We do not claim PolyViNE to be the best or the only way of performing end-to-end VN embedding. However, to the best of our knowledge, this is the first foray into this unexplored domain in the context of network virtualization, and we believe this problem to be absolutely critical in realizing network virtualization for most practical purposes.
The rest of the paper is organized as follows. Section 2 formally defines the inter-domain VN embedding problem. In Section 3 we describe the design choices and the distributed embedding protocol used by PolyViNE, followed by a discussion of its enabling technologies in Section 5. Section 6 and Section 7 respectively provide preliminary quantitative and qualitative evaluations of PolyViNE. We discuss related work in Section 8. Finally, Section 9 concludes the paper with a discussion on possible future work.
The intra-domain VN embedding problem is well-defined in the literature [5–9]. In this section, we formally define the inter-domain VN embedding problem. For simplicity, we avoid intra-domain aspects (e.g., node and link attributes) wherever we see fit. We use the notation introduced here to discuss the details of the PolyViNE protocol in section 3.
We denote the underlay (shown in Figure 1b) as a graph G U =(N U ,L U ), where is the set containing border nodes across all InPs (1≤i≤D)and L U is the set of physical inter-domain links connecting the border nodes between two InPs.
However, the underlay does not have the full connectivity, which is achieved through simple topology abstraction method . All border nodes belonging to a single InP are collapsed to one single node corresponding to that InP (Figure 1c) in this representation resulting in a multigraph G W =(N W ,L W ), where N W essentially is the set of InPs in the underlay and L W (=L U ) is a multiset of inter-domain links that connect the InPs. G C =(N C ,L C )is a simple graph (Figure 1d) referring to the controller network , where N C (=N W ) represents the set of Controllers in InPs and L C is the set of links between Controllers obtained from the multiset L W .
Similar to substrate networks, we model VN requests as weighted undirected graphs and denote a VN request by G V =(N V ,E V ). We express the requirements on virtual nodes and virtual links in standard terms [6, 8]. Figure 1e depicts a VN request with virtual node and link requirements.
Each VN request has an associated non-negative value R V expressing how far a virtual node n V ∈N V can be placed from the location specified by loc(n V ), which can be interpreted as the preferred geolocation of that virtual node. Figure 1f shows the substrate nodes within the preferred geolocation for each virtual node using dashed vertical boxes.
From PolyViNE’s point of view, an end-to-end VN assignment is performed on the controller network, G C .
The VN request G V =(N V ,L V ) is partitioned into K subgraphs such that and , where is the set of virtual links that will cross domain boundaries. In this version of the PolyViNE protocol, we also consider subsets of , where . We define to be the set of all virtual links with a single incident virtual node mapped in and another node mapped in and an inter-domain path mapping that crosses . Thus, is simply the set of all virtual links crossing inter-domain boundaries with one end mapped .
Now we can formally express inter-domain VN embedding as two mappings, that embeds each subgraph to different InP and that embeds inter-domain links in the InP controller network. Figure 1(i) shows a possible InP-level embedding for the VN request shown in Figure 1(e). Note that, InP# 2has not embedded any virtual node but is still in the embedding by being in an inter-domain virtual link.
In this section, we discuss PolyViNE design decisions, explain its workflow, and describe the distributed protocol that coordinates the PolyViNE embedding process.
We have made the following design choices for PolyViNE aiming toward decentralization of the embedding process, promotion of policy-based decision making, and support for local agility within a flexible global framework.
PolyViNE argues for using a distributed (decentralized) VN embedding solution over a centralized broker-based one. In a centralized solution, the broker will have to know the internal details and mutual agreements between all the InPs to make an informed embedding. However, InPs are traditionally inclined to share as little information as possible with any party. A distributed solution will allow for embedding based only on mutual agreements. Moreover, in a distributed market there will be no single-point-of-failure or no opportunity for a monopolistic authority (e.g., the broker).
PolyViNE allows each InP to use its own policies and algorithms to take decisions without any external restrictions. However, it also creates a high level of competition among all the InPs by introducing competitive bidding at every level of distributed VN embedding. Even though each InP is free to make self-serving decisions, they have to provide competitive prices to take part and gain revenue in PolyViNE. To keep track of the behavior of InPs over time, a reputation management mechanism can also be introduced [12, 13].
PolyViNE decision making and embedding process is deeply rooted into the location constraints that come with each VN request. After an InP embeds a part of a VN request, instead of blindly disseminating the rest of the request, it uses geographic constraints as beacons to route the request to other possible providers. PolyViNE aggregates and disseminates location information about how to reach a particular geographical region in the controller network and which InPs might be able to provide virtual resources in that region.
PolyViNE is an enabling framework for multi-step distributed embedding of VN requests across InP boundaries. In its simplest form, an SP forwards its VN request to multiple known/trusted InPs; once they reply back with embeddings and corresponding prices, the SP chooses the VN embedding with the lowest price similar to a bidding process.
However, a complete end-to-end VN request may not be mappable by any individual InP. Instead, an InP can embed a part of the request and outsource the rest to other InPs in a similar bidding process giving rise to a recursive multi-step bidding mechanism. Not only does such a mechanism keep a VN embedding simple for an SP (since the SP does not need to contact all of the eventual InPs), but it also ensures competitive prices due to bidding at every step.
Through the PolyViNE protocol, when an InP receives a request to map a virtual network, it selects a connected subgraph of the VN to embed, and passes the remaining portion of the VN graph to other InPs. The process is started at the SP where it spawns off k SP instances of the VN request. At each subsequent stage, spawns off copies of the remaining portion of the VN request to an appropriate set of InPs determined by LAP.
The search space for all possible virtual network partitionings across the controller network is vast: O(D n ) where D is the number of InPs in the controller network and n is the number of nodes in the virtual network. Thus, it is infeasible to attempt all possible partitionings of a virtual network across all InPs. PolyViNE, instead, takes a best effort approach to mapping virtual networks onto substrate networks by exploring a constant subset of partitionings.
The PolyViNE protocol attempts to navigate InPs for solutions in a breadth-first-like manner, while fixing maximum depth, d, and varying branching factor based on the pricing model discussed in 3.4, giving an upper bound of visited InPs where is defined to be the maximum branching factor at participating InPs. As discussed below, the budget for processing is fixed, and effectively, so is .
The PolyViNE protocol operates under the assumption that every entity in the controller network is behaving in its own best interest, attempting to maximize its profit. Each entity provides a service (embedding reservation) to its predecessor in the recursive process and requests a service from its successors. It then selects the service that provides the best price and rejects the other services. However, when an InP reserves resources for a partial embedding for its predecessor, it incurs an opportunity cost: those reserved resources could have been used to service another VN request. For simplicity, we assume the opportunity cost is some constant per InP per VN request. Thus, an InP charges its predecessor a processing fee. This has the effect of producing a trade-off between exploration of the space of possible embedding solutions, and price. The more InPs visited and solutions explored, the more processing fees incurred. Thus, a high branching factor (k InP ) at an InP can be extremely expensive while a lower branching factor reduces the search space (potentially increasing prices), and increases the chance of failure (not finding a feasible solution to the VN constraints in the search horizon).
In this model, the SP sets an upper bound on the processing fees as a ratio relative to the embedding budget (e.g. 1:2processing fee to embedding fees). For example, an InP may wish to embed a virtual network for a maximum of $5000 and pay no more than an additional $2500 for processing. The processing fee cap implicitly limits the search space. We leave it up to the discretion of the InP to choose how to distribute the processing fee allocation to successors, and how many successors to relay the VN request to (k InP ). k InP may, for example, be expressed as a function of the processing fee allocation such that as the allocation grows so does the branching factor.
This model disincentivizes entities (SPs and InPs) from flooding the controller network to search for a cheaper solution. As each InP takes a processing fee, we eventually run out of money in the processing fee allocation, effectively reducing search depth. On the other hand, given a fixed fee allocation, a smaller branching factor increases search depth.
This model also provides InPs an additional incentive to participate in finding an embedding for a given virtual network as it will receive compensation for its work. When an entity sends an EMBED message to another entity, it enters a contractual agreement to pay a processing fee up to an upper bound it specifies.
In order to exchange information between the SP and the InPs, and to organize the distributed embedding process, a communication protocol must be established. We refer to this protocol as the PolyViNE Protocol, which is based on eleven types of messages. These messages are sent and received asynchronously between concerned InPs and the SP to carry out the embedding process from beginning to end. The protocol messages are described in the following:
EMBED (Req_id, G,, state_table, budget_remaining, processing_allocation_remaining, InPSet): This message is sent from the SP to InPs to initiate the embedding process of the VN request G with an empty InPSet. Upon receipt of this message, an will decide whether to process the request, or report failure based on its policies as well as processing_allocation_remaining. If determines that it requires more money to process the message than is allocated by the SP, then it will report failure. If an InP processes the request but later determines that the allocation would go over budget_remaining, it will cancel the reservation, and report failure.An InP also uses this message to outsource the unmapped part of the request after appending itself to InPSet, and updating G and the partial embedding as necessary. Req_id and state_table together uniquely identify a particular instance of the VN request (see section 4.5).
EMBED_SUCCESS: Once an embedding is successfully completed, an InP replies back to its predecessor with a price and . pred_state_id is a unique identifier used to call up the relevant state stored at the predecessor entity (SP or InP). The succ_id is a unique identifier indicating which InP sent the message.
EMBED_FAILURE (pred_state_id, succ_id, error_desc): In case of a failure, an InP replies back with a description outlining the reason of failure using error_desc.
EMBED_REJECT (pred_state_id, pred_id, succ_state_id): An InP may reject a mapping provided by one of its successors if its mapping does not meet the predecessor InP’s policy, or a better mapping has been discovered and chosen or the predecessor InP has itself received an EMBED_REJECT message and so it must also recursively reject successors.
EMBED_REJECT_ACK (pred_state_id, succ_id): When an InP is instructed to reject an embedding, it first recursively rejects any partial embeddings by successors, any inter-domain paths leading to it from predecessors, and finally deallocates all resources allocated locally for the given embedding request instance. Once all that has completed, it reports an acknowledgement to the predecessor that issued the EMBED_REJECT message.
LINK: Once an InP, finishes mapping a subgraph , it sends a LINK message to each of its predecessors to map if , the set of virtual links that map to inter-domain paths that pass through and end in .
LINK_SUCCESS (pred_id, pred_state_id, succ_state_id): Once successfully maps , it reports back the price of the link mapping to , along with pred_state_id, a unique identifier used to call up the latest allocations made at for the given VN request instance.
LINK_FAILURE (pred_id, succ_state_id): If fails to map due to insufficient resources or policy violations, it reports back LINK_FAILURE to .
LINK_REJECT (pred_state_id, succ_id): If any fails to map or if ’s partial embedding is rejected by an EMBED_REJECT message, then issues LINK_REJECT to all requesting they release their reservations for .
LINK_REJECT_ACK (pred_id, succ_state_id): Once releases , it replies to with an acknowledgement for rejecting a successful partial link embedding.
EMBED_ACCEPT (succ_state_id): Once an SP decides on an embedding after receiving one or more EMBED_SUCCESS messages, it will acknowledge the embedding by directly contacting the InPs involved using this message.
While an SP’s workflow is straightforward with a single decision at the end, it shifts much more work to the InPs. An InP has to work through several steps of decision making, organizing, and coordinating between heterogeneous policies to complete the embedding process.
Upon receiving a VN request, an InP must decide whether to reject or to accept the request. It can reject a VN request outright, in case of possible policy violations or insufficient processing budget provided by the predecessor, returning an EMBED_FAILURE message to its predecessor. Even if there are no discernible policy violations, it might still need to reject a VN request if it fails to profitably embed any part of that request or if it fails to find an embedding that meets the budget constraints.
In order to decide which part of a VN request to embed, if at all, the InP can use existing intra-domain VN embedding algorithms [6, 8] that can identify conflicting resource requirements in a VN request. This can be done iteratively by looking into the output of the linear programs used in both [6, 8] without modifying the actual algorithms presented in those work, and trimming out parts of the virtual network until a feasible solution is found. However, we argue that this heuristic may not be sufficient for high quality or even feasible partial embeddings. In particular, we must ensure that if an InP maps a virtual link, it also maps the two nodes incident to it. We also wish to minimize the number of virtual links that map across multiple domains as inter-domain paths tend to be long and thus are more costly.
In case of a failure, the InP will send back an EMBED_FAILURE message (optionally with reasons for the failure). However, sometimes the InP might know of other InPs that it believes will be able to embed part or whole of the VN request. In that case, it will relay the VN request forwarding the EMBED message to that InP after adding itself to the InPSet. In Figure 3, InP# 2 is relaying the VN request G’ to InP# 3.
PolyViNE expects each to complete a mapping of all virtual links in the set prior to forwarding the remainder of the VN request to new participants. This ensures that no additional InPs will be brought in to participate in the mapping before current participants are sure inter-domain paths are feasible between them and satisfy their respective policies.
At a given stage j of the embedding process, receives an EMBED message that contains an ordered set of InPs participating so far, InPset containing . For each virtual link, , must identify the predecessor mapping the other node incident to . Once identified, adds each to . Subsequently, for each set , issues LINK messages containing that set to each instructing it to map the virtual links in across its domain.
i = m: must map a path from the virtual node incident to in to the border node leading to .
i = j: must map a path from the border node leading to to the virtual node incident to in .
If all the virtual links specified by the LINK message are mapped successfully by the receiving InP (no policies are violated and physical resources are available to satisfy the paths), then it responds with a LINK_SUCCESS to the sender. Otherwise, the recipient will respond with a LINK_FAILURE message to the sender.
An embedding at a given InP is considered successful if and only if at least one node is mapped by the InP and all inter-domain paths are successfully mapped. If one or more predecessor InPs are unable to map inter-domain paths, then the resource reservations must be released. The current InP issues a LINK_REJECT message to all InPs that responded with a LINK_SUCCESS message. The InP then waits for acknowledgement that resources have been freed through a LINK_REJECT_ACK message. Once all pending acknowledgements have been received, the InP releases the resources it allocated locally and issues an EMBED_FAILURE message to its direct predecessor (see section 4.9 for more details on roll-back of resource allocations).
The design of PolyViNE allows for a fairly straightforward analysis of message complexity. If we assume that the implementation of the protocol at each InP does not involve inter-domain paths through new InPs that have not seen the given instance of the VN request, then in the worst case, with n participating InPs, each will visit all predecessors to map virtual links to inter-domain paths. The number of participants is bounded by d. Thus, the message complexity to map a virtual network is O(min(n,d)2). PolyViNE attempts a total of mappings (recall section 3.3) giving a total complexity of .
The service provider issues a VN mapping request to InP #1.
It performs a partial mapping of the VN request onto its substrate network. The remaining portion of the VN request is forwarded to InP #2.
InP #2 fails to map the VN request, and so it relays the EMBED message to InP #3. Subsequently, InP #3 successfully maps the remaining portion of the request, leaving virtual link to inter-domain path mappings remaining.
InP #3 sends a LINK message to InP #1 containing the virtual link reference set .
InP #3 sends a LINK message to InP #2 containing the virtual link reference set .
InP #3 sends a LINK message to itself containing the virtual link reference set .
InP #1 successfully maps paths for the virtual links in from the substrate node resources reserved for virtual nodes A and B to the B 1border node. Additionally, it allocates sufficient bandwidth for on the B 2 B 3inter-domain link (see Figure 2). It reports back LINK_SUCCESS to InP #3 including the total cost of the mapping of .
InP #2 successfully maps paths for the virtual links in from the B 2border node to the B 3border node. Additionally, it allocates sufficient bandwidth for on the B 3 B 4inter-domain link (see Figure 2). It reports back LINK_SUCCESS to InP #3 including the total cost of the mapping of .
InP #3 successfully maps paths for the virtual links in from the B 4border node to the substrate node resources reserved for virtual nodes C and D. It reports back LINK_SUCCESS to itself including the total cost of the mapping of .
InP #3 sees that all participating InPs have reported back LINK_SUCCESS and so the VN mapping is complete. It accumulates the prices it received from the LINK_SUCCESS messages and adds the cost of its own local mapping to produce a total that it sends back to its predecessor InP #2 within an EMBED_SUCCESS message.
InP #2 receives InP #3’s EMBED_SUCCESS message. It compares the price it receives from InP #3 ($300) with that of InP #4 ($400), rejects InP #4’s solution, and selects InP #3’s solution (see Figure 4). It adds its own local embedding price ($0 in this case, as it did not map any nodes, and the link allocations in InP #2 were accounted for by InP #3’s offer) to produce a total that it sends back to InP #1 within an EMBED_SUCCESS message.
InP #1 receives InP #2’s EMBED_SUCCESS message. In our example , and so InP #1 immediately adds the cost of its local mapping ($200) to the price of InP #2’s solution ($300). It reports the total price of the mapping ($500) back to the SP.
After the solution in Figure 2(c) is rejected, the SP accepts InP #1’s mapping, allowing the InP to instantiate and setup virtual machines on substrate nodes.
The SP accepts InP #2’s mapping, allowing the InP to instantiate and setup virtual machines on substrate nodes.
The SP accepts InP #3’s mapping, allowing the InP to instantiate and setup virtual machines on substrate nodes.
The message sequence corresponding to the solution in Figure 2 (c) would be very similar to the sequence in Figure 3 save for the final few message exchanges. The solution at InP #4 costs more than that at InP #3, and so it will be rejected. In particular, at step 11, InP #4 would receive an EMBED_REJECT message from InP #2, and would subsequently issue LINK_REJECT messages to each of the participating InPs. InP #4 would wait for LINK_REJECT_ACK from the participating InPs, and then will report EMBED_REJECT_ACK to its predecessor, InP #2 (see section 4.9 for more details on roll-back of resource allocations).
When an InP receives an EMBED message with virtual network G and finds a mapping of some subgraph of G, the InP reserves those resources for that particular VN request. The resources are reserved until either the PolyViNE protocol determines that there are no successors that can satisfy the remaining portion of the request or a predecessor has rejected the mapping provided by the current InP. Subsequently, when an InP receives a LINK message, it must call up the previous resource allocation associated with the current instance of the virtual network mapping request and bundle any link allocations it performs with the previous resource allocations.
We define a flow to be an ordered set of InPs visited to map a given virtual network V N ij with a unique identifier i, requested by some service provider, S P j . It is evident that the flow and corresponding to our running example in Figure 2 (b) and (c) respectively are mutually exclusive, as they are two different instances of the same VN request and, ultimately, at most one instance will be accepted by the SP. Thus, while the two instances share the same node embedding state produced by the initial EMBE D1 message, their subsequent link allocation state produced by the LINK messages are different and independent. Thus, any implementation of PolyViNE must be able to identify flows in order to be able to store and call up the appropriate state specific to that flow. We propose a solution to identify flows and their associated state at an InP. At any point in the embedding process, an InP only sees a prefix of a final InP set. After processing an EMBED message, an InP will spawn off k InP mutually exclusive instances of the VN request to map its remaining portion. At this point, the k InP flows diverge but they share a common prefix: the state formed in response to the EMBED message.
We propose bundling the allocations performed in response to EMBED and LINK messages in transactional state objects which we will call EmbeddingState objects from this point forward. Allocating an EmbeddingState object also reserves an associated unique state identifier, called an EmbeddingId, which is used to call up state. EmbeddingIds at each InP in the flow so far are bundled in EMBED messages in a state table. The state table is simply a mapping from a unique InP identifier (such as an IP address), to an EmbeddingId for the latest EmbeddingState object for the flow at the given InP.
When a given InP, sends a LINK message to a predecessor , it includes the EmbeddingId associated with the flow at that it received through EMBED message’s state table. creates a new EmbeddingState object, as a child of the previous EmbeddingState for that flow, along with a new EmbeddingId. When the link allocation successfully completes, reports LINK_SUCCESS back to along with the new EmbeddingId. As the PolyViNE embedding process progresses, InPs participating in the process begin to form chains of EmbeddingState objects associated with the flow. Flows that share a common prefix may share a prefix of an EmbeddingState chain, but diverge at some point, thus forming an EmbeddingState tree. Flows that do not share a prefix form disconnected state at any given InP. Thus, for a given InP and a given VN request, PolyViNE state consists of a forest of state trees.
When InP #1 maps a part of the VN request V N ij , it allocates an EmbeddingState object, stores the mapping with the EmbeddingId 0.
The mapping (ID(InP# 1)→0)is stored in the state table attached within the message EMBE D 2.
InP #3, and InP #4 send the messages LIN K 3and LIN K 6respectively to InP #1 with the field pred_state_id = EmbeddingId 0 allowing InP #1 to call up the state associated with .
When InP #1 maps the link allocation, , it creates a new EmbeddingState object with a InP-level unique EmbeddingId 1 as a child to the EmbeddingState object housing the mapping. Similarly, when InP #1 maps the link allocation , it creates another EmbeddingState object attached to the same parent with EmbeddingId 2.
InP #1 responds to InP #3 and InP #4 with
LINK_SUCCESS messages with pred_state_id = 1 and 2, respectively.
When a subsequent message requests state associated with one of the two new EmbeddingId s, we are able to disambiguate between the two flows, despite their sharing a common subgraph mapping.
Resource reservation blowup is a major issue in any implementation of PolyViNE. In our implementation of a simulation of the PolyViNE protocol, we quickly realized that in the worst case, the resources allocated in a single InP could grow exponentially as a function of d, the maximum search depth. The total number of flows explored by PolyViNE to find a good inter-domain VN embedding is . In a pathological case, some InP, may be involved in as many as flows. If an InP reserved resources for each flow of a given VN, then we could very quickly end up in a situation where a single, relatively simple VN request drains an InP of all its available resources.
However, we note that for any given VN request, a service provider will ultimately only accept a single flow of the which will be explored. This means that an InP only needs to reserve sufficient resources for any one such flow per VN request. We denote the capacities on all substrate nodes and links in , with vectors for nodes and = for links where each component indicates the maximum capacity of that resource. Each EmbeddingState object, at with EmbeddingId k can be thought to be composed of two resource allocation vectors, one for nodes and one for links, indicating the resources allocated by that state object in addition to its parent state object: .
The InP then simply applies a component-wise maximum (cmax) of the multiset representing all resource allocation state for a given VN request, , and reserves those resources. As more allocations are made for at , the InP updates its multiset and adjusts its resource reservations accordingly, leaving resources available for other VN requests. As all the mutually exclusive flows are for the same virtual network, the allocations at any given InP will be relatively similar and so, the component-wise maximum will typically not be much more than those required of any one flow.
If an InP can only partially embed a VN request, it will have to forward the rest of the request to other InPs in the controller network in order to complete the VN request. An InP should take care to not forward a VN request to another InP already in the InPSet to avoid cycles. For example, InP# 1in Figure 3 is forwarding the unmapped VN request G’ to InP# 2. Similar to SPs, InPs also forward the request to k InP (≥1)InPs for similar reasons (e.g., competitive prices). While forwarding a request, an InP can prefer to perform a transformation on the VN request in order to hide the details of its mapping (as in Figure 1h). At this point, it can use one of the two possible methods for unmapped VN request forwarding:
Recursive forwarding: In this case, when an InP forwards a VN request, the receiver InP embeds part of it based on its policies and forwards the rest further away to another InP.
Iterative forwarding: In iterative forwarding, the receiver InP return the control back to the sender InP once it is finished with embedding.
In any case, the forwarding decision is a non-trivial one and requires careful consideration. We believe that instead of blindly forwarding based on some heuristics, we can do informed forwarding by utilizing the location constraints attached to all the virtual nodes in a VN request. Details of this forwarding scheme are presented in the next section.
The VN request proceeds from one InP to the next, until either the maximum number of participants d has been reached, there are no available InPs to send the request to or the VN request has been satisfied completely. In case of a successful embedding of a VN request, the EMBED_SUCCESS message carries back the embedding details and corresponding price. At each step of this back-propagation of EMBED_SUCCESS and EMBED_FAILURE messages, the sender InP can select mappings based on internal policies or lower price or some other criteria and rejects the other successful embeddings by issuing EMBED_REJECT messages to the appropriate successors.
As VN embeddings follow paths back to the SP, the prices are accumulated and the SP ends up with multiple choices (Figure 4).
Within a single domain, a VN embedding is transactional in nature, as an embedding must be completed as a whole, or not at all. In the multi-domain scenario, each InP is free to map a subgraph of the embedding and so the algorithm used to perform that partial mapping may or may not be transactional (it’s up to the discretion of the InP how to implement it). However, from the SP’s perspective, the multi-domain scenario is the same as that of the single domain: it expects either a completed embedding reservation or a report of failure. In other words, in a given flow, either all participating InPs succeed in reserving resources or none of them reserve resources. This means that the PolyViNE protocol itself must provide a mechanism to roll-back the work done by other InPs in a given flow once a failure or rejection occurs. An SP is expected to accept only one flow, and reject all other flows. Rejection initiates a roll-back process of all resources allocated for that flow.
Two messages in the protocol can initiate resource allocations within an InP: EMBED, and LINK. Thus, corresponding roll-back messages must exist in the protocol: EMBED_REJECT and LINK_REJECT. In order to simplify the implementation of a controller’s PolyViNE message handling system and avoid race conditions, associated acknowledgement messages EMBED_REJECT_ACK and LINK_REJECT_ACK act as barriers to ensure that roll-back occurs in the opposite order to allocation. Note that link allocations corresponding to the set for a given are unordered as there are no dependencies among them. However, state dependencies exist between subgraph allocations on one InP and the next, and so PolyViNE ensures that roll-back occurs in the opposite order to allocation through the *_ACK messages.
The embedding is incomplete at and no successors are found or all successors fail for some reason.
An InP participating in an inter-domain path allocation on behalf of responds back to with LINK_FAILURE.
An embedding solution fails to meet the budget constraints.
The processing fee budget has been exhausted.
Internal InP policies do not allow the embedding to proceed.
InP #3 issues LINK_REJECT messages to InPs #1 and itself (for consistency purposes) with the appropriate EmbeddingIds informing them to deallocate their respective allocations for virtual links a and c. InP #1 and InP #3 release their link mappings, and report LINK_REJECT_ACK to InP #3 (steps 12-13). Subsequently, InP #3 releases its subgraph embedding for the VN request and reports EMBED_FAILURE to its predecessor, InP #2 (step 14). InP #2 also sees that InP #4 has failed. Thus, InP #2 has seen that all its successors fail, and so it must also fail. InP #2 has no subgraph embedding, and so it simply reports EMBED_FAILURE to its predecessor InP #1 (step 15). InP #1 has only one successor and so it must fail as well. It releases its subgraph embedding for the VN, and reports EMBED_FAILURE to the SP.
Naïvely an InP can forward a VN request to a set of InPs in the controller network at random. However, this decision is blind to the location requirements of the virtual nodes and the availability of virtual resources at the destination InP to satisfy the constraints for the VN request. This may result in high failure rate or prices well above the fair value. To avoid flooding a VN request or sending it to random InPs which might be unable to meet the constraints of the request, we propose using location constraints associated with unassigned virtual nodes to assist an InP in making this decision. Location constraints of the virtual nodes together with the location information of the underlay will allow informed VN request forwarding in the controller network.
To accommodate such location aware forwarding, we introduce a hierarchical geographic addressing scheme with support for aggregation, named COST. InPs in PolyViNE must associate COST addresses with all the substrate nodes and SPs must express location requirements in terms of COST. Controllers in different InPs publish/disseminate information about the geographic locations of their nodes along with the unit price of their resources. They can then aggregate and disseminate data collected from all neighboring Controllers to build their own knowledge bases of location to InP mappings, each accompanied by path vectors of InPs in the controller network and corresponding prices. We propose Location Awareness Protocol (LAP) to perform this task. Careful readers will notice in the following that COST and LAP are significantly influenced by BGP.
As outlined in the problem formulation (Section 2), each virtual node in a VN request comes with a permissible geographic region in which it must be embedded. One design question at this point is how to represent and encode the geolocation. We have chosen a hierarchical geolocation representation scheme similar to  with the form Continent.cOuntry.State.ciTy (hence the name COST). Even though in this paper we are using a simple postal address like scheme for simplicity, any hierarchical geolocation representation system will work with PolyViNE.
A virtual node may restrict its location preference to any prefix in this addressing scheme. For example, to restrict a node within Canada, one may assign the address NA.CA.* to a virtual node. This indicates that beyond requiring that the node be mapped within Canada, the SP does not care where in the country it is ultimately mapped.
On the other hand, each substrate node has a complete COST address associated with it. This address indicates within which city lies the given substrate node. If an InP is not willing to share the exact location, it can always choose a higher level address. For example, instead of announcing nodes in Toronto using NA.CA.ON.Toronto, the InP can announce NA.CA.ON.*. However, such announcements can result in receiving of VN requests that it may never be able to satisfy, which will affect its reputation among other InPs.
Location Awareness Protocol (LAP) is a hybrid of Gossip and Publish/Subscribe protocols that assists an InP in making informed decisions about which InPs to forward a VN request to without making policy violations, and thus progressing toward completing the VN embedding. Controllers in different InPs keep track of the geolocations of their internal substrate nodes in COST format and announce availability and prices of available resources to their neighbors using LAP updates in the controller network. This information is aggregated and propagated throughout the controller network to create global view of the resources in the underlay in each Controller’s LAP database.
However, in a rapidly changing environment with continuously fluctuating prices, gossip may not be sufficient to disseminate updated prices in a timely fashion. To reduce the number of failures stemming from staleness of pricing information, we propose extensions to LAP using a Publish/Subscribe mechanism along with its basic gossip protocol. By using this mechanism, any InP will be able to subscribe to announcements of Controllers that are not its direct neighbors. While we leave VN request routing decisions to the discretion of InPs, an InP may use the pricing information to prefer forwarding the VN request to a lower priced InP, all other things being equal.
The question that remains open to more investigation is why would an InP be honest when announcing pricing estimates? We believe that a reputation metric – indicating long-term accuracy of an InP’s pricing estimate to the actual cost of establishing a VN request – is necessary to remedy this situation. We would like to integrate such a reputation metric within LAP to allow dissemination of path vectors attributed with corresponding prices and overall reputation score of the InPs on the paths. An InP will then be able to use pricing and reputation scores to rank multiple paths to a common destination to make a forwarding decision.
We have written a 12000 line multi-threaded C++ simulator that allows independent responses from various entities in the controller network. The simulation comprises a complete implementation of the entire set of protocol messages discussed in section 3.6.
We examine four sets of experiments. In our first set of experiments, we look at some of the properties of inter-domain embeddings generated by PolyViNE as the VN request size (node count) is varied. In our second set of experiments, we look at some of the properties of the PolyViNE embeddings as the pricing model attributes are varied (embedding and processing budgets). In our third set of experiments, we look at properties of PolyViNE embeddings as we vary the maximum number of InPs involved in a mapping. In our last set of experiments, we examine the reliability of the PolyViNE protocol and the cost of embeddings generated in the case of lost or stale LAP information.
Each experiment is run to completion (a VN request has completed) multiple times per data point to produce the averaged results presented here. We found that that variation in tests was very small and so we did not include confidence intervals. Unless otherwise specified, we have used the following settings: For each experiment, we randomly create a controller network with 60 InPs. Each InP network consists of 120 to 150 nodes and 540 to 600 links on average. Each node has a maximum CPU capacity uniformly chosen from 1 to 100 CPU units, and each link has a maximum bandwidth capacity of 100 bandwidth units. Locations of substrate nodes are sampled from a normal distribution with a mean, and variance chosen uniformly from 0 to 255 representing 256 different major cities. InPs with low variance location distributions are effectively local or regional InPs, while high variance InPs have nodes that span the globe. The per unit cost per resource is chosen from a normal distribution with a mean sampled from a prior, per InP, normal distribution (of mean 4, variance 1) and a variance of 1. This means that some InPs will tend to be cheaper than others, on average.
Unless otherwise specified, each VN request has an expected value of 30 nodes, and 120 links (±20%). Each virtual node has a maximum CPU capacity uniformly chosen from 1 to 25, and each virtual link has a maximum bandwidth capacity of 1 to 15, chosen uniformly as well. Locations are chosen uniformly from 256 major cities represented in the controller network.
Each InP charges an embedding processing fee of 1 unit (±20%). A maximum InP count per flow (d) is a property of the VN request. Based on this property, an InP estimates the maximum branching factor it can use so that up to d InPs can be involved in a flow and uses that branching factor. Thus, the entire processing budget is always consumed. The processing fees are not refunded if the flow fails.
We begin by varying the number of nodes in a VN request and observing the properties of the embeddings and the success rate of the flows. In each experiment, VN requests have n nodes (where n is varied) and 4n links. We also fix K SP =7, maximum InPs per flow to 9, processing budget : embedding budget to 0.75:1, and embedding budget to 25000. The goal of this experiment is to test the limits of PolyViNE’s ability to find solutions given a fixed set of resources available to it (Large VN requests relative to InP resources, limited processing and embedding budgets, and thus limited search space). We expect that after a certain point, VN requests will get so large that no mapping will be possible, either because the request hits InP resource limits or it hits budget limits.
In this experiment, we vary the embedding budget and processing budget of a VN request and observe their impact on the success rate and total embedding cost (embedding + processing). In each experiment, VN requests have 30 nodes and 120 links (± 20%). We also fix K SP =7, and maximum InPs per flow to 8.
Budget allocation is distributed at each InP assuming a full n-ary subtree. Thus, at least half of the InPs in the search space are last hop InPs given a fixed search depth. Since processing budget is evenly distributed across all subtrees, to explore more space, the processing budget would need to increase significantly (dependent upon the current average branching factor on the second last hop).
However, we observe a much clearer correlation between the embedding budget and the success rate. A higher embedding budget tends to improve the flow success rate. Also, we see that as we increase the embedding budget, the variation in success rate between processing budgets decreases, suggesting that a larger processing budget does allow for more flows, but most of those new flows go above the embedding budget. As we increase the embedding budget, fewer of the new flows go over the embedding budget.
The maximum number of InPs involved in a single flow (d) impacts the size of the search space explored. As d decreases, the branching factor per InP tends to increase. In this experiment, we investigate the tradeoff between increasing the number of flows (k InP ) and the maximum number of InPs involved per flow (d) given varying processing budgets ratios (processing budget : embedding budget), and a fixed embedding budget of 25,000.
In the final set of experiments, we wish to assess the reliability of the PolyViNE protocol when faced with InPs that have stale LAP information. The propagation rate of LAP data can vary by relationships between InPs and by location and so we wish to ensure that PolyViNE is able to function under a variety of conditions.
The simple pricing model we suggested in this report succeeds in accomplishing the goal of incentivizing InPs to participate in a highly competitive environment and disincentivizing flooding the controller network to find feasible, low-cost embeddings. However, we did not study the practical implications of this pricing model. It may be possible for an InP to abuse this model. For example, the d th InP may be able to gouge prices, leaving the predecessor with no option but to accept the higher price. In the future, we will investigate alternative pricing models that will accomplish our primary goals while studying the strengths and weaknesses of each model.
Scalability concerns in PolyViNE come from several fronts: size of the search space, dissemination time of location information, and storage of location and price information among others. As the number of InPs increases in the controller network, the amount of control traffic will increase even with the tweaks proposed in this paper. Moreover, the size of stored location and path information will grow very quickly with more and more InPs joining the controller network. We can limit the number of stored paths to a certain destination based on some heuristics (e.g., keep only the top M paths and flush the rest after each update), but such loss can result in degraded embedding. Finally, the freshness of the location information is dependent upon the update frequency and the total number of InPs in the controller network.
Recursive processes, by definition, can go on for a long time in the absence of proper terminating conditions resulting in unsuitable response times. Combining iterative mechanism wherever possible and limiting the level of recursion at the expense of search completeness can improve the response time of PolyViNE. However, the question regarding suitable response time depends on the arrival rate and the average life expectancy of VN requests.
InPs participating in a PolyViNE embedding will face major computation overheads while trying to map the VN request and minor communication overheads due to relaying of the rest of the request. Since for each VN embedding every InP in each step except for the winning bidder will fail to take part in the embedding, the overheads can be discouraging. We are working toward finding incentives for the InPs to partake in the embedding process.
Since each InP will try to selfishly improve its own performance and will not expose its internal information, InPs can lie to or hide information from each other. From previous studies it is known that it is hard to use mechanism design or game theory to thwart such behaviors in a large scale distributed system . Our solution against such behavior is the use of competitive bidding at each step of embedding to expose the market price of any leased resource.
PolyViNE currently uses LAP for informed forwarding of VN requests. However, location information is not the only information available to an InP about other InPs. An InP should be capable of “learning” from past experience. That is, it should be able to collect data on previous embeddings, and make more informed decisions in the future based on its observations of the past.
The VN embedding problem, with constraints on both virtual nodes and virtual links, is known to be -hard [5, 6]. A number of heuristics have appeared in the literature based on the complete separation of the node mapping and the link mapping phases [5–7]. Existing research has also been restricting the problem space in different dimensions: [5, 7] consider the offline version of the problem;  ignores node requirements; [5, 7] assume infinite capacity in substrate nodes and links to obviate admission control; and  focuses on specific VN topologies. Chowdhury et al.  proposed a pair of algorithms that provide improved performance through increased correlation between the two phases of VN embedding, while  proposed a graph isomorphism-based integrated solution that can take exponential time in the worst case. All these algorithms address VN embedding as an intra-domain problem and take advantage of a centralized embedding entity.
Recently proposed V-Mart  framework approaches the inter-domain VN embedding problem using an auction-based model, where the SP performs the partitioning task using heuristics for simplification. As a result, V-Mart cannot enable local and inter-InP policy enforcement and fine-grained resource management.
Unlike inter-domain VN embedding, inter-domain lightpath provisioning [11, 17] as well as cross-domain QoS-aware path composition [12, 18] are well studied areas. UCLP  allows users to dynamically compose, modify, and tear down lightpaths across domain boundaries and over heterogeneous networking technologies (e.g., SONET/SDH, GMPLS etc.). Xiao et al. have shown in  that QoS-assured end-to-end path provisioning can be solved by reducing it to the classic k-MCOP (k-Multi Constrained Optimal Path) problem. iREX architecture , on the other hand, uses economic market-based mechanisms to automate inter-domain QoS policy enforcement through negotiation between participating domains. PolyViNE is similar to iREX in its allowance of intra-domain policy-enforcement and in using market-based mechanisms, but iREX is concerned about mapping simple paths whereas PolyViNE embeds more complicated VN requests. PeerMart  is another auction-based marketplace for resource trading in a network virtualization environment, but it basically deals only with virtual links.
The geographic location representation and related information dissemination protocol proposed in PolyViNE is inspired by the previous proposals of geographic addressing and routing in IPv6 networks [14, 20] as well as the predominant global routing protocol in the Internet, BGP . However, unlike these works, PolyViNE does not use the information for addressing or routing purposes; rather it uses the location information to find candidate InPs that will be able to embed part or whole of the remaining unmapped VN request. Moreover, such location information is disseminated between and stored in Controllers instead of border routers as in BGP or GIRO . The concepts of Controllers in InPs and controller network connecting multiple InPs’ Controllers are discussed in the iMark framework .
In this paper we have formally defined the inter-domain VN embedding problem and presented PolyViNE – a novel policy-based inter-domain VN embedding framework – to address it. PolyViNE allows embedding of end-to-end VNs in a distributed and decentralized manner by promoting global competition in the presence of local autonomy. We have laid down the workflows of InPs and SPs throughout the PolyViNE embedding process and identified the most crucial stage in the InP workflow, VN request forwarding. In this respect, we have proposed a hierarchical addressing system (COST) and a location dissemination protocol (LAP) that jointly allow InPs to make informed forwarding decisions. We have also presented preliminary performance characteristics of PolyViNE through simulation.
In the future we would like to address issues such as pricing models, InP interactions, reputation management, and incentives for InP truthfulness. Relative advantages and disadvantages of contrasting choices (e.g., recursive vs iterative forwarding) in different stages of InP workflow should also be scrutinized. Finally, the scalability, stability, and performance characteristics of PolyViNE require further studies through larger simulations and distributed experiments with a heterogeneous mix of intra-domain VN embedding algorithms and policies.
Another interesting direction of research for this problem would be to model it as a distributed constrained optimization problem (DCOP) and to try to solve that with minimal information exchange between InPs.
aThe words ‘embedding’, ‘mapping’, and ‘assignment’ are used interchangeably throughout this paper.bWe will use the terms InP and substrate network interchangeably throughout the rest of this paper.cEach InP uses its own pricing mechanism by which it attaches a price to any embedding it provides.
This work was supported in part by the Natural Science and Engineering Council of Canada (NSERC) under the Smart Applications on Virtual Infrastructure (SAVI) Research Network, and in part by the World Class University (WCU) Program under the Korea Science and Engineering Foundation funded by the Ministry of Education, Science and Technology (Project No. R31-2008-000-10100-0).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.