Saturday, January 14, 2012

White Paper: Sizing WAN Access Links

A PDF version of this paper is available here.

Table of Contents

1. INTRODUCTION
2. USING THE POISSON DISTRIBUTION FOR TRAFFIC MODELLING
3. DIMENSIONING VIRTUAL CIRCUITS
    3.1 MULTIMEDIA STREAMING DATA
    3.2 NON-REAL-TIME DATA
4. TRAFFIC DIMENSIONING
5. NOTES

1. INTRODUCTION

Imagine an office, part of a multi-site company. It has 12 staff who undertake normal office activities: access their email (with attachments); download Word, PowerPoint and Excel files from the remote data centre; make multimedia calls from their desktop conferencing software. All this traffic needs to enter/exit their site on the Wide-Area Network (WAN) access link, so what capacity link do they require? This White Paper describes a process to work it out.

In the old days, networking was circuit-switched telephone calls. Suppose there were 12 desk phones for the 12 staff. If they all phoned at once, the access circuit would need to carry 12 voice channels (at 64 kbps each this would be a 768 kbps link). But what a needless expense: the chances of 12 simultaneous phone calls being made are vanishingly small in normal business.

Instead, a model is used to work out the probability of 0, 1, 2, 3, 4, 5, ... calls being made simultaneously. If the probability of making a phone call in the Busy Hour is 0.1 (a six minute phone call during that hour) then 12 users contribute 12 * 0.1 = 1.2 Erlangs of traffic. If we want no more than a 1% chance of someone getting an engaged tone when they try to dial, we turn to the Erlang-B tables. These tell us that we need to provide no more than 5 circuits on our access link: it’s certainly cheaper than provisioning 12 circuits.

The Erlang-B formula is the Poisson distribution modified to capture more life-like behaviour when someone calls and can’t get through. However, in the high-level modelling of this paper we will just use the Poisson distribution.

2. USING THE POISSON DISTRIBUTION FOR TRAFFIC MODELLING

Whether it’s multimedia conferencing or data traffic, it’s convenient to conceptualise it in terms of old-fashioned voice calls. Suppose there are a very large number of desk phones, n, and each phone has a probability of being in-use at a particular time of p. This means that if we inspect any phone at a random time, the probability of finding it in use when we look is p.

Given n phones, the expected number (i.e. the average number) of phone calls we expect to see if we take a random look is n * p. Call this number l.

Example: for our 12 people with their desk phones, if each has a probability of 0.83 of making a call at any moment (in the Busy Hour) then the expected number of simultaneous voice calls will be 1.

We would now like to know the probability, at a random moment, of observing k = 0, 1, 2, 3, etc simultaneous phone calls. This is what the Poisson distribution gives us: the formula is:

P(k;l) = (lk/k!)e-l where e is 2.71828... and k is 0, 1, 2, 3, etc. Here is the table for l=1.


Figure 1. Poisson distribution for l = 1 (l is the mean)

In this table k is indicating how many simultaneous phone calls and P(k;l) is the probability of seeing each value of k. if we add up the probabilities for all values of k, from zero to infinity we’ll get 1. As you see, the probabilities get very small, very rapidly.

Looking at this table, the probability of k = 0, 1, 2, 3 or 4 = 99.6%. So if we supplied 4 voice circuits, we’d satisfy the site’s phone needs 99.6% of the time. The remaining 0.4% of the time 5 or more people would be trying to call at the same time and some people would get the engaged tone.

In this example, the number of Erlangs was one (l = 1). In the Introduction we considered the case with 1.2 Erlangs and found we just needed an extra circuit (5 circuits) to give us the required grade of service.

Fine, you may say, but these days we don’t do circuit-switched voice, we do data:- streaming data and bursty data. This is true, but we can use the same method of analysis.

3. DIMENSIONING VIRTUAL CIRCUITS
3.1 MULTIMEDIA STREAMING DATA


Let’s return to the 12 people back at the office. Now they’re making multimedia call using a desktop conferencing system. Let’s suppose that they each make a multimedia call of ten minutes duration during the Busy Hour. So for each user the system is working 10 minutes out of 60, therefore with a probability of 1/6. So the mean number of simultaneous multimedia sessions, l, equals 12 * (1/6) = 2.

So now we can ask what is the probability of seeing k = 0, 1, 2, 3, ... simultaneous multimedia calls? Here is the Poisson table for l = 2.

Figure 2. Poisson distribution for Multimedia calls, l = 2

If we add up the probabilities of k = 0-6 we get: 99.55%. So if we supply enough bandwidth for six simultaneous multimedia sessions we will only encounter capacity problems if 7 or more people want to make calls at the same time. The probability of this is less than half a percent.

*** Bottom line: provision capacity for six simultaneous sessions ***

Let’s assume a multimedia session occupies 512 kbps = 0.5 Mbps. We will need this when we come to size the access link below.

3.2 NON-REAL-TIME DATA

What about the Word, PowerPoint and Excel downloads; the email attachments; the Internet pages? We proceed by turning these downloads into ‘calls’.

Suppose that each of our 12 users downloads 64 MB in the Busy Hour. If there is contention, TCP will simply slow down the download so let’s decide on a minimum rate we’re prepared to give users: say 1 Mbps each.

The second assumption we need to make is how many separate files get downloaded per user (a file download looks like a ‘call’) during the Busy Hour. Let’s assume 16 files per user spread over the data files, email attachments and Internet traffic. We’re assuming therefore that the average file is 64/16 = 4 Megabytes = 32 Megabits.

At 1 Mbps, the slowest speed we find acceptable, this will take 32 seconds. This file download has a probability of ‘occupying’ the access link of 32/3,600 = 2/225 during the Busy Hour.

Note that if the access line rate is greater than 1 Mbps (which it almost certainly will be) then many file downloads will be a lot faster, which lowers the overall probability of contention, so it’s the worst case scenario we’re dealing with here.

So we now imagine 12 * 16 = 192 little ‘file daemons’ all wanting to do their file download (= ‘make a call’) during the Busy Hour, with probability 2/225 each. The expected number of simultaneous downloads = l = 192 * (2/225) = 1.7.

We now need to know the probabilities of k = 0, 1, 2, 3 ... simultaneous file downloads as shown in figure 3 below.

Figure 3. Poisson distribution for Data/Internet traffic, l = 1.7

If we add up the probabilities for k = 0 – 5, we get 99.2%. So we can achieve the required service level if we provision enough bandwidth for five simultaneous file downloads (at 1 Mbps each).

As I mentioned, this analysis assumes that ALL file downloads occur at 1 Mbps. In fact most of them would happen much nearer the (faster) basic line rate, so freeing up more time for other downloads in a virtuous circle: in other words, this is very much the worst case.

4. TRAFFIC DIMENSIONING

We are now in a position to dimension the access link so that both multimedia conferencing users and data users meet their expectations more than 99% of the time.

We decided that we need to budget for:

* 6 simultaneous multimedia sessions at 0.5 Mbps each = 3 Mbps.
* 5 simultaneous file downloads at 1 Mbps = 5 Mbps.

Total bandwidth required for the site access link is 3 + 5 = 8 Mbps.

This analysis is conservative but to allow for some headroom and growth, we might order a 10 Mbps access link.

5. NOTES

5.1 Because of the non-linearity of the Poisson distribution, there is not a direct connection between number of users and the required access link bandwidth. If we considered 24 users, the access bandwidth required would be less than 16 Mbps.

5.2 We have considered the streaming real-time and data non-real-time traffic requirements separately and simply added them together. This is conservative: there would be further stochastic gains in reality – but not sufficient to invalidate the analysis here.

5.3 Traffic is uploaded as well as downloaded. However the links are bidirectional and downloads are normally greater in volume than uploads.

5.4 In practice the analysis of this paper should be used to create a spreadsheet which automates all the calculations shown above.

5.5 Many file downloads can be expected to gain access to the full line access rate (10 Mbps in the example here). At this rate a 4 MB file will be received in just over three seconds. This is roughly equivalent to the service a UK residential broadband service could provide, close to the exchange.