by Stuart Harris
Telephone traffic along with other forms of traffic, such as automobile traffic, share similar characteristics. Traffic may be very busy and have to slow down and wait, or it may be very light with little slowing or blockage. Facilities such as roads, telephone lines, toll booths, service agents, and bank tellers may be either under or over utilized causing costly idle time or poor service to customers. Call center workforce managers face a daily, continual challenge in finding the right number agents to use at the right time to handle calls that arrive in a random manner. This paper explores why this challenge occurs and two methods used to meet it.
The fundamental challenge for the inbound call center manager is to correctly balance the number of agents and trunks to the varying volumes of calls throughout the day in order to:
1. Keep all the agents busy, and
2. Keep the time callers have to wait to an acceptable minimum.
The inbound call center is a customer/server/queuing system: customers (calls), arrive in the system to be served by servers (agents or operators), but may have to wait (in queue) for them. Almost a century of mathematical study has gone into the problem of “arrivals” that need service by a “server”.
How Calls Arrive in a Call Center
A simple minded approach to determining the number of agents needed in a particular hour might be to take the number of calls expected to arrive in the hour, multiply that by the average minute length of all the calls and then divide by 60. As an easy example, if 60 calls arrive in 1 hour, each with an average length of 1 minute, you might say that you only need one agent with a single telephone line to answer all the calls.
The obvious flaw to this sort of logic is that call arrivals in the real world are not distributed one right after the other. The average arrival rate in the above example is 1 minute, but their actual arrival is distributed randomly: some will come in at the same time, some will come in when another is being served, and during some periods of the hour no calls may arrive at all. Also, the length of the calls follows a random distribution. These random distributions are determined by the laws of probability.
How then can the actual arrival rate of calls in your call center be predicted? While we cannot predict the exact time that each call will arrive, we can predict the probability of when the next call will arrive. This probability follows a well-known distribution called the Poisson distribution. It looks something like the following bar graph:
Probability of Arrival
average minutes to next call arrival
Figure 1: Poisson Distribution of Call Center Call Arrivals
It’s a bell-shaped distribution skewed out to the right. It says that at any given moment, the probability of what the time will be to the next call arrival is “scrunched up” in front of the average time to the next call arrival, and tapers off slowly thereafter. The “hump” of the curve is before the average time to the next call. If the hump was right at the average time to the next call arrival (a normal distribution), then call center staff planning would be far easier than it is.
So what does this mean in practice? The mathematics behind the Poisson distribution tell us that call arrivals will always tend to be clumped together and will not arrive in an even manner. This is the reason why a graph of observed data for % Agent Utilization has a gentle slope, but graphs for Average Speed to Answer and Average Time in Queue have very steep slopes. (See Fig. 2)
Figure 2: Example Call Center Measurements
Thus, the relationship between how busy your agents are and the service your callers get is not a simple linear relationship. In other words, you can’t gauge the level of service your callers are getting be peeking over the cubicles and checking out how busy your agents are over some given time period. And, there are break off points where the addition or subtraction of one or two agents will result in dramatic differences in customer service level as evidenced in the average speed to answer and average queue time in queue graphs above. This runs counter to our common intuition, but it is true.
The lengths of calls are not uniform either. Call lengths nearly always closely follow what is called the exponential distribution:
|Probability of Call Length|
average call length
Figure 3: Exponential Distribution of Call Center Call Lengths
This distribution says that the most probable call lengths are those that are less than the average call length, but that there are going to be some that are a whole lot longer. This phenomenon also contributes to a “clumping effect”: the length of calls in progress at any given time may all be rather short or all be rather long, but will very seldom be close to the average length of all the calls experienced in a day. To complicate things even further, the average time that callers spend in queue is also exponentially distributed around the average queue time primarily because a large percentage of callers hang up very quickly when they get the “please hold” message.
The Two Solutions
Using the assumptions of the above distributions for call arrivals and call length, and the customer/server/queuing nature of an inbound call center, there are two mathematical methods that can be used to optimize a call center’s performance:
A formula method that calculates a statistical equation to predict queuing times, agent workloads, and optimal agent levels, or
A simulation method that imitates the flow of calls into a call center and their handling by agents, with queuing, blocking, and overflow.
The “Formula” Solution
The most common method used for call center staffing is formula using the famous Erlang equations (Erlang C and Erlang B) for staffing and trunking. They compute the probability of delay in queue for calls when using a given agent level with a given amount of inbound call traffic.
The software in the various call center staffing and scheduling products that calculate the Erlang C formula calculate it for several agent levels and then pick out and tell the user the one agent level that gives a delay probability closest to that of their desired service level. These products will do this for each hour, half hour, or quarter hour time period throughout a workday.
So, in the formula method, the user inputs number of calls, talk times, and wrap-up times for each period of time in the workday, along with the desired service level (e.g., answer 80% of calls within 20 seconds). The software calculates the Erlang equation and pops out optimum agent levels for all time periods in the day. It can also be used to observe the effects on queuing and agent workload when using different agent levels in any one time period.
Modified versions of these formulas, with the ability to specify desired queue levels, are used in Portage Communications’ Call Center Designerô Day Planner, Staffing, and Trunking software modules. In fact, most all of the staffing and scheduling software for call centers use the Erlang formulas or some minor modification of them.
The Erlang analytical method has done a good job over the years in helping call center managers determine what agent levels they should use. However, it does not account for some of the unique dynamics of inbound call centers using an ACD, and it tends to over-staff many call centers to some degree.
The most talked about shortcoming of the standard Erlang C formula is that it assumes calls will queue up infinitely and wait forever to be answered by an agent. For centers with limited inbound line capacity and callers with little queuing tolerance, the Erlang method may not be accurate enough. Also, what about caller abandonment (hanging up after being queued), or callers retrying after abandoning or getting a busy, or the ACD overflowing calls to another agent group? And what about that ACD ring delay feature (the number of rings the caller can be given before the ACD picks up the line)?
Some companies and mathematicians have derived “special versions” of the Erlang method that claim to at least partially account for some of these issues especially the infinite queuing problem. However, no tweaking of the equation will fully take into account any of the issues mentioned above, much less all of them. Some of the variants out there to the Erlang C equation are highly suspect, having no published analysis of their mathematics, and may actually give less reliable staffing predictions.
If a call center manager finds that in practice Erlang C is not allowing them to maintain their desired service levels to the degree they want, or that it calls for many more agents than are actually needed, then they should probably just skip all the Erlang variants and analytical equation methods and try the second method:simulation software.
The Simulation Solution
So what is computer simulation? Most people have heard of weather and climate forecasts made by computer simulations, or maybe they have played computer games like SimCity. Industrial and natural processes may often be simulated by a computer program. The program imitates the flow of people, material, or events and is used to experiment and see what the effects that alternative configurations of the people, material or events would have on a real-world situation. In the past few years simulation programs have become an increasingly popular method for call center staffing and performance prediction.
A call center or ACD simulator compresses a day’s worth of incoming call traffic into a short period of time by representing each second of a day with a few microseconds of computer time. Calls are made to randomly arrive in a “virtual” call center created in the computer’s memory and are answered by agents, queued, met with busy signals, overflowed, or abandoned depending on the parameters, agent/trunk combinations, and call volumes defined for the call center.
The simulator doesn’t just calculate equations in the way that a formula does, rather, it actually acts as an experimental call center that reacts to the random arrival of inbound calls with the agent and inbound line levels, inbound call volumes, caller abandonment and retry behavior, and ACD overflow the call center manager wants to test and try out. Simulation can then accurately predict the levels and effects on service level of call blockage, abandonment, caller retries, and overflow when these are entered as parameters to the simulation. Also, a simulation program can make the random call arrivals and call lengths follow the Poisson and exponential distributions that are seen in the real world.
At the end of the simulation, you will see how many calls were handled, abandoned, given a busy, queued, and overflowed along with queue lengths, average speed to answer, and the actual service levels that would be given to your callers. More simulations with different combinations of agents and trunks are usually performed until the manager is satisfied with the service levels, agent work load, and call center finances.
A good way to state the difference between the Erlang formula method and the simulation method is this: with the formula method, you enter your call volumes and what you want to happen in the form of the desired service level you want to maintain, and you are told what number of agents and lines you need. With the simulation method you enter the number of agents and lines you want to use along with your call volumes, and you see what will happen in the form of service levels actually given.
One drawback of simulation is that it can take longer to determine your optimal agent levels. The Erlang formula’s output is an optimal agent level: you only need to run it once. With you have to test and try out different agent levels and see what happens. Are the resulting queue levels acceptable? Are the agents busy enough? Is the service level sufficient? No? Then adjust the agent and line levels and run another simulation, and another, and maybe another. This extra work pays back in what will be more accurate and practical predictions of optimal agent levels and call center performance. Also, you can see values for calls queued and overflowed, the number of callers who abandon or get busies and the calls that retry later.
Portage Communications makes SimACD an automatic call distributor simulator that takes into account such dynamics as ring delay, call abandonment and retry behavior, and overflow. It simulates a full workday by the hour, half hour, or quarter hour, and displays both numeric and graphic predictions for such items as calls, handled, queued, overflowed, busy, retried and more. It also gives average speed to answer times, queue times, and percentages of calls answered in less than ten, twenty, thirty, and sixty-second intervals. It is the call center industry’s first affordable simulator designed for smaller call centers or agent groups.
What Method Is Best For Your Call Center?
If you require staffing predictions to do simple scheduling and just need reasonably accurate levels to on which to base your agent schedules, then staying with Erlang C or an Erlang variant will probably fit your needs. Practically all of the existing workforce management systems use an Erlang formula for calculating the required number of agents, and these requirements are then fed into a scheduling algorithm and agent database to create work schedules. Erlang C also works well with most of the home grown spreadsheet based scheduling systems that many call centers have developed independently.
Problems occur if your call center is more complex. If you have multiple agent groups, skills based routing of calls, overflow of calls from one group to another, or auto attendant and IVR systems that handle some of the calls, then simulation becomes the better option since these dynamics cannot be represented in a mathematical formula. However, at present, there are hardly any simulation software products that handle these and are combined with a scheduling module in the same package. Many companies are working on one (including Portage Communications), so look for them in the near future.
Most call center managers use Erlang and simulation staffing methods for call center performance analysis in addition to agent scheduling. They want to see what they’re predicted queue levels and agent work load will be at varying agent levels. Analysis of call center performance is especially helpful in taking a close look at your call center’s busiest hours, least busy hours, and other problem periods.
For performance analysis, you don’t want an Erlang formula tool that just spits out a required numbers of agents based on a given service level goal. You want a tool that will also tell you what your average speed to answer, percent queued, average queue length, agent occupancy will be. These and other performance measures are important to study in order to determine what your service level goal should be in the first place.
Simulation software will also give these measures for call center analysis and is more accurate. Also, as already stated, simulation can take into account caller abandonment, retry behavior, ACD ring delay, and overflow. On the minus side, call center simulators can be very expensive primarily because this type of software is difficult to write, and the market for them is still very new.
Determining the optimum number of agents and the effects of different agent levels is the key to a smoothly running call center and happy callers. The dynamics of modern call centers require managers to explore all the new formula and simulation software to meet this challenge.
Stuart Harris is the President of Portage Communications, Inc. He has worked in the telecommunications and call center industries for over thirty years and holds a BA in Mathematics and Computer Science from the University of Colorado, and an MS in Software Engineering from Seattle University. May be used with permission only.
Here’s a great article about the reality of call center call volume forecasting by James Lawther:
My favorite quote from it:
“If your forecasters were the sort of people who could precisely nail the outcome of a random, chaotic, complex system, day in and day out, then they wouldn’t be forecasting your call centre. They would be playing the euro-millions lottery for 5 minutes every Friday and spending the rest of the time cruising the shores of the Mediterranean in a Ferrari.”
In business-to-business selling the last month of the year is always slow, so I’ve been going through some old stuff. Normal people would take this time to look at pictures of loved ones, photos from old trips and so forth. But I’m a mathematician so I look at graphs and equations instead. Yes, this sounds pretty sad, but in this case I’ll impart some wisdom from a very old piece of stuff I came across.
Call centers have two fundamental time measurements that they pay close attention to: average talk time, and average queue time. What does it mean when we say, “our average talk time is 2 minutes”? It does mean that some calls last less than 2 minutes and others last longer, but averaging all them we get 2 minutes. But, how are the call lengths distributed around that average? Does this mean that the probability of some calls lasting 1 minute is the same as the probability of some calls lasting 3 minutes? Is the probability of call lengths distributed on a normal bell-shaped curve with 2 minutes right in the middle like this graph?
You might think so, but you’d be wrong if you did. Caller behavior and the way calls get handled create something different. Here is an old graph dating all the way back to 1961 that shows the probability of call talk time lengths with an average talk time of 2 minutes.
It’s from some old AT&T data. It shows that call lengths follow an exponential distribution rather than a bell curved shaped distribution. Because the average is 2 minutes, the area under the graph to the left side of the 2 minute mark is equal to that under the right. The greatest probability is that calls are handled pretty quickly under two minutes, but there are a lot of calls that take longer and stretch out to 9 or 10 minutes, but with decreasing probability as the times get that long. You’ll notice right at the very beginning of the graph near 0 minutes there’s a spike and and then a dip after which it then settles into near perfectly following an exponential distribution (the dashed curve). This was caused by people hanging up right away because of having called the wrong number, or nowadays pressing the wrong selection to an IVR prompt.
The time spent in queue time follows the same pattern because the highest probability for when someone is likely to abandon from queue is the very moment they are placed in queue or soon thereafter. There’s no certain way to avoid this because a lot of people have little to no tolerance for being placed on hold. If there is any alternative such as a company competitive to you, or they’ve called on an impulse, a larger number of people will be of the no tolerance type. Try increasing your ring delay, the amount of time your ACD gives a ring tone to callers before it seizes the call — callers don’t feel they are delayed when hearing a ring.
Its been said by others many times, and I’ll say it again. As a call center manager you can’t gauge how well your service levels are by looking at how “busy” your agents are. Intuitively, we would think that if you drop your agent levels at any moment by 10% that your callers will experience a 10% increase in the time they have to wait for service. 10% less agents means about 10% higher queue times, right?
No, absolutely not! Small decreases in agent levels will not make make the agents that much more busy, but it will make the callers experience steep increases in queue time. Stated more precisely:
1. If the agents on staff are reduced in a linear manner, this gives a slightly greater occupancy rate, and some small savings in staff expense.
2. But, the average speed to answer for the callers (your customers) increases in an exponential manner.
3. This is just a fact of the mathematics of queuing theory, and the fact that calls arrive in a random distribution over time. Do you want small linear savings, or large exponential revenues?
Here’s an example. Let’s say I’m a call center manager, and let’s look at the following table from a simple Erlang C calculation:
In the parameters at the top, I’ve selected 30 minutes as a time period over which I expect 430 calls to arrive with an average talk time of 2:30, and an average after call work time (wrap-up time) of 20 seconds. From my experience and historic reports, I think callers will, on average, wait for up to 3:00 in queue before hanging up (abandoning), and I want to maintain a service level goal of answering 80% of the calls within 20 seconds. In this calculator, I then click the Calculate button. In the staffing calculations at the bottom, the famous Erlang C formula tells me a level of 46 agents is best for achieving my goal.
Now, look closely at the columns labeled AvgSpdAns (average speed to answer), Occupancy (the percentage of time agents are on a call) and Abandon (the percentage of callers who hang up while waiting in queue). At 46 agents the occupancy rate is 88%, and indeed, when I look over the cubicles, I see my agents seem to be idle for over 10% of their time. But I’m a draconian call center manager! I want to crack the whip, make my agents sweat, and pinch every penny when it comes to payroll! I think I can drop the agent levels by 4 or 5 agents, about 10%, and save a lot of money and make management happy by having my agents work at near 100% occupancy! What a great idea!
Uh … no, I should be fired on the spot. Look what would happen if I were to drop from 46 agents down to to 41 or 42 agents. Sure, my agents get a little bit more busy, but the average speed to answer increases from 10 seconds to a minute and a half at 42 agents, and an utterly disastrous 6:46 at 41 agents. Now, let’s say my call center is a catalog order center. Every caller is waiting, credit card in hand, just eagerly wanting to give my company their hard earned money. But we won’t get that revenue, because 10 to 62 percent are going to be abandoning, and they will be so ticked off they will never, ever, call again. They will abandon me; I will be rejected, dumped, and be “standing in the shadows of love, getting ready for the heartaches to come” as the Four Tops sang. (I really like how we use that term “abandoned” in our industry. It has such a sad, heartbreaking finality to it.)
Sure, you’re saying “this is an extreme example, and anyway we have real time displays, reader boards and such that warn us before things go haywire, and we’ve heard this sort of advice before”. Well, very often the first things learned are the first things forgotten, and a surprising number of call center managers do manage, to one degree or another, by looking at and perceiving the activity levels of their agents. This is especially true in smaller call centers. I guess it’s just human nature.
Also, as a caller yourself, have you ever noticed the attitude and behavior of agents you talk to is so perfectly correlated with the amount of time you’ve spent in queue? The longer the queue time, the grouchier the agent is, and the grouchier you are. The agent is being worked too hard, just as your patience is. Last month I spent over an hour in queue when calling the US Internal Revenue Service (that’s our our tax agency here in the US that provides the wonderful “service” of sticking a vacuum cleaner hose into our bank accounts). In any other interaction, I’m quite sure the agent and myself would be very pleasant people, but the conversation we had was as unpleasant as the long wait before it, and it was all because of high occupancy and average speed to answer.
So, I’m never going to interact with the IRS again; I have abandoned them! Well maybe not … death and taxes are the only certainties, so I’m sure I’ll be waiting forever in queue again with that particular call center operation. But, I’m really hoping the queue for St. Peter is a little more quick as that would be a another queue not to abandon! 😉
Do you know of any stats or surveys that answer the question of how many call center managers really pay attention to the call volume forecasts generated by their WFM software based on historical data?
My informal research indicates that about two thirds of them view the forecasts as inaccurate, and create their own based on side knowledge, hunches, selective use of historical reports, and a feel for what’s going to happen in the future rather than following an algorithm in their WFM software. This is usually because a company’s offerings, business climate, campaigns and so forth are always changing. As with most things, the past is not always a good indicator of the future.
I had a manager tell me a few weeks ago, “our call forecasting is as much a seance as it is a science”. Great quote!