If you have done your experiments with the M/M/s queuing calculator , you will notice that the queuing performance is non-linear. The larger the number of parallel servers, the waiting time and queue length reduces drastically. On the other hand, adding servers means additional cost to serve the customers. This thought leads to a question whether we can find optimum number of servers to minimize cost of your business system using queuing model.
The optimization considers the arrival rate of customers, the service rate, the service cost to operate the queuing servers and customer waiting time (customer dissatisfaction or opportunity to loss the customers).
Queuing optimization model is to minimize the total cost of waiting cost and service cost. We need to find the balance between demand (customers side) and supply (number of servers). The most important aspect of queuing optimization is how you value your customers compare to how you value the cost of your servers. The ratio between Cs and Cw actually represent the quality of service of your queuing system. The larger the value of Cs compared to Cw means you set the value of server is much more expensive than the customers value and therefore, you get lower quality of service in your queuing system. When you set the unit waiting cost Cw much higher than the unit server cost Cs, you value the customer as very important person (VIP) and you have much better quality of service but you need to bear higher cost for that service. In the worst case, when Cs = 10000 times of Cw, the results will approximate the queuing rule of thumb that I proposed.
Preferable reference for this tutorial is
Teknomo, Kardi. (2014) Queuing Theory Tutorial