How to Configure Load Balancer in a Single Application


This guide will explain how to configure a simple load balancing setup in a Single Application at

The critical point is that this Load balancer configuration happens transparently, that is, the application does not know about the LB.

For that to work, it is very important that your application follows good practices. For example, ensure that your application is stateless. Being stateless is a good practice for having your application in the cloud.

One of the advantages of stateless services is that you can bring up multiple instances of your application. Since there’s no per-instance state, any instance can handle any request. This is a natural fit for load balancers, which can be leveraged to help scale the service. They’re also readily available on cloud platforms.

What is load balancing?

At a basic level, load balancing works to distribute web traffic requests among different servers to ensure high availability and optimal traffic management while avoiding overload of any one server and defending against denial of service attacks.

Load balancers increase capacity and reliability.

How does the Load Balancer work at

To make a Load balancer possible, provides Varnish, an HTTP accelerator designed for content-heavy dynamic web sites as well as APIs.

In Varnish you will find three different methods (“directors”) for load balancing:

  • round robin
  • fallback
  • random


  • You either have an application, and you want to run at or you already have an application running at
  • A text editor of your choice.

The example below uses a Java application, but any stateless application in any language should work the same.


1. Use network storage

Because this configuration involves multiple application instances, they will each have their own local storage. To share files between applications you must use a network-storage service. See the documentation for specific instructions.

2. Define the application

Define your application in the .platform/applications.yaml file, rather than in as usual. The syntax is the same, but represented as a YAML array. The application should also be defined as a YAML anchors.

- &appdef
    name: app
    type: 'java:8'
    disk: 1024
        root: /
        build: mvn clean install
            source: local
            source_path: server_source
            start: |
                cp target/dependency/webapp-runner.jar server/webapp-runner.jar
                cp target/tomcat.war server/tomcat.war
                cd server && java -jar -Xmx$(jq .info.limits.memory /run/config.json)m -XX:+ExitOnOutOfMemoryError webapp-runner.jar --port $PORT tomcat.war

Next, add a second definition in the same file by aliasing the first definition. Add the following lines to applications.yaml:

   <<: *appdef
   name: app2

That will clone the appdef definition from the first application, then override the name property as that must be unique. You may override other values if desired but for load balancing the configuration should ideally be identical.

3. Create a Varnish instance

Add the following block to your .platform/services.yaml file:

    type: varnish:6.0
        server1: 'app:http'
        server2: 'app2:http'
        vcl: !include
            type: string
            path: config.vcl

That will define a new Varnish service, named varnish1, that can connect to both app and app2 (which are created by applications.yaml). It will use the file config.vcl for its configuration. (See below.)

4. Map incoming requests to Varnish

Configure your .platform/routes.yaml file to route incoming requests to the Varnish service. The exact syntax will depend on your application but in most cases that means just updating the upstream as below.

  type: upstream
  upstream: "varnish:http"

  type: redirect
  to: "https://{default}/"

5. Configure Varnish

Varnish is configured via a .vcl file, referenced from the services.yaml file. It has three options for load balancing:

  • Round Robin: Requests are distributed evenly between all backends, in order.
  • Random: Requests are distributed between backends at random, potentially with uneven weighting.
  • Fallback: Requests only go to a secondary server if the first does not respond.

For load balancing purposes Round Robin is almost always the optimal approach. A weighted random configuration may be used for A/B testing, but as the session is not “sticky” it is not effective for changed user functionality, only non-user-affecting changes.

To configure a round-robin varnish setup, use the following config.vcl file:

sub vcl_init {
    new lb = directors.round_robin();

sub vcl_recv {
    set req.backend_hint = lb.backend();

To use a weighted-random configuration, use the following config.vcl file:

sub vcl_init {
    new lb = directors.random();
    lb.add_backend(server1.backend(), 10.0);
    lb.add_backend(server2.backend(), 5.0);

sub vcl_recv {
    set req.backend_hint = lb.backend();

In this example, 2/3 of requests will go to server1 and the other third to server2. That is only useful if there is some configured difference between them, or if they are resourced differently. (Resourcing load balanced servers differently is generally not useful.).