Deploy a Haproxy Load Balancer and multiple Web Servers on AWS instances Using ANSIBLE

8 min readOct 23, 2020

Description:

Provision EC2 instances through ansible.
Retrieve the IP Address of instances using the dynamic inventory concept.
Configure the web servers through the ansible role.
Configure the load balancer through the ansible role.
The target nodes of the load balancer should auto-update as per the status of web servers.

Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration. Ansible was written by Michael DeHaan and acquired by Red Hat in 2015. Ansible is agentless, temporarily connecting remotely via SSH or Windows Remote Management (allowing remote PowerShell execution) to do its tasks.

HAProxy, which stands for High Availability Proxy, is a popular open-source software TCP/HTTP Load Balancer and proxying solution which can be run on Linux, Solaris, and FreeBSD. Its most common use is to improve the performance and reliability of a server environment by distributing the workload across multiple servers (e.g. web, application, database).

LoadBalancer

Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.

Load balancing distributes server loads across multiple resources — most often across multiple servers. The technique aims to reduce response time, increase throughput, and in general speed things up for each end-user.

A load balancer performs the following functions:

Distributes client requests or network load efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online
Provides the flexibility to add or subtract servers as demand changes.

HAProxy Algorithms

Round Robin: This algorithm is the most commonly implemented. It works by using each server behind the load balancer in turns, according to their weights. It’s also probably the smoothest and most fair algorithm as the servers’ processing time stays equally distributed. As a dynamic algorithm, Round Robin allows server weights to be adjusted on the go.

Static Round Robin: Similar to Round Robin, each server is used in turns per their weights. Unlike Round Robin though, changing server weight on the fly is not an option. There are, however, no design limitations as far as the number of servers is concerned. When a server goes up, it will always be immediately reintroduced into the farm once the full map is recomputed.

Least Connections: With this algorithm, the server with the lowest number of connections receives the connection. This type of load balancing is recommended when very long sessions are expected, such as LDAP, SQL, TSE, etc. It’s not, however, well-suited for protocols using short sessions such as HTTP. This algorithm is also dynamic like Round Robin.

Source: This algorithm hashes the source IP and divides it by the total weight of running servers. The same client IP always reaches the same server as long as no server goes down or up. If the hash result changes due to the changing number of running servers, clients are directed to a different server. This algorithm is generally used in TCP mode where cookies cannot be inserted. It’s also static by default.

URI: This algorithm hashes either the left part of the URI or the whole URI and divides the hash value by the total weight of running servers. The same URI is always directed to the same server as long as no servers go up or down. It’s also a static algorithm and works the same way as the Source algorithm.

URL Parameter: This static algorithm can only be used on an HTTP backend. The URL parameter that’s specified is looked up in the query string of each HTTP GET request. If the parameter that’s found is followed by an equal sign and value, the value is hashed and divided by the total weight of running servers.

Requirement:

IAM user with admin power
Ansible installed
Install boto library of python using “pip3 install boto”

Here I’m checking the ansible version

Now ping to localhost

Here you can see my localhost is ping-able.
Now check the boto/boto3 is installed .

Here i am creating one vault that is basically a secret box where we can hide our personal keys.

For launching the aws instances we need to write playbook for it , but for that we need asw secret credentials , here i’m creating a secure.yml file have aws secret credentials which is secure no one can access it without password.

Here you can see my “secure,yml” file is secured.

Now I’m creating a playbook as “ec2.yml” for launching ec2-instances which provision 3 webserver and one load balancer.

Before Launching the playbook, you can check/see there is no aws ec2-instance running in my aws account.

And just after the running ansible playbook ,it will launch 3 webserver(instances) and one loadbalancer(instance) on aws.

Here you can see playbook is successfully deployed and as such there is no error.

Here you can see launched instances on aws successfully, with one loadbalancer and 3 webserver.

Now the time to fetching ip of instances from aws by using ec2.ini and ec2.py file.

here both files are successfully downloaded . Now transfer your aws key from your system to linux by using WinSCP and make it executable .Also make ec2.ini and ec2.py file executable.

chmod +x ec2.py

chmod +x ec2.ini

Here i am going inside the ec2.py file

Make sure your shebang sign indicates the python3 version.

in /etc/ansible directory i am creating one more inventory as “inventory.txt”.

Here you can see i have successfully retrieve the dynamic IP of EC2 instances by exporting ini path, ansible host, aws region,aws access key, aws secret key.

Now check that all hosts are pingable or not by using ec2.py inventory….by ansible all -m ping.

Here you can see the configuration of that file , hoe it fetch that particular ip and arrange it in respective group.

now my creating same directory which i have as role path by “mkdir /etc/myroles command”.

ansible-galaxy init webserveransible-galaxy init loadbalancer

by “ansible-galaxy list” command you can see roles path is successfully initialised.

Here you can see two roles are created successfully.

Now I’m going to inside the webserver role and then in the task folder, here i’m editing main.yml file .

Now i’m going to handlers file of loadbalancer role’s for setting the haproxy restarted.

task file of loadbalancer for installing haproxy loadbalancer , setting the notify parameter in config file and restarting the services.

Now i am going to haproxy.cfg file in the templates folder.

Here i am using jinja2 embedded code for dynamically fetch or register the webserver ip with haproxy loadbalancer.

Now create one playbook for play all this task together.Here i am creating a one playbook for this task.

Here our all set-up of creating loadbalancer on aws is ready , now we can run this playbook.

Here you can see all the tasks run successfully without any error ,also it installed the respective software in instances and started it ,copying the content from source to destination.

We have our setup ready now. Enter the load balancer instance’s public ip on the browser.

Output:

By seeing this output we can conclude that , haproxy work properly in loadbalancer.

Hope it will helpful to you…If in case is any suggestion then please DM me or comment below.