Detailed Steps for Running TechEmpower Web Framework Benchmarks on AWS
This is continued from my previous article:
See that for the overview. Once you understood the motivation and background from the previous article, and want to see more details about how I ran the benchmark, please go on from here.
Before launching EC2, the first thing to set up is your VPC and associated Subnets, Route tables, etc. So create:
- a VPC
- a Subnet for the VPC
- an Internet Gateway (
igw=08af305fa33ed191bin the below screenshot)
- a Route Table in the VPC, which can be like this:
- a Security Group
The Routes should allow the local traffic so that benchmarking can be run on three-machine setup. You also need the SSH traffic goes through the route table, so the Routes for the Internet Gateway is necessary.
Also you should create a Security Group:
You need to enable SSH, and you'd better enable ICMP (Custom ICPM Rule) for ping for troubleshooting just in case. The source IP
219.***.***.***/32 is my local PC's public IP address.
For AWS beginners, note that Security Group is something you attach to EC2 instances, not to the Subnet. You'll attach this to the EC2 insatnces you set up as in the next section.
As the last note in the section, if you prefer a more secure environment, you can set up a jump host (a.k.a. Bastion host) or even split the network into multiple Subnets.
Continued from the VPC setup, you now need to set up EC2. The first step is to choose AMI.
I chose the standard Amazon Linux 2 but other AMIs could work too. Amazon Linux 2 doesn't have docker installed by default, but it only requires a few extra steps to enable run the Docker daemon at startup.
Next, choose the instance type. It should be
m5.xlarge as in my previous article Running TechEmpower Web Framework Benchmarks on AWS on my own
After that you should configure the instance. Choose
- Number of instances = 3 (for 3-machine setup)
- Network = your VPC configured earlier
- Subnet = your Subnet configured earlier
- "Auto-assign Public IP" = Enable
- Set up "User Data" to launch the Docker daemon at startup
"Auto-assign Public IP" is convenient when you need to SSH from your local PC later like below:
> ssh -i .ssh/your-aws-ssh-key.pem email@example.com
A public IP like
18.104.22.168 above is allocated at EC2 instance startup.
(Elastic IP would work too. Choose whichever you like.)
"User Data" is as follows, to bring up the docker daemon:
#!/bin/bash yum update -y amazon-linux-extras install docker mkdir /etc/systemd/system/docker.service.d echo "# /etc/systemd/system/docker.service.d/override.conf" >> /etc/systemd/system/docker.service.d/startup_options.conf echo "[Service]" >> /etc/systemd/system/docker.service.d/startup_options.conf echo "ExecStart= " >> /etc/systemd/system/docker.service.d/startup_options.conf echo "ExecStart=/usr/bin/dockerd -H unix:// -H tcp://0.0.0.0:2375" >> /etc/systemd/system/docker.service.d/startup_options.conf service docker start usermod -a -G docker ec2-user
For more information about running docker at EC2 startup, see How do I enable the remote API for dockerd, and Failed to load listeners: no sockets found via socket activation: make sure the service was started by systemd:
// but use unix:// instead of fd://
After the instance configuration, you can skip over to the step 6, Security Group. Choose the security group you created earlier:
Actually I set up one more EC2 instance as a "controller" to execute the
docker run... command remotely to the 3 EC2 instances I set up up to here. So the list of EC2 instances were:
The controller instance doesn't need to be
m5.xlarge as it just sends
docker run to remote EC2. (i.e.) you don't need to run the Docker daemon, but need a Docker client. Also you
git clone the TechEmpower GitHub repo to run the benchmark startup script. So its user data can be this:
#!/bin/bash yum update -y amazon-linux-extras install docker mkdir /etc/systemd/system/docker.service.d echo "# /etc/systemd/system/docker.service.d/override.conf" >> /etc/systemd/system/docker.service.d/startup_options.conf echo "[Service]" >> /etc/systemd/system/docker.service.d/startup_options.conf echo "ExecStart= " >> /etc/systemd/system/docker.service.d/startup_options.conf echo "ExecStart=/usr/bin/dockerd -H unix:// -H tcp://0.0.0.0:2375" >> /etc/systemd/system/docker.service.d/startup_options.conf service docker start usermod -a -G docker ec2-user yum -y install git cd /home/ec2-user git clone https://github.com/TechEmpower/FrameworkBenchmarks.git
Run the benchmark
From here, the interesting part starts. SSH into the "controller" ec2 instance.
// replace 22.214.171.124 with the actual controller EC2's public IP > ssh -i .ssh/your-aws-ssh-key.pem firstname.lastname@example.org
In short you'll just need to execute the following command:
./tfb --test h2o \ --network-mode host \ --server-host 10.0.0.207 \ --database-host 10.0.0.149 \ --client-host 10.0.0.216
by replacing the parameters or adding extra parameters. Then the benchmark results are saved on the "controller" EC2 instance, under the
FrameworkBenchmarks directory checked out from GitHub.
To remotely run wrk, web server and database from the controller,
--server/database/client-host parameters are necessary. Otherwise, everything will run on the same EC2 machine as the "controller"
Let's see what this command means in more detail.
tfb script is at the root of the
FrameworkBenchmarks repository, and there is explanation about it in README.
The explanation is bit wordy - but to understand its inner behavior, and why I passed those parameters, you'll need to look into a couple of python scripts. The
tfb script executes the
run-tests.py script, which has description about all the possible parameters. The parameters you pass to
tfb are mostly passed down to
run-test.py depends on lots of other python scripts. And what's important for the remote benchmark execution is the following piece of code in
class BenchmarkConfig: def __init__(self, args): ... ... if self.network_mode is None: self.network = 'tfb' self.server_docker_host = "unix://var/run/docker.sock" self.database_docker_host = "unix://var/run/docker.sock" self.client_docker_host = "unix://var/run/docker.sock" else: self.network = None # The only other supported network_mode is 'host', and that means # that we have a tri-machine setup, so we need to use tcp to # communicate with docker. self.server_docker_host = "tcp://%s:2375" % self.server_host self.database_docker_host = "tcp://%s:2375" % self.database_host self.client_docker_host = "tcp://%s:2375" % self.client_host
This piece of code means, if
network_mode (equivalent to
--network-mode parameter to the
tfb script) is None, then it tries to remotely execute the web server on on
--server-host parameter), database on
--database-host parameter), wrk on
--client-host parameter), with the
docker run command via the 2375 port.
If all the network setup was correct, and parameters are correct, you start seeing the benchmark run remotely on separate EC2 instances. You can SSH into respective EC2 instances and run
ps commands alike to verify that.
Again, the results will be saved in on the controller EC2 instance, and under the same directory as the
TechEmpower GitHub repository checked out.