docs update

This commit is contained in:
Zsolt Ero
2024-10-24 02:15:27 +02:00
parent bbbc7230c0
commit d5365ef15b
19 changed files with 49 additions and 9 deletions

43
docs/benchmark/README.md Normal file
View File

@@ -0,0 +1,43 @@
# HTTP Hosts Benchmarking
This repository contains tools and scripts for benchmarking HTTP hosts performance.
## Prerequisites
Before running the benchmarks, you need to create a path list (`path_list_500k.txt`). You have two options:
1. Generate from real-world server logs using `nginx_to_path_list.py`
2. Generate randomly (Note: real-world usage patterns are typically non-random, e.g., ocean tiles are rarely accessed)
## Important Notes
- Run the benchmarks on `localhost`, and not over the internet! Otherwise you'd be just testing your internet speed.
- The benchmark uses [wrk](https://github.com/wg/wrk) HTTP benchmarking tool
## Usage
Basic command:
```bash
wrk -c10 -t4 -d10s -s /data/ofm/benchmark/wrk_custom_list.lua http://localhost
```
### Parameters Explained
- `-c10`: Number of connections to keep open
- `-t4`: Number of threads to use
- `-d10s`: Duration of the test (10 seconds)
- `-s`: Script file to use
### Thread Count Considerations
- `-t1`: More accurate results as the URL list is loaded exactly in sequence
- `-t4`: Better reflects real-world usage patterns
## Results
Benchmark results can be found in [results.md](results.md)
## Contributing
Feel free to submit your results including which hosts were used.

View File

@@ -0,0 +1,31 @@
import json
# This script parses a nginx server log and creates a text file
# which can be used in the Lua script.
# The path file is not suppied in this repo.
with open('access.jsonl') as fp:
json_lines = fp.readlines()
paths = []
for i, line in enumerate(json_lines):
log_data = json.loads(line)
if log_data['status'] != 200:
continue
if log_data['request_method'] != 'GET':
continue
uri = log_data['uri']
if 'tiles/' not in uri or not uri.endswith('.pbf'):
continue
path = log_data['uri'].split('tiles/')[1]
paths.append(path + '\n')
print(f'{i / len(json_lines) * 100:.1f}%')
with open('path_list.txt', 'w') as fp:
fp.writelines(paths)

71
docs/benchmark/results.md Normal file
View File

@@ -0,0 +1,71 @@
# wrk benchmarks
Real world usage, 500k requests replayed from server log.
### Hetnzer dedicated server with NVME ssd
#### localhost
clean cache after nginx restart.
```
service nginx restart
wrk -c10 -t4 -d60s -s /data/ofm/benchmark/wrk_custom_list.lua http://localhost
Running 1m test @ http://localhost
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.02ms 7.04ms 50.43ms 93.23%
Req/Sec 8.42k 2.01k 18.52k 69.79%
2871265 requests in 1.00m, 230.65GB read
Requests/sec: 47811.00
Transfer/sec: 3.84GB
```
Super much overkill, we'd only need 125 MB/s for Gigabit connection and this is 3840 MB/s.
Also max request time is super nice + no errors.
#### over network
```
wrk -c10 -t4 -d60s -s /data/ofm/benchmark/wrk_custom_list.lua http://x.x.x.x
Running 1m te st @ http://144.76.168.195
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 7.57ms 6.61ms 45.34ms 84.32%
Req/Sec 293.85 141.33 1.18k 73.07%
71628 requests in 1.00m, 6.05GB read
Requests/sec: 1191.88
Transfer/sec: 103.01MB
```
Realistically this is the max over Gigabit connection.
---
### BuyVM KVM machine with 1 TB BuyVM Block Storage Slab
Advertisement: 40Gbit+ InfiniBand RDMA Storage Fabric giving near local storage performance.
Reality:
```
wrk -c10 -t4 -d60s -s /data/ofm/benchmark/wrk_custom_list.lua http://localhost
Running 1m test @ http://localhost
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 226.10ms 343.52ms 1.99s 87.75%
Req/Sec 29.77 38.06 272.00 89.72%
3655 requests in 1.00m, 232.76MB read
Socket errors: connect 0, read 0, write 0, timeout 8
Requests/sec: 60.87
Transfer/sec: 3.88MB
```
Wow, this is 60 request per second compared to Hetzner's 47000, just wow! Repeated tests with hot cache resulted in a bit better performance, but still not Gigabit.
```
Requests/sec: 266.99
Transfer/sec: 23.07MB
```
Abandoned the idea of using BuyVM, even though their unlimited bandwidth is quite unique in this price range in USA.

View File

@@ -0,0 +1,39 @@
local counter = 1
local lines = {}
local url_base = "/planet/fake_version/" -- trailing slash
local path_list_txt = "/data/ofm/benchmark/path_list_500k.txt"
for line in io.lines(path_list_txt) do
table.insert(lines, url_base .. line)
end
local function getNextUrl()
-- Get the next URL from the list
local url_path = lines[counter]
counter = counter + 1
-- If we've gone past the end of the list, wrap around to the start
if counter > #lines then
counter = 1
end
return url_path
end
request = function()
-- Return the request object with the current URL path
path = getNextUrl()
local headers = {}
headers["Host"] = "ofm"
return wrk.format('GET', path, headers, nil)
end
response = function(status)
if status ~= 200 then
print("Non-200 response")
print("Status: ", status)
-- this only works in single threaded mode (-t1)
print("Request path: ", path)
end
end