30x to 70x faster than mitmproxy/mitmdump, 4x faster than Squid
I was recently contacted by a user asking about the performance overhead when using impersonation rules in Fluxzy. Since I had never conducted precise benchmarks before, I decided to set something up in a way that would allow for reproducible benchmarks across various configurations.
The idea was straightforward: measure basic performance indicators such as the number of requests per second and bandwidth usage, without all the rocket science typically found in benchmarks.
So, I decided to take it a step further and compare Fluxzy CLI with similar tools.
As a reminder, Fluxzy CLI is an open-source command-line application that acts as an HTTP intermediary, allowing various types of manipulation and recording of HTTP(S) traffic.
Since most MITM tools are either closed-source, paid, or restricted by non-comparison clauses, this test will focus only on comparing Fluxzy CLI with mitmproxy/mitmdump — their console-based counterparts optimized for quick traffic dumping. Also, to provide a reference point, the benchmark will be executed under two additional configurations: one without a proxy and another using a well-known proxy, Squid, configured with caching disabled (only with plain text).
Setting Up the Benchmark
To carry out this experiment, we will need an HTTP client capable of performing benchmark tests and an HTTP server that is fast enough to handle the workload. The tests should be executed locally to avoid any influence from network-related factors.
The setup for this test is documented in the following repository: https://github.com/haga-rak/floody and follows this simple schema:
+---------+ +-----------+ +-----------+
| floody |------>| Proxy |------>| floodys |
| (Client)| | (to test) | | (Server) |
+---------+ +-----------+ +-----------+
Client
After reviewing dozen HTTP stress-testing tools available online, I realized that none of the regularly maintained, cross-platform tools support proxy integration. This includes big names like wrk and k6.
Given that the performance requirements are relatively moderate compared to real reverse proxies, I decided to create a trivial wrapper around .NET's HttpClient to generate HTTP requests. You can find this implementation it here: https://github.com/haga-rak/floody/tree/main/src/floody.
The input parameters are pretty simple: warm-up and test duration, concurrent connections, payload size, proxy option, extra header, ...
Server
For those unfamiliar with the .NET ecosystem, Kestrel is a cross-platform web server for ASP.NET Core. It stands out as an unrivaled web server due to its speed, efficiency, and flexibility.
Given Kestrel's reputation as an exceptionally fast HTTP server, I used it to set up a lightweight endpoint. This endpoint simply returns a response of a size specified by the client, making it enough for benchmarking purposes. By default, the server listens on both HTTP and HTTPS to allow testing every scenario.
This implement can attain easily more than 300K request/seconds with 128 connections on a workstation rig and with tools like bombardier
and wrk
. When paired with the client, it reaches 220K request/seconds with TLS on and 16 connections.
The Tests
The test is conducted under a limit number of connections to ensure that clients and server resources usages does not affect the results. Depending on the testing computer test, values up to 128 connections can be used without any issues.
Reproducing the Tests
Here are quick steps to reproduce the tests:
- Download and start fluxzy CLI
fluxzy start -k
-k
flag disables TLS verification. Use --max-upstream-connection
to increase the number of connections, which is set to 16 by default.
- Start
mitmdump
mitmdump -k -q
-k
flag disables TLS verification and -q
suppresses stdout logs that could significantly slow down the proxy.
(optional) Install and start Squid with configuration
cache deny all
andcache_dir null /tmp
Clone the repository floody
git clone https://github.com/haga-rak/floody
- Run the benchmark
./build.sh "compare:3128 44344 8080"
Port number 3128
is the Squid port, 44344
is the Fluxzy port, and 8080
is the mitmdump port. Of course, you can change these values to match your setup.
Results
The tests take into account the following configurations:
- Active MITM, the proxy decode the TLS request and send it forward to the other peer.
- 16 concurrent connections
- Plain
HTTP/HTTP/1.1
andH2/TLS
- Response body size of No response body and 8192 bytes response body
PLAIN - No response body - 15s
Total | Success | Fail | req/s | Bandwidth | |
---|---|---|---|---|---|
No proxy | 4035035 | 4035031 | 0 | 269002.0 | 33.86 MB/s |
squid | 389874 | 389874 | 0 | 25991.6 | 5.4 MB/s |
fluxzy | 1525442 | 1525442 | 0 | 101696.1 | 12.8 MB/s |
mitmproxy/mitmdump | 22064 | 22064 | 0 | 1470.9 | 189.61 KB/s |
Diff. MITM | 69 times | 69 times | / | 69 times | 69 times |
Fedora Linux 41 (Workstation Edition) **AMD Ryzen 9 7950X3D 16-Core Processor
TLS - No response body - 15s
Total | Success | Fail | req/s | Bandwidth | |
---|---|---|---|---|---|
No proxy | 3317020 | 3317020 | 0 | 221134.667 | 27.84 MB/s |
fluxzy | 852732 | 852732 | 0 | 56848.800 | 15.51 MB/s |
mitmproxy/mitmdump | 20994 | 20994 | 0 | 1399.600 | 392.9 KB/s |
Diff. MITM | 40 times | 40 times | / | 40 times | 40 times |
Fedora Linux 41 (Workstation Edition) **AMD Ryzen 9 7950X3D 16-Core Processor
PLAIN - 8192 bytes response body - 15s
Total | Success | Fail | req/s | Bandwidth | |
---|---|---|---|---|---|
No proxy | 2669850 | 2669850 | 0 | 177990.000 | 1.38 GB/s |
squid | 228279 | 228279 | 0 | 15218.600 | 122.49 MB/s |
fluxzy | 930238 | 930238 | 0 | 62015.867 | 493.61 MB/s |
mitmproxy/mitmdump | 20860 | 20860 | 0 | 1390.667 | 11.07 MB/s |
Diff. MITM | 44 times | 44 times | / | 44 times | 44 times |
Fedora Linux 41 (Workstation Edition) **AMD Ryzen 9 7950X3D 16-Core Processor
TLS - 8192 bytes response body - 15s
Total | Success | Fail | req/s | Bandwidth | |
---|---|---|---|---|---|
No proxy | 1822330 | 1822330 | 0 | 121488.667 | 966.97 MB/s |
fluxzy | 532140 | 532134 | 0 | 35475.600 | 568.46 MB/s |
mitmproxy/mitmdump | 18784 | 18784 | 0 | 1252.267 | 20.02 MB/s |
Diff. MITM | 28.329 times | 28.329 times | / | 28.329 times | 28.401 times |
Fedora Linux 41 (Workstation Edition) **AMD Ryzen 9 7950X3D 16-Core Processor
Breaking down the Performance Gap
This benchmark primarily measures I/O operations combined with TLS processing, examining how efficiently data is received, processed, and returned. The goal is to evaluate each tool's performance under similar conditions.
The performance gap between mitmproxy and Fluxzy can likely be partially attributed to their underlying platforms. mitmproxy relies on Python, which can exhibit slower performance characteristics in high-throughput scenarios. Fluxzy is built on .NET 8.0, which benefits from recent performance optimizations, particularly in garbage collection and memory management.
Fluxzy also incorporates several design choices aimed at maximizing efficiency:
- Single Buffer Usage: A single buffer processes client requests, reducing memory overhead and streamlining data handling.
- Always-On Streaming Mode: Built-in actions in Fluxzy maintain an active streaming approach that does not store an entire response in user-space memory if it exceeds the buffer size. This keeps memory usage consistent, even for large payloads.
- Stack Manipulation Techniques: Utilizing features such as stackalloc and Span
minimizes heap allocations for synchronous operations with moderate memory requirements, a common scenario in HTTP intermediary services. - Predefined Configuration Rules: Unlike dynamic scripting (as seen in mitmproxy), Fluxzy employs predefined rules mapped to compiled code. Testing showed that adding a response header in this manner has negligible impact on performance.
Finally, TLS implementation does not appear to be a decisive factor in the performance difference. Both tools use OpenSSL by default on Linux, offering native TLS support that operates independently of the application layer.
Final words
Are theses results important? Probably not. Mitmproxy/mitmdump is fast enough for most use cases. It’s a tool that has been around for over a decade, benefiting from extensive user feedback and a large, active community. The fact that it’s built in Python—a very accessible language—makes it incredibly flexible to use and to extend. And in fact, I'm still secretly a mitmproxy enjoyer.
As for Fluxzy, while it already offers extensive capabilities for traffic manipulation (40+ possibles actions for now), most users who integrate Fluxzy into their tools use it to either collect synthetic monitoring data with minimal overhead or to implement enterprise-level WAFs with advanced rules.
This simple benchmark session, of course, wasn’t conducted under perfect conditions or following strict scientific methods. It was simply designed to give a rough overview of performance differences.
Published at Wednesday, 22 January 2025