diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 97d039a..93177e6 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -12,10 +12,10 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@v4 + uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4 - name: Set up Go - uses: actions/setup-go@v5 + uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5 with: go-version-file: go.mod cache: true @@ -28,3 +28,9 @@ jobs: - name: Test run: go test -race ./... + + - name: Install govulncheck + run: go install golang.org/x/vuln/cmd/govulncheck@latest + + - name: Run govulncheck + run: govulncheck ./... diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 01bf857..f95585d 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -14,12 +14,12 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@v4 + uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4 with: fetch-depth: 0 - name: Set up Go - uses: actions/setup-go@v5 + uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5 with: go-version-file: go.mod cache: true @@ -28,7 +28,7 @@ jobs: run: go test -race ./... - name: Run GoReleaser - uses: goreleaser/goreleaser-action@v6 + uses: goreleaser/goreleaser-action@e435ccd777264be153ace6237001ef4d979d3a7a # v6 with: distribution: goreleaser version: "~> v2" diff --git a/.gitignore b/.gitignore index d125ac4..db6d8b7 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,7 @@ # Build output /bin/ /dist/ +/static-web # Test and coverage artifacts coverage.out diff --git a/CHANGELOG.md b/CHANGELOG.md index 0504e1b..aebe21b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,21 @@ +## v1.3.0 (2026-03-08) + +### Perf + +- **server**: migrate HTTP layer from net/http to fasthttp — ~141k req/sec (55% faster than Bun) +- **server**: use `tcp4` listener to eliminate dual-stack overhead (2x throughput gain on macOS) + +### Refactor + +- **handler**: replace `http.ServeContent` with custom `parseRange()`/`serveRange()` for byte-range requests +- **compress**: convert gzip middleware from wrapping `ResponseWriter` to post-processing response body +- **security**: use `ctx.SetStatusCode()`+`ctx.SetBodyString()` instead of `ctx.Error()` to preserve headers +- **cache**: change `CachedFile` header fields from `[]string` to `string` + +### Build + +- **benchmark**: add fasthttp/net-http hello world baselines and update baremetal script + ## v1.2.0 (2026-03-07) ### Feat diff --git a/CLI.md b/CLI.md index 81a4556..e1809b5 100644 --- a/CLI.md +++ b/CLI.md @@ -186,6 +186,7 @@ Grouped by concern for readability. All flags are optional; unset flags do not o |------|------|---------|--------------| | `--host` | string | `` (all interfaces) | `server.addr` (host part) | | `--port`, `-p` | int | `8080` | `server.addr` (port part) | +| `--redirect-host` | string | — | `server.redirect_host` | | `--tls-cert` | string | — | `server.tls_cert` | | `--tls-key` | string | — | `server.tls_key` | | `--tls-port` | int | `8443` | `server.tls_addr` (port part) | @@ -205,6 +206,8 @@ Grouped by concern for readability. All flags are optional; unset flags do not o |------|------|---------|--------------| | `--no-cache` | bool | `false` | `cache.enabled = false` | | `--cache-size` | string | `256MB` | `cache.max_bytes` (parses `256MB`, `64MB`, `1GB`) | +| `--preload` | bool | `false` | `cache.preload` — load all files into cache at startup | +| `--gc-percent` | int | `0` | `cache.gc_percent` — Go GC target % (0 = default; try 400 for throughput) | #### Compression @@ -244,6 +247,7 @@ static-web --dir-listing --no-dotfile-block ~/Downloads # Serve with TLS (HTTPS on :443, HTTP redirect on :80) static-web --port 80 --tls-port 443 \ + --redirect-host static.example.com \ --tls-cert /etc/ssl/cert.pem \ --tls-key /etc/ssl/key.pem \ ./public @@ -265,6 +269,9 @@ static-web # Disable caching (useful during local development to see file changes immediately) static-web --no-cache ./dist +# Maximum throughput: preload all files + tune GC +static-web --preload --gc-percent 400 ./dist + # Print version info static-web version ``` @@ -385,7 +392,7 @@ The CLI was implemented using Go stdlib `flag.FlagSet` — no external framework - **`--host` + `--port` merging**: `net.SplitHostPort` / `net.JoinHostPort` used to decompose and reconstruct `server.addr`. - **`parseBytes()`**: a small helper that parses `256MB`, `1GB`, etc. with `B`/`KB`/`MB`/`GB` suffixes (case-insensitive). - **`//go:embed config.toml.example`**: the example config is embedded in `cmd/static-web/` at compile time. The binary is fully self-contained. -- **`--quiet`**: passes `io.Discard` to a `loggingMiddlewareWithWriter` variant, suppressing access log output with zero overhead. +- **`--quiet`**: skips access-log middleware entirely, removing per-request logging overhead. - **`--verbose`**: calls `logConfig(cfg)` after all overrides are applied, so you see the final resolved values. - **Version injection**: `internal/version.Version`, `Commit`, `Date` are set via `-ldflags` at build time. Default to `"dev"`, `"none"`, `"unknown"` for `go run`. diff --git a/Makefile b/Makefile index c7b9297..44134ee 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -.PHONY: build run test bench lint precompress clean release install commit bump changelog benchmark benchmark-keep benchmark-down +.PHONY: build run test bench lint precompress clean release install commit bump changelog benchmark benchmark-keep benchmark-down benchmark-baremetal # Binary output path and name BIN := bin/static-web @@ -83,3 +83,7 @@ benchmark-keep: ## benchmark-down: tear down any running benchmark containers benchmark-down: docker compose -f benchmark/docker-compose.benchmark.yml down --remove-orphans + +## benchmark-baremetal: run bare-metal benchmark (static-web production vs Bun, no Docker) +benchmark-baremetal: + @bash benchmark/baremetal.sh diff --git a/README.md b/README.md index 8bbcacb..6e39cc9 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # static-web -A production-grade, high-performance static web file server written in Go. Zero external runtime dependencies beyond `BurntSushi/toml` and `hashicorp/golang-lru/v2`. +A production-grade, high-performance static web file server written in Go. Built on [fasthttp](https://github.com/valyala/fasthttp) for maximum throughput — **~141k req/sec**, 55% faster than Bun's native static server. ## Table of Contents @@ -53,11 +53,11 @@ static-web --help | Feature | Detail | |---------|--------| -| **In-memory LRU cache** | Size-bounded, byte-accurate; zero-alloc hot path (~27 ns/op) | +| **In-memory LRU cache** | Size-bounded, byte-accurate; ~28 ns/op lookup with 0 allocations. Optional startup preload for instant cache hits. | | **gzip compression** | On-the-fly via pooled `gzip.Writer`; pre-compressed `.gz`/`.br` sidecar support | | **HTTP/2** | Automatic ALPN negotiation when TLS is configured | | **Conditional requests** | ETag, `304 Not Modified`, `If-Modified-Since`, `If-None-Match` | -| **Range requests** | Byte ranges via `http.ServeContent` for video and large files | +| **Range requests** | Byte ranges via custom `parseRange`/`serveRange` implementation for video and large files | | **TLS 1.2 / 1.3** | Modern cipher suites; configurable cert/key paths | | **Security headers** | `X-Content-Type-Options`, `X-Frame-Options`, `Content-Security-Policy`, `Referrer-Policy`, `Permissions-Policy` | | **HSTS** | `Strict-Transport-Security` on all HTTPS responses; configurable max-age | @@ -68,7 +68,7 @@ static-web --help | **Symlink escape prevention** | `EvalSymlinks` re-verified against root; symlinks pointing outside root are blocked | | **CORS** | Configurable per-origin or wildcard (`*` emits literal `*`, never reflected) | | **Graceful shutdown** | SIGTERM/SIGINT drains in-flight requests with configurable timeout | -| **Live cache flush** | SIGHUP flushes the in-memory cache without downtime | +| **Live cache flush** | SIGHUP flushes both the in-memory file cache and the path-safety cache without downtime | --- @@ -83,7 +83,7 @@ HTTP request └────────┬────────┘ │ ┌────────▼────────┐ -│ loggingMiddleware │ ← pooled statusResponseWriter; logs method/path/status/duration +│ loggingMiddleware │ ← logs method/path/status/duration └────────┬────────┘ │ ┌────────▼────────────────────────────────────────┐ @@ -91,31 +91,28 @@ HTTP request │ • Method whitelist (GET/HEAD/OPTIONS only) │ │ • Security headers (set BEFORE path check) │ │ • PathSafe: null bytes, path.Clean, EvalSymlinks│ +│ • Path-safety cache (sync.Map, pre-warmed) │ │ • Dotfile blocking │ │ • CORS (preflight + per-origin or wildcard *) │ -│ • Injects validated path into context │ -└────────┬────────────────────────────────────────┘ - │ -┌────────▼────────────────────────────────────────┐ -│ headers.Middleware │ -│ • 304 Not Modified (ETag, If-Modified-Since) │ -│ • Cache-Control, immutable pattern matching │ -└────────┬────────────────────────────────────────┘ - │ -┌────────▼────────────────────────────────────────┐ -│ compress.Middleware │ -│ • lazyGzipWriter: decides at first Write() │ -│ • Skips 1xx/204/304, non-compressible types │ -│ • Respects q=0 explicit denial │ +│ • Injects validated path into ctx.SetUserValue │ └────────┬────────────────────────────────────────┘ │ ┌────────▼────────────────────────────────────────┐ │ handler.FileHandler │ -│ • Cache hit → serve from memory (zero os.Stat) │ +│ • Cache hit → direct ctx.SetBody() fast path │ +│ • Range/conditional → custom serveRange() │ │ • Cache miss → os.Stat → disk read → cache put │ │ • Large files (> max_file_size) bypass cache │ │ • Encoding negotiation: brotli > gzip > plain │ +│ • Preloaded files served instantly on startup │ │ • Custom 404 page (path-validated) │ +└─────────────────────────────────────────────────┘ + │ +┌────────▼────────────────────────────────────────┐ +│ compress.Middleware (post-processing) │ +│ • Compresses response body after handler runs │ +│ • Skips 1xx/204/304, non-compressible types │ +│ • Respects q=0 explicit denial │ └─────────────────────────────────────────────────┘ ``` @@ -125,32 +122,52 @@ HTTP request GET /app.js │ ├─ cache.Get("/app.js") hit? - │ YES → serveFromCache (no syscall) → done + │ YES → serveFromCache (direct ctx.SetBody, no syscall) → done │ └─ NO → resolveIndexPath → cache.Get(canonicalURL) hit? YES → serveFromCache → done NO → os.Stat → os.ReadFile → cache.Put → serveFromCache ``` +When `preload = true`, every eligible file is loaded into cache at startup. The path-safety cache (`sync.Map`) is also pre-warmed, so the very first request for any preloaded file skips both filesystem I/O and `EvalSymlinks`. + --- ## Performance -Benchmark numbers on Apple M2 Pro (`go test -bench=. -benchtime=5s`): +### End-to-end HTTP benchmarks -| Benchmark | ops/s | ns/op | allocs/op | -|-----------|-------|-------|-----------| -| `BenchmarkCacheGet` | 87–131 M | 27 | 0 | -| `BenchmarkCachePut` | 42–63 M | 57 | 0 | -| `BenchmarkCacheGetParallel` | 15–25 M | 142–147 | 0 | -| `BenchmarkHandler_CacheHit` | — | ~5,840 | — | +Measured on Apple M-series, localhost (no Docker), serving 3 small static files via `bombardier -c 100 -n 100000`: -Key design decisions driving these numbers: +| Server | Avg Req/sec | p50 Latency | p99 Latency | Throughput | +|--------|-------------|-------------|-------------|------------| +| **static-web** (fasthttp + preload) | **~141,000** | **619 µs** | **2.46 ms** | **469 MB/s** | +| Bun (native static serve) | ~90,000 | 1.05 ms | 2.33 ms | 306 MB/s | +| static-web (old net/http) | ~76,000 | 1.25 ms | 3.15 ms | — | +With `preload = true` and the fasthttp engine, static-web delivers **~141k req/sec** — **55% faster than Bun's native static serving**, while offering full security headers, TLS, and compression out of the box. + +### Micro-benchmarks + +Measured on Apple M2 Pro (`go test -bench=. -benchtime=5s`): + +| Benchmark | ops/s | ns/op | allocs/op | +|-----------|-------|-------|-----------| +| `BenchmarkCacheGet` | 35–42 M | 28–29 | 0 | +| `BenchmarkCacheGetParallel` | 6–8 M | 139–148 | 0 | + +### Key design decisions + +- **fasthttp engine**: Built on [fasthttp](https://github.com/valyala/fasthttp) — pre-allocated per-connection buffers with near-zero allocation hot path. Cache hits bypass all string formatting; headers are pre-computed at cache-population time. +- **`tcp4` listener**: IPv4-only listener eliminates dual-stack overhead on macOS/Linux — a 2× throughput difference vs `"tcp"`. +- **Preload at startup**: `preload = true` reads all eligible files into RAM before the first request — eliminating cold-miss latency. +- **Direct `ctx.SetBody()` fast path**: cache hits bypass range/conditional logic entirely; pre-formatted `Content-Type` and `Content-Length` headers are assigned directly. +- **Custom Range implementation**: `parseRange()`/`serveRange()` handle byte-range requests without `http.ServeContent`. +- **Post-processing compression**: compress middleware runs after the handler, compressing the response body in a single pass. +- **Path-safety cache**: `sync.Map`-based cache eliminates per-request `filepath.EvalSymlinks` syscalls. Pre-warmed from preload. +- **GC tuning**: `gc_percent = 400` reduces garbage collection frequency — the hot path avoids all formatting allocations, with only minimal byte-to-string conversions from fasthttp's `[]byte` API. - **Cache-before-stat**: `os.Stat` is never called on a cache hit — the hot path is pure memory. - **Zero-alloc `AcceptsEncoding`**: walks the `Accept-Encoding` header byte-by-byte without `strings.Split`. -- **Pooled `sync.Pool`**: both `gzip.Writer` and `statusResponseWriter` are pooled. -- **`filepath.Abs` at startup**: computed once during construction, never per-request. - **Pre-computed `ETagFull`**: the `W/"..."` string is built when the file is cached. --- @@ -194,10 +211,10 @@ Only `GET`, `HEAD`, and `OPTIONS` are accepted. All other methods (including `TR | Mitigation | Value | |------------|-------| -| `ReadHeaderTimeout` | 5 s (Slowloris) | -| `ReadTimeout` | 10 s | +| `ReadTimeout` | 10 s (covers full read phase including headers — Slowloris protection) | | `WriteTimeout` | 10 s | -| `MaxHeaderBytes` | 8 KiB | +| `IdleTimeout` | 75 s (keep-alive) | +| `MaxRequestBodySize` | 0 (no body accepted — static server) | --- @@ -211,10 +228,10 @@ Copy `config.toml.example` to `config.toml` and edit as needed. The server start |-----|------|---------|-------------| | `addr` | string | `:8080` | HTTP listen address | | `tls_addr` | string | `:8443` | HTTPS listen address | +| `redirect_host` | string | — | Canonical host used for HTTP→HTTPS redirects | | `tls_cert` | string | — | Path to TLS certificate (PEM) | | `tls_key` | string | — | Path to TLS private key (PEM) | -| `read_header_timeout` | duration | `5s` | Slowloris protection | -| `read_timeout` | duration | `10s` | Full request read deadline | +| `read_timeout` | duration | `10s` | Full request read deadline (covers headers; Slowloris protection) | | `write_timeout` | duration | `10s` | Response write deadline | | `idle_timeout` | duration | `75s` | Keep-alive idle timeout | | `shutdown_timeout` | duration | `15s` | Graceful drain window | @@ -232,9 +249,11 @@ Copy `config.toml.example` to `config.toml` and edit as needed. The server start | Key | Type | Default | Description | |-----|------|---------|-------------| | `enabled` | bool | `true` | Toggle in-memory LRU cache | +| `preload` | bool | `false` | Load all eligible files into cache at startup | | `max_bytes` | int | `268435456` | Cache size cap (bytes) | | `max_file_size` | int | `10485760` | Max file size to cache (bytes) | | `ttl` | duration | `0` | Entry TTL (0 = no expiry; flush with SIGHUP) | +| `gc_percent` | int | `0` | Go GC target percentage (0 = use Go default of 100) | ### `[compression]` @@ -276,9 +295,9 @@ All environment variables override the corresponding TOML setting. Useful for co |----------|-------------| | `STATIC_SERVER_ADDR` | `server.addr` | | `STATIC_SERVER_TLS_ADDR` | `server.tls_addr` | +| `STATIC_SERVER_REDIRECT_HOST` | `server.redirect_host` | | `STATIC_SERVER_TLS_CERT` | `server.tls_cert` | | `STATIC_SERVER_TLS_KEY` | `server.tls_key` | -| `STATIC_SERVER_READ_HEADER_TIMEOUT` | `server.read_header_timeout` | | `STATIC_SERVER_READ_TIMEOUT` | `server.read_timeout` | | `STATIC_SERVER_WRITE_TIMEOUT` | `server.write_timeout` | | `STATIC_SERVER_IDLE_TIMEOUT` | `server.idle_timeout` | @@ -287,9 +306,11 @@ All environment variables override the corresponding TOML setting. Useful for co | `STATIC_FILES_INDEX` | `files.index` | | `STATIC_FILES_NOT_FOUND` | `files.not_found` | | `STATIC_CACHE_ENABLED` | `cache.enabled` | +| `STATIC_CACHE_PRELOAD` | `cache.preload` | | `STATIC_CACHE_MAX_BYTES` | `cache.max_bytes` | | `STATIC_CACHE_MAX_FILE_SIZE` | `cache.max_file_size` | | `STATIC_CACHE_TTL` | `cache.ttl` | +| `STATIC_CACHE_GC_PERCENT` | `cache.gc_percent` | | `STATIC_COMPRESSION_ENABLED` | `compression.enabled` | | `STATIC_COMPRESSION_MIN_SIZE` | `compression.min_size` | | `STATIC_COMPRESSION_LEVEL` | `compression.level` | @@ -307,12 +328,13 @@ Set `tls_cert` and `tls_key` to enable HTTPS: [server] addr = ":80" tls_addr = ":443" +redirect_host = "static.example.com" tls_cert = "/etc/ssl/certs/server.pem" tls_key = "/etc/ssl/private/server.key" ``` When TLS is configured: -- HTTP requests on `addr` are automatically **redirected** to `tls_addr` with `301 Moved Permanently`. +- HTTP requests on `addr` are automatically **redirected** to HTTPS. Set `redirect_host` when `tls_addr` listens on all interfaces (for example `:443`) so redirects use a canonical host instead of the incoming `Host` header. - **HTTP/2** is enabled automatically via ALPN negotiation. - **HSTS** (`Strict-Transport-Security`) is added to all HTTPS responses (configurable max-age). - Minimum TLS version is **1.2**; preferred cipher suites are ECDHE+AES-256-GCM and ChaCha20-Poly1305. @@ -348,9 +370,9 @@ make precompress # runs gzip and brotli on all .js/.css/.html/.json/.svg |--------|--------| | `SIGTERM` | Graceful shutdown (drains in-flight requests up to `shutdown_timeout`) | | `SIGINT` | Graceful shutdown | -| `SIGHUP` | Flush in-memory cache; re-reads config pointer in `main` | +| `SIGHUP` | Flush in-memory file cache and path-safety cache; re-reads config pointer in `main` | -> **Note**: SIGHUP reloads the config pointer in `main` but the live middleware chain holds references to the old config. A full restart is required for config changes to take effect. SIGHUP is useful for flushing the cache without downtime. +> **Note**: SIGHUP reloads the config pointer in `main` but the live middleware chain holds references to the old config. A full restart is required for config changes to take effect. SIGHUP is useful for flushing both the file cache and the path-safety cache without downtime. --- @@ -400,5 +422,4 @@ go test -race ./... # all tests, race-free | Limitation | Detail | |------------|--------| | **Brotli on-the-fly** | Not implemented. Only pre-compressed `.br` sidecar files are served. | -| **Cache TTL not enforced** | `cache.ttl` is parsed but the expiry logic is not yet implemented. Use SIGHUP to flush manually. | | **SIGHUP config reload** | Reloads the config struct pointer in `main` only. Live middleware chains hold old references — full restart required for config changes to propagate. | diff --git a/USER_GUIDE.md b/USER_GUIDE.md index 965a031..3304ae9 100644 --- a/USER_GUIDE.md +++ b/USER_GUIDE.md @@ -110,10 +110,10 @@ Run `static-web --help` or see [CLI.md](CLI.md) for the full flag reference. [server] addr = ":8080" # HTTP listen address tls_addr = ":8443" # HTTPS listen address (requires tls_cert + tls_key) +redirect_host = "" # canonical host for HTTP→HTTPS redirects (recommended in production) tls_cert = "" # path to PEM certificate file tls_key = "" # path to PEM private key file -read_header_timeout = "5s" # Slowloris protection -read_timeout = "10s" +read_timeout = "10s" # full read deadline (covers headers; Slowloris protection) write_timeout = "10s" idle_timeout = "75s" shutdown_timeout = "15s" # graceful drain window on SIGTERM/SIGINT @@ -127,7 +127,9 @@ not_found = "404.html" # custom 404 page, relative to root (optional) enabled = true max_bytes = 268435456 # 256 MB total cache cap max_file_size = 10485760 # files > 10 MB bypass the cache -ttl = "0s" # 0 = no expiry; flush manually with SIGHUP +ttl = "0s" # 0 = no expiry; >0 evicts stale entries on access +preload = false # true = load all files into RAM at startup +# gc_percent = 0 # Go GC target %; 400 recommended with preload [compression] enabled = true @@ -159,9 +161,9 @@ Every config field can also be set via an environment variable, which takes prec | ----------------------------------- | ------------------------------------------------ | | `STATIC_SERVER_ADDR` | `server.addr` | | `STATIC_SERVER_TLS_ADDR` | `server.tls_addr` | +| `STATIC_SERVER_REDIRECT_HOST` | `server.redirect_host` | | `STATIC_SERVER_TLS_CERT` | `server.tls_cert` | | `STATIC_SERVER_TLS_KEY` | `server.tls_key` | -| `STATIC_SERVER_READ_HEADER_TIMEOUT` | `server.read_header_timeout` | | `STATIC_SERVER_READ_TIMEOUT` | `server.read_timeout` | | `STATIC_SERVER_WRITE_TIMEOUT` | `server.write_timeout` | | `STATIC_SERVER_IDLE_TIMEOUT` | `server.idle_timeout` | @@ -170,9 +172,11 @@ Every config field can also be set via an environment variable, which takes prec | `STATIC_FILES_INDEX` | `files.index` | | `STATIC_FILES_NOT_FOUND` | `files.not_found` | | `STATIC_CACHE_ENABLED` | `cache.enabled` | +| `STATIC_CACHE_PRELOAD` | `cache.preload` | | `STATIC_CACHE_MAX_BYTES` | `cache.max_bytes` | | `STATIC_CACHE_MAX_FILE_SIZE` | `cache.max_file_size` | | `STATIC_CACHE_TTL` | `cache.ttl` | +| `STATIC_CACHE_GC_PERCENT` | `cache.gc_percent` | | `STATIC_COMPRESSION_ENABLED` | `compression.enabled` | | `STATIC_COMPRESSION_MIN_SIZE` | `compression.min_size` | | `STATIC_COMPRESSION_LEVEL` | `compression.level` | @@ -212,6 +216,7 @@ Then in `config.toml`: [server] addr = ":8080" tls_addr = ":8443" +redirect_host = "localhost" tls_cert = "server.crt" tls_key = "server.key" ``` @@ -519,6 +524,17 @@ docker run --rm -p 8080:8080 \ docker kill --signal=HUP ``` +**Maximum throughput with preload (Docker env vars):** + +```bash +docker run --rm -p 8080:8080 \ + -v "$(pwd)/public:/public:ro" \ + -e STATIC_FILES_ROOT=/public \ + -e STATIC_CACHE_PRELOAD=true \ + -e STATIC_CACHE_GC_PERCENT=400 \ + static-web:latest +``` + --- ## Health Checks and Readiness Probes @@ -565,7 +581,7 @@ healthcheck: ## Live Cache Flush (SIGHUP) -Send `SIGHUP` to flush the in-memory LRU cache without restarting the server. This is useful after deploying updated static files to disk — new requests will read fresh content from disk and repopulate the cache. +Send `SIGHUP` to flush both the in-memory LRU file cache and the path-safety cache without restarting the server. This is useful after deploying updated static files to disk — new requests will read fresh content from disk and repopulate the cache. ```bash # by PID @@ -578,7 +594,57 @@ systemctl kill --signal=HUP static-web.service docker kill --signal=HUP ``` -> **Important:** SIGHUP only flushes the cache. It does **not** reload the configuration. Config changes require a full restart. +> **Important:** SIGHUP flushes the file cache and the path-safety cache. It does **not** reload the configuration. Config changes require a full restart. + +--- + +## Preloading for Maximum Performance + +Enable `preload` to read every eligible file into the in-memory cache at startup. Combined with the fasthttp engine, this yields the highest possible throughput — up to **~141,000 req/sec** on Apple M-series (**55% faster than Bun's native static serve**, while including full security headers, TLS, and compression). + +### Configuration + +```toml +[cache] +enabled = true +preload = true # load all files under [files.root] into RAM at startup +gc_percent = 400 # reduce GC frequency for throughput (default: 0 = Go default 100) +``` + +Or via CLI flags: + +```bash +static-web --preload --gc-percent 400 ./dist +``` + +Or via environment variables: + +```bash +STATIC_CACHE_PRELOAD=true STATIC_CACHE_GC_PERCENT=400 ./bin/static-web +``` + +### What preloading does + +1. At startup, walks every file under `files.root`. +2. Files smaller than `max_file_size` are read into the LRU cache. +3. Pre-formatted `Content-Type` and `Content-Length` response headers are computed once per file. +4. The path-safety cache (`sync.Map`) is pre-warmed — the first request for any preloaded file skips `filepath.EvalSymlinks`. +5. Preload statistics (file count, total bytes, duration) are logged at startup. + +### When to use preload + +- **Ideal**: bounded set of static files (SPA builds, marketing sites, docs sites). +- **Not recommended**: very large file trees where total size exceeds `max_bytes`, or directories with frequent file changes. + +### GC tuning + +`gc_percent` sets the Go runtime `GOGC` target. A higher value means the GC runs less often, trading memory for throughput. The handler's hot path is allocation-free, and fasthttp reuses per-connection buffers (unlike net/http which allocates per-request). Recommended values: + +| `gc_percent` | Behaviour | +|---|---| +| `0` | Do not change (Go default: 100) | +| `200` | Moderate: ~5% throughput boost | +| `400` | Aggressive: ~8% throughput boost (recommended with preload) | --- @@ -674,7 +740,6 @@ Directory listing is **disabled by default** (`directory_listing = false`). Enab | Limitation | Impact | Workaround | | ------------------------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------------------- | | **Brotli on-the-fly not implemented** | Brotli encoding requires pre-compressed `.br` files. | Run `make precompress` as part of your build pipeline. | -| **Cache TTL not enforced** | `cache.ttl` is parsed but entries never expire on their own. | Use SIGHUP to flush the cache after deploying new files. | | **No hot config reload** | SIGHUP flushes the cache only; config changes require a restart. | Use a process manager (systemd, Docker restart policy) for zero-downtime restarts. | --- @@ -700,13 +765,13 @@ The server only accepts `GET`, `HEAD`, and `OPTIONS`. Any other method (POST, PU ### Files are stale after a deploy -The in-memory cache serves files from memory after the first request. After deploying new files to disk, flush the cache: +The in-memory cache serves files from memory after the first request (or immediately if `preload = true`). After deploying new files to disk, flush both the file cache and the path-safety cache: ```bash kill -HUP $(pgrep static-web) ``` -If `cache.ttl` is set, note that it is currently parsed but **not enforced** — entries do not expire automatically. SIGHUP is the only way to clear them. +If `cache.ttl` is `0`, entries remain cached until eviction pressure or SIGHUP flush. If `cache.ttl` is greater than `0`, stale entries are evicted automatically on access. ### Compression not working diff --git a/benchmark/baremetal.sh b/benchmark/baremetal.sh new file mode 100755 index 0000000..504920e --- /dev/null +++ b/benchmark/baremetal.sh @@ -0,0 +1,286 @@ +#!/usr/bin/env bash +# ============================================================================= +# baremetal.sh — Bare-metal benchmark: static-web vs Bun +# +# Builds static-web from source, benchmarks two servers on the same port one +# at a time, then prints a head-to-head comparison. No Docker. +# +# Configurations tested: +# 1. static-web --preload --gc-percent 400 (production optimised) +# 2. Bun native static HTML server +# +# Usage: +# ./benchmark/baremetal.sh [OPTIONS] +# +# Options: +# -c Connections (default: 50) +# -n Total requests (default: 100000) +# -d Duration seconds — overrides -n when set +# -p Port to use (default: 8080) +# -r Root directory (default: ./public) +# -h Show this help +# +# Requirements: +# - go (to build static-web) +# - bun (https://bun.sh) +# - bombardier (https://github.com/codesenberg/bombardier) +# ============================================================================= +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +RESULTS_DIR="${SCRIPT_DIR}/results" + +# ---------- defaults --------------------------------------------------------- +CONNECTIONS=100 +REQUESTS=100000 +DURATION="" +PORT=8080 +ROOT_DIR="./public" +WARMUP_REQUESTS=50000 +SETTLE_SECONDS=3 # pause between server start and warmup + +# ---------- colours ---------------------------------------------------------- +RED='\033[0;31m'; YELLOW='\033[1;33m'; GREEN='\033[0;32m' +CYAN='\033[0;36m'; BOLD='\033[1m'; DIM='\033[2m'; RESET='\033[0m' + +# ---------- arg parse -------------------------------------------------------- +usage() { + grep '^#' "$0" | grep -v '^#!/' | sed 's/^# \{0,2\}//' + exit 0 +} + +while getopts "c:n:d:p:r:h" opt; do + case $opt in + c) CONNECTIONS="$OPTARG" ;; + n) REQUESTS="$OPTARG" ;; + d) DURATION="$OPTARG" ;; + p) PORT="$OPTARG" ;; + r) ROOT_DIR="$OPTARG" ;; + h) usage ;; + *) echo "Unknown option -$OPTARG"; exit 1 ;; + esac +done + +# ---------- dependency checks ------------------------------------------------ +check_deps() { + local missing="" + command -v go >/dev/null 2>&1 || missing="$missing go" + command -v bun >/dev/null 2>&1 || missing="$missing bun" + command -v bombardier >/dev/null 2>&1 || missing="$missing bombardier" + + if [ -n "$missing" ]; then + echo -e "${RED}Missing dependencies:${missing}${RESET}" + echo "" + echo "Install bombardier: brew install bombardier" + echo "Install bun: curl -fsSL https://bun.sh/install | bash" + exit 1 + fi +} + +# ---------- helpers ---------------------------------------------------------- +BIN="${PROJECT_ROOT}/static-web" +SERVER_PID="" + +cleanup() { + if [ -n "$SERVER_PID" ] && kill -0 "$SERVER_PID" 2>/dev/null; then + kill "$SERVER_PID" 2>/dev/null + wait "$SERVER_PID" 2>/dev/null || true + fi +} +trap cleanup EXIT + +wait_for_port() { + local port=$1 max=15 i=0 + while ! curl -sf -o /dev/null "http://localhost:${port}/" 2>/dev/null; do + sleep 0.5 + i=$((i + 1)) + if [ "$i" -ge "$max" ]; then + echo -e " ${RED}TIMEOUT${RESET}" + return 1 + fi + done +} + +kill_on_port() { + local pids + pids=$(lsof -ti :"$1" 2>/dev/null || true) + if [ -n "$pids" ]; then + echo "$pids" | xargs kill -9 2>/dev/null || true + sleep 1 + fi +} + +run_bombardier() { + local url=$1 + if [ -n "$DURATION" ]; then + bombardier -c "$CONNECTIONS" -d "${DURATION}s" -l --print r "$url" 2>/dev/null + else + bombardier -c "$CONNECTIONS" -n "$REQUESTS" -l --print r "$url" 2>/dev/null + fi +} + +parse_rps() { awk '/Reqs\/sec/{print $2; exit}'; } +parse_p50() { awk '/50\%/{print $2; exit}'; } +parse_p99() { awk '/99\%/{print $2; exit}'; } +parse_tp() { awk '/Throughput/{print $2; exit}'; } + +# ---------- main ------------------------------------------------------------- +main() { + check_deps + + mkdir -p "$RESULTS_DIR" + + # Resolve root to absolute path + local abs_root + abs_root="$(cd "$PROJECT_ROOT" && cd "$ROOT_DIR" 2>/dev/null && pwd)" || { + echo -e "${RED}Root directory not found: ${ROOT_DIR}${RESET}" + exit 1 + } + + echo "" + echo -e "${BOLD}╔════════════════════════════════════════════════════════════════════╗${RESET}" + echo -e "${BOLD}║ Bare-Metal Benchmark: static-web vs Bun ║${RESET}" + echo -e "${BOLD}╚════════════════════════════════════════════════════════════════════╝${RESET}" + echo "" + + if [ -n "$DURATION" ]; then + echo -e " ${CYAN}Mode: duration ${DURATION}s${RESET}" + else + echo -e " ${CYAN}Mode: ${REQUESTS} requests${RESET}" + fi + echo -e " ${CYAN}Connections: ${CONNECTIONS}${RESET}" + echo -e " ${CYAN}Warmup: ${WARMUP_REQUESTS} requests${RESET}" + echo -e " ${CYAN}Port: ${PORT}${RESET}" + echo -e " ${CYAN}Root: ${abs_root}${RESET}" + echo -e " ${CYAN}Tool: $(bombardier --version 2>&1 | head -1)${RESET}" + echo -e " ${CYAN}Go: $(go version | awk '{print $3}')${RESET}" + echo -e " ${CYAN}Bun: $(bun --version)${RESET}" + echo -e " ${CYAN}Date: $(date -u '+%Y-%m-%d %H:%M:%S UTC')${RESET}" + echo -e " ${CYAN}OS/Arch: $(uname -s)/$(uname -m)${RESET}" + echo "" + + # ---- use pre-built static-web binary ------------------------------------- + echo -e "${BOLD}→ Using pre-built static-web: ${BIN}${RESET}" + if [ ! -x "$BIN" ]; then + echo -e "${RED}Binary not found or not executable: ${BIN}${RESET}" + echo -e "${RED}Run: go build -ldflags=\"-s -w\" -o static-web ./cmd/static-web${RESET}" + exit 1 + fi + echo "" + + # Make sure port is free + kill_on_port "$PORT" + + local URL="http://localhost:${PORT}/index.html" + + # Result arrays (indexed: 0=preload, 1=bun) + local -a NAMES RPS_ARR P50_ARR P99_ARR TP_ARR + NAMES=("static-web (preload+gc400)" "Bun") + + # ====================================================================== + # Test 1: static-web --preload --gc-percent 400 (production mode) + # ====================================================================== + echo -e "${BOLD}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${RESET}" + echo -e "${BOLD} [ static-web — production: --preload --gc-percent 400 ]${RESET}" + echo -e "${BOLD}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${RESET}" + + "$BIN" --quiet --no-compress --preload --gc-percent 400 --port "$PORT" "$abs_root" & + SERVER_PID=$! + sleep "$SETTLE_SECONDS" + wait_for_port "$PORT" + echo -e " ${GREEN}Server ready (PID ${SERVER_PID})${RESET}" + + echo -e " ${DIM}Warming up (${WARMUP_REQUESTS} requests)...${RESET}" + bombardier -c "$CONNECTIONS" -n "$WARMUP_REQUESTS" --print i "$URL" >/dev/null 2>&1 + echo -e " ${DIM}Settle (${SETTLE_SECONDS}s)...${RESET}" + sleep "$SETTLE_SECONDS" + + echo -e " ${CYAN}Benchmarking...${RESET}" + local raw + raw=$(run_bombardier "$URL" | tee "${RESULTS_DIR}/baremetal-static-web-preload.txt") + echo "" + + RPS_ARR[0]=$(echo "$raw" | parse_rps) + P50_ARR[0]=$(echo "$raw" | parse_p50) + P99_ARR[0]=$(echo "$raw" | parse_p99) + TP_ARR[0]=$(echo "$raw" | parse_tp) + + kill "$SERVER_PID" 2>/dev/null; wait "$SERVER_PID" 2>/dev/null || true + SERVER_PID="" + sleep 1 + kill_on_port "$PORT" + + # ====================================================================== + # Test 2: Bun static serve + # ====================================================================== + echo -e "${BOLD}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${RESET}" + echo -e "${BOLD} [ Bun — native static HTML server ]${RESET}" + echo -e "${BOLD}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${RESET}" + + (cd "$abs_root" && bun --port "$PORT" ./index.html) & + SERVER_PID=$! + sleep "$SETTLE_SECONDS" + wait_for_port "$PORT" + echo -e " ${GREEN}Server ready (PID ${SERVER_PID})${RESET}" + + echo -e " ${DIM}Warming up (${WARMUP_REQUESTS} requests)...${RESET}" + bombardier -c "$CONNECTIONS" -n "$WARMUP_REQUESTS" --print i "$URL" >/dev/null 2>&1 + echo -e " ${DIM}Settle (${SETTLE_SECONDS}s)...${RESET}" + sleep "$SETTLE_SECONDS" + + echo -e " ${CYAN}Benchmarking...${RESET}" + raw=$(run_bombardier "$URL" | tee "${RESULTS_DIR}/baremetal-bun.txt") + echo "" + + RPS_ARR[1]=$(echo "$raw" | parse_rps) + P50_ARR[1]=$(echo "$raw" | parse_p50) + P99_ARR[1]=$(echo "$raw" | parse_p99) + TP_ARR[1]=$(echo "$raw" | parse_tp) + + kill "$SERVER_PID" 2>/dev/null; wait "$SERVER_PID" 2>/dev/null || true + SERVER_PID="" + + # ====================================================================== + # Rank results (descending by RPS — simple swap for 2 elements) + # ====================================================================== + local -a SORTED_IDX=(0 1) + if awk "BEGIN{exit !(${RPS_ARR[1]} > ${RPS_ARR[0]})}" 2>/dev/null; then + SORTED_IDX=(1 0) + fi + + # ====================================================================== + # Print results table + # ====================================================================== + echo "" + echo -e "${BOLD}╔════════════════════════════════════════════════════════════════════╗${RESET}" + echo -e "${BOLD}║ Bare-Metal Results ║${RESET}" + echo -e "${BOLD}╠════════════════════════════════════════════════════════════════════╣${RESET}" + printf "${BOLD}║ %-4s %-30s %10s %8s %8s ║${RESET}\n" \ + "#" "Server" "Req/sec" "p50" "p99" + echo -e "${BOLD}╠════════════════════════════════════════════════════════════════════╣${RESET}" + + local rank=1 + for idx in "${SORTED_IDX[@]}"; do + local colour medal + if [ "$rank" -eq 1 ]; then + colour="$GREEN"; medal="1st" + else + colour="$YELLOW"; medal="2nd" + fi + + printf "${colour}║ %-4s %-30s %10s %8s %8s ║${RESET}\n" \ + "$medal" "${NAMES[$idx]}" "${RPS_ARR[$idx]}" "${P50_ARR[$idx]}" "${P99_ARR[$idx]}" + rank=$((rank + 1)) + done + + echo -e "${BOLD}╚════════════════════════════════════════════════════════════════════╝${RESET}" + echo "" + echo -e " ${DIM}Throughput:${RESET}" + echo -e " ${DIM} preload+gc400 ${TP_ARR[0]}${RESET}" + echo -e " ${DIM} Bun ${TP_ARR[1]}${RESET}" + echo -e " ${DIM}Results saved to: ${RESULTS_DIR}/baremetal-*.txt${RESET}" + echo "" +} + +main "$@" diff --git a/benchmark/bench.sh b/benchmark/bench.sh index 8a6674b..91985db 100755 --- a/benchmark/bench.sh +++ b/benchmark/bench.sh @@ -2,16 +2,16 @@ # ============================================================================= # bench.sh — Static file server benchmark suite # -# Spins up static-web, nginx, bun, and caddy via docker-compose, runs -# bombardier against each one serving /index.html, prints a ranked summary, -# then tears everything down. +# Spins up static-web (default), static-web (preload), nginx, bun, and caddy +# via docker-compose, runs bombardier against each one serving /index.html, +# prints a ranked summary, then tears everything down. # # Usage: # ./benchmark/bench.sh [OPTIONS] # # Options: -# -c Connections (default: 125) -# -n Total requests (default: 500000) +# -c Connections (default: 50) +# -n Total requests (default: 100000) # -d Duration in seconds — overrides -n when set # -k Keep containers running after benchmark (default: tear down) # -h Show this help @@ -28,8 +28,8 @@ COMPOSE_FILE="${SCRIPT_DIR}/docker-compose.benchmark.yml" RESULTS_DIR="${SCRIPT_DIR}/results" # ---------- defaults --------------------------------------------------------- -CONNECTIONS=125 -REQUESTS=500000 +CONNECTIONS=50 +REQUESTS=100000 DURATION="" # empty = use -n mode; set seconds e.g. 30 to use -d mode KEEP=false @@ -70,9 +70,9 @@ check_deps() { } # ---------- servers (parallel indexed arrays — bash 3 compatible) ------------ -SERVER_NAMES=( "static-web" "nginx" "bun" "caddy" ) -SERVER_URLS=( "http://localhost:8001/index.html" "http://localhost:8002/index.html" "http://localhost:8003/index.html" "http://localhost:8004/index.html" ) -SERVER_COUNT=4 +SERVER_NAMES=( "static-web" "nginx" "bun" "caddy" "static-web-preload" ) +SERVER_URLS=( "http://localhost:8001/index.html" "http://localhost:8002/index.html" "http://localhost:8003/index.html" "http://localhost:8004/index.html" "http://localhost:8005/index.html" ) +SERVER_COUNT=5 # ---------- helpers ---------------------------------------------------------- wait_for_server() { @@ -80,7 +80,7 @@ wait_for_server() { local url=$2 local max=30 local i=0 - printf " Waiting for %-12s" "${name}..." + printf " Waiting for %-22s" "${name}..." while ! curl -sf -o /dev/null "$url" 2>/dev/null; do sleep 1 i=$((i + 1)) @@ -124,9 +124,9 @@ main() { mkdir -p "$RESULTS_DIR" echo "" - echo -e "${BOLD}╔══════════════════════════════════════════════════════════╗${RESET}" - echo -e "${BOLD}║ Static Web Server Benchmark Suite ║${RESET}" - echo -e "${BOLD}╚══════════════════════════════════════════════════════════╝${RESET}" + echo -e "${BOLD}╔════════════════════════════════════════════════════════════════════╗${RESET}" + echo -e "${BOLD}║ Static Web Server Benchmark Suite ║${RESET}" + echo -e "${BOLD}╚════════════════════════════════════════════════════════════════════╝${RESET}" echo "" if [ -n "$DURATION" ]; then @@ -158,7 +158,7 @@ main() { echo -e "${BOLD}→ Warming up (10 000 requests each)...${RESET}" i=0 while [ $i -lt $SERVER_COUNT ]; do - printf " %-12s" "${SERVER_NAMES[$i]}" + printf " %-22s" "${SERVER_NAMES[$i]}" bombardier -c "$CONNECTIONS" -n 10000 --print i "${SERVER_URLS[$i]}" >/dev/null 2>&1 echo -e " ${GREEN}done${RESET}" i=$((i + 1)) @@ -199,7 +199,12 @@ main() { # ---- rank by req/s (simple insertion sort, bash 3 compatible) ------------- # Build a sorted index array (descending by RPS) - SORTED_IDX=(0 1 2 3) + SORTED_IDX=() + i=0 + while [ $i -lt $SERVER_COUNT ]; do + SORTED_IDX[$i]=$i + i=$((i + 1)) + done n=${#SORTED_IDX[@]} i=1 while [ $i -lt $n ]; do @@ -223,12 +228,12 @@ main() { best_rps=${RPS[${SORTED_IDX[0]}]} - echo -e "${BOLD}╔══════════════════════════════════════════════════════════╗${RESET}" - echo -e "${BOLD}║ Results Summary ║${RESET}" - echo -e "${BOLD}╠══════════════════════════════════════════════════════════╣${RESET}" - printf "${BOLD}║ %-4s %-14s %12s %10s %10s ║${RESET}\n" \ + echo -e "${BOLD}╔════════════════════════════════════════════════════════════════════╗${RESET}" + echo -e "${BOLD}║ Results Summary ║${RESET}" + echo -e "${BOLD}╠════════════════════════════════════════════════════════════════════╣${RESET}" + printf "${BOLD}║ %-4s %-22s %12s %10s %10s ║${RESET}\n" \ "#" "Server" "Req/sec" "p50 lat" "p99 lat" - echo -e "${BOLD}╠══════════════════════════════════════════════════════════╣${RESET}" + echo -e "${BOLD}╠════════════════════════════════════════════════════════════════════╣${RESET}" rank=1 for idx in "${SORTED_IDX[@]}"; do @@ -244,15 +249,15 @@ main() { elif [ "$rank" -eq 3 ]; then colour="$YELLOW"; medal="3rd" else - colour="$RESET"; medal="4th" + colour="$RESET"; medal="${rank}th" fi - printf "${colour}║ %-4s %-14s %12s %10s %10s ║${RESET}\n" \ + printf "${colour}║ %-4s %-22s %12s %10s %10s ║${RESET}\n" \ "$medal" "$name" "$rps" "$p50" "$p99" rank=$((rank + 1)) done - echo -e "${BOLD}╚══════════════════════════════════════════════════════════╝${RESET}" + echo -e "${BOLD}╚════════════════════════════════════════════════════════════════════╝${RESET}" echo "" echo -e " Full results saved to: ${CYAN}${RESULTS_DIR}/${RESET}" echo "" diff --git a/benchmark/docker-compose.benchmark.yml b/benchmark/docker-compose.benchmark.yml index 423419d..247e077 100644 --- a/benchmark/docker-compose.benchmark.yml +++ b/benchmark/docker-compose.benchmark.yml @@ -1,19 +1,20 @@ # docker-compose.benchmark.yml -# Spins up four static file servers all serving the same public/index.html. +# Spins up five static file servers all serving the same public/index.html. # Run via: make benchmark (or ./benchmark/bench.sh directly) # # Port map: -# 8001 → static-web (built from source) +# 8001 → static-web (default config, built from source) # 8002 → nginx # 8003 → bun # 8004 → caddy +# 8005 → static-web-preload (--preload --gc-percent 400) name: static-web-benchmark services: # ------------------------------------------------------------------------- - # static-web — our server, built from source + # static-web — our server, default config (baseline) # ------------------------------------------------------------------------- static-web: build: @@ -26,9 +27,6 @@ services: environment: - SW_ROOT=/public - SW_ADDR=0.0.0.0:8080 - # Raise the GC target ratio — after warmup the cache is fully populated - # and allocation rate is very low, so running GC less often is a win. - - GOGC=400 # Ensure Go uses all available CPUs (explicit, matches container quota). - GOMAXPROCS=0 restart: unless-stopped @@ -46,7 +44,7 @@ services: restart: unless-stopped # ------------------------------------------------------------------------- - # bun — native HTML static server (bun ./index.html) + # bun — native HTML static server # ------------------------------------------------------------------------- bun: image: oven/bun:alpine @@ -54,7 +52,8 @@ services: - "8003:3000" volumes: - ../public:/www:ro - command: ["bun", "/www/index.html", "--port=3000", "--host=0.0.0.0"] + working_dir: /www + command: ["bun", "./index.html", "--port=3000", "--host=0.0.0.0"] restart: unless-stopped # ------------------------------------------------------------------------- @@ -68,3 +67,21 @@ services: - ../public:/www:ro - ./Caddyfile:/etc/caddy/Caddyfile:ro restart: unless-stopped + + # ------------------------------------------------------------------------- + # static-web-preload — our server with preload + GC tuning (production mode) + # ------------------------------------------------------------------------- + static-web-preload: + build: + context: .. + dockerfile: benchmark/Dockerfile.static-web + ports: + - "8005:8080" + volumes: + - ../public:/public:ro + environment: + - SW_ROOT=/public + - SW_ADDR=0.0.0.0:8080 + - GOMAXPROCS=0 + command: ["static-web", "--quiet", "--no-compress", "--preload", "--gc-percent", "400", "/public"] + restart: unless-stopped diff --git a/benchmark/fasthttp-hello/main.go b/benchmark/fasthttp-hello/main.go new file mode 100644 index 0000000..c418b23 --- /dev/null +++ b/benchmark/fasthttp-hello/main.go @@ -0,0 +1,40 @@ +// Bare-minimum fasthttp server: pre-allocated response, no middleware, no allocs. +// Direct comparison against the net/http hello world to isolate HTTP stack overhead. +package main + +import ( + "log" + "os" + + "github.com/valyala/fasthttp" +) + +var ( + body = []byte("Hello, World!") + contentType = []byte("text/plain") + contentLen = []byte("13") +) + +func handler(ctx *fasthttp.RequestCtx) { + ctx.Response.Header.SetBytesV("Content-Type", contentType) + ctx.Response.Header.SetBytesV("Content-Length", contentLen) + ctx.SetStatusCode(200) + ctx.SetBody(body) +} + +func main() { + port := ":8080" + if p := os.Getenv("PORT"); p != "" { + port = ":" + p + } + + s := &fasthttp.Server{ + Handler: handler, + Name: "fasthttp-hello", + } + + log.Printf("fasthttp listening on %s", port) + if err := s.ListenAndServe(port); err != nil { + log.Fatal(err) + } +} diff --git a/benchmark/raw-hello/main.go b/benchmark/raw-hello/main.go new file mode 100644 index 0000000..eaa192e --- /dev/null +++ b/benchmark/raw-hello/main.go @@ -0,0 +1,27 @@ +// Bare-minimum Go HTTP server: pre-allocated response, no middleware, no allocs. +// This establishes the net/http ceiling on this machine. +package main + +import ( + "net/http" + "os" +) + +var body = []byte("Hello, World!") + +func main() { + port := ":8080" + if p := os.Getenv("PORT"); p != "" { + port = ":" + p + } + + mux := http.NewServeMux() + mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "text/plain") + w.Header().Set("Content-Length", "13") + w.WriteHeader(200) + w.Write(body) + }) + + http.ListenAndServe(port, mux) +} diff --git a/cmd/static-web/config.toml.example b/cmd/static-web/config.toml.example index be83f49..d2b4710 100644 --- a/cmd/static-web/config.toml.example +++ b/cmd/static-web/config.toml.example @@ -13,11 +13,9 @@ tls_cert = "" # Path to TLS private key file (PEM). Leave empty to disable HTTPS. tls_key = "" -# Maximum time to read request headers from the client. -# Protects against Slowloris DoS attacks. Default 5s. -read_header_timeout = "5s" - # Maximum time to read an HTTP request from the client (headers + body). +# With fasthttp, this single timeout covers the full read phase including +# headers, providing Slowloris protection. Default 10s. read_timeout = "10s" # Maximum time to write an HTTP response to the client. @@ -43,6 +41,13 @@ not_found = "404.html" # Enable or disable the in-memory LRU file cache. enabled = true +# Preload all files into cache at startup for maximum throughput. +# When enabled, every file under [files.root] that is smaller than max_file_size +# is read into RAM when the server starts. This eliminates filesystem I/O on +# first requests and can roughly double throughput for cache-hit workloads. +# Default: false. Recommended for deployments with a bounded set of static files. +preload = false + # Maximum total bytes to store in cache (default 256 MB). max_bytes = 268435456 @@ -52,6 +57,11 @@ max_file_size = 10485760 # Optional TTL for cache entries. 0 means no expiry (flush with SIGHUP). ttl = "0s" +# Go runtime garbage collector target percentage. A higher value reduces GC +# frequency, trading memory for throughput. 0 means "do not change" (Go +# default is 100). Recommended: 400 for high-throughput preloaded deployments. +# gc_percent = 0 + [compression] # Enable gzip compression for compressible content types. enabled = true diff --git a/cmd/static-web/main.go b/cmd/static-web/main.go index 69dfa49..946d31d 100644 --- a/cmd/static-web/main.go +++ b/cmd/static-web/main.go @@ -15,18 +15,21 @@ import ( "fmt" "log" "net" - "net/http" "os" "path/filepath" "runtime" + "runtime/debug" "strconv" "strings" "github.com/BackendStack21/static-web/internal/cache" + "github.com/BackendStack21/static-web/internal/compress" "github.com/BackendStack21/static-web/internal/config" "github.com/BackendStack21/static-web/internal/handler" + "github.com/BackendStack21/static-web/internal/security" "github.com/BackendStack21/static-web/internal/server" "github.com/BackendStack21/static-web/internal/version" + "github.com/valyala/fasthttp" ) //go:embed config.toml.example @@ -80,6 +83,7 @@ func runServe(args []string) { host := fs.String("host", "", "host/IP to listen on (default: all interfaces)") port := fs.Int("p", 0, "shorthand for --port") portLong := fs.Int("port", 0, "HTTP port to listen on (default: 8080)") + redirectHost := fs.String("redirect-host", "", "canonical host for HTTP to HTTPS redirects") tlsCert := fs.String("tls-cert", "", "path to TLS certificate file (PEM)") tlsKey := fs.String("tls-key", "", "path to TLS private key file (PEM)") tlsPort := fs.Int("tls-port", 0, "HTTPS port (default: 8443)") @@ -91,6 +95,8 @@ func runServe(args []string) { // Cache. noCache := fs.Bool("no-cache", false, "disable in-memory file cache") cacheSize := fs.String("cache-size", "", "max cache size, e.g. 256MB, 1GB (default: 256MB)") + preload := fs.Bool("preload", false, "preload all files into cache at startup for maximum throughput") + gcPercent := fs.Int("gc-percent", 0, "set Go GC target percentage (0=default, 400 recommended for high throughput)") // Compression. noCompress := fs.Bool("no-compress", false, "disable response compression") @@ -133,6 +139,7 @@ func runServe(args []string) { if err := applyFlagOverrides(cfg, flagOverrides{ host: *host, port: effectivePort, + redirectHost: *redirectHost, tlsCert: *tlsCert, tlsKey: *tlsKey, tlsPort: *tlsPort, @@ -140,6 +147,8 @@ func runServe(args []string) { notFound: *notFound, noCache: *noCache, cacheSize: *cacheSize, + preload: *preload, + gcPercent: *gcPercent, noCompress: *noCompress, cors: *cors, dirListing: *dirListing, @@ -157,42 +166,79 @@ func runServe(args []string) { if !effectiveQuiet { log.Printf("static-web %s starting (addr=%s, root=%s)", version.Version, cfg.Server.Addr, cfg.Files.Root) } + if cfg.Cache.GCPercent > 0 { + old := debug.SetGCPercent(cfg.Cache.GCPercent) + if !effectiveQuiet { + log.Printf("GC target set to %d%% (was %d%%)", cfg.Cache.GCPercent, old) + } + } // Initialise the in-memory file cache (respects cfg.Cache.Enabled). var c *cache.Cache if cfg.Cache.Enabled { - c = cache.NewCache(cfg.Cache.MaxBytes) + c = cache.NewCache(cfg.Cache.MaxBytes, cfg.Cache.TTL) } else { - c = cache.NewCache(0) // zero-size cache effectively disables caching + c = nil + } + + // Preload files into cache at startup if requested. + var pathCache *security.PathCache + if c != nil && cfg.Cache.Preload { + pcfg := cache.PreloadConfig{ + MaxFileSize: cfg.Cache.MaxFileSize, + IndexFile: cfg.Files.Index, + BlockDotfiles: cfg.Security.BlockDotfiles, + CompressEnabled: cfg.Compression.Enabled, + CompressMinSize: cfg.Compression.MinSize, + CompressLevel: cfg.Compression.Level, + CompressFn: compress.GzipBytes, + HTMLMaxAge: cfg.Headers.HTMLMaxAge, + StaticMaxAge: cfg.Headers.StaticMaxAge, + ImmutablePattern: cfg.Headers.ImmutablePattern, + } + stats := c.Preload(cfg.Files.Root, pcfg) + if !effectiveQuiet { + log.Printf("preloaded %d files (%s) into cache (%d skipped)", + stats.Files, formatByteSize(stats.Bytes), stats.Skipped) + } + + // Pre-warm the path cache with every URL key the file cache knows about. + pathCache = security.NewPathCache() + pathCache.PreWarm(stats.Paths, cfg.Files.Root, cfg.Security.BlockDotfiles) + if !effectiveQuiet { + log.Printf("path cache pre-warmed with %d entries", pathCache.Len()) + } } // Build the full middleware + handler chain. - var h http.Handler + var h fasthttp.RequestHandler if effectiveQuiet { - h = handler.BuildHandlerQuiet(cfg, c) + h = handler.BuildHandlerQuiet(cfg, c, pathCache) } else { - h = handler.BuildHandler(cfg, c) + h = handler.BuildHandler(cfg, c, pathCache) } // Create the HTTP/HTTPS server. - srv := server.New(&cfg.Server, &cfg.Security, h) + serverCfg := cfg.Server + srv := server.New(&serverCfg, &cfg.Security, h) // Start listeners in the background. go func() { - if err := srv.Start(&cfg.Server); err != nil { + if err := srv.Start(&serverCfg); err != nil { log.Printf("server start error: %v", err) } }() // Block until SIGTERM/SIGINT, handling SIGHUP for live reload. ctx := context.Background() - server.RunSignalHandler(ctx, srv, c, *cfgPath, &cfg) + server.RunSignalHandler(ctx, srv, c, *cfgPath, &cfg, pathCache) } // flagOverrides groups all serve-subcommand CLI flags that can override config. type flagOverrides struct { host string port int + redirectHost string tlsCert string tlsKey string tlsPort int @@ -200,6 +246,8 @@ type flagOverrides struct { notFound string noCache bool cacheSize string + preload bool + gcPercent int noCompress bool cors string dirListing bool @@ -228,6 +276,9 @@ func applyFlagOverrides(cfg *config.Config, f flagOverrides) error { if f.tlsCert != "" { cfg.Server.TLSCert = f.tlsCert } + if f.redirectHost != "" { + cfg.Server.RedirectHost = f.redirectHost + } if f.tlsKey != "" { cfg.Server.TLSKey = f.tlsKey } @@ -249,6 +300,12 @@ func applyFlagOverrides(cfg *config.Config, f flagOverrides) error { if f.noCache { cfg.Cache.Enabled = false } + if f.preload { + cfg.Cache.Preload = true + } + if f.gcPercent != 0 { + cfg.Cache.GCPercent = f.gcPercent + } if f.cacheSize != "" { n, err := parseBytes(f.cacheSize) if err != nil { @@ -323,14 +380,33 @@ func parseBytes(s string) (int64, error) { return n * multiplier, nil } +// formatByteSize returns a human-readable string like "7.7 KB" or "256.0 MB". +func formatByteSize(b int64) string { + const ( + kb = 1024 + mb = 1024 * 1024 + gb = 1024 * 1024 * 1024 + ) + switch { + case b >= gb: + return fmt.Sprintf("%.1f GB", float64(b)/float64(gb)) + case b >= mb: + return fmt.Sprintf("%.1f MB", float64(b)/float64(mb)) + case b >= kb: + return fmt.Sprintf("%.1f KB", float64(b)/float64(kb)) + default: + return fmt.Sprintf("%d B", b) + } +} + // logConfig writes the resolved configuration to the standard logger. func logConfig(cfg *config.Config) { - log.Printf("[config] server.addr=%s tls_addr=%s tls_cert=%q tls_key=%q", - cfg.Server.Addr, cfg.Server.TLSAddr, cfg.Server.TLSCert, cfg.Server.TLSKey) + log.Printf("[config] server.addr=%s tls_addr=%s redirect_host=%q tls_cert=%q tls_key=%q", + cfg.Server.Addr, cfg.Server.TLSAddr, cfg.Server.RedirectHost, cfg.Server.TLSCert, cfg.Server.TLSKey) log.Printf("[config] files.root=%q files.index=%q files.not_found=%q", cfg.Files.Root, cfg.Files.Index, cfg.Files.NotFound) - log.Printf("[config] cache.enabled=%v cache.max_bytes=%d cache.max_file_size=%d", - cfg.Cache.Enabled, cfg.Cache.MaxBytes, cfg.Cache.MaxFileSize) + log.Printf("[config] cache.enabled=%v cache.preload=%v cache.max_bytes=%d cache.max_file_size=%d cache.gc_percent=%d", + cfg.Cache.Enabled, cfg.Cache.Preload, cfg.Cache.MaxBytes, cfg.Cache.MaxFileSize, cfg.Cache.GCPercent) log.Printf("[config] compression.enabled=%v compression.min_size=%d compression.level=%d", cfg.Compression.Enabled, cfg.Compression.MinSize, cfg.Compression.Level) log.Printf("[config] security.block_dotfiles=%v security.directory_listing=%v security.cors_origins=%v", @@ -425,6 +501,7 @@ Serve flags: --config string path to TOML config file (default "config.toml") --host string host/IP to listen on (default: all interfaces) --port, -p int HTTP port (default 8080) + --redirect-host string canonical host for HTTP to HTTPS redirects --tls-cert string path to TLS certificate (PEM) --tls-key string path to TLS private key (PEM) --tls-port int HTTPS port (default 8443) @@ -432,6 +509,8 @@ Serve flags: --404 string custom 404 page, relative to root --no-cache disable in-memory file cache --cache-size string max cache size, e.g. 256MB, 1GB (default 256MB) + --preload preload all files into cache at startup + --gc-percent int set Go GC target %% (0=default, 400 for high throughput) --no-compress disable response compression --cors string CORS origins, comma-separated or * for all --dir-listing enable directory listing diff --git a/config.toml.example b/config.toml.example index be83f49..b471f5e 100644 --- a/config.toml.example +++ b/config.toml.example @@ -7,17 +7,20 @@ addr = ":8080" # HTTPS listen address (requires tls_cert and tls_key to be set). tls_addr = ":8443" +# Canonical host used for HTTP → HTTPS redirects. +# Set this in production whenever tls_addr listens on all interfaces (e.g. ":443"). +# Example: "static.example.com" +redirect_host = "" + # Path to TLS certificate file (PEM). Leave empty to disable HTTPS. tls_cert = "" # Path to TLS private key file (PEM). Leave empty to disable HTTPS. tls_key = "" -# Maximum time to read request headers from the client. -# Protects against Slowloris DoS attacks. Default 5s. -read_header_timeout = "5s" - # Maximum time to read an HTTP request from the client (headers + body). +# With fasthttp, this single timeout covers the full read phase including +# headers, providing Slowloris protection. Default 10s. read_timeout = "10s" # Maximum time to write an HTTP response to the client. diff --git a/docs/index.html b/docs/index.html index e837315..40980f9 100644 --- a/docs/index.html +++ b/docs/index.html @@ -6,7 +6,7 @@ static-web — High-Performance Go Static File Server @@ -37,7 +37,7 @@ @@ -76,11 +76,11 @@ "operatingSystem": "Linux, macOS, Windows", "url": "https://static.21no.de", "downloadUrl": "https://github.com/BackendStack21/static-web/releases", - "softwareVersion": "1.1.0", + "softwareVersion": "1.2.0", "programmingLanguage": "Go", "license": "https://github.com/BackendStack21/static-web/blob/main/LICENSE", "codeRepository": "https://github.com/BackendStack21/static-web", - "description": "A production-grade, blazing-fast static web file server written in Go. Features in-memory LRU cache (~27 ns/op lookup), HTTP/2, TLS 1.2+, gzip and brotli compression, and comprehensive security headers.", + "description": "A production-grade, blazing-fast static web file server written in Go. ~141k req/sec with fasthttp — 55% faster than Bun. Features in-memory LRU cache, TTL-aware cache expiry, HTTP/2, TLS 1.2+, gzip and brotli compression, and comprehensive security headers.", "author": { "@type": "Person", "name": "Rolando Santamaria Maso", @@ -92,7 +92,11 @@ "priceCurrency": "USD" }, "featureList": [ - "In-memory LRU cache with ~27 ns/op lookup", + "~141k req/sec — 55% faster than Bun's native static server", + "In-memory LRU cache with ~28 ns/op lookup", + "Startup preloading with path-safety cache pre-warming", + "TTL-aware cache expiry with optional automatic stale-entry eviction", + "Direct ctx.SetBody() fast path with pre-formatted headers for cache hits", "HTTP/2 with automatic HTTPS", "TLS 1.2+ with AEAD cipher suites", "gzip and brotli compression", @@ -219,19 +223,19 @@

static-web

Production-Grade Go Static File Server

- Blazing fast, lightweight static server with in-memory LRU cache, HTTP/2, TLS, gzip / brotli, - and security headers baked in — ready for production in minutes. + Blazing fast, lightweight static server with in-memory LRU cache, startup preloading, HTTP/2, TLS, gzip / brotli, + and security headers baked in.

- ~27 ns - cache lookup + 141k + req/sec (fasthttp)
- 0 allocs - on cache hit + ~0 alloc + hot-path serving
@@ -295,11 +299,8 @@

Everything You Need

-

Zero-Alloc Hot Path

-

- LRU cache hit at ~27 ns/op with 0 allocations. os.Stat is never called on - a cache hit — pure memory. -

+

Near-Zero Alloc Hot Path

+

~141k req/sec — 55% faster than Bun. Built on fasthttp with direct ctx.SetBody() and pre-formatted headers — no formatting allocations on cache hits.

🗜️
@@ -328,10 +329,7 @@

Security Hardened

📦

Smart Caching

-

- Byte-accurate LRU cache with configurable max size, per-file size cap, ETag, and live flush via SIGHUP - without downtime. -

+

Byte-accurate LRU cache with startup preloading, configurable max size, per-file size cap, optional TTL expiry, ETag, and live flush via SIGHUP without downtime.

🔄
@@ -450,8 +448,10 @@

Getting Started

[cache] enabled = true +preload = true # load all files at startup max_bytes = 268435456 # 256 MiB max_file_size = 10485760 # 10 MiB +gc_percent = 400 # tune GC for throughput [compression] enabled = true @@ -550,19 +550,7 @@

Logging Middleware

🔐

Security Middleware

-

Method whitelist · security headers · path safety · dotfile block · CORS

-
-
- -
-
🏷️
-
-

Headers Middleware

-

ETag · 304 Not Modified · Cache-Control · immutable assets

+

Method whitelist · security headers · path safety (cached) · dotfile block · CORS