I had setup MinIO as a way to self-host S3 buckets for an unrelated project. As a way to force myself to test that setup regularly (and to make my VPS setup more enterprise-y), I opted to host my site(s) from there. Previously I just had the files on an XFS filesystem like a caveman and had Caddy serve them.
Using Caddy is really nice with its automatic SSL, sane config files and
whatnot. Thus I wanted to keep using it. So instead of simply serving static
files via the file_server
directive, it now reverse proxies public S3 buckets
served by MinIO. This, at first, seem pretty straightforward. A basic MinIO
provides public buckets via an URL like minio.host/$BUCKET/$OBJECT
.
$OBJECT
can, of course, be an identifier that resembles a directory
structure. So the initial hunch was to simply configure something like this in
the Caddyfile:
rewrite * /$BUCKETNAME{uri}
reverse_proxy minio:9000
This works. Somewhat. Obviously it doesn’t serve index.html
when the request
points towards a directory. This is bad. All routes on this damn page rely on
this to work… Actually, its even worse. By default MinIO serves a listing of
all files in the bucket if you request “/
”. So all subdirectories are just
broken, because the object does not actually exist and the root of the page
is an ugly XML listing of all files. Not good.
To prevent MinIO from proving a file listing for publicly readable buckets,
simply remove the following Actions from the access policy: s3:ListBucket
,
s3:ListBucketMultipartUploads
. Or vice versa, you only want the permissions
s3:GetObject
and s3:GetBucketLocation
.
Not, MinIO will return 403 when trying to access “/
” and 404 when trying to
access a directory. We’ll let Caddy handle both errors by simply trying the
same route again, with /index.html
appended to it.
rewrite * /$BUCKETNAME{uri}
reverse_proxy minio:9000 {
@error status 403 404
handle_response @error {
rewrite * {uri}/index.html
reverse_proxy minio:9000 {
@nestedError status 404
handle_response @nestedError {
respond "not found" 404
}
}
}
}
This retries the request when the first one returns 403 and 404. Only if the second attempt also returns 404, we present “not found” to the enduser.
So. Done?
Not quite… In S3 one just pretends that object names are fully qualified
paths. Right now, we always append /index.html
to the request. This works
fine for https://janw.name/blog
but falls apart if the request URL is
https://janw.name/blog/
. Thats because the seconds one ends up as a request
for the object blog//index.html
, which does not exist. Only blog/index.html
exists. We’ll need to trim the trailing slash if it is present in the request.
This can be done by appending the following to the configuration:
@pathWithSlash path_regexp dir (.+)/$
handle @pathWithSlash {
rewrite @pathWithSlash {re.dir.1}
}
We can then wrap the whole thing in a nice template like so:
(s3page) {
@pathWithSlash path_regexp dir (.+)/$
handle @pathWithSlash {
rewrite @pathWithSlash {re.dir.1}
}
rewrite * /{args[0]}{uri}
reverse_proxy minio:9000 {
@error status 403 404
handle_response @error {
rewrite * {uri}/index.html
reverse_proxy minio:9000 {
@nestedError status 404
handle_response @nestedError {
respond "not found" 404
}
}
}
}
}
And then use the template like so:
janw.name {
import s3page "janw.name"
}
In my case I simply hardcoded the MinIO on my internal network into the
template (minio:9000
). But this could be made configurable like the bucket
name if required.