rlucas.net: The Next Generation Rotating Header Image

aws

s3put fails with ssl.CertificateError suddenly after upgrade

We had been using periods / dots in Amazon S3 bucket names in order to create some semblance of namespace / order. Pretty common convention.

A short while ago a cron job doing backups stopped working after some Python upgrades. Specifically, we were using s3put to upload a file to “my.dotted.bucket“. The error was:

ssl.CertificateError: hostname 'my.dotted.bucket.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'

It turns out that per Boto issue #2836 a recent strictifying of SSL certificate validation breaks the ability to validate the SSL cert when there are extra dots on the LHS of the wildcard. Boo.

If you don’t have the luxury of monkey-patching (or actually patching) the code that sits atop this version of boto, you can put the following section into your (possibly new) ~/.boto config file:

[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat

(Of course, expect that all of the nasty MITM attacks that stricter SSL validation is meant to mitigate to come back and bite you!)

s3put just stops working with “broken pipe”

So your cron job, which has been dutifully stuffing away into s3 your backups nightly or hourly or whatever, just stops working. s3put just breaks with the unhelpful complaint, “broken pipe.”

You can try running s3put with “–debug 2” added to your flags, and watch the lower protocol-level stuff seem to go along just fine until it barfs with the same error.

Check the size of your file. If you’ve got a backup that’s been slowing creeping up in size and is now over 5.0 GB, that’s your issue. AWS apparently has a 5 GB s3 limit for single-part HTTP PUT.

s3put accepts a “–multipart” option, but only if it can find the necessary Python libraries including “filechunkio,” so install filechunkio and try again. With any luck, you can just add –multipart to your s3put command and it will Just Work.

Don’t Bother with S3FS for Mounting Amazon S3 on Mac OS X (2014)

As of January 2014, it’s not worth bothering with the “s3fs” software for mounting Amazon S3 buckets on your local filesystem.

The idea of s3fs is simple and great: use FUSE (Filesystem in Userspace) to “mount” the S3 bucket the same way you’d mount, say, an NFS drive or a partition of a disk. Manipulate the files, let s3fs sync it in the background. Sure, you lose some reliability, but we’ve had NFS and SMB and all kinds of somewhat-latent-over-an-unreliable-link-but-mostly-with-filesystem-semantics software for decades now, right?

Well, forget it. s3fs as of January 2014, used on Mac OS X and against an existing set of buckets, is so utterly unreliable as to be useless.

First, s3fs cannot “see” existing folders. This is because folders are a bit of hack on S3 and weren’t done in a standardized, documented way when s3fs was first written. However, since then, at least two other ways of creating folders on S3 have gained currency: an older, deprecated way with Firefox plugin S3Fox, and a newer defacto standard way with Amazon’s own management dashboard/browser for S3. Whatever the historical reason, you can’t see the existing folders.

Second, although from mailing list posts, theoretically you can *create* an s3fs folder with the same name as your existing folder, and then its contents will magically become visible, empirically something rather different happens. A mkdir on the s3fs mount leads to the creation of a mangled *regular* file on the S3 dashboard. Now you have two “folders,” each of which is unusable as a folder on the other (S3 dashboard, or s3fs) system. Argh.

Finally, you might say, ok, fine, this will just make me use flat-level, non-folder-nested choices about my S3 architecture. (Leave aside for the moment that the very reason you want to use S3 is probably exactly so that you can have lots and lots and lots and lots (like 10^8+) files in a way that will cripple any reasonable filesystem tools that see them all in one “directory.”) However, even that doesn’t work reliably, as s3fs demonstrated today when it went into “write-only mode” such that I could create files locally that would show up on S3 but that subsequently would disappear from my local filesystem. WTF?!?

The unfortunate answer is: S3 is not a filesystem, and it was created by people who are smarter than you, and who have very craftily calculated that if you are forced to weave in the S3 API and its limitations into your application code, you will have a damn hard time ripping it out of your infrastructure, and so they are going to have you do just that. They do not want it to be used as a filesystem, and so guess what: you are *not* going to use it as a filesystem. Not gonna happen.

Say what you will, but our hometown heroes here in Seattle are no dummies. Embrace, extend, extinguish, baby. Not just for OS companies anymore…

(Yes, I know that s3fs is not an Amazon project. But it appears to be the community’s best attempt to put filesystem semantics around S3, and that attempt has been rejected by AWS’s immune system.)