Coder Perfect

[closed] How stable is s3fs for mounting an Amazon S3 bucket as a local directory?


Under terms of mounting an Amazon S3 bucket as a local directory in Linux, how stable is s3fs? Is it suitable for high-volume manufacturing environments?

Are there any other options that are better or similar?

Would it be better to utilize EBS and mount it to all other AMIs through NFS?

Asked by arod

Solution #1

There’s a decent writeup about s3fs here, however I ended up using an EBS Share after reading it.

It emphasizes a few key points to keep in mind when utilizing s3fs, including those relating to S3’s intrinsic limitations:

As a result, whether or not s3fs is a viable solution depends on what you’re storing. If you’re storing images, for example, and want to write or read a full file without changing it progressively, then it’s OK. However, if you’re doing this, why not just use S3’s API directly?

If you’re talking about application data (for example, database or logging files) where you want to make minor incremental changes, S3 is a no-no – you can’t incrementally alter a file on S3.

The aforementioned post mentions s3backer, a similar application that solves the speed difficulties by implementing a virtual filesystem over S3. This solves the performance problems, but it comes with its own set of problems:

I resorted to using an EC2 instance’s EBS Mounted Drive. However, you should be aware that, although being the most efficient alternative, it has one major flaw. A single point of failure exists with an EBS Mounted NFS Share; if the system that is sharing the EBS Volume goes down, all machines that use the share lose access.

This was a risk I could live with, and it was ultimately the option I selected. I hope this information is useful.

Answered by reach4thelasers

Solution #2

Because this is an old question, I’ll describe my experience with S3FS over the last year.

It had a lot of errors and memory leakage at first (I had to reload it every 2 hours with a cron task), but with the recent release 1.73, it’s been really stable.

The best part about S3FS is that you’ll have one less item to worry about while also gaining some performance gains.

PUT (5%) and GET (95%) will make up the majority of your S3 requests. If you don’t need any post-production, this is the option for you (thumbnail generation for example). If you don’t need any post-processing, you shouldn’t be hitting your web server in the first place and uploading directly to S3 (using CORS).

Assuming you’re reaching the server, you’ll probably need to do some image post-processing. You’ll be uploading to the server, then to S3, using an S3 API. If the user wants to crop, you’ll need to download again from S3, then re-upload to server, crop and then upload to S3. With S3FS and local caching turned on this orchestration is taken care of for you and saves downloading files from S3.

without having to be concerned about anything You shouldn’t need to clear your cache unless you’ve ran out of disk space. This makes tasks such as searching and filtering a lot easier.

The only thing I wish it had was full S3 sync (RSync style). That would make it a business-oriented version of DropBox or Google Drive for S3, but without the quotas and costs that come with them.

Answered by aleemb

Post is based on