So you one day get the task to move or copy some objects between S3 buckets. "How difficult can it be?" you ask yourself.
Your first thought is to check the AWS S3 Console, but, to your surprize, you find the options are fairly limited. Perhaps there are other tools or methods to get the job done?
I'll start off this post with a short summary of the tools available at the time I wrote this blog as Part 1 of 2, and will present my complete solution and recomendation in part 2.
Moving or Copying Objects Between S3 Buckets: Available Options
Here are the currently available options that I found, with comments:
- AWS S3 console - suitable when dealing with a limited number of objects, or transferring within the same AWS account.
- AWS command line interface (cli) - EC2 command line tool.
- AWS S3 SDK - If you are ready to do some coding and write your own script.
- s3cmd from s3tools.org - command line tool written in Python. Check out Alfie La Peter's article on this.
- S3 Browser - Windows only.
- CloudBerry S3 Explorer - Windows only, and at a price.
- Bucket Explorer - Supports Windows, Mac and Linux (at a price).
- CloudBuddy - MS Office plug-in.
- S3Fox - Firefox add-on.
Copying a few objects is not that much of a problem, so you should be able to do that with the AWS S3 Console and if you transfer within the same AWS account.
The problem gets more difficult when you need to copy/move between different AWS accounts and when we are dealing with A LOT of objects (thousands or even milions) and specifically with nested objects (objects that appear to be within a folder when you look at them in the AWS S3 Console).
Dealing with a large amount of objects
The moment you deal with millions of objects some of the popular tools in the list above start to behave glitchy or may even crash. Also, when trying to move lots of objects you will find that most tools out there can take a long time (minutes if not hours to copy/move a large S3 bucket) if they don't crash or stop responding halfway through. This is not acceptable when dealing with production environments.
I won't go into details regarding the philosophy, history and detailed documentation in this piece, however I will refer to it as I go. Amazon did a great job in documenting all their services and tools, and I will try to reference it as much as I can. My goal with this blog is to give a practical solution that will get you the fastest, most accurate and easiest recipe to accomplish this task.
So which solution did I go with? My solution is (of course) the AWS S3 CLI tool! Yes, when dealing with AWS I found that most of the time Amazon has the best solutions and tools to deal with the AWS environment.
Using the AWS S3 CLI tool
So here are the ingredients for this recipe:
- 2 - S3 buckets (one for each AWS account)
- 1 - IAM User - Most AWS accounts already may have a few users
- 1 - User policy for the IAM user who is going to do the copy/move.
- 1 - Bucket policy
- 1 - AWS S3 CLI tool - which comes already installed on the EC2 instance
- 1 - EC2 Linux instance, which you most probably already have and if not it is just a few clicks to start a micro instance.
Here is the link for part 2 of this blog, where I will go through the full solution and practical steps to complete this recipe. Check it out!