What if your customer needs to access some documents infrequently only once for reporting during specific time of period in the most cost-effective way. What would you recommend? with the condition that the access should be fast enough to retrieve within 5 minutes.
The answer is S3 Glacier.
Basically, retrieving an archive from S3 Glacier is a asynchronous job operation. You are able to download an output from a completed job result. You can use REST API or equivalent AWS CLI or AWS SDK to initiate an archive retrieval.
To retrieve an archive, you need to follow the two steps below
- Initiate an archive retrieval job.
→ Get the archive ID from inventory of the vault and then initiate a job requesting Glacier to prepare an entire or portion of the archive for downloading by using “Initiate Job” operation.
particularly, when initiating a Job, you can choose one among the following options :
Expedited
– This case is suitable for urgent request for quick access to the data. Using this tier, data access can be completed within 1 to 5 minutes with the archive size of 250 MB+.
Standard
– This tier allows retrieval within several hours like 3 to 5 hours. This options is applied by default if job requests don’t have a specific option provided.
Bulk
– With 5–12 hours of access time and petabytes scale, data access is possible.
2. Download the bytes using the Get Job Output (GET output) operation
So, to meet the need of our customer, we’ve got to use the expedited tier to initiate a job request to retrieve an archive at which necessary documents are stored, and then download from Job OutPut.