Bulk uploading of files as an archive
My site has some 19000 small images which I'm offloading to the CDN. Total size about 80 Mbytes. It takes several hours to upload them using a script through the API, since a single transaction takes about 0.5 to 1 second and the files must be uploaded one by one. How about uploading a .tar or a .zip file in a single transaction through the API or the control panel, and have the Cloud Files server unpack the archive in the container? Would also reduce the load in the API service when uploading small files.
Thanks for the votes, we appreciate the feedback!
Was recently added to swift / cloudfiles.
Aran Reeks commented
We regularly develop sites that have requirements for uploading vast image catalogues to your CDN often resulting is days where our server is constantly sending the images across 1 by 1 which is terrible for everyones bandwidth.
In an ideal world I'd love to see something like the following added to Cloud Files via the APIs:
- Ability to upload archive files e.g. tar, gz, rar, zip... and be able to extract them either via API or within your control panel into a specified directory, ideally via API
- Ability to compress content (optionally) once extracted using tools like PngCrush for PNGs and other similar libraries.
After all the while idea of a CDN is to deliver content to consumers as quickly as possible, yet there are still such simple things that could be done to achieve this even better still.
Hope to see some movement on this soon.
Chris Caldwell commented
throwing my hat in the ring for this, uploading an archive of >400,000,000 files, and we add 40-50 thousand and delete a few hundred each week. REALLY need some sort of batch operations here.
Note that Amazon S3 currently supports multi-object delete:
Any update on when bulk delete and upload will be added?
Is there any sense from the RS guys that this will be implemented at any point?
Michael Klocker commented
+1. We have roughly 400k files on Cloudfiles currently. The API allows us to remove 1 file in about 2 - 3 seconds. There is just no way to scale with a lot of small files. Bulk Delete and Upload are key to scaling to larger applications with larger data sets.
Glen Coates commented
Just wrote to the cloudfiles devs about this separately, but this is desperately needed for anyone streaming large amounts of small files to the CDN. A separate HTTPS request for every one is a huge waste of resources on both the client and server side.
There are plenty of ZIP / TAR libraries available for almost every language out there so supporting the upload of an archive would be perfect for this (something like Container.createObjectsFromArchive()).
We also need a way to bulk-remove files from Cloud Files: In a single transaction, upload a list of files to be removed from a container, and Cloud Files then does the deletions.
We encounter numerous timeouts when trying to upload our thousands of changing photos each day over the service net at DFW. If we could upload zips for unpacking into containers, we could scale up to our goal of hundreds of thousands or millions of photos a day. This is our biggest need. (We use the Java API.)
Marc Schipperheyn commented
Totally need this NOW