Tag Archives: s3cmd

Use s3cmd to Download Requester Pays Buckets on S3

List files under pdf:

$ s3cmd ls --requester-pays s3://arxiv/pdf
                       DIR   s3://arxiv/pdf/

List files under pdf:

$ s3cmd ls --requester-pays s3://arxiv/pdf/\*
2010-07-29 19:56 526202880   s3://arxiv/pdf/arXiv_pdf_0001_001.tar
2010-07-29 20:08 138854400   s3://arxiv/pdf/arXiv_pdf_0001_002.tar
2010-07-29 20:14 525742080   s3://arxiv/pdf/arXiv_pdf_0002_001.tar
2010-07-29 20:33 156743680   s3://arxiv/pdf/arXiv_pdf_0002_002.tar
2010-07-29 20:38 525731840   s3://arxiv/pdf/arXiv_pdf_0003_001.tar
2010-07-29 20:52 187607040   s3://arxiv/pdf/arXiv_pdf_0003_002.tar
2010-07-29 20:58 525731840   s3://arxiv/pdf/arXiv_pdf_0004_001.tar
2010-07-29 21:11  44851200   s3://arxiv/pdf/arXiv_pdf_0004_002.tar
2010-07-29 21:14 526305280   s3://arxiv/pdf/arXiv_pdf_0005_001.tar
2010-07-29 21:27 234711040   s3://arxiv/pdf/arXiv_pdf_0005_002.tar
...

Get all files under pdf:

$ s3cmd get --requester-pays s3://arxiv/pdf/\*

List all content to text file:

$ s3cmd ls --requester-pays s3://arxiv/src/\* > all_files.txt

Calculate file size:

$ awk '{s += $3} END { print "sum is", s/1000000000, "GB, average is", s/NR }' all_files.txt
sum is 844.626 GB, average is 4.80447e+08

Install s3cmd beta / alpha on OS X

Beta released: 1.1.0-beta2, now supports invalidation

sudo python setup.py install --record install.log
[10:46:27][email protected]:
/Users/humanerrorprocessor/Git/s3cmd$ sudo python setup.py install
Password:
Using xml.etree.ElementTree for XML processing
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/S3
copying S3/__init__.py -> build/lib/S3
copying S3/AccessLog.py -> build/lib/S3
copying S3/ACL.py -> build/lib/S3
copying S3/BidirMap.py -> build/lib/S3
copying S3/CloudFront.py -> build/lib/S3
copying S3/Config.py -> build/lib/S3
copying S3/ConnMan.py -> build/lib/S3
copying S3/Exceptions.py -> build/lib/S3
copying S3/FileDict.py -> build/lib/S3
copying S3/FileLists.py -> build/lib/S3
copying S3/HashCache.py -> build/lib/S3
copying S3/MultiPart.py -> build/lib/S3
copying S3/PkgInfo.py -> build/lib/S3
copying S3/Progress.py -> build/lib/S3
copying S3/S3.py -> build/lib/S3
copying S3/S3Uri.py -> build/lib/S3
copying S3/SimpleDB.py -> build/lib/S3
copying S3/SortedDict.py -> build/lib/S3
copying S3/Utils.py -> build/lib/S3
running build_scripts
creating build/scripts-2.7
copying and adjusting s3cmd -> build/scripts-2.7
changing mode of build/scripts-2.7/s3cmd from 644 to 755
running install_lib
creating /Library/Python/2.7/site-packages/S3
copying build/lib/S3/__init__.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/AccessLog.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/ACL.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/BidirMap.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/CloudFront.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/Config.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/ConnMan.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/Exceptions.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/FileDict.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/FileLists.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/HashCache.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/MultiPart.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/PkgInfo.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/Progress.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/S3.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/S3Uri.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/SimpleDB.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/SortedDict.py -> /Library/Python/2.7/site-packages/S3
copying build/lib/S3/Utils.py -> /Library/Python/2.7/site-packages/S3
byte-compiling /Library/Python/2.7/site-packages/S3/__init__.py to __init__.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/AccessLog.py to AccessLog.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/ACL.py to ACL.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/BidirMap.py to BidirMap.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/CloudFront.py to CloudFront.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/Config.py to Config.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/ConnMan.py to ConnMan.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/Exceptions.py to Exceptions.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/FileDict.py to FileDict.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/FileLists.py to FileLists.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/HashCache.py to HashCache.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/MultiPart.py to MultiPart.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/PkgInfo.py to PkgInfo.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/Progress.py to Progress.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/S3.py to S3.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/S3Uri.py to S3Uri.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/SimpleDB.py to SimpleDB.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/SortedDict.py to SortedDict.pyc
byte-compiling /Library/Python/2.7/site-packages/S3/Utils.py to Utils.pyc
running install_scripts
copying build/scripts-2.7/s3cmd -> /usr/local/bin
changing mode of /usr/local/bin/s3cmd to 755
running install_data
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc/packages
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc/packages/s3cmd
copying README -> /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc/packages/s3cmd
copying INSTALL -> /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc/packages/s3cmd
copying NEWS -> /System/Library/Frameworks/Python.framework/Versions/2.7/share/doc/packages/s3cmd
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share/man
creating /System/Library/Frameworks/Python.framework/Versions/2.7/share/man/man1
copying s3cmd.1 -> /System/Library/Frameworks/Python.framework/Versions/2.7/share/man/man1
running install_egg_info
Writing /Library/Python/2.7/site-packages/s3cmd-1.5.0_alpha3-py2.7.egg-info

Setup

s3cmd –configure

Usage

s3cmd sync -rP --guess-mime-type --delete-removed --no-preserve --cf-invalidate --exclude '.DS_Store' /path-to-files/ s3://bucket/

Post Update

Now you can use brew to install beta or alpha version:

brew install s3cmd --devel
brew install s3cmd --HEAD