Hitachi Content Platform (HCP) S3 - How to I disable or delete previous versions? - amazon-s3

I am (unfortunately) using Hitachi Content Platform for S3 object storage, and I need to sync around 400 images to a bucket every 2 minutes. The filenames are always the same, and the sync "updates" the original file with the latest image.
Originally, I was unable to overwrite existing files. Unlike other platforms, on HCP, you cannot update a file that already exists when versioning is disabled, it returns a 409 and won't store the file, so I've enabled versioning which allows the files to be overwritten.
The issue now is that HCP is set to retain old versions for 0 days for my bucket (which my S3 admin says should cause it to retain no versions) and "Keep deleted versions" is also disabled, but the bucket is still filling up with objects (400 files every 2 minutes = ~288K per day). It seems to cap out at this amount, after the first day it remains at 288K permanently (which seems like it's eventually removing the old versions after 1 day).
Here's an example script that simulates the problem:
# Generate 400 files with the current date/time in them
for i in $(seq -w 1 400); do
echo $(date +'%Y%m%d%H%M%S') > "file_${i}.txt"
done
# Sync the current directory to the bucket
aws --endpoint-url $HCP_HOST s3 sync . s3://$HCP_BUCKET/
# Run this a few times to simulate the 2 minute upload cycle
The initial sync is very quick, and takes less than 5 seconds, but throughout the day it becomes slower and slower as the bucket begins to get more versions, eventually taking sometimes over 2 minutes to sync the files (which is bad since I need to sync the files every 2 minutes).
If I try to list the objects in the bucket after 1 day, only 400 files come back in the list, but it can take over 1 minute to return (which is why I need to add --cli-read-timeout 0):
# List all the files in the bucket
aws --endpoint-url $HCP_HOST s3 ls s3://$HCP_BUCKET/ --cli-read-timeout 0 --summarize
# Output
Total Objects: 400
Total Size: 400
I can also list and see all of the old unwanted versions:
# List object versions and parse output with jq
aws --endpoint-url $HCP_HOST s3api list-object-versions --bucket $HCP_BUCKET --cli-read-timeout 0 | jq -c '.Versions[] | {"key": .Key, "version_id": .VersionId, "latest": .IsLatest}'
Output:
{"key":"file_001.txt","version_id":"107250810359745","latest":false}
{"key":"file_001.txt","version_id":"107250814851905","latest":false}
{"key":"file_001.txt","version_id":"107250827750849","latest":false}
{"key":"file_001.txt","version_id":"107250828383425","latest":false}
{"key":"file_001.txt","version_id":"107251210538305","latest":false}
{"key":"file_001.txt","version_id":"107251210707777","latest":false}
{"key":"file_001.txt","version_id":"107251210872641","latest":false}
{"key":"file_001.txt","version_id":"107251212449985","latest":false}
{"key":"file_001.txt","version_id":"107251212455681","latest":false}
{"key":"file_001.txt","version_id":"107251212464001","latest":false}
{"key":"file_001.txt","version_id":"107251212470209","latest":false}
{"key":"file_001.txt","version_id":"107251212644161","latest":false}
{"key":"file_001.txt","version_id":"107251212651329","latest":false}
{"key":"file_001.txt","version_id":"107251217133185","latest":false}
{"key":"file_001.txt","version_id":"107251217138817","latest":false}
{"key":"file_001.txt","version_id":"107251217145217","latest":false}
{"key":"file_001.txt","version_id":"107251217150913","latest":false}
{"key":"file_001.txt","version_id":"107251217156609","latest":false}
{"key":"file_001.txt","version_id":"107251217163649","latest":false}
{"key":"file_001.txt","version_id":"107251217331201","latest":false}
{"key":"file_001.txt","version_id":"107251217343617","latest":false}
{"key":"file_001.txt","version_id":"107251217413505","latest":false}
{"key":"file_001.txt","version_id":"107251217422913","latest":false}
{"key":"file_001.txt","version_id":"107251217428289","latest":false}
{"key":"file_001.txt","version_id":"107251217433537","latest":false}
{"key":"file_001.txt","version_id":"107251344110849","latest":true}
// ...
I thought I could just run a job that cleans up the old versions on a regular basis, but I've tried to delete the old versions and it fails with an error:
# Try deleting an old version for the file_001.txt key
aws --endpoint-url $HCP_HOST s3api delete-object --bucket $HCP_BUCKET --key "file_001.txt" --version-id 107250810359745
# Error
An error occurred (NotImplemented) when calling the DeleteObject operation:
Only the current version of an object can be deleted.
I've tested this using MinIO and AWS S3 and my use-case works perfectly fine on both of those platforms.
Is there anything I'm doing incorrectly, or is there a setting in HCP that I'm missing that could make it so I can overwrite objects on sync while retaining no previous versions? Alternatively, is there a way to manually delete the previous versions?

Related

How to get information on latest successful pod deployment in OpenShift 3.6

I am currently working on making a CICD script to deploy a complex environment into another environment. We have multiple technology involved and I currently want to optimize this script because it's taking too much time to fetch information on each environment.
In the OpenShift 3.6 section, I need to get the last successful deployment for each application for a specific project. I try to find a quick way to do so, but right now I only found this solution :
oc rollout history dc -n <Project_name>
This will give me the following output
deploymentconfigs "<Application_name>"
REVISION STATUS CAUSE
1 Complete config change
2 Complete config change
3 Failed manual change
4 Running config change
deploymentconfigs "<Application_name2>"
REVISION STATUS CAUSE
18 Complete config change
19 Complete config change
20 Complete manual change
21 Failed config change
....
I then take this output and parse each line to know which is the latest revision that have the status "Complete".
In the above example, I would get this list :
<Application_name> : 2
<Application_name2> : 20
Then for each application and each revision I do :
oc rollout history dc/<Application_name> -n <Project_name> --revision=<Latest_Revision>
In the above example the Latest_Revision for Application_name is 2 which is the latest complete revision not building and not failed.
This will give me the output with the information I need which is the version of the ear and the version of the configuration that was used in the creation of the image use for this successful deployment.
But since I have multiple application, this process can take up to 2 minutes per environment.
Would anybody have a better way of fetching the information I required?
Unless I am mistaken, it looks like there are no "one liner" with the possibility to get the information on the currently running and accessible application.
Thanks
Assuming that the currently active deployment is the latest successful one, you may try the following:
oc get dc -a --no-headers | awk '{print "oc rollout history dc "$1" --revision="$2}' | . /dev/stdin
It gets a list of deployments, feeds it to awk to extract the name $1 and revision $2, then compiles your command to extract the details, finally sends it to standard input to execute. It may be frowned upon for not using xargs or the like, but I found it easier for debugging (just drop the last part and see the commands printed out).
UPDATE:
On second thoughts, you might actually like this one better:
oc get dc -a -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.spec.template.spec.containers[0].env}{"\n\t"}{.spec.template.spec.containers[0].image}{"\n-------\n"}{end}'
The example output:
daily-checks
[map[name:SQL_QUERIES_DIR value:daily-checks/]]
docker-registry.default.svc:5000/ptrk-testing/daily-checks#sha256:b299434622b5f9e9958ae753b7211f1928318e57848e992bbf33a6e9ee0f6d94
-------
jboss-webserver31-tomcat
registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat7-openshift#sha256:b5fac47d43939b82ce1e7ef864a7c2ee79db7920df5764b631f2783c4b73f044
-------
jtask
172.30.31.183:5000/ptrk-testing/app-txeq:build
-------
lifebicycle
docker-registry.default.svc:5000/ptrk-testing/lifebicycle#sha256:a93cfaf9efd9b806b0d4d3f0c087b369a9963ea05404c2c7445cc01f07344a35
You get the idea, with expressions like .spec.template.spec.containers[0].env you can reach for specific variables, labels, etc. Unfortunately the jsonpath output is not available with oc rollout history.
UPDATE 2:
You could also use post-deployment hooks to collect the data, if you can set up a listener for the hooks. Hopefully the information you need is inherited by the PODs. More info here: https://docs.openshift.com/container-platform/3.10/dev_guide/deployments/deployment_strategies.html#lifecycle-hooks

Ceph s3 bucket space not freeing up

I been testing Ceph with s3
my test ENV is a 3node with an datadisk of 10GB each so 30GB
its set to replicate 3 times. so i have "15290 MB" space available.
I got the S3 bucket working and been uploading files, and filled up the storage, tried to remove the said files but the disks are still show as full
cluster 4ab8d087-1802-4c10-8c8c-23339cbeded8
health HEALTH_ERR
3 full osd(s)
full flag(s) set
monmap e1: 3 mons at {ceph-1=xxx.xxx.xxx.3:6789/0,ceph-2=xxx.xxx.xxx.4:6789/0,ceph-3=xxx.xxx.xxx.5:6789/0}
election epoch 30, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e119: 3 osds: 3 up, 3 in
flags full,sortbitwise,require_jewel_osds
pgmap v2224: 164 pgs, 13 pools, 4860 MB data, 1483 objects
14715 MB used, 575 MB / 15290 MB avail
164 active+clean
I am not sure how to get the disk space back?
Can any one advise on what i have done wrong or have missed
I'm beginning with ceph and had the same problem.
try running the garbage collector
list what will be deleted
radosgw-admin gc list --include-all
then run it
radosgw-admin gc process
if it didn't work (like for me with most of my data)
find the bucket with your data :
ceph df
Usually your S3 data goes in the default pool default.rgw.buckets.data
purge it from every object /!\ you will loose all your data /!\
rados purge default.rgw.buckets.data --yes-i-really-really-mean-it
I don't know why ceph is not purging this data itself for now (still learning...).
Thanks to Julien on this info
you be right with steps 1 and 2
when you run
radosgw-admin gc list --include-all
you see an example like
[
{
"tag": "17925483-8ff6-4aaf-9db2-1eafeccd0454.94098.295\u0000",
"time": "2017-10-27 13:51:58.0.493358s",
"objs": [
{
"pool": "default.rgw.buckets.data",
"oid": "17925483-8ff6-4aaf-9db2-1eafeccd0454.24248.3__multipart_certs/boot2docker.iso.2~UQ4MH7uZgQyEd3nDZ9hFJr8TkvldwTp.1",
"key": "",
"instance": ""
},
{
"pool": "default.rgw.buckets.data",
"oid": "17925483-8ff6-4aaf-9db2-1eafeccd0454.24248.3__shadow_certs/boot2docker.iso.2~UQ4MH7uZgQyEd3nDZ9hFJr8TkvldwTp.1_1",
"key": "",
"instance": ""
}, ....
if you notice the time
2017-10-27 13:51:58.0.493358s
when running
radosgw-admin gc process
It will only clear/remove parts that are older then the time feild
eg i can run "radosgw-admin gc process" over and over again but the files wont be removed till till after "2017-10-27 13:51:58.0.493358s"
but you are also right
rados purge default.rgw.buckets.data --yes-i-really-really-mean-it
works as well
You can list all buckets to be processed by GC (garbage collection) with:
radosgw-admin gc list --include-all
Then you can check that GC will run after specified time. Manually you can run:
radosgw-admin gc process --include-all
It will start process of garbage collection and with "--include-all" will process all entries, including unexpired.
Then you can check the progress of clean-up with:
watch -c "ceph -s"
or simply check the result with "ceph -s" that all buckets supposed to be deleted are gone. Documentation regarding GC settings you can find here:
https://docs.ceph.com/en/quincy/radosgw/config-ref/#garbage-collection-settings

How to remove multiple S3 buckets at once?

I have a dozen of buckets that I would like to remove in AWS S3, all having a similar name containing bucket-to-remove and with some objects in it.
Using the UI is quite slow, is there a solution to remove all these buckets quickly using the CLI?
You could try this sample line to delete all at once. Remember that this is highly destructive, so I hope you know what you are doing:
for bucket in $(aws s3 ls | awk '{print $3}' | grep my-bucket-pattern); do aws s3 rb "s3://${bucket}" --force ; done
You are done with that. May take a while depending on the amount of buckets and their content.
I did this
aws s3api list-buckets \
--query 'Buckets[?starts_with(Name, `bucket-pattern `) == `true`].[Name]' \
--output text | xargs -I {} aws s3 rb s3://{} --force
Then update the bucket pattern as needed.
Be careful though, this is a pretty dangerous operation.
The absolute easiest way to bulk delete S3 buckets is to not write any code at all. I use Cyberduck to browse my S3 account and delete buckets and their contents quite easily.
Using boto3 you cannot delete buckets that have objects in it thus you first need to remove the objects before deleting the bucket. The easiest solution is a simple Python script such as:
import boto3
import botocore
import json
s3_client = boto3.client(
"s3",
aws_access_key_id="<your key id>",
aws_secret_access_key="<your secret access key>"
)
response = s3_client.list_buckets()
for bucket in response["Buckets"]:
# Only removes the buckets with the name you want.
if "bucket-to-remove" in bucket["Name"]:
s3_objects = s3_client.list_objects_v2(Bucket=bucket["Name"])
# Deletes the objects in the bucket before deleting the bucket.
if "Contents" in s3_objects:
for s3_obj in s3_objects["Contents"]:
rm_obj = s3_client.delete_object(
Bucket=bucket["Name"], Key=s3_obj["Key"])
print(rm_obj)
rm_bucket = s3_client.delete_bucket(Bucket=bucket["Name"])
print(rm_bucket)
Here is a windows solution.
First test the filter before you delete
aws s3 ls ^| findstr "<search here>"
and then execute
for /f "tokens=3" %a in ('<copy the correct command between the quotes>') do aws s3 rb s3://%a --force
According to the S3 docs you can remove a bucket using the CLI command aws s3 rb only if the bucket does not have versioning enabled. If that's the case you can write a simple bash script to get the bucket names and delete them one by one, like:
#!/bin/bash
# get buckets list => returns the timestamp + bucket name separated by lines
S3LS="$(aws s3 ls | grep 'bucket-name-pattern')"
# split the lines into an array. #see https://stackoverflow.com/a/13196466/6569593
oldIFS="$IFS"
IFS='
'
IFS=${IFS:0:1}
lines=( $S3LS )
IFS="$oldIFS"
for line in "${lines[#]}"
do
BUCKET_NAME=${line:20:${#line}} # remove timestamp
aws s3 rb "s3://${BUCKET_NAME}" --force
done
Be careful to don't remove important buckets! I recommend to output each bucket name before actually remove them. Also be aware that the aws s3 rb command takes a while to run, because it recursively deletes all the objects inside the bucket.
For deleting all s3 buckets in you account use below technique, It's work very well using local
Step 1 :- export your profile using below command Or you can export access_key and secrete_access_key locally as well
export AWS_PROFILE=<Your-Profile-Name>
Step 2:- Use below python code, Run it on local and see your all s3 buckets will delete.
import boto3
client = boto3.client('s3', Region='us-east-2')
response = client.list_buckets()
for bucket in response['Buckets']:
s3 = boto3.resource('s3')
s3_bucket = s3.Bucket(bucket['Name'])
bucket_versioning = s3.BucketVersioning(bucket['Name'])
if bucket_versioning.status == 'Enabled':
s3_bucket.object_versions.delete()
else:
s3_bucket.objects.all().delete()
response = client.delete_bucket(Bucket=bucket['Name'])
If you see error like boto3 not found please go to link and install it
Install boto3 using pip
I have used lambda for deleting buckets with the specified prefix.
It will delete all the objects regardless versioning is enabled or not.
Note that: You should give appropriate S3 access to your lambda.
import boto3
s3_client = boto3.client('s3')
s3 = boto3.resource('s3')
def lambda_handler(event, context):
bucket_prefix = "your prefix"
response = s3_client.list_buckets()
for bucket in response["Buckets"]:
# Only removes the buckets with the name you want.
if bucket_prefix in bucket["Name"]:
s3_bucket = s3.Bucket(bucket['Name'])
bucket_versioning = s3.BucketVersioning(bucket['Name'])
if bucket_versioning.status == 'Enabled':
s3_bucket.object_versions.delete()
else:
s3_bucket.objects.all().delete()
response = s3_client.delete_bucket(Bucket=bucket['Name'])
return {
'message' : f"delete buckets with prefix {bucket_prefix} was successfull"
}
if you're using PowerShell, this will work:
Foreach($x in (aws s3api list-buckets --query
'Buckets[?starts_with(Name, `name-pattern`) ==
`true`].[Name]' --output text))
{aws s3 rb s3://$x --force}
The best option that I find is to use the Cyberduck. You can select all the buckets from the GUI and delete them. I provide a screenshot for how to do it.

Hadoop jobs getting poor locality

I have some fairly simple Hadoop streaming jobs that look like this:
yarn jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar \
-files hdfs:///apps/local/count.pl \
-input /foo/data/bz2 \
-output /user/me/myoutput \
-mapper "cut -f4,8 -d," \
-reducer count.pl \
-combiner count.pl
The count.pl script is just a simple script that accumulates counts in a hash and prints them out at the end - the details are probably not relevant but I can post it if necessary.
The input is a directory containing 5 files encoded with bz2 compression, roughly the same size as each other, for a total of about 5GB (compressed).
When I look at the running job, it has 45 mappers, but they're all running on one node. The particular node changes from run to run, but always only one node. Therefore I'm achieving poor data locality as data is transferred over the network to this node, and probably achieving poor CPU usage too.
The entire cluster has 9 nodes, all the same basic configuration. The blocks of the data for all 5 files are spread out among the 9 nodes, as reported by the HDFS Name Node web UI.
I'm happy to share any requested info from my configuration, but this is a corporate cluster and I don't want to upload any full config files.
It looks like this previous thread [ why map task always running on a single node ] is relevant but not conclusive.
EDIT: at #jtravaglini's suggestion I tried the following variation and saw the same problem - all 45 map jobs running on a single node:
yarn jar \
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar \
wordcount /foo/data/bz2 /user/me/myoutput
At the end of the output of that task in my shell, I see:
Launched map tasks=45
Launched reduce tasks=1
Data-local map tasks=18
Rack-local map tasks=27
which is the number of data-local tasks you'd expect to see on one node just by chance alone.

How to delete files recursively from an S3 bucket

I have the following folder structure in S3. Is there a way to recursively remove all files under a certain folder (say foo/bar1 or foo or foo/bar2/1 ..)
foo/bar1/1/..
foo/bar1/2/..
foo/bar1/3/..
foo/bar2/1/..
foo/bar2/2/..
foo/bar2/3/..
With the latest aws-cli python command line tools, to recursively delete all the files under a folder in a bucket is just:
aws s3 rm --recursive s3://your_bucket_name/foo/
Or delete everything under the bucket:
aws s3 rm --recursive s3://your_bucket_name
If what you want is to actually delete the bucket, there is one-step shortcut:
aws s3 rb --force s3://your_bucket_name
which will remove the contents in that bucket recursively then delete the bucket.
Note: the s3:// protocol prefix is required for these commands to work
This used to require a dedicated API call per key (file), but has been greatly simplified due to the introduction of Amazon S3 - Multi-Object Delete in December 2011:
Amazon S3's new Multi-Object Delete gives you the ability to
delete up to 1000 objects from an S3 bucket with a single request.
See my answer to the related question delete from S3 using api php using wildcard for more on this and respective examples in PHP (the AWS SDK for PHP supports this since version 1.4.8).
Most AWS client libraries have meanwhile introduced dedicated support for this functionality one way or another, e.g.:
Python
You can achieve this with the excellent boto Python interface to AWS roughly as follows (untested, from the top of my head):
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket("bucketname")
bucketListResultSet = bucket.list(prefix="foo/bar")
result = bucket.delete_keys([key.name for key in bucketListResultSet])
Ruby
This is available since version 1.24 of the AWS SDK for Ruby and the release notes provide an example as well:
bucket = AWS::S3.new.buckets['mybucket']
# delete a list of objects by keys, objects are deleted in batches of 1k per
# request. Accepts strings, AWS::S3::S3Object, AWS::S3::ObectVersion and
# hashes with :key and :version_id
bucket.objects.delete('key1', 'key2', 'key3', ...)
# delete all of the objects in a bucket (optionally with a common prefix as shown)
bucket.objects.with_prefix('2009/').delete_all
# conditional delete, loads and deletes objects in batches of 1k, only
# deleting those that return true from the block
bucket.objects.delete_if{|object| object.key =~ /\.pdf$/ }
# empty the bucket and then delete the bucket, objects are deleted in batches of 1k
bucket.delete!
Or:
AWS::S3::Bucket.delete('your_bucket', :force => true)
You might also consider using Amazon S3 Lifecycle to create an expiration for files with the prefix foo/bar1.
Open the S3 browser console and click a bucket. Then click Properties and then LifeCycle.
Create an expiration rule for all files with the prefix foo/bar1 and set the date to 1 day since file was created.
Save and all matching files will be gone within 24 hours.
Just don't forget to remove the rule after you're done!
No API calls, no third party libraries, apps or scripts.
I just deleted several million files this way.
A screenshot showing the Lifecycle Rule window (note in this shot the Prefix has been left blank, affecting all keys in the bucket):
The voted up answer is missing a step.
Per aws s3 help:
Currently, there is no support for the use of UNIX style wildcards in a
command's path arguments. However, most commands have --exclude "<value>" and --include "<value>" parameters that can achieve the
desired result......... When there are multiple
filters, the rule is the filters that appear later in the command take
precedence over filters that appear earlier in the command. For example, if the filter parameters passed to the command were --exclude "*" --include "*.txt" All files will be excluded from the command except for files ending
with .txt
aws s3 rm --recursive s3://bucket/ --exclude="*" --include="/folder_path/*"
With s3cmd package installed on a Linux machine, you can do this
s3cmd rm s3://foo/bar --recursive
In case if you want to remove all objects with "foo/" prefix using Java AWS SDK 2.0
import java.util.ArrayList;
import java.util.Iterator;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.*;
//...
ListObjectsRequest listObjectsRequest = ListObjectsRequest.builder()
.bucket(bucketName)
.prefix("foo/")
.build()
;
ListObjectsResponse objectsResponse = s3Client.listObjects(listObjectsRequest);
while (true) {
ArrayList<ObjectIdentifier> objects = new ArrayList<>();
for (Iterator<?> iterator = objectsResponse.contents().iterator(); iterator.hasNext(); ) {
S3Object s3Object = (S3Object)iterator.next();
objects.add(
ObjectIdentifier.builder()
.key(s3Object.key())
.build()
);
}
s3Client.deleteObjects(
DeleteObjectsRequest.builder()
.bucket(bucketName)
.delete(
Delete.builder()
.objects(objects)
.build()
)
.build()
);
if (objectsResponse.isTruncated()) {
objectsResponse = s3Client.listObjects(listObjectsRequest);
continue;
}
break;
};
In case using AWS-SKD for ruby V2.
s3.list_objects(bucket: bucket_name, prefix: "foo/").contents.each do |obj|
next if obj.key == "foo/"
resp = s3.delete_object({
bucket: bucket_name,
key: obj.key,
})
end
attention please, all "foo/*" under bucket will delete.
To delete all the versions of the objects under a specific folder:
Pass the path /folder/subfolder/ to the Prefix -
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket("my-bucket-name")
bucket.object_versions.filter(Prefix="foo/bar1/1/").delete()
I just removed all files from my bucket by using PowerShell:
Get-S3Object -BucketName YOUR_BUCKET | % { Remove-S3Object -BucketName YOUR_BUCKET -Key $_.Key -Force:$true }
Just saw that Amazon added a "How to Empty a Bucket" option to the AWS console menu:
http://docs.aws.amazon.com/AmazonS3/latest/UG/DeletingaBucket.html
Best way is to use lifecycle rule to delete whole bucket contents. Programmatically you can use following code (PHP) to PUT lifecycle rule.
$expiration = array('Date' => date('U', strtotime('GMT midnight')));
$result = $s3->putBucketLifecycle(array(
'Bucket' => 'bucket-name',
'Rules' => array(
array(
'Expiration' => $expiration,
'ID' => 'rule-name',
'Prefix' => '',
'Status' => 'Enabled',
),
),
));
In above case all the objects will be deleted starting Date - "Today GMT midnight".
You can also specify Days as follows. But with Days it will wait for at least 24 hrs (1 day is minimum) to start deleting the bucket contents.
$expiration = array('Days' => 1);
I needed to do the following...
def delete_bucket
s3 = init_amazon_s3
s3.buckets['BUCKET-NAME'].objects.each do |obj|
obj.delete
end
end
def init_amazon_s3
config = YAML.load_file("#{Rails.root}/config/s3.yml")
AWS.config(:access_key_id => config['access_key_id'],:secret_access_key => config['secret_access_key'])
s3 = AWS::S3.new
end
s3cmd del --recursive s3://your_bucket --force