Amazon Athena Performance on small S3 files [closed] - amazon-s3

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 days ago.
Improve this question
What kind of performance (latencies) can I expect with Amazon Athena queries on an S3 bucket containing about 200 million files?
More details: Each file is only 2KB in size. Let's assume that the file format is either CSV or Parquet (not both) and we have about 50 searchable fields.
I could possibly use s3DistCp to merge files to improve performance, but I don't want to go that route unless it is really needed.
Thanks for the help!
Esther

Related

Can I upload files in bzip2 to storage and then use them in bigquery? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a bunch of (largish, 10GB each) files in bz2 format. I would like to upload them and then perform some queries on them. Does big query "understand" bzip as it does gzip? Should I convert them? What would be the best way to upload them?
I assume the files are in CSV or JSON format. Per BigQuery documentation (https://cloud.google.com/bigquery/preparing-data-for-loading), only gzip compression is supported. Bit even if bz2 was supported, it wouldn't be a good idea to work with 10GB sized compressed files. The problem is that unlike with uncompressed file - BigQuery won't be able to split them into pieces, and will have to work with entire 10GB file, which will be very slow.

How to clean up elastic scale databases [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
While testing Elastic Scale I wanted to start over a couple of times without deleting the complete databases. Since this might save you some time I though I would share the following SQL to cleanup databases used for Elastic Scale Shards and the ShardMapsManager.

Is it possible to sort files on Amazon S3? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Also, is it possible to set a default sort order for a bucket on Amazon S3?
For viewing, or for some additional processing?
If you want to sort the files for viewing on a Windows desktop, an inexpensive option is to install WebDrive and map a drive to Amazon S3. You can do a column sort by date, file type, name, etc.

Are amazon s3 folders browsable? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a little cloud api based on amazon which uploads a file with a 20 chars random title.
This file is public, but can anyone access it without knowing the filename?
Example: amazonurl.com/mybucket/myfolder/xxxxxxxxxxxxxxxxxxxxxxxxxxxxx.txt
Can someone browse mybucket or myfolder and access the file anyway?
Thanks.
No, S3 does not autogenerate an index.html as some webservers do. You need to explicitly define the file you are referencing to access it.

How to delete a versioned S3 bucket? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
The AWS Console won't let me delete a bucket it says is empty (and whose files I deleted using the console) because, when I click delete, it tells me the bucket isn't empty. I assume that this is because deleting the files didn't delete their earlier versions. Is this correct? If so, how can I clean out the bucket? This 2 cent charge on my credit card bill each month is getting annoying. =)
Via the AWS Forum: use Cloudberry, or BucketExplorer.