ec2-bundle-vol Chunking Size - amazon-s3

Making an AMI and storing it to S3 using the ec2-bundle-vol/ec2-upload-bundle/ec2-register trifecta in AWS I get 36 10 MB image chunks. For a readability/testing standpoint I would much prefer something like 4 100 MB images or one 3.5 GB file.
I am not seeing an easy way to change this behavior without finding and reverse-engineering the Ruby Rails script wrapped in ec2-bundle-vol.
Alternately, is there a good reason for three dozen small files?

unfortunately there's no way without modifying the ruby script.
The chunk size is hardcoded
$AWS_PATH/amitools/ec2/lib/ec2/amitools/bundle.rb line 14
CHUNK_SIZE = 10 * 1024 * 1024 # 10 MB in bytes.
Eventually, it might not work to register the bundle if chunks have a different size

Related

When stored procedure returns 17 million rows, it's throwing out of memory while accessing dataset in Delphi

I'm using Delphi 6 for developing windows application and have a stored procedure which returns around 17 million rows. It takes 3 to 4 minutes while returning data in SQL Server Management Studio.
And, I'm getting an "out of memory" exception while I'm trying to access the result dataset. I'm thinking that the sp.execute might to executed fully. Do I need to follow any steps to fix this or shall I use sleep() to fix this issue?
Delphi 6 can only compile 32 bit executables.
32 bit executables running on a 32 bit Windows have a memory limit of 2 GiB. This can be extended to 3 GiB with a hardware boot switch.
32 bit executables running on a 64 bit Windows have the same memory limit of 2 GiB. Using the "large address aware" flag they can at max address 4 GiB of memory.
32 bit Windows executables emulated via WINE under Linux or Unix should not be able to overcome this either, because 32 bit can at max store the number 4,294,967,295 = 2³² - 1, so the logical limit is 4 GiB in any possible way.
Wanting 17 million datasets on currently 1,9 GiB of memory means that 1,9 * 1024 * 1024 * 1024 = 2,040,109,465 bytes divided by 17,000,000 gives a mean of just 120 bytes per dataset. I can hardly imagine that is enough. And it would even only be the gross load, but memory for variables are still needed. Even if you manage to put that into large arrays you'd still need plenty of overhead memory for variables.
Your software design is wrong. As James Z and Ken White already pointed out: there can't be a scenario where you need all those dataset at once, much less the user to view them all at once. I feel sorry for the poor souls that yet had to use that software - who knows what else is misconcepted there. The memory consumption should remain at sane levels.

Suggestion on RocksDB Configuration

I am looking for suggestions on my RocksDB configuration. Our use case is to load 100GB of key-value pairs into rocksdb and at run time only serve the key-value pairs in the db. Key is 32 bytes and value is 1.6 KB in size.
What we have right now is we used hadoop to generate a 100GB sst file using SstFileWriter api and save it off in S3. Each new server that comes up ingests the file using: db.ingestExternalFile(..). We use a i3.large machine (15.25 GiB | 2 vCPUs | 475 GiB NVMe SSD). P95 and avg response from rocksdb given the current configuration:
BlockSize = 2KB
FormatVersion = 4
Read-Write=100% read at runtime
is ~1ms but P99 and PMAX are pretty bad. We are looking for some way to reduce the PMAX response times which is ~10x P95.
Thanks.
Can you load all of the db in memory using tmpfs ... basically just copy the data to RAM and see if that helps .... also it may make sense to compact the sst files in a separate job rather than ingesting at startup ... this may need a change in your machine config to be geared more towards RAM and less towards SSD storage

Recommended VLF File Count in SQL Server

What is the recommended VLF File Count for 120 GB size database in SQL Server?
I appreciate anyone response quickly .
Thanks,
Govarthanan
There are many excellent articles on managing VLFs in SQL server; but the crux of all of them is- It depends on you!
Some people may need really quick recovery, and allocating a large VLF upfront is better.
DB size and VLFs are not really correlated.
You may have a small DB and may be doing large amount of updates. Imagine a DB storing daily stock values. It deletes all data every night and inserts new data in tables every day! This will really create a large log data but may not impact mdf file size.
Here's an article about VLF auto growth settings. Quoting important section
Up to 2014, the algorithm for how many VLFs you get when you create, grow, or auto-grow the log is based on the size in question:
Less than 1 MB, complicated, ignore this case.
Up to 64 MB: 4 new VLFs, each roughly 1/4 the size of the growth
64 MB to 1 GB: 8 new VLFs, each roughly 1/8 the size of the growth
More than 1 GB: 16 new VLFs, each roughly 1/16 the size of the growth
So if you created your log at 1 GB and it auto-grew in chunks of 512 MB to 200 GB, you’d have 8 + ((200 – 1) x 2 x 8) = 3192 VLFs. (8 VLFs from the initial creation, then 200 – 1 = 199 GB of growth at 512 MB per auto-grow = 398 auto-growths, each producing 8 VLFs.)
IMHO 3000+ VLFs is not a big number but alarming. Since you have some idea about your DB size; and assuming you know that typically your logs are approximately n times your DB size.
Then you can put in right auto growth settings to keep your VLFs in a range you are comfortable with.
I personally will be comfortable with a setting of 10 GB start with 5 GB auto-growth.
So for 120 GB of logs (n=1) this will give me 16 + 22*16=368 VLFs.
And if my logs go up to 500 GB, then I'll have 16+ 98*16=1584 VLFs

maximum upload file size in php 5.3

What is the maximum file upload size allowed in the post_max_size and upload_max_filesize configuration options (in PHP 5.3)?
According to the manual entry about post_max_size:
Note:
PHP allows shortcuts for bit values, including K (kilo), M (mega) and G (giga).
PHP will do the conversions automatically if you use any
of these. Be careful not to exceed the 32 bit signed integer limit (if
you're using 32bit versions) as it will cause your script to fail.
Your limit could be 32bit signed integer limit. ~2,147,483,647 bytes on a 32 bit version. See the PHP_INT_MAX constant to get the value for your system:
PHP_INT_MAX (integer)
The largest integer supported in this build of PHP. Usually int(2147483647). Available since PHP 4.4.0 and PHP 5.0.5
Related:
How to have 64 bit integer on PHP?
There is no real limit set by PHP to post_max_size or upload_max_filesize. However both values must be smaller than memory_limit (also you can modify this). Anyway, as values use something (a lot) smaller than your RAM. A hacker may attempt to send a very large file that will consume completely your system resources. To upload large file is better to use an FTP server.

Redis - configure parameters vm-page-size and vm-pages

Using Redis, I am currently parameterizing the redis.conf for using virtual memory.
Regarding I have 18 millions of keys (max 25 chars) as hashtables with 4 fields (maximum 256 chars)
My server has 16 Go RAM.
I wonder how to optimize the parameters vm-page-size (more than 64 ?) and vm-pages.
Any ideas ? Thanks.
You probably don't need to in this case - your usage is pretty close to standard. It's only when your values are large ( > ~4k iirc) that you can run into issues with insufficient contiguous space.
Also, with 16GB available there won't be much swapping happening, which makes the vm config a lot less important.