Estimating how much memory is available in PostgreSQL buffer caches? - sql

I am tuning my PostgreSQL db effective_cache_size. The PostgreSQL documentation references the expected available memory in PostgreSQL buffer caches to calculate the expected memory available for disk caching. How do I estimate this? Is the shared_buffers the only memory allocated for the buffer caching?

effective_cache_size represents the total memory of the machine minus what you know is used for something else than disk caching.
From Greg's Smith 5-Minute Introduction to PostgreSQL Performance:
effective_cache_size should be set to how much memory is leftover for
disk caching after taking into account what's used by the operating
system, dedicated PostgreSQL memory, and other applications
shared_buffers is considered in this sentence as "dedicated PostgreSQL memory", but other than that, it's not correlated to effective_cache_size.
On Linux if you run free when your system is at its typical memory usage (all applications running and caches are warm), the cached field gives a good value for effective_cache_size.
If you use monitoring tools that produce graphs, you can look at the cached size for long period of times at a glance.

One typical suggestion for a dedicated Postgres server is to set effective_cache_size to about 3/4 of your available RAM. A good tool to use for setting sane defaults is pgtune, which can be found here: https://github.com/gregs1104/pgtune

Related

Configuring Lucene Index writer, controlling the segment formation (setRAMBufferSizeMB)

How to set the parameter - setRAMBufferSizeMB? Is depending on the RAM size of the Machine? Or Size of Data that needs to be Indexed? Or any other parameter? could someone please suggest an approach for deciding the value of setRAMBufferSizeMB.
So, what we have about this parameter in Lucene javadoc:
Determines the amount of RAM that may be used for buffering added
documents and deletions before they are flushed to the Directory.
Generally for faster indexing performance it's best to flush by RAM
usage instead of document count and use as large a RAM buffer as you
can. When this is set, the writer will flush whenever buffered
documents and deletions use this much RAM.
The maximum RAM limit is inherently determined by the JVMs available
memory. Yet, an IndexWriter session can consume a significantly larger
amount of memory than the given RAM limit since this limit is just an
indicator when to flush memory resident documents to the Directory.
Flushes are likely happen concurrently while other threads adding
documents to the writer. For application stability the available
memory in the JVM should be significantly larger than the RAM buffer
used for indexing.
By default, Lucene uses 16 Mb as this parameter (this is the indication to me, that you shouldn't have that much big parameter to have fine indexing speed). I would recommend you to tune this parameter by setting it let's say to 500 Mb and checking how well your system behave. If you will have crashes, you could try some smaller value like 200 Mb, etc. until your system will be stable.
Yes, as it stated in the javadoc, this parameter depends on the JVM heap, but for Python, I think it could allocate memory without any limit.

mongodb high cpu usage

I have installed MongoDB 2.4.4 on Amazon EC2 with ubuntu 64 bit OS and 1.6 GB RAM.
On this server, only MongoDB running nothing else.
But sometime CPU usage reach to 99% and load average: 500.01, 400.73,
620.77
I have also installed MMS on server to monitor what's going on server.
Here is MMS detail
As per MMS details, indexing working perfectly for each queries.
Suspect details as below
1) HIGH non-mapped virtual memory
2) HIGH page faults
Can anyone help me to understand what exactly causing high CPU usage ?
EDIT:
After comments of #Dylan Tong, i have reduced active connetions but
still there is high non-mapped virtual memory
Here's a summary of a few things to look into:
1. Observed a large number of connections and cursors (13k):
- fix: make sure your connection pool is appropriate. For reporting, and your current request rate, you only need a few connections at most. Also, I'm guessing you have a m1small instance, which means you only have 1 core.
2. Review queries and indexes:
- run your queries with explain(), to observe how the queries are executed. The right model normally results in queries only pulling very few documents and utilization of an index.
3. Memory (compact and readahead setting):
- make the best use of memory. 1.6GB is low. Check how much free memory you have, and compare it to what is reported as resident. A couple of common causes of low resident memory is due to fragmentation. If there are alot of documents moving, changing size and such, you should run the compact command to defragment your data files. Also, a bad readahead can lead to poor use of memory as well. Check your readahead setting (http://manpages.ubuntu.com/manpages/lucid/man2/readahead.2.html). Try a few values starting with low values (http://docs.mongodb.org/manual/administration/production-notes/). The production notes recommend 32 (for standard 512byte blocks). Sometimes higher values are optimal if your documents are larger. The hope is that resident memory should be close to your available memory and your page faults should start to lower.
If you're using resources to the fullest after this, and you're still capped out on CPU then it means you need to up your resources.

If not CPU, disk or network, what is the bottleneck during query execution?

I work with SQL Server 2005 and wonder, if not CPU, disk or network, what are users waiting for when SQL Server is working. The strange thing is that system monitor shows that the 4 processors are at an average of 5%, the disk (demonstrated 50MB/s write) works with about 5-8 MB/s, but the execution (inserts and selects) take up to 10 minutes. I'd be happy to install additional hardware, but I don't see what device is the bottleneck and how do I measure its capacity and current workload.
Any advice would be appreciated.
Thanks
additional info: RAM is constantly at about 70% capacity and I am running windows xp.
check your disk read and write 'wait' time. a heavy load database may just make a lot of read and write request with very small piece of data that saturates the IO.
As others mention, disks are rarely the bottleneck when it comes to bandwidth, but rather in the number of IO operations they can perform per second - commonly called IOPS.
The IOPS capabilities of your disks will vary according to disk type, cache and the RAID setup you have.
Another thing you may run into is locking. If you have a lot of concurrent access to the same data, especially inside of large transactions, you may see other transactions being blocked - causing no network, CPU nor disk usage while being blocked, just wasted time.
Probably the disks. If you are seeking all over the place, the throughput (MB/s) will be low even though the disks are running as fast as they can.
Generic advice: try increasing SQLServer's cache, tune your queries, check that the appropriate indexes exist and are used (and that you don't have too many).

confused, is redis a in-memory only store that also rights to disk for backup/restore?

confused, is redis a in-memory only store that also rights to disk for backup/restore?
if so, how long does a 16GB db take to write and read back to memory?
As seen on the Redis README, all data is in memory but also stored on disk for persistency and backup value.
As for the 16gb database issue, it depends entirely on your server. For example;
Does the server have any other I/O bound software running?
What type of server is it? Shared? VPS? Dedicated?
The hardware specifications of both the hard disk and RAM.
For those reasons, it is impossible to give you an accurate estimate as to how long reading and writing 16gb of data would take.
Honestly, if you're storing 16gb of data then Redis is most likely not the correct database program to be using simply because it is so heavy on RAM and disk space by design.

Redis - Can data size be greater than memory size?

I'm rather new to Redis and before using it I'd like to learn some important (as for me) details on it. So....
Redis is using RAM and HDD for storing data. RAM is used as fast read/write storage, HDD is used to make this data persistant. When Redis is started it loads all data from HDD to RAM or it loads only often queried data to the RAM? What if I have 500Mb Redis storage on HDD, but I have only 100Mb or RAM for Redis. Where can I read about it?
Redis loads everything into RAM. All the data is written to disk, but will only be read for things like restarting the server or making a backup.
There are a couple of ways you can use it with less RAM than data though. You can set it up in combination with MySQL or another disk based store to work much like memcached - you manage cache misses and persistence manually.
Redis has a VM mode where all keys must fit in RAM but infrequently accessed data can be on disk. However, I'm not sure if this is in the stable builds yet.
Recent versions (>2.0) have improved significantly and memory management is more efficient. See this blog post that explains how to use hashes to optimize RAM memory footprint: http://antirez.com/post/redis-weekly-update-7.html
The feature called Virtual Memory and it official deprecated
Redis VM is now deprecated. Redis 2.4 will be the latest Redis version featuring Virtual Memory (but it also warns you that Virtual Memory usage is discouraged). We found that using VM has several disadvantages and problems. In the future of Redis we want to simply provide the best in-memory database (but persistent on disk as usual) ever, without considering at least for now the support for databases bigger than RAM. Our future efforts are focused into providing scripting, cluster, and better persistence.
more information about VM: https://redis.io/topics/virtual-memory