We are obtaining a very interesting graph in our Redis architecture.
Green: master
Blue: slave
Looks like the master Redis is executing 35% more commands than the slave Redis.
It is not always the same distance.
Here is part of the log of the active redis server:
[26911] 14 Feb 13:28:44 - DB 0: 2399 keys (417 volatile) in 16384 slots HT.
[26911] 14 Feb 13:28:44 - DB 1: 498 keys (498 volatile) in 1024 slots HT.
[26911] 14 Feb 13:28:44 - DB 2: 1 keys (0 volatile) in 4 slots HT.
[26911] 14 Feb 13:28:44 - 706 clients connected (1 slaves), 33794240 bytes in use
and during the same time on the salve:
[17748] 14 Feb 13:28:44 - DB 0: 2398 keys (417 volatile) in 16384 slots HT.
[17748] 14 Feb 13:28:44 - DB 1: 497 keys (497 volatile) in 1024 slots HT.
[17748] 14 Feb 13:28:44 - DB 2: 1 keys (0 volatile) in 4 slots HT.
[17748] 14 Feb 13:28:44 - 1 clients connected (0 slaves), 24839792 bytes in use
So they look like they are almost 1:1 synchronized.
We wonder which can be the cause of this gap. Also we are asking our selves if this means there are unnecessary commands sent to Redis that we can optimize.
Here's a possible explanation: total_commands_processed reports all commands, reads, writes and server related commands. Only write commands will be propagated to the slave(s).
In a setup where you only write to the master, and read from the slave(s), you will have a higher total_commands_processed on the slave(s) (all reads + all writes).
If you write to and read from the master, and only keep the slave as a backup, or to persist to disk, the master will have a higher total_commands_processed.
In fact, it's very improbable that the master and slave will have the same number of total_commands_processed.
Related
I'm looking for information on the implications latency has on Redis Replication and Sentinel with a cross data center setup for Geo-HA.
Let's assume the following in regards to server/DC latency:
A <> B: 5 ms
A <> C: 25 ms
B <> C: 25 ms
The database is used with real-time messaging brokers and therefore we cannot have any long latencies with reads and writes.
What impact do network latencies (5 ms, 25 ms) have on read and write operations within a replication setup?
How does Sentinel handle such latencies?
What would the effect be if C is only a Sentinel instance to the above?
I've inherited a custom built webpage which uses redis server and i noticed that in every 3-5 minutes redis server peaks and uses 100% cpu for maybe 2-3 minutes.
Anyone have any ideas or clues on what i can do to optimize this?
Log file:
2276:M 23 Apr 2019 18:22:44.060 * 10 changes in 300 seconds. Saving...
2276:M 23 Apr 2019 18:22:44.356 * Background saving started by pid 16081
16081:C 23 Apr 2019 18:25:03.575 * DB saved on disk
16081:C 23 Apr 2019 18:25:03.783 * RDB: 1 MB of memory used by copy-on-write
2276:M 23 Apr 2019 18:25:04.174 * Background saving terminated with success
2276:M 23 Apr 2019 18:30:05.089 * 10 changes in 300 seconds. Saving...
2276:M 23 Apr 2019 18:30:05.396 * Background saving started by pid 16984
16984:C 23 Apr 2019 18:32:26.841 * DB saved on disk
16984:C 23 Apr 2019 18:32:27.126 * RDB: 1 MB of memory used by copy-on-write
2276:M 23 Apr 2019 18:32:27.523 * Background saving terminated with success
2276:M 23 Apr 2019 18:47:28.032 * 1 changes in 900 seconds. Saving...
2276:M 23 Apr 2019 18:47:28.334 * Background saving started by pid 18748
18748:C 23 Apr 2019 18:49:53.540 * DB saved on disk
18748:C 23 Apr 2019 18:49:53.744 * RDB: 1 MB of memory used by copy-on-write
2276:M 23 Apr 2019 18:49:54.157 * Background saving terminated with success
2276:M 23 Apr 2019 18:54:55.023 * 10 changes in 300 seconds. Saving...
2276:M 23 Apr 2019 18:54:55.328 * Background saving started by pid 19422
19422:C 23 Apr 2019 18:57:18.455 * DB saved on disk
19422:C 23 Apr 2019 18:57:18.592 * RDB: 1 MB of memory used by copy-on-write
2276:M 23 Apr 2019 18:57:18.823 * Background saving terminated with success
That's just how Redis is. The backup process involves heavy amounts of compression which will, for a time, pin a single core.
You can switch to using the AOF (append-only-file) exclusively, but that does require periodic pruning and compression as well, it's just less frequent and demanding.
The manual has some specifics, including this caveat:
There are many users using AOF alone, but we discourage it since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts, and in the event of bugs in the AOF engine.
We're running Redis 5.0.3 on docker, with both saving and AOF turned off:
127.0.0.1:6379> config get save
1) "save"
2) ""
127.0.0.1:6379> config get appendonly
1) "appendonly"
2) "no"
Everything runs fine (no backups in the logs), until this morning when we got several DB backup logs in quick succession:
21 Mar 2019 04:12:58.453 * DB saved on disk
21 Mar 2019 04:12:58.454 * DB saved on disk
21 Mar 2019 04:12:58.456 * DB saved on disk
21 Mar 2019 04:13:50.153 * DB saved on disk
21 Mar 2019 04:13:51.573 * DB saved on disk
21 Mar 2019 04:13:52.282 * DB saved on disk
21 Mar 2019 04:21:18.539 * DB saved on disk
21 Mar 2019 04:21:18.540 * DB saved on disk
21 Mar 2019 04:21:18.541 * DB saved on disk
During this time period, Redis drops all of our keys - twice!
Any ideas why this is happening? The system is not under memory or CPU pressure, all graphs look normal.
Other useful things:
Memory usage of redis is increasing but still well within the bounds of the box (as expected as we're storing streams of data)
Number of keys is flat during this time period, until they all get dropped
Latency is flat the whole time
Redis reports no expired or evicted keys
The slow log bumps up during that time period and then is flat again immediately after.
EDIT
On further debugging using info commandstats it seems that several flushall commands were made during this time period, which would explain the DB saves from looking at source.
I have no idea why these flushes are occurring - we do not have any flush commands in our applications. Debugging continues.
Try to convert MBR to GPT with mbr2gpt introduced with Windows 10 build 1703, it failed with
mbr2gpt: Too many MBR partitions found, no room to create EFI system partition.
Full log:
2017-06-07 22:23:24, Info ESP partition size will be 104857600
2017-06-07 22:23:24, Info MBR2GPT: Validating layout, disk sector size is: 512 bytes
2017-06-07 22:23:24, Error ValidateLayout: Too many MBR partitions found, no room to create EFI system partition.
2017-06-07 22:23:24, Error Disk layout validation failed for disk 1
The mbr2gpt disk conversion tool need three conditions for the validation of disk layout:
Admin rights (what you already know)
One of physical disk (or harddrive) with boot partition (MSR) AND os partition
The validation allows normally one of additional partitions (often it's recovery partition)
If you have more than three partition then check this with diskpart:
Microsoft Windows [Version 10.0.15063]
(c) 2017 Microsoft Corporation. All rights reserved.
C:\WINDOWS\system32>diskpart
Microsoft DiskPart-Version 10.0.15063.0
Copyright (C) Microsoft Corporation.
On computer: SILERRAS-PC
DISKPART> list disk
Dist ### Status Size Free Dyn GPT
-------- ------ ------ ------- ---- ---
Disk 0 Online 117 GB 1024 KB * *
Disk 1 Online 489 GB 455 MB *
Disk 2 Online 186 GB 0 B
Disk 3 Online 931 GB 0 B
Disk 4 Online 931 GB 1024 KB *
DISKPART> select disk 1
Disk 1 is now the selected disk.
DISKPART> list partition
Partition ### Typ Größe Offset
------------- ---------------- ------- -------
Partition 1 System 350 MB 1024 KB
Partition 2 Primary 487 GB 351 MB
Partition 3 Recovery 452 MB 488 GB
DISKPART>
Try to reduce the number of partitions to three partitions.
If you have more than two recovery partitions, then check this out with "ReAgentc /info". This command shows you the current recovery partitions. Often only one of those is active. You can delete the other one with diskpart. Please be careful which partition you delete. The diskpart command is "delete partition override".
I hope my guide is helpful for you.
I am using HSQLDB 2.3.0. I have a database that the following schema:
CREATE TABLE MEASUREMENT (ID INTEGER NOT NULL PRIMARY KEY IDENTITY, OBJ CLOB);
When I fill this table with test data, the LOBS file in my database grows:
ls -lath
-rw-rw-r-- 1 hsqldb hsqldb 35 May 6 16:37 msdb.log
-rw-rw-r-- 1 hsqldb hsqldb 85 May 6 16:37 msdb.properties
-rw-rw-r-- 1 hsqldb hsqldb 16 May 6 16:37 msdb.lck
drwxrwxr-x 2 hsqldb hsqldb 4.0K May 6 16:37 msdb.tmp
-rw-rw-r-- 1 hsqldb hsqldb 1.6M May 6 16:37 msdb.script
-rw-rw-r-- 1 hsqldb hsqldb 625M May 6 16:35 msdb.lobs
After running the following command:
TRUNCATE SCHEMA public AND COMMIT;
CHECKPOINT DEFRAG;
SHUTDOWN COMPACT;
The lobs file is still the same size:
-rw-rw-r-- 1 hsqldb hsqldb 84 May 6 16:44 msdb.properties
-rw-rw-r-- 1 hsqldb hsqldb 1.6M May 6 16:44 msdb.script
-rw-rw-r-- 1 hsqldb hsqldb 625M May 6 16:35 msdb.lobs
What is the best way to truncate the schema and get all the disk space back?
I have an application with the same problem using hsqldb 2.3.3. The .lobs file seems to be growing indefinatly even after calling "checkpoint defrag". My scenario is that I'm inserting a 1000 blobs of 300 bytes each. I'm periodically deleting them all and inserting a 1000 new blobs about the same size. After a number of rounds of this my .lobs file is now 1,3GB in size but it is really just storing around 300kB of data. Inspite of calling checkpoint defrag the .lobs file just grows and grows. Is this behavoiur a bug?
The database engine is designed for continuous use in real applications. If you have an application that uses lobs and deletes some of them, the space will be reused for future lobs after each checkpoint.
In normal application use, the DELETE statement is used to delete rows. This statement deallocates the lob space for reuse after each checkpoint.
You can design your tests in a way that recreates the database, rather than reuse the old database after removing the data.