How to recover data from Aerospike - aerospike

I have faced a strange issue.
Suddenly Aerospike data has been erased. Provided i have not executed any command to delete the data from the Aerospike.
namespace test {
replication-factor 2
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
I haven’t configured the ttl here but few days back I ran one UDF to set the ttl of all the records to -1 so that it never expires. The sets were being updated periodically, so even then it should not expire after 30 days. I lost all at once which should not be the case.
I am stuck in this since 2 days. Any help is appreciated.

You're using a namespace that is basically defined to be a cache. It is in-memory with no persistence. For example, a restart of the node will cause the namespace to start empty.
The Namespace Storage Configuration article in the deployment guide gives recipes for storage engine configuration. You can set the storage of a specific namespace to be one of the following:
Data stored on SSD
Data stored on a filesystem (not recommended for production)
Data stored in-memory with persistence to an SSD
Data stored in-memory with persistence on a filesystem
Data stored in-memory with no persistence
There is a special case of data in-memory for counters, data-in-index. This is done with persistence.

Related

Aerospike 3.7.4 after reload there is a data loss and some deleted data was restored automatically

We have a cluster from 3 nodes. I have a namespace to store the history of operations, I did lots of delete operations on one of the set and after that migrated data from scratch.
For some reason after a while one node has failed, and we needed to reload the cluster, and later we have encountered that most of the new data was lost and some deleted data was restored.
Can you please help to avoid such behaviour as we need consistency.
Aerospike version is 3.7.4
Here is the configuration for the namespace:
namespace dar_history {
replication-factor 2
memory-size 4G
default-ttl 0 # 30 days, use 0 to never expire/evict.
storage-engine device {
file ../dar_history.dat
filesize 32G
data-in-memory true # Store data in memory in addition to file.
}
}
Aerospike 3.10.0 introduces durable deletes for Aerospike Enterprise. Learn more about how they work here: https://www.aerospike.com/docs/guide/durable_deletes.html.
A few alternative solutions for community edition are discussed here: https://discuss.aerospike.com/t/expired-deleted-data-reappears-after-server-is-restarted/470/22.
We have solved the problem by backing up the data, then restarting the cluster, importing data from the backup, recreating all the indices we had and deleting old unnecessary data.
Thank you all for replies.

Aerospike namespace configuration options

I've recently been getting involved in implementing an Aerospike data store into our product. We've been trying to work out the best configuration for our namespace. The requirement to persist data means we need to have a storage-engine as a device. we have specified data-in-memory as true.
My question is: does data-in-memory attempt to load ALL the backing store data into memory as the vague description implies?
Keep a copy of all data in memory always.
Or will it pay attention to the memory-size setting on the namespace and only load memory-size amount of data from the backing store?
Description of the setting was retrieved from documentation.
I have been talking to the guy who first implemented aerospike to try and find out if he knew and wasn't sure so I'm looking for clarification.
For reference my namespace config looks something like this, with an obviously smaller memory quota than backing store
namespace Test {
replication-factor 2
memory-size 4G
default-ttl 0
storage-engine device {
file /opt/aerospike/data/Test.dat
filesize 16G
data-in-memory true
}
}
It will keep all the data in memory. Aerospike does not yet have a partial cache implementation to keep the most used data in the provided memory.
Your data will only exist in memory, while the disk is used for persistence for recovery in the event of server restart. The reason the filesize is larger than the memory-size is that disk space is needed for maintenance operations such as defragmentation of blocks. Disk devices are block devices, and in the default 1MB write-block-size you can fit multiple records so operations such as defrag occur by moving records from blocks that are less than defrag-lwm-pct full. This takes extra blocks, so you need that spare capacity.

aerospike data not found on after server restart

I am new to aerospike DB. I inserted data from mysql to aerospike using a migration script. Due to some issue aerospike server was restarted.
But after the restart, there was no data in aerospike DB.
Can someone please let me know what could be the issue? Any config problem in Aerospike ?
What is the storage mechanism that you used with Aerospike? Did you use one of the default databases? One of the defaults is a in-memory only. Hence, data will be lost if it is an in-memory storage only with a single node and is restarted.
So basically you should ensure that the database storage is configured for persistence[1], has replication factor 2 or more and the suggested minimum number of servers in the cluster should be atleast equal to replication factor to ensure HA.
[1]https://www.aerospike.com/docs/operations/configure/namespace/storage/#recipe-for-an-ssd-storage-engine

What if Redis keys are never deleted programmatically?

What will happen to my redis data if no expiry is set and no DEL command is used.
Will it be removed after some default time ?
One more,
How redis stores data, is it in any file format ? because I can access data even after restarting the computer. So which files are created by redis and where ?
Thanks.
Redis is a in-memory data store meaning all your data is kept in RAM (ie. volatile). So theoritically your data will live as long as you don't turn the power off.
However, it also provides persistence in two modes:
RDB mode which takes snapshots of your dataset and saves them to the disk in a file called dump.drb. This is the default mode.
AOF mode which records every write operation executed by the server in an Append-Only file and then replays it thus reconstructing the original data.
Redis persistence is very good explained here and here by the creator of Redis himself.

Is Redis a memory only store like memcached or does it write the data to the disk

Is Redis memory only store like memcached or does it write the data to the disk? If it does write to the disk, how often is the disk written to?
Redis persistence is described in detail here:
http://redis.io/topics/persistence
By default, redis performs snapshotting:
By default Redis saves snapshots of the dataset on disk, in a binary file called dump.rdb. You can configure Redis to have it save the dataset every N seconds if there are at least M changes in the dataset, or you can manually call the SAVE or BGSAVE commands.
For example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed: save 60 1000
Another good reference is this link to the author's blog where he tries to explain how redis persistance works:
http://antirez.com/post/redis-persistence-demystified.html
Redis holds all data in memory. If the size of an application's data is too large for that, then Redis is not an appropriate solution.
However, Redis also offers two ways to make the data persistent:
1) Snapshots at predefined intervals, which may also depend on the number of changes. Any changes between these intervals will be lost at a power failure or crash.
2) Writing a kind of change log at every data change. You can fine-tune how often this is physically written to the disk, but if you chose to always write immediately (which will cost you some performance), then there will be no data loss caused by the in-memory nature of Redis.