How to search read hotkeys in aerospike cluster? - aerospike

We have an aerospike cluster of 8 nodes. We saw that during peak hours one of the nodes is having a significantly higher load average in comparison to other nodes. Also in the AMC dashboard, we saw that the node is having only 30% read success. After following few similar issues posted in the aerospike community, we thought that the presence of hotkeys might be the possible culprit.
After following (https://discuss.aerospike.com/t/how-to-identify-read-hotkeys/4193), we found out a few hotkey digests with TCPdump in real-time. Among the top 10 digests, the interesting thing is that one key is present in 90% of the time.
We then followed (https://discuss.aerospike.com/t/faq-how-keys-and-digests-are-used-in-aerospike/4663) to find out UserKey/record from those digests. We were able to map user key from all those except for one key which is present in 90% of the time.
Is there any way we can identify that hotkey?

Depending on your version of aerospike, you can also change the logging level for rw-client module which would also print the digest in the logs. That may remove any false positive from the tcpdump method.
Turn detail level logging for rw-client context
asinfo -v "set-log:id=0;rw-client=detail"
Turn back to info
asinfo -v "set-log:id=0;rw-client=info"
Also did you try the UDF from the above article to determine the set and key? (They original key would only be stored if the client has explicitly enable the SendKEY policy). Were there any corresponding record write failures, like record too big? Or possibly trying to read a non-existing record. (read not found) The write failures from a record too big would have the most impact on your network infrastructure. In both of these cases, the digest and record would not make it to storage and digest would not match an existing record.

It is possible that the frequent read request with the rouge digest may be failing with a 'not found' error (and hence only 30% read success). But Aerospike will spend its resources (CPU) to search for this digest in the index tree. If this is true, there will be no record in the database corresponding to the digest that you found via tcpdump. So, you will not get any details about that in the database. How did you identify the keys of other digests ? and what issue are you facing to find the key corresponding to the rouge digest ?.
Another option is to track back to the application. One option is to see in the tcpdump if all the requests for this rouge digest are coming from a single machine. That will narrow down your search greatly. We have seen bots creating such a mess in the past.

Related

REDIS search does not return results consistently

I am reading events from Kafka and depositing into REDIS. Then, we read events using Python and in case we don’t find events we drop/re-create the index.
However, I noticed at times even after re-creating the index I still don’t find events.
I have couple of questions -
[Q1] Is re-indexing a good approach where we are continuously getting a huge flow of events?
[Q2] Also, I noticed during REDIS search at times I do get events and then at another instance query does not return results, can this be related to dropping / re-creating index?
[Q3] Is there a better standard approach to ensure JSON are deposited / retrieved consistenly.
[Q4] Is there an explanation as to why at times everything just seems to work fine continuously for several hours and then does not work at all for few hours.
I would appreciate alternate approaches for this simple use case as I am fairly new to REDIS
Here are some points. I hope it would be helpful.
[Q1]
Re-indexing can be a good approach if you are getting a huge flow of events, as it will help keep your data up-to-date. However, you also need to make sure that your Redis instance is able to handle the increased load otherwise Redis can become very slow when it is re-indexing and may not be able to keep up with the flow of event. And in that case, you may need to scale your Redis instance to handle the load or incremental indexing may be a better option.
[Q2]
There could be many reasons why search results vary, but it is possible that the index is being dropped and re-created, which would cause the event to not be found. There are a few other things that could be happening:
There could be an issue with the search algorithm, which would cause search results to not be returned.
The data in the database could be changing, which would cause search results to not be returned.
[Q3]
There is no one standard approach to ensure JSON are deposited or retrieved consistently. However, some best practices you may consider include using a library or tool that supports serialization and deserialization of JSON data, validating input and output data, and using an editor that highlights errors in JSON syntax. You can also use locking mechanism to ensure that only one process can write to the Redis instance at a time, or using a queue to buffer writes to Redis. Also, different developers may have different preferences and opinions on the best way to handle JSON in Redis. Some possible methods include using commands such as JSET or JGET to manage JSON objects, or using a library such as JRedis to simplify the process.
[Q4]
Redis can be temperamental, and its behavior can vary depending on the specific configuration and usage scenarios. There is no specific explanation for this behavior, but it could be due to various factors such as load on the Redis server, network conditions, or other applications using the same Redis instance. In that case, server will not be able to handle requests properly and will stop responding. If everything is working fine for a few hours and then suddenly stops working, you can try restarting Redis or checking the logs for any errors that may have occurred.

Baselining internal network traffic (corporate)

We are collecting network traffic from switches using Zeek in the form of ‘connection logs’. The connection logs are then stored in Elasticsearch indices via filebeat. Each connection log is a tuple with the following fields: (source_ip, destination_ip, port, protocol, network_bytes, duration) There are more fields, but let’s just consider the above fields for simplicity for now. We get 200 million such logs every hour for internal traffic. (Zeek allows us to identify internal traffic through a field.) We have about 200,000 active IP addresses.
What we want to do is digest all these logs and create a graph where each node is an IP address, and an edge (directed, sourcedestination) represents traffic between two IP addresses. There will be one unique edge for each distinct (port, protocol) tuple. The edge will have properties: average duration, average bytes transferred, number of logs histogram by the hour of the day.
I have tried using Elasticsearch’s aggregation and also the newer Transform technique. While both work in theory, and I have tested them successfully on a very small subset of IP addresses, the processes simply cannot keep up for our entire internal traffic. E.g. digesting 1 hour of logs (about 200M logs) using Transform takes about 3 hours.
My question is:
Is post-processing Elasticsearch data the right approach to making this graph? Or is there some product that we can use upstream to do this job? Someone suggested looking into ntopng, but I did not find this specific use case in their product description. (Not sure if it is relevant, but we use ntop’s PF_RING product as a Frontend for Zeek). Are there other products that does the job out of the box? Thanks.
What problems or root causes are you attempting to elicit with graph of Zeek east-west traffic?
Seems that a more-tailored use case, such as a specific type of authentication, or even a larger problem set such as endpoint access expansion might be a better use of storage, compute, memory, and your other valuable time and resources, no?
Even if you did want to correlate or group on Zeek data, try to normalize it to OSSEM, and there would be no reason to, say, collect tuple when you can collect community-id instead. You could correlate Zeek in the large to Suricata in the small. Perhaps a better data architecture would be VAST.
Kibana, in its latest iterations, does have Graph, and even older version can lever the third-party kbn_network plugin. I could see you hitting a wall with 200k active IP addresses and Elasticsearch aggregations or even summary indexes.
Many orgs will build data architectures beyond the simple Serving layer provided by Elasticsearch. What I have heard of would be a Kappa architecture streaming into the graph database directly, such as dgraph, and perhaps just those edges of the graph available from a Serving layer.
There are other ways of asking questions from IP address data, such as the ML options in AWS SageMaker IP Insights or the Apache Spot project.
Additionally, I'm a huge fan of getting the right data only as the situation arises, although in an automated way so that the puzzle pieces bubble up for me and I can simply lock them into place. If I was working with Zeek data especially, I could lever a platform such as SecurityOnion and its orchestrated Playbook engine to kick off other tasks for me, such as querying out with one of the Velocidex tools, or even cross correlating using the built-in Sigma sources.

Membase caching pattern when one server in cluster is inaccessible

I have an application that runs a single Membase server (1.7.1.1) that I use to cache data I'd otherwise fetch from our central SQL Server DB. I have one default bucket associated to the Membase server, and follow the traditional data-fetching pattern of:
When specific data is requested, lookup the relevant key in Membase
If data is returned, use it.
If no data is returned, fetch data from the DB
Store the newly returned data in Membase
I am looking to add an additional server to my default cluster, and rebalance the keys. (I also have replication enabled for one additional server).
In this scenario, I am curious as to how I can use the current pattern (or modify it) to make sure that I am not getting data out of sync when one of my two servers goes down in either an auto-failover or manual failover scenario.
From my understanding, if one server goes down (call it Server A), during the period that it is down but still attached to the cluster, there will be a cache key miss (if the active key is associated to Server A, not Server B). In that case, in the data-fetching pattern above, I would get no data returned and fetch straight from SQL Server. But, when I attempt to store the data back to my Membase cluster, will it store the data in Server B and remap that key to Server B on the next fetch?
I understand that once I mark Server A as "failed over", Server B's replica key will become the active one, but I am unclear about how to handle the intermittent situation when Server A is inaccessible but not yet marked as failed over.
Any help is greatly appreciated!
That's a pretty old version. But several things to clarify.
If you are performing caching you are probably using a memcached bucket, and in this case there is no replica.
Nodes are always considered attached to the cluster until they are explicitly removed by administrative action (autofailover attempts to automate this administrative action for you by attempting to remove the node from the cluster if it's determined to be down for n amount of time).
If the server is down (but not failed over), you will not get a "Cache Miss" per se, but some other kind of connectivity error from your client. Many older memcached clients do not make this distinction and simply return a NULL, False, or similar value for any kind of failure. I suggest you use a proper Couchbase client for your application which should help differentiate between the two.
As far as Couchbase is concerned, data routing for any kind of operation remains the same. So if you were not able to reach the item on Server A. because it was not available, you will encounter this same issue upon attempting to store it back again. In other words, if you tried to get data from Server A and it was down, attempting to store data to Server A will fail in the exact same way, unless the server was failed over between the last fetch and the current storage attempt -- in which case the client will determine this and route the request to the appropriate server.
In "newer" versions of Couchbase (> 2.x) there is a special get-from-replica command available for use with couchbase (or membase)-style buckets which allow you to explicitly read information from a replica node. Note that you still cannot write to such a node, though.
Your overall strategy seems very sane for a cache; except that you need to understand that if a node is unavailable, then a certain percentage of your data will be unavailable (for both reads and writes) until the node is either brought back up again or failed over. There is no

how can I control cassandra replication?

When I am making an Insert in a specified keyspace, I wants that the data is stored only in a specified node (or node list). The information contains in the insert, may be confidential and should not be distributed on any nodes.
I first thought about implementing my own AbstractReplicationStrategy, but it seams that the first node choose depends on the Token (selected by the partitioner) and not the implemented strategy.
How can I be sure the the information contains in a keyspace only comes where I allow this?
I don't think that it is possible to do what you are asking. Cassandra actively tries to maintain a certain number of replicas of each piece of data- even if you managed to force only a single node to store your insert (which is fairly straighforward), you'd have no control over which node that was (as you found, this is controlled by the partitioner), and if the node went down your data would be lost.
The short answer is that controlling replication is not the way to achieve data security- you should use proper security techniques such as encryption, segregated networks, controlled access, etc.

Selective replication with CouchDB

I'm currently evaluating possible solutions to the follwing problem:
A set of data entries must be synchonized between multiple clients, where each client may only view (or even know about the existence of) a subset of the data.
Each client "owns" some of the elements, and the decision who else can read or modify those elements may only be made by the owner. To complicate this situation even more, each element (and each element revision) must have an unique identifier that is equal for all clients.
While the latter sounds like a perfect task for CouchDB (and a document based data model would fit my needs perfectly), I'm not sure if the authentication/authorization subsystem of CouchDB can handle these requirements: While it should be possible to restict write access using validation functions, there doesn't seem to be a way to authorize read access. All solutions I've found for this problem propose to route all CouchDB requests through a proxy (or an application layer) that handles authorization.
So, the question is: Is it possible to implement an authorization layer that filters requests to the database so that access is granted only to documents that the requesting client has read access to and still use the replication mechanism of CouchDB? Simplified, this would be some kind of "selective replication" where only some of the documents, and not the whole database is replicated.
I would also be thankful for directions to some detailed information about how replication works. The CouchDB wiki and even the "Definite Guide" Book are not too specific about that.
this begs for replication filters. you filter outbound replication based on whatever criteria you impose, and give the owner of the target unrestricted access to their own copy.
i haven't had the opportunity to play with replication filters directly, but the idea would be that each doc would have some information about who has access to it, and the filtering mechanism would then allow outbound replication of only those documents that you have access to. replication from the target back to the master would be unrestricted, allowing for the master to remain a rollup copy, and potentially multicast changes to overlapping sets of data.
What you are after is replication filters. According to Chris Anderson, it is a 0.11 feature.
"The current status is that there is
an API for filtering the _changes
feed. The replicator in 0.10 consumes
the changes feed, so the next step is
getting the replicator to use the
filter API.
There is work in progress on this, so
it should be fully ready to go in
0.11."
See the orginal post
Here is a new link to the some documentation about this:
http://blog.couchbase.com/what%E2%80%99s-new-apache-couchdb-011-%E2%80%94-part-three-new-features-replication
Indeed, as others have said, replication filters are the way to go for this. Here is a link with some information on using them.
One caveat I would add is that at scale replication filters can be extremely slow. More information about this and other nuances about couchdb can be found in this excellent blog post: "what every developer should know about couchdb". For large scale systems performing replication in the application layer has proven faster and more reliable.