how to use wildcard in redis search - redis

I have the following hash:
HMSET rules:1231231234_11:00_17:00 fw 4444 dm test.abc.com days 'thu, tue, wed'
HMSET rules:1231231234_9:00_10:59 fw 2211 dm anothertest.abc.com days 'thu'
Is there anyway I can search the rules hash and find all records that have a prefix of 1231231234?
Something like
HGET rules:1231231234*
OR... perhaps the way I've created the data is wrong. What's the best way to create a data set like this:
(json notation)
{
pn: 1231231234,
rules: [{
"expiration_date" : "",
"days_of_week" : "Thu, Tue, Wed",
"start_time" : "11:00",
"end_time" : "17:00",
"fw" : "9999"
},
{
"rule_expiration_date" : "",
"days_of_week" : "Thu",
"start_time" : "9:00",
"end_time" : "10:59",
"fw" : "2222"
}]
}
How this data will be used:
I will need to find the rule that applies to me, based on the current time.
So for example, when my application gets a request to "process" pn 1231231234, I need to lookup all rules for that pn number, and then find which rule matches my current day of week, and time stamp.
I don't mind getting back all the rules for a given pn and then having the client code loop through to find the right rule.
EDIT 1
Using my data the way it currently has been created, I tried HSCAN like this:
127.0.0.1:6379[1]> HSCAN rules 0 MATCH 1231231234*
1) "0"
2) (empty list or set)
127.0.0.1:6379[1]>
EDIT 2
As a test, I tried this type of a structure instead:
HMSET rules:1231231234 tue_11:00_17:00 fw 9999
HMSET rules:1231231234 wed_11:00_17:00 fw 9999
HMSET rules:1231231234 thur_11:00_17:00 fw 9999
HMSET rules:1231231234 thu_9:00_10:59 fw 2222
Then I can just see all rules for the main pn. and the use my client app to loop through the results...
?

You need to use scan instead of hscan.
Combining SCAN and HGETALL you can achieve this.
1) Do Scan and get all the values matching your pattern
127.0.0.1:6379> scan 0 match rules:1231231234*
1) "0"
2) 1) "rules:1231231234_11:00_17:00"
2) "rules:1231231234_9:00_10:59"
2) Then for each key in your app logic iterate over them and do an hgetall
127.0.0.1:6379> hgetall rules:1231231234_11:00_17:00
1) "fw"
2) "4444"
3) "dm"
4) "test.abc.com"
5) "days"
6) "thu, tue, wed"
3) if it matches your criteria process.
4) Repeat the same throughout the iteration.
Hope this helps

Related

How to search array of objects RedisJson using RedisSearch

i have added a json document into my redis db using JSON.set Below given is my Json SET command
{
"books": [
{
"title": "Peter Pan",
"price": 8.95
},
{
"title": "Moby Dick",
"price": 12.99
}
]
}
JSON.SET myDoc $ '{"books": [{"title": "Peter Pan", "price": 8.95},{"title": "A Thousand Suns", "price": 8.15}, {"title": "Moby Dick", "price": 12.99}]}'
Now i want to search this specific array of objects to fetch me the records which have price greater and equal to 8
i have tried creating an index but it always return me 0 records Below given is my index
FT.CREATE docIdx ON JSON SCHEMA $.myDoc.books.price AS price Numeric
in the way you propose the index, you're ignoring the array of elements that belong to the object book. One way could be:
FT.CREATE docIdx ON JSON SCHEMA $.myDoc.books[*].price AS price Numeric
or
FT.CREATE docIdx ON JSON SCHEMA $.myDoc.books[0:].price AS price Numeric
another recommendation it's to use the prefix since all documents with the same prefix will be indexed by Redisearch.
Something like:
FT.CREATE docIdx PREFIX 1 myDoc ON JSON SCHEMA $.books[*].price AS price Numeric
Notice, since you are trying to match the condition within the same document when you search for your price condition (between 8 and 9), the result will be the entire document "myDoc" with all the books, since the "Peter Pan" and "A Thousand Suns" matches the query condition. Such as:
> FT.SEARCH docIdx "#price:[8 12]"
1) "1"
2) "myDoc"
3) 1) "$"
2) "{\"books\":[{\"title\":\"Peter Pan\",\"price\":8.95},{\"title\":\"A Thousand Suns\",\"price\":8.15},{\"title\":\"Moby Dick\",\"price\":12.99}]}"

Is there a way to get the results with the same data type of the column with ServiceStack.Redis using Redisql package?

I want to connect to Redis with ServiceStack.Redis package with c# with below statement.
var redisManager = new RedisManagerPool("127.0.0.1:6380");
var redis = redisManager.GetClient();
redis.Custom("DEL", "DB");
redis.Custom("REDISQL.CREATE_DB", "DB");
redis.Custom("REDISQL.EXEC", "DB", "CREATE TABLE TABLE1(A INT, B TEXT);");
redis.Custom("REDISQL.CREATE_STATEMENT", "DB", "INSERTINTOTABLE1STMT", "INSERT INTO TABLE1 VALUES(?1,?2)");
redis.Custom("REDISQL.EXEC_STATEMENT", "DB", "INSERTINTOTABLE1STMT", 1, "Value1");
redis.Custom("REDISQL.EXEC_STATEMENT", "DB", "INSERTINTOTABLE1STMT", 2, "Value2");
var res = redis.Custom("REDISQL.EXEC", "DB", "SELECT * FROM TABLE1");
Queries are executed fine. Column 'A' of Table1 is defined as an integer. But final command that retrieves all rows from TABLE1 return data of 'A' column as Text because result is RedisText type.
Is there a way to get the results with the same data type of the column ?
127.0.0.1:6379> REDISQL.V2.CREATE_DB DB
1) 1) "OK"
127.0.0.1:6379> REDISQL.EXEC DB COMMAND "CREATE TABLE TABLE1(A INT, B TEXT);"
1) 1) "DONE"
2) 1) (integer) 0
127.0.0.1:6379> REDISQL.EXEC DB COMMAND "INSERT INTO TABLE1 VALUES (1,'A'),(2,'B');"
1) 1) "DONE"
2) 1) (integer) 2
127.0.0.1:6379> REDISQL.EXEC DB COMMAND "SELECT * FROM TABLE1;"
1) 1) "RESULT"
2) 1) "A"
2) "B"
3) 1) "INT"
2) "TEXT"
4) 1) (integer) 1
2) "A"
5) 1) (integer) 2
2) "B"
127.0.0.1:6379> REDISQL.EXEC DB COMMAND "SELECT * FROM TABLE1;" json
"{\"rows\":[{\"A\":1,\"B\":\"A\"},{\"A\":2,\"B\":\"B\"}],\"number_of_rows\":2,\"columns\":{\"A\":\"INT\",\"B\":\"TEXT\"}}"
So as you mention, the problem is ServiceStack that returns a simple RedisText instead of interpreter the correct data in the redis protocol.
In the example above I am using RediSQL V2 (now know as zeeSQL https://zeesql.com) that returns also the type of the columns.
Moreover, we can return directly JSON, with the correct typing.
More info about all the commands in zeeSQL are here: https://doc.zeesql.com/references
Unfortunately I don't believe there is much we can do, beside a suggestion to the ServiceStack maintainers.
The obvious suggestion is to switch to RediSQL V2 / zeeSQL. We maintain perfect backward compatibility with RediSQL V1.
You just have to prefix your commands with REDISQL.V1 instead of just REDISQL.

Using Athena to get terminatingrule from rulegrouplist in AWS WAF logs

I followed these instructions to get my AWS WAF data into an Athena table.
I would like to query the data to find the latest requests with an action of BLOCK. This query works:
SELECT
from_unixtime(timestamp / 1000e0) AS date,
action,
httprequest.clientip AS ip,
httprequest.uri AS request,
httprequest.country as country,
terminatingruleid,
rulegrouplist
FROM waf_logs
WHERE action='BLOCK'
ORDER BY date DESC
LIMIT 100;
My issue is cleanly identifying the "terminatingrule" - the reason the request was blocked. As an example, a result has
terminatingrule = AWS-AWSManagedRulesCommonRuleSet
And
rulegrouplist = [
{
"nonterminatingmatchingrules": [],
"rulegroupid": "AWS#AWSManagedRulesAmazonIpReputationList",
"terminatingrule": "null",
"excludedrules": "null"
},
{
"nonterminatingmatchingrules": [],
"rulegroupid": "AWS#AWSManagedRulesKnownBadInputsRuleSet",
"terminatingrule": "null",
"excludedrules": "null"
},
{
"nonterminatingmatchingrules": [],
"rulegroupid": "AWS#AWSManagedRulesLinuxRuleSet",
"terminatingrule": "null",
"excludedrules": "null"
},
{
"nonterminatingmatchingrules": [],
"rulegroupid": "AWS#AWSManagedRulesCommonRuleSet",
"terminatingrule": {
"rulematchdetails": "null",
"action": "BLOCK",
"ruleid": "NoUserAgent_HEADER"
},
"excludedrules":"null"
}
]
The piece of data I would like separated into a column is rulegrouplist[terminatingrule].ruleid which has a value of NoUserAgent_HEADER
AWS provide useful information on querying nested Athena arrays, but I have been unable to get the result I want.
I have framed this as an AWS question but since Athena uses SQL queries, it's likely that anyone with good SQL skills could work this out.
It's not entirely clear to me exactly what you want, but I'm going to assume you are after the array element where terminatingrule is not "null" (I will also assume that if there are multiple you want the first).
The documentation you link to say that the type of the rulegrouplist column is array<string>. The reason why it is string and not a complex type is because there seems to be multiple different schemas for this column, one example being that the terminatingrule property is either the string "null", or a struct/object – something that can't be described using Athena's type system.
This is not a problem, however. When dealing with JSON there's a whole set of JSON functions that can be used. Here's one way to use json_extract combined with filter and element_at to remove array elements where the terminatingrule property is the string "null" and then pick the first of the remaining elements:
SELECT
element_at(
filter(
rulegrouplist,
rulegroup -> json_extract(rulegroup, '$.terminatingrule') <> CAST('null' AS JSON)
),
1
) AS first_non_null_terminatingrule
FROM waf_logs
WHERE action = 'BLOCK'
ORDER BY date DESC
You say you want the "latest", which to me is ambiguous and could mean both first non-null and last non-null element. The query above will return the first non-null element, and if you want the last you can change the second argument to element_at to -1 (Athena's array indexing starts from 1, and -1 is counting from the end).
To return the individual ruleid element of the json:
SELECT from_unixtime(timestamp / 1000e0) AS date, action, httprequest.clientip AS ip, httprequest.uri AS request, httprequest.country as country, terminatingruleid, json_extract(element_at(filter(rulegrouplist,rulegroup -> json_extract(rulegroup, '$.terminatingrule') <> CAST('null' AS JSON) ),1), '$.terminatingrule.ruleid') AS ruleid
FROM waf_logs
WHERE action='BLOCK'
ORDER BY date DESC
I had the same issue but the solution posted by Theo didn't work for me, even though the table was created according to the instructions linked to in the original post.
Here is what worked for me, which is basically the same as Theo's solution, but without the json conversion:
SELECT
from_unixtime(timestamp / 1000e0) AS date,
action,
httprequest.clientip AS ip,
httprequest.uri AS request,
httprequest.country as country,
terminatingruleid,
rulegrouplist,
element_at(filter(ruleGroupList, ruleGroup -> ruleGroup.terminatingRule IS NOT NULL),1).terminatingRule.ruleId AS ruleId
FROM waf_logs
WHERE action='BLOCK'
ORDER BY date DESC
LIMIT 100;

Complex data structures Redis

Lets say I have a hash of a hash e.g.
$data = {
'harry' : {
'age' : 25,
'weight' : 75,
},
'sally' : {
'age' : 25,
'weight' : 75,
}
}
What would the 'usual' way to store such a data structure (or would you not?)
Would you be able to directly get a value (e.g. get harry : age ?
Once stored could you directly change the value of a sub key (e.g. sally : weight = 100)
What would the 'usual' way to store such a data structure (or would
you not?)
For example harry and sally would be stored each in separate hashes where fields would represent their properties like age and weight. Then set structure would hold all the members (harry, sally, ...) which you have stored in redis.
Would you be able to directly get a value (e.g. get harry : age ?)
Yes, see HGET or HMGET or HGETALL.
Once stored could you directly change the value of a sub key (e.g.
sally : weight = 100)
Yes, see HSET.
Lets take a complex data that we have to store in redis ,
for example this one:
$data = {
"user:1" : {
name : "sally",
password : "123"
logs : "25th october" "30th october" "12 sept",
friends : "34" , "24", "10"
}
"user:2" :{
name : ""
password : "4567"
logs :
friends: ""
}
}
The problem that we face is that the friends & logs are lists.
So what we can do to represent this data in redis is use hashes and lists something like this :
Option 1. A hash map with keys as user:1 and user:2
hmset user:1 name "sally" password "12344"
hmset user:2 name "pally" password "232342"
create separate list of logs as
logs:1 { here 1 is the user id }
lpush logs:1 "" "" ""
lpush logs:2 "" "" ""
and similarly for friends.
Option 2: A hash map with dumped json data as string encode
hmset user:1 name "sally" password "12344" logs "String_dumped_data" friends "string of dumped data"
Option 3: This is another representation of #1
something like user:1:friends -> as a list
and user:2:friends -> as a list
Please , correct me if i m wrong.
Depends on what you want to do, but if your datastructure is not deeper nested and you need access to each field, I would recommend using hashes: http://redis.io/commands#hash
Here is a good overview over the redis datatypes, each with pro and contra: http://redis.io/topics/data-types

Effect mongodb _id generation on Indexing

I am using MonoDB as a databse.......
I am going to generate a _id for each document for
that i use useId and FolderID for that user
here userId is different for each User and also Each user has different FolderIds
i generate _id as
userId="user1"
folderId="Folder1"
_id = userId+folderId
is there any effect of this id generation on mongoDB Indexing...
will it work Fast like _id generated by MongoDB
A much better solution would be to leave the _id column as it is and have separate userId and folderId fields in your document, or create a separate field with them both combined.
As for if it will be "as fast" ... depends on your query, but for ordering by "create" date of the document for example you'd lose the ability to simply order by the _id you'd also lose the benefits for sharding and distribution.
However if you want to use both those ID's for your _id there is one other option ...
You can actually use both but leave them separate ... for example this is a valid _id:
> var doc = { "_id" : { "userID" : 12345, "folderID" : 5152 },
"field1" : "test", "field2" : "foo" };
> db.crazy.save(doc);
> db.crazy.findOne();
{
"_id" : {
"userID" : 12345,
"folderID" : 5152
},
"field1" : "test",
"field2" : "foo"
}
>
It should be fine - the one foreseeable issue is that you'll lose the ability to reverse out the date / timestamp from the MongoID. Why not just add another ID object within the document? You're only losing a few bytes, and you're not screwing with the built in indexing system.