How to extract all matches as array by parsing a JSON message using AWS CloudWatch Logs Insights? - amazon-cloudwatch

I log messages that are JSON objects. The JSON has an array that contains objects with id:
{
...
"arr": [{"typeId": 1, "foo": "bar"}, {"typeId": 10, "foo": "other"}, ...],
...
}
Now I want to count usage of objects by type by counting all occurrences of their corresponding ID in the array.
I've tried the following query:
filter #message like '"typeId":'
| parse #message '"typeId":*,' as id
| stats count(id) as objCount by id
| sort objCount desc
Which produces a result:
id
objCount
113
28
1
12
296
10
2
9
But it looks wrong. It seems to extract only the very first ID value from each message. I know it because if I search logs by filter #message like '"typeId":113,' I get 44 occurrences.
How can I extract all matches by using parse?

Related

Parse/Ignore specific string in CloudWatch Logs Insights

I have the following AWS Cloudwatch query:
fields #timestamp, #message
| filter #message like /(?i)(error|except)/
| filter !ispresent(level) and !ispresent(eventType)
| stats count(*) as ErrorCount by #message
| sort ErrorCount desc
Results end up looking something like this with the message and a count:
The first 4 results are actualy the same error. However, since they have different (node:*) values at the beginning of the message, it ends up grouping them as different errors.
Is there a way for the query to parse/ignore the (node:*) part so that the first 4 results in the image would be considered just one result with a total count of 2,997?

what's the most effecient way to query all the messages in a group chat application?

i will use an example to illustrate my question.
you have a group-chat table that stores data about group chat.
-------------------+
id | name |owner_id|
-------------------+
33 | code | 45
you have a messages table that hold messages
-------------------------------------+
id | content | user_id | chat_room_id
-------------------------------------+
5 | "hello" | 41 | 33
2 | "hi" | 43 | 33
you have a users table that holds user information and which group chat they are part of:
-------------------------------------+
id | name | chat_room_id
-------------------------------------+
5 |"nick"| 33
2 |"mike"| 33
is this the right way to set up the database?
without joints or foreign keys. what's the most efficient way to load all the messages and user data and have it in a form that allows you to construct a ui where the user data is displayed next to the message?
My solutions:
if you query the messages database and retrieved all the messages where chat room id is equal to 33, you're gonna get an array that looks like
[
{
id : 5,
user_id : 41,
content : "hello"
},
{
id : 2,
user_id : 43,
content : "hi"
}
]
as you can see the user ids are part of the message object.
solution 1 : (naive) :
loop through the messages array and query the database using the user id.
this is a bad solution since querying the database from a loop is never a good idea.
solution 2 : (efficient but less data to send in the response) :
loop through the messages array and construct an array of user ids and use that in a query
using WHERE user_id IN
then loop through the array of users and construct a hash table using the user id as a key since it is unique.
on the front end just loop through the messages array and lookup the user.
is this solution going to be very slow if you have a large amount of messages. will it scale well since it's O(n).
solution 3 : (efficient but more data to send in the response) :
its the same as before but the difference here is adding properties to the messages object that store user data.
the problem with this solution is that you will have duplicate data since one user can publish multiple messages.
these are my solutions i hope to hear yours.
for context : system design videos on youtube don't address this part of chat apps. if you found one that does please post the link.

How to parse retrieve value from a json string field in Redshift/SQL

I have a row that looks like this:
id | json_list | expected_result
"1" | [{"id":"1", "text":"text1"},{"id":"3", "text":"text3"}] | "text1"
"2" | [{"id":"2", "text":"text2"},{"id":"3", "text":"text3"}] | "text2"
I want to retrieve the "text" field based on the id column. How can I achieve that in AWS Redshift? I know Redshift has some json functions and it needs to be paired with some kind of loop condition, but I wasn't sure if it's possible in SQL
Please let me know , If I have understood your question correctly or not.
Because the data is a bit confusing.
Id column value - 2
Your Json value - [{"id":"1", "text":"text1"},{"id":"3", "text":"text3"}]
Expected Result - text1
Solution ->
Step 1 - Iterating through various JSON Elements
select json_extract_array_element_text('[{"id":"1", "text":"text1"},{"id":"3", "text":"text3"}]',1)
Step 2 - Parsing key,Value Pairs in a particular JSON Element
select json_extract_path_text(json_extract_array_element_text('[{"id":"1", "text":"text1"},{"id":"3", "text":"text3"}]',1),'text')
So, Suppose I have a table -
create table dev.gp_test1_20200731
(
id int,
json_list varchar(1000),
expected_result varchar(100)
)
Inserting Data -
insert into dev.gp_test1_20200731
values
(1,'[{"id":"1", "text":"text1"},{"id":"3", "text":"text3"}]', 'text1'),
(2,'[{"id":"2", "text":"text2"},{"id":"3", "text":"text3"}]', 'text2')
How does the data look like -
This is how the query would be -
select json_extract_path_text(json_extract_array_element_text(json_list,1),'text')
from dev.gp_test1_20200731
where id = 2
Result -
But, It is not a good practice of storing JSON's in Redshift.
Documentation on why - Link

Need Splunk query for finding common elements between two fields when each field is a list

I have each event as a JSON object below which is indexed by Splunk. How can I have a Splunk query such that I find all such failures which happen to be present in both "failed" and "passed" arrays?
"output":{
"date" : "21-09-2017"
"failed": [ "fail_1", **"fail_2"** ],
"passed": [ "pass_1", "pass_2" , **"fail_2"**]
}
For the above example, the result would be "fail_2".
You can do something like:
| makeresults
| eval x = "{\"output\":{\"date\" : \"21-09-2017\",\"failed\": [ \"fail_1\", \"fail_2\"],\"passed\": [ \"pass_1\", \"pass_2\" , \"fail_2\"]}}"
| eval x = mvappend(x,"{\"output\":{\"date\" : \"21-09-2017\",\"failed\": [ \"f_1\", \"f_2\"],\"passed\": [ \"f_1\", \"pass_2\" , \"f_2\"]}}")
| mvexpand x
| streamstats count as id
| spath input=x
| rename "output.failed{}" as failed, "output.passed{}" as passed, "output.date" as date
| mvexpand failed
| eval common_field = if(isnotnull(mvfind(passed, failed)),failed,null)
| stats values(date) as date, values(failed) as failed, values(passed) as passed, values(common_field) as common_field by id
The example contains 2 sample log events where failed and passed have common values. streamstats is then used to assign a unique id to each event, as I did not see a unique id in your sample. spath parses the json object into fields. Once that is done, mvexpand creates one row for each value of failed. mvfind then is used to find the values of the failed field that match with any of the values of the passed field. The related rows are then combined again using the unique id assigned.

How to convert recordset to json array in postgres

i'm having one table like
name | age
abc | 20
pqr | 30
I want result in json array like
{
[{
"name":"abc",
"age":20
},
{
"name":"pqr",
"age":30
}]
}
I know their is method
row_to_json();
that will be give me only single row of json but i want array,
please help me on this
select json_agg(row_to_json(t))
from t
;
json_agg
----------------------------------------------------
[{"name":"abc","age":20}, {"name":"pqr","age":30}]
You can try:
SELECT array_to_json(array_agg(row_to_json(data)))
FROM (select name, age from your_table) data
You also can see more at this link:
https://hashrocket.com/blog/posts/faster-json-generation-with-postgresql
Hope it useful to you.