I am ingesting some custom logs to Azure LogAnalytics. One of the columns contains nested json objects. I would like to return each nested object to a separate column value.
Was trying the mvexpand statement but have not had any luck.
customLog_CL
| extend test = parsejson(target_s)
| mvexpand test
The column data looks like below.
[ { "id": "00phb49dl40lBsasC0h7", "type": "PolicyEntity", "alternateId": "unknown", "displayName": "Default Policy", "detailEntry": "#{policyType=hello}" }, { "id": "0pri9mxp9vSc4lpiU0h7", "type": "PolicyRule", "alternateId": "00phb49dl40lBsasC0h7", "displayName": "All Users Login", "detailEntry": null } ]
I'm in the exact same situation, so hopefully we can share the knowledge.
I ended up doing something like this, if it's the correct way of doing it, or I have any bugs, I honestly can't tell you right now (still doing my data validation, so I'll update later on), but this should at least get you started.
customLog_CL
| mvexpand parsejson(target_s)
| extend Id=target_s["id"]
| extend type=target_s["type"]
| extend OtherId=target_s["alternateId"]
| project Id, type, OtherId
This should work:
datatable(d:dynamic)
[
dynamic(
[
{ "id": "00phb49dl40lBsasC0h7", "type": "PolicyEntity", "alternateId": "unknown", "displayName": "Default Policy", "detailEntry": "#{policyType=hello}" },
{ "id": "0pri9mxp9vSc4lpiU0h7", "type": "PolicyRule", "alternateId": "00phb49dl40lBsasC0h7", "displayName": "All Users Login", "detailEntry": "" }
]
)
]
| mv-expand(d)
| project key = tostring(d['id']), value = d
| extend p = pack(key, value)
| summarize bag = make_bag(p)
| evaluate bag_unpack(bag)
Output
Please check if this fits your requirement.
let hosts_object = parsejson('{"hosts": [ { "id": "00phb49dl40lBsasC0h7", "type": "PolicyEntity", "alternateId": "unknown", "displayName": "Default Policy", "detailEntry": "#{policyType=hello}" }, { "id": "0pri9mxp9vSc4lpiU0h7", "type": "PolicyRule", "alternateId": "00phb49dl40lBsasC0h7", "displayName": "All Users Login", "detailEntry": null } ]}');
print hosts_object
| extend json1 = hosts_object.hosts[0] , json2 = hosts_object.hosts[1]
Output for this should be as below
Additional Documentation Reference
Hope this helps.
Related
Hi Can someone help me simulate this scenario, Example this is the response I got, I want to extract all alertId with the name parameter contains test. You response is highly appreciated. Thank you so much.
Response:
[
{
"duplicateCount": 0,
"fqdn": "qa-ubuntu14-4",
"appName": "TEST_APD_UB14",
"stateString": "OPEN",
"category": "FILESCAN",
"alkey": {
"agentId": "8470ea64-a710-3e46-ba6b-ccd37ebc4074",
"role": "AD SERVER",
"alertId": "0258a7ca-bc72-3a53-aa98-3098c87411ba",
"id": "6695a7fa-ab9f-43fa-871b-620cd1eeb75054af7770-604b-11e9-b486-8d59ab9344597cea0ea2-d897-3696-852d-5f3cb36f270e8470ea64-a710-3e46-ba6b-ccd37ebc4074/var/log/test321.txttest321.txtA",
"applicationContextId": "7cea0ea2-d897-3696-852d-5f3cb36f270e"
},
"properties": {
"name": "test321.txt",
"acl": ""
}
},
{
"duplicateCount": 0,
"fqdn": "qa-ubuntu14-4",
"appName": "TEST_APD_UB18",
"stateString": "OPEN",
"category": "FILESCAN",
"alkey": {
"agentId": "8470ea64-a710-3e46-ba6b-ccd37ebc4074",
"role": "AD SERVER",
"alertId": "0258a7ca-bc72-3a53-aa98-3098c8741CDA",
"id": "6695a7fa-ab9f-43fa-871b-620cd1eeb75054af7770-604b-11e9-b486-8d59ab9344597cea0ea2-d897-3696-852d-5f3cb36f270e8470ea64-a710-3e46-ba6b-ccd37ebc4074/var/log/test321.txttest321.txtA",
"applicationContextId": "7cea0ea2-d897-3696-852d-5f3cb36f270e"
},
"properties": {
"name": "test555.txt",
"acl": ""
}
}
]
Screenshot:
Expected Result:
I want to extract all alertId with the name parameter contains test
You could use the following JSON query to extract the values:
[*].[?(#.properties.name contains 'test')]alkey.agentId
I found this reference with JSON Path Syntax is really useful.
I'm trying to combine 2 select queries on 2 different tables which have a foreign key in common project_id included with a condition and returned in a single result set with the project_id a json_array called sprints and a json_array called backlog. The output should look something like this.
{
"id": "1920c79d-69d7-4b63-9662-ed5333e9b735",
"name": "Test backend v1",
"backlog_items": [
{
"id": "961b2438-a16b-4f30-83f1-723a05592d68",
"name": "Another User Story 1",
"type": "User Story",
"backlog": true,
"s3_link": null,
"sprint_id": null
},
{
"id": "a2d93017-ab87-4ec2-9589-71f6cebba936",
"name": "New Comment",
"type": "Comment",
"backlog": true,
"s3_link": null,
"sprint_id": null
}
],
"sprints": [
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62ed4",
"name": "Test name 2",
"sprint_items": [
{
"id": "1285825b-1669-40f2-96b8-de02ec80d8bd",
"name": "As an admin I should be able to delete an organization",
"type": "User Story",
"backlog": false,
"s3_link": null,
"sprint_id": "1cd165c7-68f7-4a1d-b018-609989d62ed4"
}
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62f44",
"name": "Test name 1",
"sprint_items": []
}
]
}
In case there are no backlog items associated with the project_id or sprints with the project_id I want to return an empty list. I figured using Postgres COALESCE function might help in this case but I'm not sure how to use is to achieve what I want.
Sprint table
id | end_date | start_date | project_id | name
--------------------------------------+----------+------------+--------------------------------------+-------------
1cd165c7-68f7-4a1d-b018-609989d62ed4 | | | 1920c79d-69d7-4b63-9662-ed5333e9b735 | Test name 2
Sprint item table
id | sprint_id | name | type | s3_link | backlog | project_id
--------------------------------------+--------------------------------------+--------------------------------------------------------+------------+---------+---------+--------------------------------------
961b2438-a16b-4f30-83f1-723a05592d68 | | Another User Story 1 | User Story | | t | 1920c79d-69d7-4b63-9662-ed5333e9b735
a2d93017-ab87-4ec2-9589-71f6cebba936 | | New Comment | Comment | | t | 1920c79d-69d7-4b63-9662-ed5333e9b735
1285825b-1669-40f2-96b8-de02ec80d8bd | 1cd165c7-68f7-4a1d-b018-609989d62ed4 | As an admin I should be able to delete an organization | User Story | | f | 1920c79d-69d7-4b63-9662-ed5333e9b735
The query I'm using right now which returns multiple duplicates in the result set.
with si as (
select si.id, si.name, si.backlog, si.project_id
from sprint_items si
), s as (
select s.id, s.name, s.project_id, jsonb_agg(to_jsonb(si) - 'project_id') as sprint_items
from sprints s
left join sprint_items si
on si.sprint_id = s.id
group by s.id, s.name, s.project_id
), p as (
select p.id, p.name, jsonb_agg(to_jsonb(s) - 'project_id') as sprints,
jsonb_agg(to_jsonb(case when si.backlog = true then si end) - 'project_id') as backlog_items
from projects p
left join s
on s.project_id = p.id
left join si
on si.project_id = p.id
group by p.id, p.name
)
select to_jsonb(p) from p
where p.id = '1920c79d-69d7-4b63-9662-ed5333e9b735'
Updated
This is what the above query is producing in terms of duplicating the sprint items and sprints
{
"id": "1920c79d-69d7-4b63-9662-ed5333e9b735",
"name": "Test backend v1",
"sprints": [
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62ed4",
"name": "Test name 2",
"sprint_items": [
{
"id": "1285825b-1669-40f2-96b8-de02ec80d8bd",
"name": "As an admin I should be able to delete an organization",
"type": "User Story",
"backlog": false,
"s3_link": null,
"sprint_id": "1cd165c7-68f7-4a1d-b018-609989d62ed4"
}
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62ed4",
"name": "Test name 2",
"sprint_items": [
{
"id": "1285825b-1669-40f2-96b8-de02ec80d8bd",
"name": "As an admin I should be able to delete an organization",
"type": "User Story",
"backlog": false,
"s3_link": null,
"sprint_id": "1cd165c7-68f7-4a1d-b018-609989d62ed4"
}
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62ed4",
"name": "Test name 2",
"sprint_items": [
{
"id": "1285825b-1669-40f2-96b8-de02ec80d8bd",
"name": "As an admin I should be able to delete an organization",
"type": "User Story",
"backlog": false,
"s3_link": null,
"sprint_id": "1cd165c7-68f7-4a1d-b018-609989d62ed4"
}
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62f44",
"name": "Test name 1",
"sprint_items": [
null
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62f44",
"name": "Test name 1",
"sprint_items": [
null
]
},
{
"id": "1cd165c7-68f7-4a1d-b018-609989d62f44",
"name": "Test name 1",
"sprint_items": [
null
]
}
],
"backlog_items": [
null,
{
"id": "961b2438-a16b-4f30-83f1-723a05592d68",
"name": "Another User Story 1",
"backlog": true
},
{
"id": "a2d93017-ab87-4ec2-9589-71f6cebba936",
"name": "New Comment",
"backlog": true
},
null,
{
"id": "961b2438-a16b-4f30-83f1-723a05592d68",
"name": "Another User Story 1",
"backlog": true
},
{
"id": "a2d93017-ab87-4ec2-9589-71f6cebba936",
"name": "New Comment",
"backlog": true
}
]
}
Any pointers to what functions I should read up would be greatly appreciated.
I am a Mulesoft newbie trying to dynamically add data to a JSON list according to the payload but currently I am stuck on it.
Below is an example of what I pretend to return:
{
"id": "1",
"name": "Example A",
"languages": [
{
"code": "en",
"label": "English",
"translatedName": "Example A"
}
]
},
{
"id": "2",
"name": "Example B",
"languages": [
{
"languageCode": "nl",
"translatedName": "Voorbeeld B"
},
{
"languageCode": "fr",
"translatedName": "Exemple B"
}
]
}
As you can see, the languages depend on the name (Example A has 1 language, Example B has 2).
Am I able to build a map, or a function, in order to return the correct languages?
Best regards and thank you in advance!
Suppose I have the following JSON, which is the result of parsing urls parameters from a log file.
{
"title": "History of Alphabet",
"author": [
{
"name": "Larry"
},
]
}
{
"title": "History of ABC",
}
{
"number_pages": "321",
"year": "1999",
}
{
"title": "History of XYZ",
"author": [
{
"name": "Steve",
"age": "63"
},
{
"nickname": "Bill",
"dob": "1955-03-29"
}
]
}
All the fields in top-level, "title", "author", "number_pages", "year" are optional. And so are the fields in the second level, inside "author", for example.
How should I make a schema for this JSON when loading it to BQ?
A related question:
For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
How should I make a schema for this JSON when loading it to BQ?
The following schema should work. You may want to change some of the types (e.g. maybe you want the dob field to be a TIMESTAMP instead of a STRING), but the general structure should be similar. Since types are NULLABLE by default, all of these fields should handle not being present for a given row.
[
{
"name": "title",
"type": "STRING"
},
{
"name": "author",
"type": "RECORD",
"fields": [
{
"name": "name",
"type": "STRING"
},
{
"name": "age",
"type": "STRING"
},
{
"name": "nickname",
"type": "STRING"
},
{
"name": "dob",
"type": "STRING"
}
]
},
{
"name": "number_pages",
"type": "INTEGER"
},
{
"name": "year",
"type": "INTEGER"
}
]
A related question: For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
It should be possible to union two tables with differing schemas without too much difficulty.
Here's a quick example of how it works over public data (kind of a silly example, since the tables contain zero fields in common, but shows the concept):
SELECT * FROM
(SELECT * FROM publicdata:samples.natality),
(SELECT * FROM publicdata:samples.shakespeare)
LIMIT 100;
Note that you need the SELECT * around each table or the query will complain about the differing schemas.
It could be that I've created my index wrong, but I have a lead index with variable field names that I need to search through. I created a sub object called fields that contains name and value. Sample:
[
{
"name": "first_name",
"value": "XXX"
},
{
"name": "last_name",
"value": "XXX"
},
{
"name": "email",
"value": "X0#yahoo.com"
},
{
"name": "address",
"value": "X Thomas RD Apt 1023"
},
{
"name": "city",
"value": "phoenix"
},
{
"name": "state",
"value": "AZ"
},
{
"name": "zip",
"value": "12345"
},
{
"name": "phone",
"value": "5554448888"
},
{
"name": "message",
"value": "recently had XXXX"
}
]
name field is not_analyzed, and value field is analyzed and not, as .exact and .search
I thought I could get the results I want from a query string query doing something like
+fields.name: first_name +fields.value.exact: XXX
But it doesn't quite work the way I thought. I figure its because I'm trying to use this as mysql instead of as nosql, and there is a fundamental brain shift I must have.
While the approach you are taking probably should work with enough effort, you are much better off having explicit field names for everything, eg:
{
"name.first_name" : "XXX",
"name.last_name" : "XXX",
etc...
}
Then your query_string looks like this:
name.first_name:XXX
If you are a new to elasticsearch, play around with things before you add your mappings. The dynamic defaults should kick in and things will work. You then add mappings to get fine grained control over the field behavior.