MarkLogic - Xpath on JSON document

MarkLogic - Xpath on JSON document - marklogic-9

MarkLogic Version: 9.0-6.2
I am trying to apply Xpath in extract-document-data (using Query Options) on a JSON document shown below. I need to filter out "Channel" property if the underneath property "OptIn" has a value of "True".
{
"Category":
{
"Name": "Severe Weather",
"Channels":[
{
"Channel":
{
"Name":"Email",
"OptIn": "True"
}
},
{
"Channel":
{
"Name":"Text",
"OptIn": "False"
}
}
]
}
}
I tried below code,
'<extract-document-data selected="include">' +
'<extract-path>//*[OptIn="True"]/../..</extract-path>' +
'</extract-document-data>' +
which is only pulling from "Channel" property as shown below.
[
{
"Channel": {
"Name": "Email",
"OptIn": "True"
}
}
]
But my need is to pull from parent "Category" property, but filter out the Channels that have OptIn value as False.
Any pointers?

If I understand correctly, you'd like to extract 'Category', but only with those 'Channel's that have 'OptIn' equalling 'true', right?
Extract-document-data is not advanced enough for that. You best extract entire Categories which have at least one OptIn equalling true (//Category[//OptIn = 'true']), and use a REST transform on the search response to trim down the unwanted Channels..
HTH!

Related

Flatten complex json using Databricks and ADF

I have following json which I have flattened partially using explode
{
"result":[
{
"employee":[
{
"employeeType":{
"name":"[empName]",
"displayName":"theName"
},
"groupValue":"value1"
},
{
"employeeType":{
"name":"#bossName#",
"displayName":"theBoss"
},
"groupValue":[
{
"id":"1",
"type":{
"name":"firstBoss",
"displayName":"CEO"
},
"name":"Martha"
},
{
"id":"2",
"type":{
"name":"secondBoss",
"displayName":"cto"
},
"name":"Alex"
}
]
}
]
}
]
}
I need to get following fields:
employeeType.name
groupValue
I am able to extract those fields and value. But, if name value starts with # like in "name":"#bossName#", I am getting groupValue as string from which I need to extract id and name.
"groupValue":[
{
"id":"1",
"type":{
"name":"firstBoss",
"displayName":"CEO"
},
"name":"Martha"
},
{
"id":"2",
"type":{
"name":"secondBoss",
"displayName":"cto"
},
"name":"Alex"
}
]
How to convert this string to json and get the values.
My code so far:
from pyspark.sql.functions import *
db_flat = (df.select(explode("result.employee").alias("emp"))
.withColumn("emp_name", col(emp.employeeType.name))
.withColumn("emp_val",col("emp.groupValue")).drop("emp"))
How can I extract groupValue from db_flat and get id and name from it. Maybe use python panda library.

Since you see they won't be dynamic. You can traverse through the json while mapping like as below. Just identify record and array, specify index [i] as needed.
Example:
id --> $['employee'][1]['groupValue'][0]['id']
name --> $['employee'][1]['groupValue'][0]['type']['name']

how to select a single item and get it's relations in faunadb?

I have two collections which have the data in the following format
{
"ref": Ref(Collection("Leads"), "267824207030650373"),
"ts": 1591675917565000,
"data": {
"notes": "voicemail ",
"source": "key-name",
"name": "Glenn"
}
}
{
"ref": Ref(Collection("Sources"), "266777079541924357"),
"ts": 1590677298970000,
"data": {
"key": "key-name",
"value": "Google Ads"
}
}
I want to be able to query the Leads collection and be able to retrieve the corresponding Sources document in a single query
I came up with the following query to try and use an index but I couldn't get it to run
Let(
{
data: Get(Ref(Collection('Leads'), '267824207030650373'))
},
{
data: Select(['data'],Var('data')),
source: q.Lambda('data',
Match(Index('LeadSourceByKey'), Get(Select(['source'], Var('data') )) )
)
}
)
Is there an easy way to retrieve the Sources document ?

What you are looking for is the following query which I broke down for you in multiple steps:
Let(
{
// Get the Lead document
lead: Get(Ref(Collection("Leads"), "269038063157510661")),
// Get the source key out of the lead document
sourceKey: Select(["data", "source"], Var("lead")),
// use the index to get the values via match
sourceValues: Paginate(Match(Index("LeadSourceValuesByKey"), Var("sourceKey")))
},
{
lead: Var("lead"),
sourceValues: Var("sourceValues")
}
)
The result is:
{
lead: {
ref: Ref(Collection("Leads"), "269038063157510661"),
ts: 1592833540970000,
data: {
notes: "voicemail ",
source: "key-name",
name: "Glenn"
}
},
sourceValues: {
data: [["key-name", "Google Ads"]]
}
}
sourceValues is an array since you specified in your index that there will be two items returned, the key and the value and an index always returns the array. Since your Match could have returned multiple values in case it wasn't a one-to-one, this becomes an array of an array.
This is only one approach, you could also make the index return a reference and Map/Get to get the actual document as explained on the forum.
However, I assume you asked the same question here. Although I applaud asking questions on stackoverflow vs slack or even our own forum, please do not just post the same question everywhere without linking to the others. This makes many people spend a lot of time while the question is already answered elsewhere.

You might probably change the Leads document and put the Ref to Sources document in source:
{
"ref": Ref(Collection("Leads"), "267824207030650373"),
"ts": 1591675917565000,
"data": {
"notes": "voicemail ",
"source": Ref(Collection("Sources"), "266777079541924357"),
"name": "Glenn"
}
}
{
"ref": Ref(Collection("Sources"), "266777079541924357"),
"ts": 1590677298970000,
"data": {
"key": "key-name",
"value": "Google Ads"
}
}
And then query this way:
Let(
{
lead: Select(['data'],Get(Ref(Collection('Leads'), '267824207030650373'))),
source:Select(['source'],Var('lead'))
},
{
data: Var('lead'),
source: Select(['data'],Get(Var('source')))
}
)

express-graphql: How to remove external "data" object layer.

I am replacing an existing REST endpoint with GraphQL.
In our existing REST endpoint, we return a JSON array.
[{
"id": "ABC"
},
{
"id": "123"
},
{
"id": "xyz"
},
{
"id": "789"
}
]
GraphQL seems to be wrapping the array in two additional object layers. Is there any way to remove the "data" and "Client" layers?
Response data:
{
"data": {
"Client": [
{
"id": "ABC"
},
{
"id": "123"
},
{
"id": "xyz"
},
{
"id": "789"
}
]
}
}
My query:
{
Client(accountId: "5417727750494381532d735a") {
id
}
}

No. That was the whole purpose of GraphQL. To have a single endoint and allow users to fetch different type/granularity of data by specifying the input in a query format as opposed to REST APIs and then map them onto the returned JSON output.
'data' acts as a parent/root level container for different entities that you have queried. Without these keys in the returned JSON data, there won't be any way to segregate the corresponding data. e.g.
Your above query can be modified to include another entity like Owner,
{
Client(accountId: "5417727750494381532d735a") {
id
}
Owner {
id
}
}
In which case, the output will be something like
{
"data": {
"Client": [
...
],
"Owner": [
...
]
}
}
Without the 'Client' and 'Owner' keys in the JSON outout, there is no way to separate the corresponding array values.
In your case, you can get only the array by doing data.Client on the returned output.

DataTables ajax requires explicit json collection name for the returned datasource?

I recently ran into a problem when implementing the ajax functionality of jquery DataTables. Until I actually gave my json object collection an explicit name I couldn't get anything to display. Shouldn't there be a default data source if nothing named is returned?
Client Side control setup (includes hidden field that supplies data to dynamic anchor:
$('#accountRequestStatus').dataTable(
{
"destroy": true, // within a method that will be called multiple times with new/different data
"processing": true,
"ajax":
{
"type": "GET",
"url": "#Url.Action("SomeServerMethod", "SomeController")",
"data": { methodParam1: 12341, methodParam2: 123423, requestType: 4123421 }
}
, "paging": false
, "columns": [
{ "data": "DataElement1" },
{ "data": "DataElement2", "title": "Col1" },
{ "data": "DataElement3", "title": "Col2" },
{ "data": "DataElement4", "title": "Col3" },
{ "data": "DataElement5", "title": "Col4" },
]
, "columnDefs": [
{
"targets": 0, // hiding first column, userId
"visible": false,
"searchable": false,
"sortable": false
},
{
"targets": 5, // creates action link using the hidden data for that row in column [userId]
"render": function (data, type, row) {
return "<a href='#Url.Action("ServerMethod", "Controller")?someParam=" + row["DataElement1"] + "'>Details</a>"
},
"searchable": false,
"sortable": false
}
]
});
Here's a snippet of my server side code that returns the json collection.
tableRows is a collection of models containing the data to be displayed.
var json = this.Json(new { data = tableRows });
json.JsonRequestBehavior = JsonRequestBehavior.AllowGet;
return json;
As I said before, the ajax call returned data but wouldn't display until I gave the collection a name. Maybe I missed this required step in the documentation, but wouldn't it make sense for the control to wire up to a single returned collection as the default data source and not require the name? Figuring out the name thing equated to about 2+ hours of messin' around trying different things. That's all I'm saying.
Maybe this'll help someone else too...

dataTables does actually have a dataSrc property! dataTables will look for either a data or an aaData section in the JSON. Thats why you finally got it to work with new { data=tableRows }. That is, if dataSrc is not specified! If your JSON differs from that concept you must specify dataSrc :
If you return a not named array / collection [{...},{...}] :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: ""
}
If you return a JSON array named different from data or aaData, like customers :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: "customers"
}
If the content is nested like { a : { b : [{...},{...}] }}
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: "a.b"
}
If you have really complex JSON or need to manipulate the JSON in any way, like cherry picking from the content - dataSrc can also be a function :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: function(json) {
//do what ever you want
//return an array containing JSON / object literals
}
}
Hope the above clears things up!

Raven DB HTTP API - Property Traversal

I am given the following JSON structure:
{
"document": {
"sections": {
"x": {
"title": "foo"
},
"y": {
"title": "bar"
}
}
}
}
How do I update value of the title property for a given section using the HTTP API?
I would like to provide a path (string) to get to the property.

This was fixed in build 2254. You should now be able to issue a single scripted patch like this:
EVAL http://localhost:8080/docs/foos/1
{Script:"this.document.sections.x.title = newTitle;",Values:{"newTitle":"Whatever"}}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MarkLogic - Xpath on JSON document - marklogic-9

Related

Flatten complex json using Databricks and ADF

how to select a single item and get it's relations in faunadb?

express-graphql: How to remove external "data" object layer.

DataTables ajax requires explicit json collection name for the returned datasource?

Raven DB HTTP API - Property Traversal

Categories

Resources