I need to extract the email from an intricate 'dict' (I am new to sql)
I have seen several previous posts on the same topic (e.g. this one) however, none seem to work on my data
select au.details
from table_au au
result:
{
"id":3526,
"contacts":[
{
"contactType":"EMAIL",
"value":"name#email.be",
"private":false
},
{
"contactType":"PHONE",
"phoneType":"PHONE",
"value":"025/6251111",
"private":false
}
]
}
I need:
name#email.be
select d.value -> 0 -> 'value' as Email
from json_each('{"id":3526,"contacts":[{"contactType":"EMAIL","value":"name#email.be","private":false},{"contactType":"PHONE","phoneType":"PHONE","value":"025/6251111","private":false}]}') d
where d.key::text = 'contacts'
Output:
| | email |
-------------------
|1 |"name#email.be"|
You can run it here: https://rextester.com/VHWRQ89385
Related
I want to remove an object from a json column in Sqlite and I can't make it work. The json column contains a nested object, has the following type:
{
a: number;
pair: {
field1: string;
field2: string;
}[]
}
I want to update the column "ArrayColumn" with the same values but remove the object that has field1 equal to "0" and field2 equal to "1" . Every row contains the "pair" array, but not all the "pair" arrays in ArrayColumn contain this value ({"field1":"0", "field2":"1"})
I have the following structure:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"0", "field2":"1"},{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"0", "field2":"1"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"},{"field1":"0", "field2":"1"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
After updating the rows, the values would be:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
I tried with JSON_TREE but can't make it work.
I was thinking that the first step would be to select all the rows that contain that value, I retreived them using these 2 ways:
With LIKE operator searching for the stringified form:
select Id, json_extract(json(par), '$.pair') as pair from Table pair like '%{"field1":"0","field2":"1"}%'
Using json_tree
select Id, value from Table, json_tree(Table.ArrayColumn, '$.pair' ) where json_extract(value, '$.field1' ) = '0' AND json_extract(value, '$.field2' ) = '1'
I tried using json_remove with this small example but no luck:
SELECT json_remove('[{"field1":"1","field2":"0"},{"field1":"A","field2":"B"}]', '${"field1":"1","field2":"0"}' )
I tried using json_remove but had no luck.
Thank you
For this sample data the simplest way to do this is to treat the json column as a string and use string functions to remove the value that you want:
UPDATE tablename
SET ArrayColumn = REPLACE(REPLACE(REPLACE(ArrayColumn, ']', ',]'), '{"field1":"0", "field2":"1"},', ''), ',]', ']')
WHERE ArrayColumn LIKE '%{"field1":"0", "field2":"1"}%';
See the demo.
I have one dataframe and one dataset :
Dataframe 1 :
+------------------------------+-----------+
|City_Name |Level |
+------------------------------+------------
|{City -> Paris} |86 |
+------------------------------+-----------+
Dataset 2 :
+-----------------------------------+-----------+
|Country_Details |Temperature|
+-----------------------------------+------------
|{City -> Paris, Country -> France} |31 |
+-----------------------------------+-----------+
I am trying to make a join of them by checking if the map in the column "City_Name" is included in the map of the Column "Country_Details".
I am using the following UDF to check the condition :
val mapEqual = udf((col1: Map[String, String], col2: Map[String, String]) => {
if (col2.nonEmpty){
col2.toSet subsetOf col1.toSet
} else {
true
}
})
And I am making the join this way :
dataset2.join(dataframe1 , mapEqual(dataset2("Country_Details"), dataframe1("City_Name"), "leftanti")
However, I get such error :
terminated with error scala.MatchError: UDF(Country_Details#528) AS City_Name#552 (of class org.apache.spark.sql.catalyst.expressions.Alias)
Has anyone previously got the same error ?
I am using Spark version 3.0.2 and SQLContext, with scala language.
There are 2 issues here, the first one is that when you're calling your function, you're passing one extra parameter leftanti (you meant to pass it to join function, but you passed it to the udf instead).
The second one is that the udf logic won't work as expected, I suggest you use this:
val mapContains = udf { (col1: Map[String, String], col2: Map[String, String]) =>
col2.keys.forall { key =>
col1.get(key).exists(_ eq col2(key))
}
}
Result:
scala> ds.join(df1 , mapContains(ds("Country_Details"), df1("City_Name")), "leftanti").show(false)
+----------------------------------+-----------+
|Country_Details |Temperature|
+----------------------------------+-----------+
|{City -> Paris, Country -> France}|31 |
+----------------------------------+-----------+
There are two tables, one is called "user_preference" that contains all users:
id | firstname | lastname | email |
And "match" which combines users with meetups they joined:
id | matcher | partner | meetup |
Both matcher and partner are foreign keys that represent user_preference.id, meaning that same user can be both matcher and a partner in the same meetup.
What I need to know is what percentage of total unique users joined what number of meetings.
For example:
17% of users joined 5 meetups
20% of users joined 3 meetups
40% of users joined 1 meetup
23% of users joined 0 meetups
The number of meetups should not be hardcoded but dynamic.
But I want to avoid duplication of users for a single meetup and count them only once. For example this:
id | matcher | partner | meetup |
1 | user1 | user2 | meetup1 |
2 | user1 | user3 | meetup1 |
3 | user5 | user1 | meetup1 |
4 | user6 | user1 | meetup2 |
Should count that user1 visited only 2 meetups.
What I managed to do so far is to display the count of meetups each user visited but that is not what I need:
SELECT distinct up.email users, COUNT(m.user) meetups
FROM user_preference up
LEFT JOIN
(
SELECT matcher AS user FROM match
UNION ALL
SELECT partner AS user FROM match
) m ON m.user = up.id
GROUP BY up.email
ORDER BY meetups desc;
In the end I did this by making simple queries and looping through them in the code, its far from elegant solution but it should work.
If someone posts SQL solution I will accept and upvote it...
export const getDevStats = async () => {
const users = await getRepository(UserPreference).query(
`SELECT * FROM user_preference;`
);
const meetups = await getRepository(Meetup).query(
`SELECT * FROM meetup;`
);
const matches = await getRepository(Match).query(
`SELECT * FROM match;`
);
let userMatches: any = {};
users.forEach((user: any) => {
userMatches[user.id] = []
matches.forEach((match: any) => {
if(user.id == match.matcher || user.id == match.partner) {
if(userMatches[user.id].indexOf(match.meetup) === -1) {
userMatches[user.id].push(match.meetup);
}
}
});
});
let matchStats: any = {};
for (var userId of Object.keys(userMatches)) {
if (typeof matchStats[userMatches[userId].length] === 'undefined') {
matchStats[userMatches[userId].length] = 0;
}
matchStats[userMatches[userId].length]++;
}
return {
users: users,
meetups: meetups,
matches: matches,
userMatches: userMatches,
matchStats: matchStats
};
};
I have Azure AD audit event sent to a log analytics workspace and I'd like to build a query that shows me all Unified Groups created with the IsPublic property set to True.
I have the relevant events in TargetResources[0].modifiedProperties however this is a multi-valued object and depending on how it was provisioned the position of the attribute I look for is different.
for ex.
TargetResources[0].modifiedProperties contains the IsPublic on the 3rd position, but sometimes it's on the second or fourth position.
[
{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},
{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},
{"displayName":"IsPublic","oldValue":"[]","newValue":"[false]"}
]
I am guessing there is a way to find the exact property and value dynamically?
Sincerely,
Tonino Bruno
You could use the mv-apply operator.
below are a few examples:
datatable(i:int, TargetResources:dynamic)
[
1, dynamic([{"p2":"v2","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[false]"}]}]),
2, dynamic([{"p4":"v4","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsSomething","oldValue":"[]","newValue":"[false]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[true]"}]}]),
3, dynamic([{"p2":"v2","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[true]"}]}]),
]
| project i, mp = TargetResources[0].modifiedProperties
| mv-apply mp on (
where mp.displayName == "IsPublic" and mp.newValue == '[true]'
)
datatable(i:int, TargetResources:dynamic)
[
1, dynamic([{"p2":"v2","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[false]"}]}]),
2, dynamic([{"p4":"v4","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsSomething","oldValue":"[]","newValue":"[false]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[true]"}]}]),
3, dynamic([{"p2":"v2","modifiedProperties":[{"displayName":"DisplayName","oldValue":"[]","newValue":"[\"Test Group\"]"},{"displayName":"GroupType","oldValue":"[]","newValue":"[\"Unified\"]"},{"displayName":"IsPublic","oldValue":"[]","newValue":"[true]"}]}]),
]
| project i, mp = TargetResources[0].modifiedProperties
| mv-apply mp on (
project displayName = tostring(mp.displayName), newValue = tostring(parse_json(tostring(mp.newValue))[0])
| summarize b = make_bag(pack(displayName, newValue))
)
| where b.GroupType == "Unified" and b.IsPublic == "true"
I have one column table in my snowflake database that contain a JSON mapping structure as following
ColumnMappings : {"Field Mapping": "blank=Blank,E=East,N=North,"}
How to write a query that if I feed the Field Mapping a value of E I will get East or if the value if N I will get North so on and so forth without hard coding the value in the query like what CASE statement provides.
You really want your mapping in this JSON form:
{
"blank" : "Blank",
"E" : "East",
"N" : "North"
}
You can achieve that in Snowflake e.g. with a simple JS UDF:
create or replace table x(cm variant) as
select parse_json(*) from values('{"fm": "blank=Blank,E=East,N=North,"}');
create or replace function mysplit(s string)
returns variant
language javascript
as $$
res = S
.split(",")
.reduce(
(acc,val) => {
var vals = val.split("=");
acc[vals[0]] = vals[1];
return acc;
},
{});
return res;
$$;
select cm:fm, mysplit(cm:fm) from x;
-------------------------------+--------------------+
CM:FM | MYSPLIT(CM:FM) |
-------------------------------+--------------------+
"blank=Blank,E=East,N=North," | { |
| "E": "East", |
| "N": "North", |
| "blank": "Blank" |
| } |
-------------------------------+--------------------+
And then you can simply extract values by key with GET, e.g.
select cm:fm, get(mysplit(cm:fm), 'E') from x;
-------------------------------+--------------------------+
CM:FM | GET(MYSPLIT(CM:FM), 'E') |
-------------------------------+--------------------------+
"blank=Blank,E=East,N=North," | "East" |
-------------------------------+--------------------------+
For performance, you might want to make sure you call mysplit only once per value in your mapping table, or even pre-materialize it.