JCR SQL2 Multivalue properties search - jcr

I want to do a search in the content repository using one or more of the values as an input parameter for a multivalue property
Something like: find all nodes with the primary type 'nt:unstructured' whose property 'multiprop' (multivalue property) contains both values "one" and "two".
How would the queryString passed to queryManager.createQuery should loook like?
Thank you.

You can treat the criteria on multi-valued properties just like other criteria. For example, the following query will find all nodes that have a value of 'white dog' on the 'someProp' property:
SELECT * FROM [nt:unstructured] WHERE someProp = 'white dog'
If the 'someProp' property has multiple values, then a node with at least one value that satisfies the criteria will be included in the results.
To find nodes that have multiple values of a multi-valued property, simply AND together multiple criteria. For example, the following query will return all nodes that have both of the specified values:
SELECT * FROM [nt:unstructured] WHERE someProp = 'white dog'
AND someProp = 'black dog'
Any of the operators will work, including 'LIKE':
SELECT * FROM [nt:unstructured] WHERE someProp LIKE '%white%'
AND someProp LIKE '%black%'
Other combinations are possible, of course.

Related

SparkSQL: How to query a column with datatype: List of Maps

I have a dataframe with column of array (or list) with each element being a map of String, complex data type (meaning --String, nested map, list etc; in a way you may assume column data type is similar to List[Map[String,AnyRef]])
now i want to query on this table like..
select * from the tableX where column.<any of the array element>['someArbitaryKey'] in ('a','b','c')
I am not sure how to represent <any of the array element> in the spark SQL. Need help.
The idea is to transform the list of maps into a list of booleans, where each boolean indicates if the respective map contains the wanted key (k2 in the code below). After that all we have to check if the boolean array contains at least one true element.
select * from tableX where array_contains(transform(col1, map->map_contains_key(map,'k2')), true)
I have assumed that the name of the column holding the list of maps is col1.
The second parameter of the transform function could be replaced by any expression that returns a boolean value. In this example map_contains_key is used, but any check resulting in a boolean value would work.
A bit unrelated: I believe that the data type of the map cannot be Map[String,AnyRef] as there is no encoder for AnyRef available.

A column has many multiple mispelled or similar values. How to set the correct values using some rules?

I have combined multiple dataframes, and the resulting dataframe df, has three columns 'Date', 'Product', 'Price'.
There are many rows where:
the 'Product' value is either 'Kiwi' or 'Kiwi '.
the 'Product' value is either 'apricot' or 'Apricot'.
the 'Product' value is either 'Apple / imported' or 'Apple / local'
and so on.
I am trying to apply some rules to rename the values such as:
if value contains 'Kiwi' then set the value as 'Kiwi'
if value contains 'apricot' then set the value as 'Apricot'
if value contains 'Apple' then set the value as 'Apple'
Using df.loc[:,'Product'].sort_values().unique() and examining the results, I have created a dictionary 'rename_product' containing key:value pairs, where the keys are the texts to search for, and the values are the new values that should be assigned, such as:
rename_product = {
'Kiwi' : 'Kiwi',
'apricot' : 'Apricot',
'Apple' : 'Apple'
}
How to proceed to the substitution of values?
I think this is what you are looking for
pandas replace
your implementation in the comment doesn't seem correct to me. Try implementing one of the solutions in the article I included and see if it works. You can also look at the re package. If I remember right though you'll need to use apply and lambda with it to loop through each row in your column.

Is it possible to look for an instance of a particular value in a Azure Data Factory expression?

I return a JSON array from a TSQL procedure to Azure Data Factory. I want to know if at least 1 value in the array is equal to true. The JSON array has multiple fields included and multiple rows.
Setup overview:
Data Factory lookup activity.
TSQL procedure that returns 2 or more
rows.
Data Factory IF activity with conditional that checks JSON
returned for at least 1 instance of x.
Dummy procedure:
CREATE PROC dbo.usp_dummyProc
AS
SET NOCOUNT ON;
SELECT 1, 'a', 1
UNION
SELECT 2, 'b', 0
;
Data Factory pipe:
I tried:
#contains(activity('ActivityName').output.value.SqlFieldName, true)
Which, unsurprisingly led to:
The expression
'contains(activity('ActivityName').output.value.SqlFieldName, true)'
cannot be evaluated because property 'SqlFieldName' cannot be
selected. Array elements can only be selected using an integer index.
I cannot see an expression component that can iterate over the list returned to check for a value. I could write another procedure to deal with this, but ideally, I would prefer not to need to do so every time I want to solve this problem. This is where I looked.
You can cast your the result of activity('ActivityName').output.value to String, then use contains() to check whether it contains '"SqlFieldName":true'. Something like the following expression: #contains(join(activity('ActivityName').output.value,','),'"SqlFieldName":true')

How to "search" with a string of values for records whose columns contain said values

Okay, I'm going to try to explain this as simply as I can.
Say I have 3 records of a certain table. We will call this table Objs, and Objs have an attribute, of type string, called colors (notice how it's plural). Here are the 3 hypothetical records in the database and their corresponding colors values:
obj1 colors: "red, green, blue"
obj2 colors: "blue, orange, yellow, green"
obj3 colors: "teal, purple"
Okay, so now say I want to be able find a subset of the records that have something in common (a good situation to use the WHERE method right?) However, I HAVE to be able to support searching for these records using either single or even multiple values. For instance:
Say my query is, "red, green".
Then the resulting collection of records would need to be obj1, and obj2 since their color values include the keywords "red" and "green".
Say my query is, "blue, purple".
The resulting collection should include obj1, obj2, and obj3.
Also, the format of the query and the attributes of Obj will be the values delimited with a ", " since the attributes and the query itself are generated by an array. I.e., the attributes of the object and the queries themselves will always have this format:
"value1, value2, value3, value4"
It will never be like this:
"value1 value2 value3 value4"
or any other possible format.
Thanks for all the help.
You could compare the value, treating it as an array, against another array. For example:
SELECT *
FROM example
WHERE string_to_array(colors,', ') && array['red','blue'];
The && operator here checks for overlaps.
Disclosure: I am an EnterpriseDB (EDB) employee.
Let's wrap it into function,
first of all we are creating array from search string to know how many or queries we need,
then we are preparing query,
and as a final step we are passing our query and search words to our model
def search_function(search_string)
names_array = search_string.split(', ')
query = 'name LIKE ?'
number_of_ors = names_array.count
number_of_ors.times do
query += ' OR name LIKE ?'
end
Objects.where(query, *names_array)
end

from string to map object in Hive

My input is a string that can contain any characters from A to Z (no duplicates, so maximum 26 characters it may have).
For example:-
set Input='ATK';
The characters within the string can appear in any order.
Now I want to create a map object out of this which will have fixed keys from A to Z. The value for a key is 1 if its corresponding character appears in the input string. So in case of this example (ATK) the map object should look like:-
So what is the best way to do this?
So the code should look like:-
set Input='ATK';
select <some logic>;
It should return a map object (Map<string,int>) with 26 key value pairs within it. What is the best way to do it, without creating any user defined functions in Hive. I know there is a function str_to_map that easily comes to mind.But it only works if key value pairs exist in source string and also it will only consider the key value pairs specified in the input.
Maybe not efficient but works:
select str_to_map(concat_ws('&',collect_list(concat_ws(":",a.dict,case when
b.character is null then '0' else '1' end))),'&',':')
from
(
select explode(split("A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z",',')) as dict
) a
left join
(
select explode(split(${hiveconf:Input},'')) as character
) b
on a.dict = b.character
The result:
{"A":"1","B":"0","C":"0","D":"0","E":"0","F":"0","G":"0","H":"0","I":"0","J":"0","K":"1","L":"0","M":"0","N":"0","O":"0","P":"0","Q":"0","R":"0","S":"0","T":"1","U":"0","V":"0","W":"0","X":"0","Y":"0","Z":"0"}