Choosing between literal map and an existing map using BinCond in Pig

Choosing between literal map and an existing map using BinCond in Pig - apache-pig

If an alias has a column consisting of a map,
A = LOAD .. AS ( id : int, data : [ DOUBLE ] );
and I want to keep the map if the map has at least one element, otherwise I want to replace it with a map [ 'dummy' : 1.0d ].
B = FOREACH A GENERATE id,
( SIZE( data ) == 0 ? TOMAP('dummy', (DOUBLE)1.0) : data ) AS data;
This results in the following error,
Two inputs of BinCond must have compatible schemas. left hand side: #133:map right hand side: udf_results#144:map(#145:double)

The issue seems to be that Pig does not implicitly cast MAP:[ ] to MAP:[ DOUBLE ]. Explicitly casting solved the problem,
B = FOREACH A GENERATE id,
( SIZE( data ) == 0 ? ( ([DOUBLE]) TOMAP('dummy', (DOUBLE)1.0) : data ) AS data;

Related

Multi-value object search in CrateDB when this one is within an array of objects

I'm trying to migrate our current ES to CrateDB and one of the issues I'm facing is searching for two specific values within the same object when this object is part of an array of objects.
CREATE TABLE test.artefact (
id INTEGER,
metadata ARRAY(OBJECT(STATIC) AS (
key_id INTEGER,
value TEXT
))
);
insert into test.artefact(id, metadata) values (
1,
[
{
"key_id" = 1,
"value" = 'TEST1'
},
{
"key_id" = 2,
"value" = 'TEST2'
}
]
);
So basically, I'm trying to search metadata providing key_id and value.
A select like this one finds artefact 1 as a match, even when key and value are in different objects:
select * from test.artefact where 1 = ANY(metadata['key_id']) AND 'TEST2' = ANY(metadata['value'])
I have tried other functions, like UNNEST, with no luck.

Copy from CrateDB Community:
One way that should work is
SELECT *
FROM test.artefact
WHERE {key_id = 1, value = 'TEST2'} = ANY(metadata)
however this is probably not the most performant way.
together with the queries on the fields it might be quick enough.
SELECT *
FROM test.artefact
WHERE
1 = ANY(metadata['key_id'])
AND 'TEST2' = ANY(metadata['value'])
AND {key_id = 1, value = 'TEST2'} = ANY(metadata)

Update object field of element in array jsonb with postgres

I have following jsonb column which name is data in my sql table.
{
"special_note": "Some very long special note",
"extension_conditions": [
{
"condition_id": "5bfb8b8d-3a34-4cc3-9152-14139953aedb",
"condition_type": "OPTION_ONE"
},
{
"condition_id": "fbb60052-806b-4ae0-88ca-4b1a7d8ccd97",
"condition_type": "OPTION_TWO"
}
],
"floor_drawings_file": "137c3ec3-f078-44bb-996e-161da8e20f2b",
}
What I need to do is to update every object's field with name condition_type in extension_conditions array field from OPTION_ONE to MARKET_PRICE and OPTION_TWO leave the same.
Consider that this extension_conditions array field is optional so I need to filter rows where extension_conditions is null
I need a query which will update all my jsonb columns of rows of this table by rules described above.
Thanks in advance!

You can use such a statement containing JSONB_SET() function after determining the position(index) of the related key within the array
WITH j AS
(
SELECT ('{extension_conditions,'||idx-1||',condition_type}')::TEXT[] AS path, j
FROM tab
CROSS JOIN JSONB_ARRAY_ELEMENTS(data->'extension_conditions')
WITH ORDINALITY arr(j,idx)
WHERE j->>'condition_type'='OPTION_ONE'
)
UPDATE tab
SET data = JSONB_SET(data,j.path,'"MARKET_PRICE"',false)
FROM j
Demo 1
Update : In order to update for multiple elements within the array, the following query containing nested JSONB_SET() might be preferred to use
UPDATE tab
SET data =
(
SELECT JSONB_SET(data,'{extension_conditions}',
JSONB_AGG(CASE WHEN j->>'condition_type' = 'OPTION_ONE'
THEN JSONB_SET(j, '{condition_type}', '"MARKET_PRICE"')
ELSE j
END))
FROM JSONB_ARRAY_ELEMENTS(data->'extension_conditions') AS j
)
WHERE data #> '{"extension_conditions": [{"condition_type": "OPTION_ONE"}]}';
Demo 2

How to group by duplicate value of nested array in Postgresql?

Previously question : How to group by duplicate value and nested the array Postgresql
Using this query :
SELECT json_build_object(
'nama_perusahaan',"a"."nama_perusahaan",
'proyek', json_agg(
json_build_object(
'no_izin',"b"."no_izin",
'kode',c.kode,
'judul_kode',d.judul
)
)
)
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "a"."nama_perusahaan"
The result is shown below:
{
"nama_perusahaan" : "JASA FERRIE",
"proyek" :
{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode" : "14302",
"judul_kode" : "IND"
}
{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode" : "13121",
"judul_kode" : "IND B"
}
}
As you could see, the proyek have been nested, so the duplicate proyek will be grouped. Now i have to group the same value of no_izin so it will double nested array like expected result below.
{
"nama_perusahaan" : "JASA FERRIE",
"proyek" :
[{
"no_izin" : "26A/E/IU/PMA/D8FD",
"kode_list":[
{
"kode" : "14302",
"judul_kode" : "IND"
},
{
"kode" : "13121",
"judul_kode" : "IND B"
}]
}]
}
I tried to use this query:
SELECT json_build_object(
'nama_perusahaan',"a"."nama_perusahaan",
'proyek', json_agg(
json_build_object(
'no_izin',"b"."no_izin",
'kode_list',json_agg(
json_build_object(
'kode',c.kode,
'judul_kode',d.judul
)
)
)
)
)
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "a"."nama_perusahaan", b.no_izin
but it didnt work, it gives ERROR: aggregate function calls cannot be nested LINE 6:'kode_list',json_agg(.
What could go wrong with my code ?

Disclaimer: It is very hard for us to construct a query without knowing the input data and table structure and have to handle a language we don't know. Please try to minimize your further questions (e.g. For your question it is not relevant that you need to join some tables before converting the result into a JSON output), create examples in English (handling foreign languages makes the code looking confusing and leads to spelling errors, so the probably right idea fails on writing the words wrong) and add the input data! This would help you as well: You would get an answer faster and the chance of code mistakes is much more less (because now without the data we cannot create a runnable example to check our ideas).
Creating a nested JSON structure is only possible doing it from the innermost nested object to the outermost one. So first you have to create the no_izin array in a subquery. This can be used to create the proyek object:
SELECT
json_build_object(
'nama_perusahaan',"s"."nama_perusahaan",
'proyek', json_agg(no_izin)
)
)
FROM (
SELECT
"a"."nama_perusahaan",
json_build_object(
'no_izin',
"b"."no_izin",
'kode_list',
json_agg(
json_build_object(
'kode',c.kode,
'judul_kode',d.judul
)
)
) AS no_izin
FROM "t_pencabutan" "a"
LEFT JOIN "t_pencabutan_non" "b" ON "a"."id_pencabutan" = "b"."id_pencabutan"
LEFT JOIN "t_pencabutan_non_b" "c" ON "b"."no_izin" = "c"."no_izin"
LEFT JOIN "t_pencabutan_non_c" "d" ON "c"."id_proyek" = "d"."id_proyek"
GROUP BY "c"."id_proyek", "a"."nama_perusahaan"
) AS s
GROUP BY "s"."nama_perusahaan"

neo4j cypher index query

I was used to use node_auto_index( condition ) to search for nodes using indexes, but now i used batch-import ( https://github.com/jexp/batch-import/ ) and it created indexes with specific names ( type, code, etc ).
So, how to do a cypher query using indexes on multiple properties ?
old query example :
START n = node : node_auto_index( 'type: NODE_TYPE AND code: NODE_CODE' ) RETURN n;
how to do the 'same' query but without node_auto_index and specific index names ?
START n = node : type( "type = NODE_TYPE" ) RETURN n;
Also, the next query does not work (no errors, but the result is empty and it shouldn't be) :
START n = node : type( 'type: NODE_TYPE AND code: NODE_CODE' ) RETURN n;
So, type is an index, code is an index. how to mix the two in the same query for a single node ?
Another question: whats the difference of node_auto_index and this indexes with specific names ?
Thank you.

You almost had it:
START n = node:type("type:NODE_TYPE") RETURN n;
or
START n = node:type(type="NODE_TYPE") RETURN n;

Semantic Similarity Result interpretation

I'm performing a semantic similarity using a tool here,
I'm getting the following results, but cannot properly interprete them:
apple#n#1,banana#n#1 0.04809463683080774
apple#n#1,banana#n#2 0.13293629283742603
apple#n#2,banana#n#1 0.0
apple#n#2,banana#n#2 0.0
here is the code:
URL url = new URL ( "file" , null , "dictionary/3.0/dict" );
IDictionary dict = new Dictionary ( url ) ;
dict.open () ;
// look up first sense of the word " dog "
IIndexWord idxWord = dict . getIndexWord ( "dog" , POS.NOUN ) ;
IWordID wordID = idxWord . getWordIDs () . get (0) ; // 1 st meaning
List <IWordID> wordIDs = idxWord.getWordIDs();
JWS ws= new JWS ("dictionary", "3.0");
TreeMap <String,Double> scores1 = ws.getJiangAndConrath().jcn("apple", "banana", "n");
for (String s:scores1.keySet())
System.out.println(s+"\t"+scores1.get(s));

From the NLTK Documentation:
The Jiang Conrath similarity returns a score denoting how similar two
word senses are, based on the Information Content (IC) of the Least
Common Subsumer (most specific ancestor node) and that of the two
input Synsets. The relationship is given by the equation 1 / (IC(s1) +
IC(s2) - 2 * IC(lcs)).
A result of 0 means that the two concepts are not related at all.
A result near 1 would mean a very close relationship.

can you put me code source written in JAVA responsible for the execution of LeacockAndChodorow algorithm because I do have some problems with Url variable?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Choosing between literal map and an existing map using BinCond in Pig - apache-pig

The issue seems to be that Pig does not implicitly cast MAP:[ ] to MAP:[ DOUBLE ]. Explicitly casting solved the problem, B = FOREACH A GENERATE id, ( SIZE( data ) == 0 ? ( ([DOUBLE]) TOMAP('dummy', (DOUBLE)1.0) : data ) AS data;

Related

Multi-value object search in CrateDB when this one is within an array of objects

Update object field of element in array jsonb with postgres

How to group by duplicate value of nested array in Postgresql?

neo4j cypher index query

Semantic Similarity Result interpretation

Categories

Resources