Multi-value object search in CrateDB when this one is within an array of objects - cratedb

I'm trying to migrate our current ES to CrateDB and one of the issues I'm facing is searching for two specific values within the same object when this object is part of an array of objects.
CREATE TABLE test.artefact (
id INTEGER,
metadata ARRAY(OBJECT(STATIC) AS (
key_id INTEGER,
value TEXT
))
);
insert into test.artefact(id, metadata) values (
1,
[
{
"key_id" = 1,
"value" = 'TEST1'
},
{
"key_id" = 2,
"value" = 'TEST2'
}
]
);
So basically, I'm trying to search metadata providing key_id and value.
A select like this one finds artefact 1 as a match, even when key and value are in different objects:
select * from test.artefact where 1 = ANY(metadata['key_id']) AND 'TEST2' = ANY(metadata['value'])
I have tried other functions, like UNNEST, with no luck.

Copy from CrateDB Community:
One way that should work is
SELECT *
FROM test.artefact
WHERE {key_id = 1, value = 'TEST2'} = ANY(metadata)
however this is probably not the most performant way.
together with the queries on the fields it might be quick enough.
SELECT *
FROM test.artefact
WHERE
1 = ANY(metadata['key_id'])
AND 'TEST2' = ANY(metadata['value'])
AND {key_id = 1, value = 'TEST2'} = ANY(metadata)

Related

Query for retrieve matching json Objects as a list

Assume i have a table called MyTable and this table have a JSON type column called myjson and this column have next value as a json array hold multiple objects, for example like next:
[
{
"budgetType": "CF",
"financeNumber": 1236547,
"budget": 1000000
},
{
"budgetType": "ENVELOPE",
"financeNumber": 1236888,
"budget": 2000000
}
]
So how i can search if the record has any JSON objects inside its JSON array with financeNumber=1236547
Something like this:
SELECT
t.*
FROM
"MyTable",
LATERAL json_to_recordset(myjson) AS t ("budgetType" varchar,
"financeNumber" int,
budget varchar)
WHERE
"financeNumber" = 1236547;
Obviously not tested on your data, but it should provide a starting point.
with a as(
SELECT json_array_elements(myjson)->'financeNumber' as col FROM mytable)
select exists(select from a where col::text = '1236547'::text );
https://www.postgresql.org/docs/current/functions-json.html
json_array_elements return setof json, so you need cast.
Check if a row exists: Fastest check if row exists in PostgreSQL

Update specific object in array of objects Postgres jsonb

I am attempting to update a jsonb column pagesRead on table Books which contains an array of objects. The structure of it looks similar to this:
[{
"book": "Moby Dick",
"pagesRead": [
"1",
"2",
"3",
"4"
]
},
{
"book": "Book Thief",
"pagesRead": [
"1",
"2"
]
}]
What I am trying to do is update the pagesRead when a specific page of the book is read or if someone has started a new book, add an extra entry into it.
I am able to retrieve the specific book details, but I am unsure about how to update it.
EDIT: So I had to use the Update query from S-Man to add a book entry, but I used the Insert query from Barbaros Özhan to handle updating the page
Some thoughts before:
You should never store structured data as it is in one column. This yields problems with updates, indexing (so, searching/performance), filtering, everything. Please normalize everything into proper tables and columns
You should never store arrays. Normalize it.
Do not use type text to store integer (pages)
"pagesRead" is a sibling of your filter element ("book"). This makes it much more complicated to reference it than referencing it as a child. So think about the book name (or better: an id) as key like {"my_id_for_book_thief": {"name" : "Book Thief", "pagesRead": [...]}}. In that case, you could use a path for referencing it. Otherwise, we need to extract the array, have a look into each book attribute and reference its sibling
demo:db<>fiddle
Adding a book is quite simple (Assuming that you are using type jsonb instead of type json):
SELECT mydata || '{"book": "Lord Of The Rings", "pagesRead": []}'
FROM mytable
Update:
UPDATE mytable
SET mycolumn = mycolumn || '{"book": "Lord Of The Rings", "pagesRead": []}'
Adding a pagesRead value:
SELECT
jsonb_agg( -- 4
jsonb_build_object( -- 3
'book', elem -> 'book',
'pagesRead', CASE WHEN elem ->> 'book' = 'Moby Dick' THEN -- 2
elem -> 'pagesRead' || '"42"'
ELSE elem -> 'pagesRead' END
)
) as new_array
FROM mytable,
jsonb_array_elements(mydata) as elem -- 1
Extract the array into one record per element
Add a page if element contains correct book
Rebuild the object
Reaggregate your array.
Update would be:
UPDATE mytable
SET mycolumn = s.new_array
FROM (
-- <query above>
) s
Assuming you want to add a new page for the second book (Book Thief), then using JSONB_INSERT() function with the following Update Statement will be enough
UPDATE books
SET pagesRead = JSONB_INSERT(pagesRead,'{1,pagesRead,1}','"3"'::JSONB,true)
But, in order to make it a dynamical solution, without knowing the position of the book within the main array, and adding the new page number to the end of the pagesRead array of the desired book, determine the position, and the related array's length within the subquery as
WITH b AS
(
SELECT idx-1 AS pos1,
JSONB_ARRAY_LENGTH( (j ->> 'pagesRead')::JSONB )-1 AS pos2
FROM books
CROSS JOIN JSONB_ARRAY_ELEMENTS(pagesRead)
WITH ORDINALITY arr(j,idx)
WHERE j ->> 'book' = 'Book Thief'
)
UPDATE books
SET pagesRead =
JSONB_INSERT(
pagesRead,
('{'||pos1||',pagesRead,'||pos2||'}')::TEXT[],
--# pos1 stands for the position within the main array
--# pos2 stands for the position within the related pagesRead array
'"3"'::JSONB, --# an arbitrary page number
true --# the new page value will be inserted after the target path
)
FROM b
Demo

Update object field of element in array jsonb with postgres

I have following jsonb column which name is data in my sql table.
{
"special_note": "Some very long special note",
"extension_conditions": [
{
"condition_id": "5bfb8b8d-3a34-4cc3-9152-14139953aedb",
"condition_type": "OPTION_ONE"
},
{
"condition_id": "fbb60052-806b-4ae0-88ca-4b1a7d8ccd97",
"condition_type": "OPTION_TWO"
}
],
"floor_drawings_file": "137c3ec3-f078-44bb-996e-161da8e20f2b",
}
What I need to do is to update every object's field with name condition_type in extension_conditions array field from OPTION_ONE to MARKET_PRICE and OPTION_TWO leave the same.
Consider that this extension_conditions array field is optional so I need to filter rows where extension_conditions is null
I need a query which will update all my jsonb columns of rows of this table by rules described above.
Thanks in advance!
You can use such a statement containing JSONB_SET() function after determining the position(index) of the related key within the array
WITH j AS
(
SELECT ('{extension_conditions,'||idx-1||',condition_type}')::TEXT[] AS path, j
FROM tab
CROSS JOIN JSONB_ARRAY_ELEMENTS(data->'extension_conditions')
WITH ORDINALITY arr(j,idx)
WHERE j->>'condition_type'='OPTION_ONE'
)
UPDATE tab
SET data = JSONB_SET(data,j.path,'"MARKET_PRICE"',false)
FROM j
Demo 1
Update : In order to update for multiple elements within the array, the following query containing nested JSONB_SET() might be preferred to use
UPDATE tab
SET data =
(
SELECT JSONB_SET(data,'{extension_conditions}',
JSONB_AGG(CASE WHEN j->>'condition_type' = 'OPTION_ONE'
THEN JSONB_SET(j, '{condition_type}', '"MARKET_PRICE"')
ELSE j
END))
FROM JSONB_ARRAY_ELEMENTS(data->'extension_conditions') AS j
)
WHERE data #> '{"extension_conditions": [{"condition_type": "OPTION_ONE"}]}';
Demo 2

Groovy SQL named list parameter

I want to use a keyset of a Map as a list parameter in a SQL query:
query = "select contentid from content where spaceid = :spaceid and title in (:title)"
sql.eachRow(query, [spaceid: 1234, title: map.keySet().join(',')]) {
rs ->
println rs.contentid
}
I can use single values but no Sets or Lists.
This is what I've tried so far:
map.keySet().join(',')
map.keySet().toListString()
map.keySet().toList()
map.keySet().toString()
The map uses Strings as key
Map<String, String> map = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
Also, I don't get an error. I just get nothing printed like have an empty result set.
You appoach will not give the expected result.
Logically you are using a predicate such as
title = 'value1,value2,value3'
This is the reason why you get no exception but also no data.
Quick search gives a little evidence, that a mapping of a collections to IN list is possible in Groovy SQL.
Please check here and here
So very probably you'll have to define the IN list in a proper length and assign the values from your array.
title in (:key1, :key2, :key3)
Anyway something like this works fine:
Data
create table content as
select 1 contentid, 1 spaceid, 'AAA' title from dual union all
select 2 contentid, 1 spaceid, 'BBB' title from dual union all
select 3 contentid, 2 spaceid, 'AAA' title from dual;
Groovy Script
map['key1'] = 'AAA'
map['key2'] = 'BBB'
query = "select contentid from content where spaceid = :spaceid and title in (${map.keySet().collect{":$it"}.join(',')})"
println query
map['spaceid'] = 1
sql.eachRow(query, map) {
rs ->
println rs.contentid
}
Result
select contentid from content where spaceid = :spaceid and title in (:key1,:key2)
1
2
The key step is to dynamicall prepare the IN list with proper names of the bind variable using the experssion map.keySet().collect{":$it"}.join(',')
Note
You may also want to check the size if the map and handle the case where it is greater than 1000, which is an Oracle limitation of a single IN list.
It has worked for me with a little adaptation, I've added the map as a second argument.
def sql = Sql.newInstance("jdbc:mysql://localhost/databaseName", "userid", "pass")
Map<String,Long> mapProduitEnDelta = new HashMap<>()
mapProduitEnDelta['key1'] = 1
mapProduitEnDelta['key2'] = 2
mapProduitEnDelta['key3'] = 3
produits : sql.rows("""select id, reference from Produit where id IN (${mapProduitEnDelta.keySet().collect{":$it"}.join(',')})""",mapProduitEnDelta),
Display the 3 products (colums + values from the produit table) of id 1, 2, 3

Creating a new table from grouped substring of existing table

I am having some trouble creating some SQL (for SQL server 2008).
I have a table of tasks that are priority ordered, comma delimited tasks:
Id = 1, LongTaskName = "a,b,c"
Id = 2, LongTaskName = "a,c"
Id = 3, LongTaskName = "b,c"
Id = 4, LongTaskName = "a"
etc...
I am trying to build a new table that groups them by the first task, along with the id:
GroupName: "a", TaskId: 1
GroupName: "a", TaskId: 2
GroupName: "a", TaskId: 4
GroupName: "b", TaskId: 3
Here is the naive, slow, linq code:
foreach(var t in Tasks)
{
var gt = new GroupedTasks();
gt.TaskId = t.Id;
var firstWord = t.LongTaskName.Split(',');
if(firstWord.Count() > 0)
{
gt.GroupName = firstWord.First();
}
else
{
gt.GroupName = t.LongTaskName;
}
GroupedTasks.InsertOnSubmit(gt);
}
I wrote a sql function to do the string split:
create function fn_Split(
#String nvarchar (4000),
#Delimiter nvarchar (10)
)
returns nvarchar(4000)
begin
declare #FirstComma int
set #FirstComma = charindex(#Delimiter,#String)
if(#FirstComma = 0)
return #String
return substring(#String, 0, #FirstComma)
end
go
However, I am getting stuck on the real sql to do the work.
I can get the group by alone:
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
And I know I need to head down something like this:
DECLARE #RowSet TABLE (GroupName nvarchar(1024), Id nvarchar(5))
insert into #RowSet
select ???
FROM [dbo].Tasks as T
INNER JOIN
(
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
) G
ON T.??? = G.???
ORDER BY ???
INSERT INTO dbo.GroupedTasks(GroupName, Id)
select * from #RowSet
But I am not quite groking how to reference the grouped relationships and am confused about having to call split multiple times.
Any thoughts?
If you only care about the first item in the list, there's no need really for a function. I would recommend this way. You also don't need the #RowSet table variable for any temporary holding.
INSERT dbo.GroupedTasks(GroupName, Id)
SELECT
LEFT(LongTaskName, COALESCE(NULLIF(CHARINDEX(',', LongTaskName)-1, -1), 1024)),
Id
FROM dbo.Tasks;
It is even easier if the tasks are 1-character long, you can use LEFT(LongTaskName, 1) instead of the ugly SUBSTRING/CHARINDEX mess. But I'm guessing your task names are not one character long (if this is the case, you should include some data that varies a bit so that others don't make assumptions about length).
Now, keep in mind that you'll have to do something like this to keep dbo.GroupedTasks up to date every time a dbo.Tasks row is inserted, updated or deleted. How are you going to keep these two tables in sync?
More to the point, you should consider storing the top priority task separately in the first place, either by using a computed column or separating it out before the insert. Munging data together is something that you do with hash tables and arrays in application code, but it rarely has any positive attributes inside a database. You almost always spend more time and effort extracting the data apart than you ever saved by keeping it together in the first place. This will negate the need for a second table at all.
Select Id, Split( ',', LongTaskName ) as GroupName into TasksWithGroupInfo
Does this answer your question?