Make an IN statement using two attributes in Activerecord - sql

I've been trying this for a while, and can't seem to get it right in Activerecord.
Given an array of asset_id and asset_type pairs, query a class that has both those attributes, only where both asset_id and asset_type match.
So given the array
[[4,"Logo"],[1,"Image"]]
I want to generate the SQL
SELECT "asset_attachments".* FROM "asset_attachments" WHERE ((asset_id,asset_type) IN ((4,'Logo'),(1,'Image')))
I can do this by manually entering a string using where like this:
AssetAttachment.where("(asset_id,asset_type) IN ((4,'Logo'),(1,'Image'))")
But I'm trying to use it with an array of any length and asset type/id.
So far I've tried
AssetAttachment.where([:asset_id, :asset_type] => [[4,"Logo"],[1,"Image"]])
NoMethodError: undefined method `to_sym' for [:asset_id, :asset_type]:Array
and
AssetAttachment.where("(asset_id,asset_type)" => [[4,"Logo"],[1,"Image"]])
ActiveRecord::StatementInvalid: PG::Error: ERROR: column asset_attachments.(asset_id,asset_type) does not exist
and
AssetAttachment.where("(asset_id,asset_type) IN (?,?)",[[4,"Logo"],[1,"Image"]])
ActiveRecord::PreparedStatementInvalid: wrong number of bind variables (1 for 2) in: (asset_id,asset_type) IN (?,?)
Does anyone know how to do this? Thanks in advance

set vs. array
The core of the problem is: you are mixing sets and arrays in an impossible way.
elem IN (...) .. expects a set.
elem = ANY(...) .. expects an array.
You can use unnest() to transform an array to a set.
You can use the aggregate function array_agg() to transform a set to an array.
Errors
Here, you are trying to form an array from (asset_id, asset_type):
AssetAttachment.where([:asset_id, :asset_type] => [[4,"Logo"],[1,"Image"]])
.. which is impossible, since arrays have to consist of identical types, while we obviously deal with a numeric and a string constant (you kept the actual types a secret).
Here, you force "(asset_id, asset_type)" as single column name by double-quoting it:
AssetAttachment.where("(asset_id,asset_type)" => [[4,"Logo"],[1,"Image"]])
And finally, here you try provide a single bind variable for two ?:
AssetAttachment.where("(asset_id,asset_type) IN (?,?)",[[4,"Logo"],[1,"Image"]])
Valid SQL
In pure SQL, either of these work:
SELECT * FROM asset_attachments
WHERE (asset_id, asset_type) IN ((4, 'Logo'), (1, 'Image'));
SELECT * FROM asset_attachments
WHERE (asset_id, asset_type) IN (VALUES(4, 'Logo'), (1, 'Image'));
SELECT * FROM asset_attachments
WHERE (asset_id, asset_type) = ANY (ARRAY[(4, 'Logo'), (1, 'Image')]);
If you have a long list of possible matches, an explicit JOIN would prove faster:
SELECT *
FROM asset_attachments
JOIN (VALUES(4, 'Logo'), (1, 'Image')) AS v(asset_id, asset_type)
USING (asset_id, asset_type)
Valid syntax for AR
I am an expert with Postgres, with AR not so much. This simple form might work:
AssetAttachment.where("(asset_id,asset_type) IN ((?,?),(?,?))", 4,"Logo",1,"Image")
Not sure if this could work, not sure about single or double quotes either:
AssetAttachment.where((:asset_id, :asset_type) => [(4,'Logo'),(1,'Image')])

Related

extract values inside an array column in amazon athena

I have a table in athena aws where the column 'metadata_stopinfo' has the structure that you can see in the image.
I am trying to extract values that are inside that array, however when I try
SELECT
"json_extract_scalar"(metadata_stopinfo, '$.city')
FROM "table"
I have the following problem
SYNTAX_ERROR: line 2:5: Unexpected parameters (array(row("address" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"carrierreference" varchar,"contacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"mobilephone" varchar,"name" varchar,"officephone" varchar,"userid" varchar)),"containerinfo" array(row("containerid" varchar,"containeridtype" varchar,"equipmentcode" varchar,"equipmenttype" varchar)),"conveyancelinenumber" varchar,"conveyancetype" varchar,"conveyancetypeoriginal" varchar,"dateinfo" row("arrivalestimateddate" varchar,"arrivalestimateddateend" varchar,"arrivalestimatedendoffset" varchar,"arrivalestimatedoffset" varchar,"arrivalrequesteddate" varchar,"deliveryestimateddate" varchar,"deliveryestimateddateend" varchar,"deliveryestimatedendoffset" varchar,"deliveryestimatedoffset" varchar,"deliveryrequesteddate" varchar,"deliveryrequesteddateend" varchar,"deliveryrequestedendoffset" varchar,"deliveryrequestedoffset" varchar,"departureestimateddate" varchar,"departureestimateddateend" varchar,"departureestimatedendoffset" varchar,"departureestimatedoffset" varchar,"departurerequesteddate" varchar,"pickuprequesteddate" varchar,"pickuprequesteddateend" varchar,"pickuprequestedendoffset" varchar,"pickuprequestedoffset" varchar,"pickupestimateddate" varchar,"pickupestimateddateend" varchar,"pickupestimatedendoffset" varchar,"pickupestimatedoffset" varchar),"deliverynotenumber" varchar,"instructions" array(row("customerspecificsubtype" varchar,"header" boolean,"instructionsubtype" varchar,"instructiontype" varchar,"text" varchar)),"locationid" varchar,"partnercarrieraddress" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"partnercarriercontacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"name" varchar,"officephone" varchar)),"partnercarrierid" varchar,"partnercarriername" varchar,"partnerid" varchar,"partnername" varchar,"partnertimezone" varchar,"partnertype" varchar,"productquantity" row("number" double,"originalunitofmeasure" varchar,"quantitytype" varchar,"unitofmeasure" varchar),"sequencenumber" bigint,"shipmentidentifier" varchar,"stoptype" varchar,"transportinfo" row("description" varchar,"transportcode" varchar,"transportoriginalcode" varchar),"vesselinfo" row("lloydsnumber" varchar,"shipsradiocallnumber" varchar,"vesselname" varchar,"vesselnumber" varchar,"voyagetripnumber" varchar))), varchar(6)) for function json_extract_scalar. Expected: json_extract_scalar(varchar(x), JsonPath) , json_extract_scalar(json, JsonPath)
My question is, how can i extract values inside de column ?
json_extract_scalar unsurprisingly works with json (note that even if yur data was in json format, json_extract_scalar(metadata_stopinfo, '$.city') still would not have worked cause your data is an array), while your column contains array's of row's, so you need to work with it correspondingly. For example you can use indexes to access elements in array (in presto array indexes start from 1):
SELECT
metadata_stopinfo[1] r
FROM "table"
And then access the fields:
The fields may be of any SQL type, and are accessed with field reference operator .
SELECT
metadata_stopinfo[1].city city
FROM "table"
Also you can flatten the array with unnest:
SELECT r.city
FROM "table",
unnest(metadata_stopinfo) as t(r)

Fetching attribute from JSON string with JSON_VAL cause "<attribute> is invalid in the used context" error

A proprietary third-party application stores JSON strings in it's database like this one:
{"state":"complete","timestamp":1614776473000}
I need the timestamp and found out that
DB2 offers JSON functions. Since it's stored as string in the PROF_VALUE column, I guess that converting with SYSTOOLS.JSON2BSON is required, before I can use JSON_VAL to fetch the timestamp:
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "timestamp", "f")
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'
This causes an error that timestamp is invalid in the used context ( SQLCODE=-206, SQLSTATE=42703, DRIVER=4.26.14). The same error is thown when I remove the JSON2BSON call like this
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE, "timestamp", "f")
Also not working with the same error (different data-types):
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "state", "s:1000")
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE) "state", "s:1000")
I don't understand this error. My syntax is like the documented JSON_VAL ( json-value , search-string , result-type) and it is the same like in the examples, where they show how to fetch the name field of an object.
I also played around a bit with JSON_TABLE to use raw input data for testing (instead of the database data), but it seems not suiteable for that.
SELECT *
FROM TABLE(SYSTOOLS.JSON_TABLE( SYSTOOLS.JSON2BSON('{"state":"complete","timestamp":1614776473000}'), 'state','s:32')) DATA
This gave me a table with one row: Type = 2 and Value = complete.
I had two problems in my query: First it seems that double quotes " are for object references. I wasn't aware that there is any difference, because in most databases I used yet, both single ' and double quotes " are equal.
The second problem is, that JSON_VAL needs to be called without SYSTOOLS, but the reference is still needed on SYSTOOLS.JSON2BSON(PROF_VALUE).
With those changes, the following query worked:
SELECT JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), 'timestamp', 'f')
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'

Clean a JSON in a PostGreSQL request

I have a SQL request that is almost perfect (for what I want to do):
WITH liste_fichiers_joints AS (
SELECT
id_dans_table,
ARRAY_AGG (row_to_json(f)) ids_fichier
FROM
fichiers_joints fj
LEFT JOIN fichiers f ON f.id = fj.id_fichier
WHERE
nom_table = 'taches'
GROUP BY
id_dans_table
)
SELECT t.id, t.nom, lfj.ids_fichier
FROM taches t
JOIN liste_fichiers_joints lfj ON lfj.id_dans_table = t.id
As you may have guessed, I'd like to get in the same request getting all the tasks: the id of a task, the name of the task but also in an array all the ids and names of the attached files if there are any.
The result is nearly what I want, but the last column displays this:
{"{\"uuid\":\"fd809b1f-6849-4322-a654-67f70c46a435\",\"nom\":\"test.png\",\"date\":\"2020-11-17T01:21:24.223354\",\"status\":\"TMP\",\"id\":185}"}
I'd like to remove the uuid and status parts, I tried some subrequests, up to no avail.
Also, I'd like to remove the backslashes \, because otherwise it will be complicated to use this column as a JSON in my Javascript.
Does anybody has a clue?
Thanks in advance.
You can use json[b]_build_object() instead of row_to_json[b](): it accepts a list of key/value pairs, so you have fine-grained control about what is going into your objects.
Also, you most likely want a JSON array, rather than a Postgres array of JSON objects.
I would recommend changing this:
ARRAY_AGG (row_to_json(f)) ids_fichier
To:
jsonb_agg(
jsonb_build_object('nom', f.nom, 'date', f.date, 'id', f.id)
) as ids_fichier

Rails 4 querying against postgresql column with array data type error

I am trying to query a table with a column with the postgresql array data type in Rails 4.
Here is the table schema:
create_table "db_of_exercises", force: true do |t|
t.text "preparation"
t.text "execution"
t.string "category"
t.datetime "created_at"
t.datetime "updated_at"
t.string "name"
t.string "body_part", default: [], array: true
t.hstore "muscle_groups"
t.string "equipment_type", default: [], array: true
end
The following query works:
SELECT * FROM db_of_exercises WHERE ('Arms') = ANY (body_part);
However, this query does not:
SELECT * FROM db_of_exercises WHERE ('Arms', 'Chest') = ANY (body_part);
It throws this error:
ERROR: operator does not exist: record = character varying
This does not work for me either:
SELECT * FROM "db_of_exercises" WHERE "body_part" IN ('Arms', 'Chest');
That throws this error:
ERROR: array value must start with "{" or dimension information
So, how do I query a column with an array data type in ActiveRecord??
What I have right now is:
#exercises = DbOfExercise.where(body_part: params[:body_parts])
I want to be able to query records that have more than one body_part associated with them, which was the whole point of using an array data type, so if someone could enlighten me on how to do this that would be awesome. I don't see it anywhere in the docs.
Final solution for posterity:
Using the overlap operator (&&):
SELECT * FROM db_of_exercises WHERE ARRAY['Arms', 'Chest'] && body_part;
I was getting this error:
ERROR: operator does not exist: text[] && character varying[]
so I typecasted ARRAY['Arms', 'Chest'] to varchar:
SELECT * FROM db_of_exercises WHERE ARRAY['Arms', 'Chest']::varchar[] && body_part;
and that worked.
I don't think that it has related to rails.
What if you do the follow?
SELECT * FROM db_of_exercises WHERE 'Arms' = ANY (body_part) OR 'Chest' = ANY (body_part)
I know that rails 4 supports Postgresql ARRAY datatype, but I'm not sure if ActiveRecord creates new methods for query the datatype. Maybe you can use Array Overlap I mean the && operator and then doind something like:
WHERE ARRAY['Arms', 'Chest'] && body_part
or maybe give a look to this gem: https://github.com/dockyard/postgres_ext/blob/master/docs/querying.md
And then do a query like:
DBOfExercise.where.overlap(:body_part => params[:body_parts])
#Aguardientico is absolutely right that what you want is the array overlaps operator &&. I'm following up with some more explanation, but would prefer you to accept that answer, not this one.
Anonymous rows (records)
The construct ('item1', 'item2', ...) is a row-constructor unless it appears in an IN (...) list. It creates an anonymous row, which PostgreSQL calls a "record". The error:
ERROR: operator does not exist: record = character varying
is because ('Arms', 'Chest') is being interpreted as if it were ROW('Arms', 'Chest'), which produces a single record value:
craig=> SELECT ('Arms', 'Chest'), ROW('Arms', 'Chest'), pg_typeof(('Arms', 'Chest'));
row | row | pg_typeof
--------------+--------------+-----------
(Arms,Chest) | (Arms,Chest) | record
(1 row)
and PostgreSQL has no idea how it's supposed to compare that to a string.
I don't really like this behaviour; I'd prefer it if PostgreSQL required you to explicitly use a ROW() constructor when you want an anonymous row. I expect that the behaviour shown here exists to support SET (col1,col2,col3) = (val1,val2,val3) and other similar operations where a ROW(...) constructor wouldn't make as much sense.
But the same thing with a single item works?
The reason the single ('Arms') case works is because unless there's a comma it's just a single parenthesised value where the parentheses are redundant and may be ignored:
craig=> SELECT ('Arms'), ROW('Arms'), pg_typeof(('Arms')), pg_typeof(ROW('Arms'));
?column? | row | pg_typeof | pg_typeof
----------+--------+-----------+-----------
Arms | (Arms) | unknown | record
(1 row)
Don't be alarmed by the type unknown. It just means that it's a string literal that hasn't yet had a type applied:
craig=> SELECT pg_typeof('blah');
pg_typeof
-----------
unknown
(1 row)
Compare array to scalar
This:
SELECT * FROM "db_of_exercises" WHERE "body_part" IN ('Arms', 'Chest');
fails with:
ERROR: array value must start with "{" or dimension information
because of implicit casting. The type of the body_part column is text[] (or varchar[]; same thing in PostgreSQL). You're comparing it for equality with the values in the IN clause, which are unknown-typed literals. The only valid equality operator for an array is = another array of the same type, so PostgreSQL figures that the values in the IN clause must also be arrays of text[] and tries to parse them as arrays.
Since they aren't written as array literals, like {"FirstValue","SecondValue"}, this parsing fails. Observe:
craig=> SELECT 'Arms'::text[];
ERROR: array value must start with "{" or dimension information
LINE 1: SELECT 'Arms'::text[];
^
See?
It's easier to understand this once you see that IN is actually just a shorthand for = ANY. It's an equality comparison to each element in the IN list. That isn't what you want if you really want to find out if two arrays overlap.
So that's why you want to use the array overlaps operator &&.

doctrine native sql not accepting parameter list

I'm trying to do native SQL in Doctrine. Basically I have 2 parameters:
CANDIDATE_ID - user for who we delete entries,
list of FILE_ID to keep
So I make
$this->getEntityManager()->getConnection()->
executeUpdate( "DELETE FROM FILE WHERE CANDIDATE_ID = :ID AND NOT ID IN :KEEPID",
array(
"ID" => $candidate->id,
"KEEPID" => array(2) )
);
But Doctrine fails:
Notice: Array to string conversion in D:\xampp\htdocs\azk\vendor\doctrine\dbal\lib\Doctrine\DBAL\Connection.php on line 786
Is this bug in Doctrine? I'm making somewhere else select with IN but with QueryBuilder and it's working. Maybe someone could suggest better way of deleting entries, with QueryBuilder for example?
$stmt = $conn->executeQuery('SELECT * FROM articles WHERE id IN (?)',
array(array(1, 2, 3, 4, 5, 6)),
array(\Doctrine\DBAL\Connection::PARAM_INT_ARRAY)
);
From Doctrine's documentation.
You can't pass an array of IDs to a parameter. You can do this for scalar values, but even if this had a 'toString', it wouldn't be what you want.
String concatenation is one method,
"DELETE FROM FILE WHERE CANDIDATE_ID = :ID AND NOT ID IN (". implode(",", $list_of_ids) .")"
But this method goes straight around parameters, and therefore suffers in terms of readability, and is limited to a certain maximum line length, which can vary between databases.
Another approach is to write a function returning a table result, which takes a string of IDs as a parameter.
You could also solve this with a join to a table containing the IDs to keep.
It's a problem I've seen many times with few good answers, but it's usually caused by a misunderstanding in the way the database is modelled. This is a 'code smell' for database access.