How to query a Postgres `RECORD` datatype - sql

I have a query that will return a row as a RECORD data type from a subquery - see below for example:
select *
from (
select row(st.*) table_rows
from some_table st
) x
where table_rows[0] = 339787
I am trying to further qualify it in the WHERE clause and I need to do so by extracting one of the nodes in the returned RECORD data type.
When I do the above, I get an error saying:
ERROR: cannot subscript type record because it is not an array
Does anybody know of a way of implementing this?

Use (row).column_name. You can just refer to the table itself to create the record:
select *
from (
select r
from some_table r
) x
where (r).column_name = 339787
There is a small chance that later a column is created with the same name as the alias you chose and the above query will fail as select r will return the later created column in instead of the record. The first solution is to use the row constructor as you did in your question:
select row(r.*) as r
The second solution is to use the schema qualified name of the table:
select my_squema.some_table as r

Alternately You can try this
select *
from (
select *
from tbl
) x
where x.col_name = 339787

Related

How to pass a string of column name as a parameter into a CREATE TABLE FUNCTION in BigQuery

I want to create a table function that takes two arguments, fieldName and parameter, where I can later use this function to create tables in other fieldName and parameter pairs. I tried multiple ways, and it seems like the fieldName(column name) is always parsed as a string in the where clause. Wondering how should I be doing this in the correct way.
CREATE OR REPLACE TABLE FUNCTION dataset.functionName( fieldName ANY TYPE, parameter ANY TYPE)
as
(SELECT *
FROM `dataset.table`
WHERE format("%t",fieldName ) = parameter
)
Later call the function as
SELECT *
from dataset.functionName( 'passed_qa', 'yes')
(passed_qa is a column name and assume it only has 'yes' and 'no' value)
I tried using EXECUTE IMMEDIATE, it works, but I just want to know if there's a way to approach this in a functional way.
Thanks for any help!
Good news - IT IS POSSIBLE!!! (side note: in my experience - i haven't had any cases when something was not possible to achieve in BigQuery directly or indirectly/workaround maybe with some few exceptions)
See example below
create or replace table function dataset.functionName(fieldName any type, parameter any type)
as (
select * from `bigquery-public-data.utility_us.us_states_area` t
where exists ( select true
from unnest(`bqutil.fn.json_extract_keys`(to_json_string(t))) key with offset
join unnest(`bqutil.fn.json_extract_values`(to_json_string(t))) value with offset
using(offset)
where key = fieldName and value = parameter
)
)
Now, when table function created - run below as see result
select *
from dataset.functionName('state_abbreviation', 'GU')
you will get record for GUAM
Then try below
select *
from dataset.functionName('division_code', '0')
with output
For details see:
https://cloud.google.com/bigquery/docs/reference/standard-sql/table-functions
A work-around can be to use a case statement to select the desired column. If any column is needed, please use the solution of Mikhail Berlyant.
Create or replace table function Test.HL(fieldName string,parameter ANY TYPE)
as
(
SELECT *
From ( select "1" as tmp, 2 as passed_qa) # generate dummy table
Where case
when fieldName="passed_qa" then format("%t",passed_qa)
when fieldName="tmp" then format("%t",tmp)
else ERROR(concat('column ',fieldName,' not found')) end = parameter
)

show columns in CTE returns an error - why?

I have a show columns query that works fine:
SHOW COLUMNS IN table
but it fails when trying to put it in a CTE, like this:
WITH columns_table AS (
SHOW COLUMNS IN table
)
SELECT * from columns_table
any ideas why and how to fix it?
Using RESULT_SCAN:
Returns the result set of a previous command (within 24 hours of when you executed the query) as if the result was a table. This is particularly useful if you want to process the output from any of the following:
SHOW or DESC[RIBE] command that you executed.
SHOW COLUMNS IN ...;
WITH columns_table AS (
SELECT *
FROM table(RESULT_SCAN(LAST_QUERY_ID()))
)
SELECT *
FROM columns_table;
CTE requires select clause and we cannot use SHOW COLUMN IN CTE's and as a alterative use INFORMATION_SCHEMA to retrieve metadata .Like below:
WITH columns_table AS (
Select * from INTL_DB.INFORMATION_SCHEMA.COLUMNS where TABLE_NAME='CURRENCIES'
)
SELECT * from columns_table;

How to use a multi-element string for a IN sql query?

Is it possible to use the input from one field of the database for another query in combination with the IN statement. The point is that in the sting, I use for IN, contains several by comma separated values:
SELECT id, name
FROM refPlant
WHERE id IN (SELECT cover
FROM meta_characteristic
WHERE id = 2);
the string of the subquery is: 1735,1736,1737,1738,1739,1740,1741,1742,1743,1744
The query above give me only the first element of the string. But when I put the string directly in the query, I get all the ten elements:
SELECT id, name
FROM refPlant
WHERE id IN (735,1736,1737,1738,1739,1740,1741,1742,1743,1744);
Is it possible to have all ten elements and not only one with query like the first one.
My sql version is 10.1.16-MariaDB
You can use FIND_IN_SET in the join condition.
SELECT r.id, r.name
FROM refPlant r
JOIN (SELECT * FROM meta_characteristic m WHERE id=2) m
ON FIND_IN_SET(r.id,m.cover) > 0
If you use a sub-query as in the first code snippet you will get a filter for each row returned from it. It will not work when it returns as a single string field.
SELECT id, name
FROM refPlant
WHERE FIND_IN_SET(id, (SELECT cover
FROM meta_charateristic
WHERE id = 2));

Bigquery If field exists

Short: Is there a way to query in BQ fields that don't exist, receiving nulls for these fields?
I have almost the same issue that
BigQuery IF field exists THEN but sometimes my APIs can query tables were there are not some particular fields (historic tables) and this approach fails because it needs a table with that field:
SELECT a, b, c, COALESCE(my_field, 0) as my_field
FROM
(SELECT * FROM <somewhere w/o my_field>),
(SELECT * FROM <somewhere with my_field>)
Is there a way to do something like:
SELECT IFEXISTS(a, NULL) as the-field
FROM <somewhere w/o my_field>
Let's assume your table has x and y fields only!
So below query will perfectly work
SELECT x, y FROM YourTable
But below one will fail because of non-existing field z
SELECT x, y, z FROM YourTable
The way to address this is as below
#legacySQL
SELECT x, y, COALESCE(z, 0) as z
FROM
(SELECT * FROM YourTable),
(SELECT true AS fake, NULL as z)
WHERE fake IS NULL
EDIT: added explicit #legacySQL to not to confuse those who is trying to apply this exact approach to Standard SQL :o)
Like #phaigeim, I wasn't able to use Mikhail's answer in 2019 - I got "Column name z is ambiguous".
I wound up using the BigQuery Information Schema tables to check if the column exists, and otherwise do SELECT NULL as z. I did this in dbt using a jinja macro since I couldn't figure out a way to do it in straight SQL. That restricts its applicability, but it may be an option in some use cases.
This can be done by using a script:
DECLARE my_field STRING;
SET my_field = "default";
-- my_field falls back to "default" if there is no such column in my_table
SELECT my_field FROM my_table;
I ran into this issue recently. Apparently bigquery has exception handling so you could do
BEGIN
SELECT a, b FROM your_table;
EXCEPTION WHEN ERROR THEN
SELECT a, NULL AS b FROM your_table;
END
assuming column a is guaranteed to exist, but b might not.

Does PostgreSQL have a mechanism to update the same row multiple times in a single query?

Consider the following:
create table tmp.x (i integer, t text);
create table tmp.y (i integer, t text);
delete from tmp.x;
delete from tmp.y;
insert into tmp.x values (1, 'hi');
insert into tmp.y values(1, 'there');
insert into tmp.y values(1, 'wow');
In the above, there is one row in table x, which I want to update. In table y, there are two rows, both of which I want to "feed data into" the update.
Below is my attempt:
update tmp.x
set t = x.t || y.t
from ( select * from tmp.y order by t desc ) y
where y.i = x.i;
select * from tmp.x;
I want the value of x.t to be 'hiwowthere' but the value ends up being 'hiwow'. I believe the cause of this is that the subquery in the update statement returns two rows (the y.t value of 'wow' being returned first), and the where clause y.i = x.i only matches the first row.
Can I achieve the desired outcome using a single update statement, and if so, how?
UPDATE: The use of the text type above was for illustration purposes only. I do not actually want to modify textual content, but rather JSON content using the json_set function that I posted here (How do I modify fields inside the new PostgreSQL JSON datatype?), although I'm hoping the principle could be applied to any function, such as the fictional concat_string(column_name, 'string-to-append').
UPDATE 2: Rather than waste time on this issue, I actually wrote a small function to accomplish it. However, it would still be nice to know if this is possible, and if so, how.
What you can do is to build up a concatenated string using string_agg, grouped by the integer i, which you can then join onto during the update:
update tmp.x
set t = x.t || y.txt
from (select i, string_agg(t, '') as txt
from(
select tmp.y.i,tmp.y.t
from tmp.y
order by t desc
) z
group by z.i) y
where y.i = x.i ;
In order to preserve the order, you may need an additional wrapping derived table. SqlFiddle here
Use string_agg, as follows:
update tmp.x x
set t = x.t || (
select string_agg(t,'' order by t desc)
from tmp.y where i = x.i
group by i
)
SQLFiddle
with cte as (
select y.i, string_agg(t, '' order by t desc) as txt
from y
group by y.i
)
update x set t= x.t||cte.txt
from cte where cte.i=x.i