Unnest string array and transpose in Big query - sql

I'm using Bigquery, I've a table A with string array and I need to cast to int64/string ( if possible ) so I can join with table B which of Int64/string
The main ask here is:
I've a table A, where I've string array mapped with Ref ID as below:
I'm trying to get unnest and my desired output should be as below.
I did tried below script:
SELECT a0_string_arrat,
ref_id
FROM TableA AS t,
t.String_array AS a0_String_array
But the challenge with above script is, I've close to 1000 Ref IDs, but my output is resulting only 100
If I try the below, I'm able to get all 1000 rows.
SELECT string_array,
ref_id
FROM TableA
The end goal is to I need to unnest and cast to Int64/string. The above script is not working for my need. can someone help on this.

You can use CROSS JOIN + UNNEST() in order to get the values from the array attributed to each ref_id:
select
ref_id,
unnested_numbers
from tablea
cross join unnest(string_array) as unnested_numbers
order by 2, 1
This should give you the desired output that you specified.

Related

Select rows according to another table with a comma-separated list of items

Have a table test.
select b from test
b is a text column and contains Apartment,Residential
The other table is a parcel table with a classification column. I'd like to use test.b to select the right classifications in the parcels table.
select * from classi where classification in(select b from test)
this returns no rows
select * from classi where classification =any(select '{'||b||'}' from test)
same story with this one
I may make a function to loop through the b column but I'm trying to find an easier solution
Test case:
create table classi as
select 'Residential'::text as classification
union
select 'Apartment'::text as classification
union
select 'Commercial'::text as classification;
create table test as
select 'Apartment,Residential'::text as b;
You don't actually need to unnest the array:
SELECT c.*
FROM classi c
JOIN test t ON c.classification = ANY (string_to_array(t.b, ','));
db<>fiddle here
The problem is that = ANY takes a set or an array, and IN takes a set or a list, and your ambiguous attempts resulted in Postgres picking the wrong variant. My formulation makes Postgres expect an array as it should.
For a detailed explanation see:
How to match elements in an array of composite type?
IN vs ANY operator in PostgreSQL
Note that my query also works for multiple rows in table test. Your demo only shows a single row, which is a corner case for a table ...
But also note that multiple rows in test may produce (additional) duplicates. You'd have to fold duplicates or switch to a different query style to get de-duplicate. Like:
SELECT c.*
FROM classi c
WHERE EXISTS (
SELECT FROM test t
WHERE c.classification = ANY (string_to_array(t.b, ','))
);
This prevents duplication from elements within a single test.b, as well as from across multiple test.b. EXISTS returns a single row from classi per definition.
The most efficient query style depends on the complete picture.
You need to first split b into an array and then get the rows. A couple of alternatives:
select * from nj.parcels p where classification = any(select unnest(string_to_array(b, ',')) from test)
select p.* from nj.parcels p
INNER JOIN (select unnest(string_to_array(b, ',')) from test) t(classification) ON t.classification = p.classification;
Essential to both is the unnest surrounding string_to_array.

How to use a multi-element string for a IN sql query?

Is it possible to use the input from one field of the database for another query in combination with the IN statement. The point is that in the sting, I use for IN, contains several by comma separated values:
SELECT id, name
FROM refPlant
WHERE id IN (SELECT cover
FROM meta_characteristic
WHERE id = 2);
the string of the subquery is: 1735,1736,1737,1738,1739,1740,1741,1742,1743,1744
The query above give me only the first element of the string. But when I put the string directly in the query, I get all the ten elements:
SELECT id, name
FROM refPlant
WHERE id IN (735,1736,1737,1738,1739,1740,1741,1742,1743,1744);
Is it possible to have all ten elements and not only one with query like the first one.
My sql version is 10.1.16-MariaDB
You can use FIND_IN_SET in the join condition.
SELECT r.id, r.name
FROM refPlant r
JOIN (SELECT * FROM meta_characteristic m WHERE id=2) m
ON FIND_IN_SET(r.id,m.cover) > 0
If you use a sub-query as in the first code snippet you will get a filter for each row returned from it. It will not work when it returns as a single string field.
SELECT id, name
FROM refPlant
WHERE FIND_IN_SET(id, (SELECT cover
FROM meta_charateristic
WHERE id = 2));

Array field in postgres, need to do self-join with results

I have a table that looks like this:
stuff
id integer
content text
score double
children[] (an array of id's from this same table)
I'd like to run a query that selects all the children for a given id, and then right away gets the full row for all these children, sorted by score.
Any suggestions on the best way to do this? I've looked into WITH RECURSIVE but I'm not sure that's workable. Tried posting at postgresql SE with no luck.
The following query will find all rows corresponding to the children of the object with id 14:
SELECT *
FROM unnest((SELECT children FROM stuff WHERE id=14)) t(id)
JOIN stuff USING (id)
ORDER BY score;
This works by finding the children of 14 as array first, then we convert it into a table using the unnest function, and then we join with stuff to find all rows with the given ids.
The ANY construct in the join condition would be simplest:
SELECT c.*
FROM stuff p
JOIN stuff c ON id = ANY (p.children)
WHERE p.id = 14
ORDER BY c.score;
Doesn't matter for the query whether the array of children IDs is in the same table or different one. You just need table aliases here to be unambiguous.
Related:
Check if value exists in Postgres array
Similar solution:
With Postgres you can use a recursive common table expression:
with recursive rel_tree as (
select rel_id, rel_name, rel_parent, 1 as level, array[rel_id] as path_info
from relations
where rel_parent is null
union all
select c.rel_id, rpad(' ', p.level * 2) || c.rel_name, c.rel_parent, p.level + 1, p.path_info||c.rel_id
from relations c
join rel_tree p on c.rel_parent = p.rel_id
)
select rel_id, rel_name
from rel_tree
order by path_info;
Ref: Postgresql query for getting n-level parent-child relation stored in a single table

Compare Items in the "IN" Clause and the resultset

I'd like to achieve something as follows, I have the following query (As simple as this),
SELECT ENT_ID,TP_ID FROM TC_LOGS WHERE ENT_ID IN (1,2,3,4,5).
Now the table TC_LOGS may not have all the items in the IN clause. So assuming that the table TC_LOGS has only 1,2. I'd like to compare the items in the IN clause i.e. 1,2,3,4,5 with 1,2(the resultset) and get a result as FOUND - 1,2 NOT FOUND - 3,4,5. I've have implemented this by applying an XSL transformation on the resultset in the application code, but I'd like to achieve this in a query, which I feel is more of an elegant solution to this problem. Also, I tried the following query with NVL, just to separate out the FOUND and NOT FOUND items as,
SELECT NVL(ENT_ID,"NOT FOUND") FROM TC_LOGS WHERE ENT_ID IN(1,2,3,4,5)
I was expecting a result as 1,2,NOT FOUND,NOT FOUND,NOT FOUND
But the above query doesn't return any result.. I'd appreciate if someone can guide me in the right path here.. Thanks much in advance.
Assuming that the items in your IN list can (or can come) from another query, you can do something like
WITH src AS (
SELECT level id
FROM dual
CONNECT BY level <= 5)
SELECT nvl(ent_id, 'Not Found' )
FROM src
LEFT OUTER JOIN tc_logs ON (src.id = tc_logs.ent_id)
In my case, the src query is just generating the numbers 1 through 5. You could just as easily fetch that data from a different table, load the numbers into a collection that you query using the TABLE operator, load the numbers into a temporary table that you query, etc. depending on how the IN list data is determined.
NVL isn't going to work because no values (including NULLS) are returned when there is no match with the IN statement.
What you can do is something like this:
SELECT NVL(ENT_ID, "NOT FOUND")
FROM TC_LOGS
RIGHT OUTER JOIN (
SELECT 1 AS 'TempID' UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4 UNION
SELECT 5) AS Sub ON ENT_ID = TempID
The outer join will return NULLS for ENT_ID where there are no matches. Note, I'm not an Oracle person so I can't guarantee that this syntax is perfect.
if you have a table (let's use table src )contains all (1,2,3,4,5) values, you can use full join.
You can use (WITH src AS ( SELECT level id FROM dual CONNECT BY level <= 5) as the src table also)
SELECT
ent_id,tl.tp_id,src.tp_id
FROM
src
FULL JOIN
tc_logs tl
USING (ent_id)
ORDER BY
ent_id
Here is the web site for oracle full join.http://psoug.org/snippet/Oracle-PL-SQL-ANSI-Joins-FULL-JOIN_738.htm

What is the most effecient way to write this SQL query?

I have two lists of ids. List A and List B. Both of these lists are actually the results of SQL queries (QUERY A and QUERY B respectively).
I want to 'filter' List A, by removing the ids in List A if they appear in list B.
So for example if list A looks like this:
1, 2, 3, 4, 7
and List B looks like this:
2,7
then the 'filtered' List A should have ids 2 and 7 removed, and so should look like this:
1, 3, 4
I want to write an SQL query like this (pseudo code of course):
SELECT id FROM (QUERYA) as temp_table where id not in (QUERYB)
Using classic SQL:
select [distinct] number
from list_a
where number not in (
select distinct number from list_b
);
I've put the first "distinct" in square brackets since I'm unsure as to whether you wanted duplicates removed (remove either the brackets or the entire word). The second "distinct" should be left in just in case your DBMS doesn't optimize IN clauses.
It may be faster (measure, don't guess) with an left join along the lines of:
select [distinct] list_a.number from list_a
left join list_b on list_a.number = list_b.number
where list_b.number is null;
Same deal with the "[distinct]".
see Doing INTERSECT and MINUS in MySQL
The query:
select id
from ListA
where id not in (
select id
from ListB)
will give you the desired result.
I am not sure which way is the best. As my previous impression, the perforamnce could be very different depends on situtation and the size of the tables.
1.
select id
from ListA
where id not in (
select id
from ListB)
2.
select ListA.id
from ListA
left join ListB on ListA.id=ListB.id
where ListB.id is null
3.
select id
from ListA
where not exists (
select *
from ListB where ListB.id=ListA.id)
The 2) should be the fastest usually, as it does inner join not sub-queries.
Some people may suggest 3) rather then 1) beause it use "exists" which does not read data from table.