BigQuery: Union two different tables which are based on federated Google Spreadsheet - sql

I have two different Google Spreadsheet:
One with 4 columns
+------+------+------+------+
| Col1 | Col2 | Col5 | Col6 |
+------+------+------+------+
| ID1 | A | B | C |
| ID2 | D | E | F |
+------+------+------+------+
One with the 4 columns of the previous file, and 2 more columns
+------+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+------+------+------+------+------+------+
| ID3 | G | H | J | K | L |
| ID4 | M | N | O | P | Q |
+------+------+------+------+------+------+
I configured them as Federated source in Google BigQuery, now I need to create a view that will join data of both tables.
Both tables have Col1 column, which contains an ID, this ID is unique across alla the tables, does not contain replicated data.
The resulting table I'm looking for is the following one:
+------+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+------+------+------+------+------+------+
| ID1 | A | NULL | NULL | B | C |
| ID2 | D | NULL | NULL | E | F |
| ID3 | G | H | J | K | L |
| ID4 | M | N | O | P | Q |
+------+------+------+------+------+------+
For the columns that the first file does not have, I'm expecting a NULL value.
I'm using standardSQL, here is a statement you can use to generate a sample data:
#standardsQL
WITH table1 AS (
SELECT "A" as Col1, "B" as Col2, "C" AS Col3
UNION ALL
SELECT "D" as Col1, "E" as Col2, "F" AS Col3
),
table2 AS (
SELECT "G" as Col1, "H" as Col2, "J" AS Col3, "K" AS Col4, "L" AS Col5
UNION ALL
SELECT "M" as Col1, "N" as Col2, "O" AS Col3, "P" AS Col4, "Q" AS Col5
)
A simple UNION ALL is not working because tables have different columns
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
Error: Queries in UNION ALL have mismatched column count; query 1 has 3 columns, query 2 has 5 columns at [17:1]
And wildcard operator is not a suitable way because Federated sources does not support that
SELECT * FROM `table*`
Error: External tables cannot be queried through prefix
Of course this is a sample data, with only 3-5 columns, the real tables have 20-40 columns. So an example where I need to explicitly SELECT field by field it is not a considerable way.
Is there a working way to join this two tables?

You can pass the rows through a UDF to handle the case where column names aren't aligned by position or there are different numbers of them between tables. Here is an example:
CREATE TEMP FUNCTION CoerceRow(json_row STRING)
RETURNS STRUCT<Col1 STRING, Col2 STRING, Col3 STRING, Col4 STRING, Col5 STRING>
LANGUAGE js AS """
return JSON.parse(json_row);
""";
WITH table1 AS (
SELECT "A" as Col5, "B" as Col3, "C" AS Col2
UNION ALL
SELECT "D" as Col5, "E" as Col3, "F" AS Col2
),
table2 AS (
SELECT "G" as Col1, "H" as Col2, "J" AS Col3, "K" AS Col4, "L" AS Col5
UNION ALL
SELECT "M" as Col1, "N" as Col2, "O" AS Col3, "P" AS Col4, "Q" AS Col5
)
SELECT CoerceRow(json_row).*
FROM (
SELECT TO_JSON_STRING(t1) AS json_row
FROM table1 AS t1
UNION ALL
SELECT TO_JSON_STRING(t2) AS json_row
FROM table2 AS t2
);
+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 |
+------+------+------+------+------+
| NULL | C | B | NULL | A |
| NULL | F | E | NULL | D |
| G | H | J | K | L |
| M | N | O | P | Q |
+------+------+------+------+------+
Note that the CoerceRow function needs to declare the explicit row type that you want in the output. Outside of that, the columns in the tables being unioned are just matched by name.

Is there a working way to join this two tables?
#standardsQL
SELECT *, NULL AS Col5, NULL AS Col6 FROM table1
UNION ALL
SELECT * FROM table2
Yo can check this using your example
#standardsQL
WITH table1 AS (
SELECT "ID1" AS Col1, "A" AS Col2, "B" AS Col3, "C" AS Col4
UNION ALL
SELECT "ID2", "D", "E", "F"
),
table2 AS (
SELECT "ID3" Col1, "G" AS Col2, "H" AS Col3, "J" AS Col4, "K" AS Col5, "L" AS Col6
UNION ALL
SELECT "ID4", "M", "N", "O", "P", "Q"
)
SELECT *, NULL AS Col5, NULL AS Col6 FROM table1
UNION ALL
SELECT * FROM table2

Related

Create View with reverse Row values SQL

I have data like
name| col1 | col2 | col3 | col4 | col4 | col5 |
rv | rv1 | rv2 | rv3 | rv4 | | |
sgh | sgh1 | sgh2 | | | | |
vik | vik1 | vik2 | vik3 | vik4 |vik5 |vik6 |
shv | shv1 | shv2 | shv3 | shv4 |shv5 | |
Table Name: emp_data
to create View to get DATA like
name| col1 | col2 | col3 | col4 | col4 | col5 |
rv | rv4 | rv3 | rv2 | rv1 | | |
sgh | sgh2 | sgh1 | | | | |
vik | vik6 | vik5 | vik4 | vik3 |vik2 |vik1 |
shv | shv5 | shv4 | shv3 | shv2 |shv1 | |
MySql 8 supports LATERAL, this way you can sort values by positions and conditionally aggregate them back.
with tbl(name, col1, col2, col3 ,col4 ,col5 , col6) as
(
select 'rv ','rv1 ','rv2 ','rv3 ','rv4 ',null,null union all
select 'sgh','sgh1','sgh2', null,null,null,null union all
select 'vik','vik1','vik2','vik3','vik4','vik5','vik6' union all
select 'shv','shv1','shv2','shv3','shv4','shv5', null
)
select tbl.name, t.*
from tbl
, lateral (
select
max(case n when 1 then val end) col1,
max(case n when 2 then val end) col2,
max(case n when 3 then val end) col3,
max(case n when 4 then val end) col4,
max(case n when 5 then val end) col5,
max(case n when 6 then val end) col6
from (
select row_number() over( order by n) n, val
from (
select case when col1 is null then 99 else 6 end n, col1 val union all
select case when col2 is null then 99 else 5 end n, col2 val union all
select case when col3 is null then 99 else 4 end n, col3 val union all
select case when col4 is null then 99 else 3 end n, col4 val union all
select case when col5 is null then 99 else 2 end n, col5 val union all
select case when col6 is null then 99 else 1 end n, col6 val
) t
) t
) t
db<>fidle

Insert into table, in 2 phases

I have two tables: table1 and table2 (The tables are almost identical, table2 has an extra field. 30 columns in table1, and 31 columns in table2. the extra column is a key).
I also have a procedure, which gets a number at first. If the number is above 10, I want to isnert the row in all the 30 columns from table1 to table2. Otherwise, I want to insert columns 1 to 20 (from table1), and insert columns 20-30 multiplied by 30. I have created two different "INSERT INTO TABLE" for each situation (above/under 10), but I belive there is more efficient way, since the first 20 rows should be the same in every case. I thought to insert first the first 20 rows, and after enter "IF" statement and then 'insert' according to the given parameter the remain columns. But ofcourse I'm getting two rows instead of one.
what is the solution, so I will insert all the data into one row?
Here is an example with 10 columns (instead of 30). In this example, if the paramter is above 10, we'll insert the row as is into table2. otherwise, we will insert col1-col7, and multiply col8-col10.
parameter = 15
table1
Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | Col7 | Col8 | Col9 |Col10 |
======+======+======+======+======+======+======+======+======+======+
1 | 1 | 1 | 2 | 2 | 2 | 2 | 5 | 5 | 5 |
table2 (Identical to table 1, because the parameter > 10 )
Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | Col7 | Col8 | Col9 |Col10 |
======+======+======+======+======+======+======+======+======+======+
1 | 1 | 1 | 2 | 2 | 2 | 2 | 5 | 5 | 5 |
If the parameter was parameter = 3 , then table two was:
table2 (columns 8-10 multiplied)
Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | Col7 | Col8 | Col9 |Col10 |
======+======+======+======+======+======+======+======+======+======+
1 | 1 | 1 | 2 | 2 | 2 | 2 | 150 | 150 | 150 |
A template to my code:
if #Parameter >10
begin
INSERT INTO Table1
(Col1
,Col2
,Col3
...
,Col29
,Col30)
SELECT
Col1
,Col2
,Col3
...
,Col29
,Col30
FROM ...
wHERE ...
end
else
begin
INSERT INTO Table1
(Col1
,Col2
,Col3
...
,Col29
,Col30)
SELECT
Col1
,Col2
,Col3
...
,Col29
,Col30
FROM ...
wHERE ...
end
Right now I have more then 120 lines, when 2/3 from them are duplicated.
How can I make it more efficient?
As far as I could your problem you could use,INSERT ALL COMMAND
FOR eg:-
INSERT ALL
WHEN number>10 THEN
INTO table1 VALUES(col1,col2,col3)
INTO table1 VALUES(col1,col2,col3)
WHEN number<10
INTO table1 VALUES(col1,col2,col3)
INTO table1 VALUES(col1,col2,col3)
select * from dual
use condition as per your requirement

sql table comparisons - postgres

I have two tables with some columns being the same:
TABLE A
| Col1 | Col2 | Col3 |
+------+------+------+
| 1 | aa | ccc |
| 2 | null | ccc |
| null | bb | null |
TABLE B
|Col1 | Col2 | Col3| Col4 |
+------+-------+-----+------+
| 1 | aa | ccc | aaaa |
| 2 | null | ccc | cccc |
| null | bb | null | sss |
| 4 | bb | null | ddd |
I'd like to return the following:
|Col1 | Col2 | Col3| Col4 |
+------+-------+-----+------+
| 4 | bb | null | ddd |
How do I check what rows from table B are in table A and also return Col4 (from table B) where they match in the query.
I was using EXCEPT which worked great but now I need to have the outputs of Col4 in the returned query results.
Thanks.
Something like this?
SELECT Col1, Col2, Col3, Col4
FROM TableB
WHERE NOT EXISTS (
SELECT 1
FROM TableA
WHERE TableA.Col1 IS NOT DISTINCT FROM TableB.Col1
AND TableA.Col2 IS NOT DISTINCT FROM TableB.Col2
AND TableA.Col3 IS NOT DISTINCT FROM TableB.Col3
)
(Using IS NOT DISTINCT FROM to say that columns with null are equal to each other.)
The answer to your question is very easy: none of the rows from table 'A' are in table 'B'.
Those are separate tables and they have separate rows.
Now, if you want to find 'a row in table B, that has similar values in columns as a specific row in table A, you may do following:
select a.*,
case when b.ctid is null then 'I AM A VERY SAD PENGUIN AND ROW IN B WAS NOT FOUND :('
else b.col4 end as col4
from table_a a
left join table_b b on
((a.col1 = b.col1 or (a.col1 is null and b.col1 is null)) and
(a.col2 = b.col2 or (a.col2 is null and b.col2 is null)) and
(a.col3 = b.col3 or (a.col3 is null and b.col3 is null))
)
I assume that by 'similar' you would mean "if null appears in columns in both tables, the rows are still similar.
Notice the last insert into table_b i added - you are not guaranteed to find unique values of col4!:
dbfiddle, modified version of VBoka's answer
If you have no duplicates within the b table, then the following handles NULL values rather elegantly:
select col1, col2, col3, max(col4)
from ((select coll, col2, col3, null as col4, 'a' as which
from a
) union all
(select coll, col2, col3, col4, 'b' as which
from b
)
) b
group by col1, col2, col3
having min(which) = 'b';

Join two table and shows with null values

I have two tables:
Table 1
Col | Col2
--------+---------
AA | CC
Table 2
Col | Col2
--------+---------
BB | CC
Result I need
Col1 | Col2 | Col4
--------+----------+---------
AA | CC | null
null | CC | BB
I can't find any relation between two tables so, i would do :
select col as col1, col2, null as col4
from table1 t1
union all
select null, col2, col
from table2 t2;

SQL query to group a column and ignore null values

I have a table like:
Col1 Col2 Col3 Col4
1 a
1 b
1 c
2 e
2 f
2 g
I need to write a query which will have the output like this
Col1 Col2 Col3 Col4
1 a b c
2 e f g
I am using oracle 10g
If you only have one value per column, then you might be able to use an aggregate function:
select
col1,
max(col2) col2,
max(col3) col3,
max(col4) col4
from yourtable
group by col1
See SQL Fiddle with Demo
The result is:
| COL1 | COL2 | COL3 | COL4 |
-----------------------------
| 1 | b | a | c |
| 2 | e | f | g |