Big Query: SQL JOIN with column prefix - google-bigquery

I have two tables in Google Big Query. One table is kind of a product catalogue (1:1), while the second one is product related information (1:n). For a query I'm joining both. But the joins fails since the column pid and some others are present in both tables.
#standardSQL
SELECT tbl1.*, tbl2.* FROM (
SELECT * FROM `my_project.my_dataset.my_table_1`
) AS tbl1
LEFT JOIN ( SELECT * FROM `my_project.my_dataset.my_table_2`) AS tbl2
ON tbl1.pid = tbl2.pid
WHERE tbl1.category LIKE '111002%'
Idea 1: How to select * without the duplicated columns (that I can put in manually).
Idea 2: How to provide a left/right prefix for the columns in the join?
Any help is appreciated.

To avoid duplicating pid from both sides of the join, use a USING clause instead:
#standardSQL
SELECT * FROM (
SELECT * FROM `my_project.my_dataset.my_table_1`
) AS tbl1
LEFT JOIN ( SELECT * FROM `my_project.my_dataset.my_table_2`) AS tbl2
USING(pid)
WHERE tbl1.category LIKE '111002%'
To prefix the column names from both sides of the join, use a reference to the tables in the select list instead of applying .* to them:
#standardSQL
SELECT tbl1, tbl2 FROM (
SELECT * FROM `my_project.my_dataset.my_table_1`
) AS tbl1
LEFT JOIN ( SELECT * FROM `my_project.my_dataset.my_table_2`) AS tbl2
USING(pid)
WHERE tbl1.category LIKE '111002%'
The columns resulting from the query will be tbl1 and tbl2, which are STRUCTs containing the columns from each of those tables as fields.

Related

Left Join with Blanks in B.Key

I would like to combine two tables.
The first table (tbl1) contains all Articles I need.
The second table (tbl2) contains some additional information - but not for every article.
That means in tbl2 are some columns where there is no value.
I am using the following join:
SELECT *
FROM tbl1
LEFT JOIN tbl2 ON tbl1.c4 = tbl2.C4
this join filters all articles, where tbl2.c4 = ''.
But I need the total articles that are listed in tbl1.
How can I manage that?
It is based on Oracle
You can use window functions. I think:
SELECT *
FROM (SELECT t1.*, COUNT(*) OVER () as cnt
FROM tbl1
) t1 LEFT JOIN
tbl2
ON t1.c4 = tbl2.C4;

Oracle Sql to retrieve sum when match exists

Sql help needed.
Totalcount= Employees + Count
The column names are like that. These are two random tables we are trying to join.
Imp: It is possible that - what exists in Table1 may not exist in Table2. Also what exists in table2 may not exist in Table1. So if exists in both then sum total needed, If not individual value
One method uses a full join:
select coalesce(t1.company, t2.entity) as company,
coalesce(t1.employees, 0) + coalesce(t2.count, 0) as totalcount
from table1 t1 full join
table2 t2
on t1.company = t2.entity
You can use union all and aggregation:
select entity, sum(cnt) total_count
from (
select entity, cnt from table2
union all select company, employees from table1
) t
group by entity
order by entity
For this to work properly , you need the columns in both tables to have the same datatype, ie table2.entity should have the same datatype as table1.company' (as well as table2.cnt and table1.employees). If the datatypes do not match, you must explictly cast the columns to adujst.

How to map each distinct value of a column in one table with each distinct value of a column in another table in Hive

I have two tables in Hive, Table1 and Table2. I want to get each distinct customerID in Table1 and map it to each distinct value in a column called category of Table2. However I am a bit lost on how to do this in hive. A better example of what I am trying to do is the following: Let's say Table1 contains 5 distinct customerID's and Table2 contains 3 distinct categories. I want my query result to look something like the following:
However Table1 and Table2 do not have any columns in common so I am a bit lost on how to perform a join on this two tables in hive. Is this task possible in hive? Any insights on this would be greatly appreciated!
You can do that with a cross join of distinct values from both tables.
select t1.customerid,t2.categories
from (select distinct customerid from tbl1) t1
cross join (select distinct categories from tbl2) t2

How to retrieve only those rows of a table (db1) which are not in another table (db2)

I have a table t1 in db1, and another table t2 in db2. I have the same columns in both tables.
How do I retrieve only those rows which are not in the other table?
select id_num
from [db1].[dbo].[Tbl1]
except
select id_num
from [db2].[dbo].[Tb01]
You can use LEFT JOIN or WHERE NOT IN functions.
Using WHERE NOT IN:
select
dbase1.id_num from [db1].[dbo].[Tbl1] as dbase1
where dbase1.id_num not in
(select dbase2.id_num from [db2].[dbo].[Tb01] as dbase2)
Using LEFT JOIN (recommended as this is much faster)
SELECT dbase1.id_num
FROM [db1].[dbo].[Tbl1] as dbase1
LEFT JOIN [db2].[dbo].[Tb01] as dbase2 ON dbase2.id_num COLLATE Latin1_General_CI_A = dbase1.id_num COLLATE Latin1_General_CI_A
WHERE dbase2.id_num IS NULL
Compare tables with DB2 other databases may have a select a - b statement or similar. Because at the time my database also didn't have a-b I use the following. Wrap the statement in a create table statement to dig into the results. No rows and the tables are identical. I've added in a column BEFORE|AFTER which makes the results easy to read.
SELECT 'AFTER', A.* FROM
(SELECT * FROM &AFTER
EXCEPT
SELECT * FROM &BEFORE) AS A
UNION
SELECT 'BEFORE', B.* FROM
(SELECT * FROM &BEFORE
EXCEPT
SELECT * FROM &AFTER) AS B

How do I merge data from two tables in a single database call into the same columns?

If I run the two statements in batch will they return one table to two to my sqlcommand object with the data merged. What I am trying to do is optimize a search by searching twice, the first time on one set of data and then a second on another. They have the same fields and I’d like to have all the records from both tables show and be added to each other. I need this so that I can sort the data between both sets of data but short of writing a stored procedure I can’t think of a way of doing this.
Eg. Table 1 has columns A and B, Table 2 has these same columns but different data source. I then wan to merge them so that if a only exists in one column it is added to the result set and if both exist it eh tables the column B will be summed between the two.
Please note that this is not the same as a full outer join operation as that does not merge the data.
[EDIT]
Here's what the code looks like:
Select * From
(Select ID,COUNT(*) AS Count From [Table1]) as T1
full outer join
(Select ID,COUNT(*) AS Count From [Table2]) as T2
on t1.ID = T2.ID
Perhaps you're looking for UNION?
IE:
SELECT A, B FROM Table1
UNION
SELECT A, B FROM Table2
Possibly:
select table1.a, table1.b
from table1
where table1.a not in (select a from table2)
union all
select table1.a, table1.b+table2.b as b
from table1
inner join table2 on table1.a = table2.a
edit: perhaps you would benefit from unioning the tables before counting. e.g.
select id, count() as count from
(select id from table1
union all
select id from table2)
I'm not sure if I understand completely but you seem to be asking about a UNION
SELECT A,B
FROM tableX
UNION ALL
SELECT A,B
FROM tableY
To do it, you would go:
SELECT * INTO TABLE3 FROM TABLE1
UNION
SELECT * FROM TABLE2
Provided both tables have the same columns
I think what you are looking for is this, but I am not sure I am understanding your language correctly.
select id, sum(count) as count
from (
select id, count() as count
from table1
union all
select id, count() as count
from table2
) a
group by id