How to write subquery like: column=(select xx from table) in Hive? - hive

I have a scenario, for example:
with tmp as (select name from table1)
select * from table2 b
where b.name=(select max(name) from tmp)
However, Hive can't recognize this syntax, so is there any legal syntax for this?
After search, I learnt it can use join to realize:
select table2.* from table2
join (select max(name) as name from tmp) t2
where table2.name = t2.name
but I don't want to use join, as the join will be very slow, I just want to regard it as a reference.
Like in MySQL, you are able to set the result as a reference:
set #max_date := select max(date) from some_table;
select * from some_other_table where date > #max_date;
While Hive can achieve the effect that storing query result in shell. Check: HiveQL: Using query results as variables
Can Hive support such feature in SQL mode?

In Hive you can achieve it as below:
select * from table2 b
where b.name=(select max(name) from table1)
Other way :
You can also create temporary table in hive which will help to replicate your Oracle query above.
CREATE TEMPORARY TABLE tmp AS SELECT name FROM table1;
SELECT * FROM table2 b WHERE b.name=(SELECT max(name) FROM tmp);

Related

How to insert data from CTE to a Temp Table?

I am trying to create a some logic using CTE and then instead of using DML statement after CTE, I am trying to create a temp table using CTE. This is possible in T-SQL. Is it possible in GBQ?
I know I can create temp table instead of CTE in the below example, but just want to know the possibility!
WITH xyz AS
(SELECT * FROM table1)
CREATE TEMP TABLE temp1 AS (
SELECT * FROM xyz INNER JOIN table2 on ...);
Use below instead
CREATE TEMP TABLE temp1 AS (
WITH xyz AS
(SELECT * FROM table1)
SELECT * FROM xyz INNER JOIN table2 on ...
);
So in 2022 I believe that no longer works without a script or session in GBQ:
You could write your query as follows:
WITH xyz AS (
SELECT
*
FROM table1
)
SELECT
*
FROM xyz
INNER JOIN table2
ON ...
and then click the More Button -> Query Settings as shown below:
After that you can set a destination for your results a a temporary table and here you can define the name of your table etc. in your case it's temp1:
This way you can just save the results of your query into a temporary table. Hope it helps!

Extract all tables and respective columns from long SQL Query

The task I am trying to solve is to get all tables out of a long SQL query and its respective columns.
E.g.
SELECT
t1.id, t1.gender, t1.name,
t2.age, t2.salary
FROM table1 t1
LEFT JOIN table2 t2
ON t1.id = t2.id
Wanted output:
{'table1': ['id', 'gender', 'name'], 'table2': ['age', 'salary']}
I considered using string splitting etc. getting all table names and based on the alias (if available) get the columns.
But this is getting way to complicated if there are multiple joins and maybe also UNIONs.
Is there an available library or smart way to do that?
If it's only for 1 query I would advise to use MS Excel and filter on the table alias. Generate the select statement via MS Excel and you could create something like this:
SELECT
'table1:', t1.id, t1.gender, t1.name,
'table2:',t2.age, t2.salary
FROM table1 t1
LEFT JOIN table2 t2
ON t1.id = t2.id
In case if this helps.
Take the column name from All_TAB_COLUMN and Pivot it. Still this is not exact result you want.
select * from (
select TABLE_NAME,COLUMN_NAME from ALL_TAB_COLUMNS where TABLE_NAME in
('Table1','Table2'))
pivot
(
max(table_name)
for COLUMN_NAME in ('gender','name','age','salary')
)
order by 1;

How to insert all rows from one table in the column in () statement in Hive

I have a table with 100 strings that I would like to add to a where column in (value, value, etc) Something like select cookies from table where field in (select * from table)
I don't think Hive supports subqueries in in clauses, but you can accomplish the same with an inner join:
select table1.cookies
from table1 join table2 on table1.field = table2.field
Hive does have a support of SUB-QUERIES from version-0.13.
So you can use this version.
Or you can try this query:
select * from table1 t1 JOIN (select 100_string_column as col2 from table2 where (whatever your condition is)) t2 ON t1.<matching_column> = t2.col2
Hope this helps...!!!

In SQL, how can I perform a "subtraction" operation?

Suppose I have two tables, which both have user ids. I want to perform an operation that would return all user IDS in table 1 that are not in table 2. I know there has to be some easy way to do this - can anyone offer some assistance?
Its slow, but you can normally accomplish this with something like 'not in'. (There are other functions in various RDBMS systems to do this in better ways, Oracle for instance has a 'exists' clause that can be used for this.
But you could say:
select id from table1 where id not in (select id from table2)
There are a few ways to do it. Here's one approach using NOT EXISTS:
SELECT userid
FROM table1
WHERE NOT EXISTS
(
SELECT *
FROM table2
WHERE table1.userid = table2.userid
)
And here's another approach using a join:
SELECT table1.userid
FROM table1
LEFT JOIN table2
ON table1.userid = table2.userid
WHERE table2.userid IS NULL
The fastest approach depends on the database.
One way is to use EXCEPT if your TSQL dialect supports it. It is equivalent to performing a left join and null test
SELECT user_id FROM table1 LEFT JOIN table2 ON table1.user_id = table2.user_id WHERE table2.user_id IS NULL;
If it is
SQL Server:
SELECT id FROM table1
EXCEPT
SELECT id FROM table2
Oracle:
SELECT id FROM table1
MINUS
SELECT id FROM table2
Rest: Am not sure....
Try this:
SELECT id FROM table1 WHERE id NOT IN
(
SELECT id FROM table2
)
select ID from table1
where ID not in (select ID from table2)

How can I reference a single table multiple times in the same query?

Sometimes I need to treat the same table as two separate tables. What is the solution?
You can reference, just be sure to use a table alias
select a.EmployeeName,b.EmployeeName as Manager
from Employees A
join Employees B on a.Mgr_id=B.Id
Use an alias like a variable name in your SQL:
select
A.Id,
A.Name,
B.Id as SpouseId,
B.Name as SpouseName
from
People A
join People B on A.Spouse = B.id
Use an alias:
SELECT t1.col1, t2.col3
FROM tbl t1
INNER JOIN tbl t2
ON t1.col1 = t2.col2
Alias is the most obvious solution
SELECT * FROM x1 AS x,y1 AS y
However if the table is the result of a query a common table expressions is quite usefull
;WITH ctx AS
( select * from z)
SELECT y.* FROM ctx AS c1,ctx AS c2
A third solution -- suitable when your query lasts a long time -- is temporary tables:
SELECT *
INTO #monkey
FROM chimpanzee
SELECT * FROM #monkey m1,#monkey m2
DROP TABLE #MONKEY
Note a common table expression is only available for one query (the query directly after it), and temporary tables last for the whole batch.