I have one question about database query. Please refer below table.
Table : 1
ID Country
1 x
2 y
3 z
4 k
Table : 2
eng fre fre1 fre2
x x
x1 k y t
x2 n z
Output Table
id country
1 x
2 x1
3 x2
4 x1
How to achieve this in Hive?
Thank you so much for help.
You can join three times but it may run slow:
select a.id, coalesce(b.eng, c.eng, d.eng) as Country
from table_1 a
left join table_2 b on a.country=b.fre
left join table_2 c on a.country=c.fre1
left join table_2 d on a.country=d.fre2
;
Related
I need a recommendation.
I have two tables. Table 1 is the main table and table 2 is the table that I initially thought to join Table 1 through a left join, table 2 is much larger than table 1. What would be the best performing way to join Table 1 and Table 2 being the union condition that Column b is equal to column b or that column c is equal to column c and column d is equal to column d, that is, any of these conditions is met but no empty values are met. This without using OR in the left join due to the poor performance it would have and the execution time. I appreciate any help.
Note: table 1 and table 2 is the result of 40 lines query. Database do not support recursive query. The database is sap hana.
Table 1
ID
column b
column c
column d
1
d
g
j
2
e
h
k
3
f
i
Table 2
ID_2
column b
column c
column d
4
d
g
5
k
6
i
Desired Result
ID
column b
column c
column d
ID_2
1
d
g
J
4
2
e
h
k
5
3
f
i
6
Use two left joins:
select t1.*,
coalesce(t2_b.id_2, t2_c.id_2, t2_d.id_2) as id_2
from table1 t1 left join
table2 t2_b
on t1.b = t2_b.b left join
table2 t2_c
on t1.c = t2_d.c and t2_b.b is null
table2 t2_d
on t1.d = t2_d.d and t2_c.c is null;
Note that for optimal performance, you want three indexes:
table2(b, id_2)
table2(c, id_2)
table2(d, id_2)
I have a problem which I can't describe without explaining this on this example:
So there are 2 columns like:
X Y
A 2
A 1
A 3
B 3
C 2
A 1
D 2
B 1
B 3
C 1
A 1
D 3
D 1
and now I would like to select only that data from X, where one of the values from Y is 2.
So my output should look like:
X Y
A 2
A 1
A 3
C 2
A 1
D 2
C 1
A 1
D 3
D 1
because Y=2 for X=B doesn't exist in the main table.
My question is what is the query for this operation? I tried something with CASE WHEN but something didn't fix for me.
Try
SELECT X FROM Table WHERE X IN (SELECT X FROM Table WHERE Y=2)
OR Try
SELECT t1.X FROM Table t1
INNER JOIN Table t2 ON t1.X = t2.X
WHERE t2.Y = 2
Try a subquery:
SELECT X FROM table WHERE X IN (SELECT X FROM table WHERE Y = 2);
I have the following tables in a Hive database:
table1:
id t X
1 1 a
1 4 a
2 5 a
3 10 a
table2:
id t Y
1 3 b
2 6 b
2 8 b
3 15 b
And I would like to merge them to have a table like:
id t Z
1 1 a
1 3 b
1 4 a
2 5 a
2 6 b
2 8 b
3 10 a
3 15 b
Basically what I want to do is :
a join on the column id (that part is easy)
merge the columns table1.t and table2.t into a new column t
have the variable Z that is equal to table1.X if the corresponding t comes from table1.t, and table2.Y if it comes from table2.t
order the table by id and then by t (that shouldn't be too hard)
I have no idea on how to do the parts 2 and 3. I tried with an outer join on
table1.id = table2.id and table1.t = table2.t, but it doesn't merge the two columns t.
Any pointer would be appreciated. Thanks!
CREATE TABLE table3 as SELECT * FROM (SELECT id,t,X as Z FROM t3_1 UNION ALL SELECT id,t,Y as Z FROM t3_2) u1 order by id,t;
Although not always required, using a subquery for the union'd queries help to organize, plus you can then reference the fields from the union (e.g. u1.id ) in other parts of the query.
You'll need the alias on the 3rd column to make the schemas match. If the source table name was not already a column, you could do something like this:
select * from (select id,t,'a' from t3_1 UNION ALL select id,t,'b' from t3_2) u1;
Try this one. It will insert in table 3, all the values from the other 2 tables
INSERT INTO table3 ( t, Z )
SELECT t, X
FROM table1
UNION ALL
SELECT t, Y
FROM table2
I have two tables (A and B) with two columns in common (x and y). I'd like to inner join A and B on x but keep only the values of A's column y (the left join). I'm looking for a way that will combine the two y columns (can't just specify A.y in the select statement). How can I do this?
Example
Table A
x y
1 2
3 4
5 6
7 8
Table B
x y
1 2
3 8
9 null
11 0
I'd like the resulting table to look like
x y
1 2
3 4
select a.x, a.y
from TableA a
inner join TableB b on a.x = b.x
Do you mean:
SELECT *
FROM A
INNER JOIN B b1 ON A.x = b1.x
LEFT JOIN B b2 ON a.y = b2.y
Take a look at SQL exclude a column using SELECT * [except columnA] FROM tableA? Second answer. Not the best solution, but you can use this as a workaround. In general, you should specify the full list of columns explicitly.
I'm using SQL Server 2008 and I have 3 tables, x, y and z. y exists to create a many-to-many relationship between x and z.
x y z
-- -- --
id xid id
zid sort
All of the above fields are int.
I want to find the best-performing method (excluding denormalising) of finding the z with the highest sort for any x, and return all fields from all three tables.
Sample data:
x: id
--
1
2
y: xid zid
--- ---
1 1
1 2
1 3
2 2
z: id sort
-- ----
1 5
2 10
3 25
Result set should be
xid zid
--- ---
1 3
2 2
Note that if more than one z exists with the same highest sort value, then I still only want one row per x.
Note also that in my real-world situation, there are other fields in all three tables which I will need in my result set.
One method is with a sub query. This however is only good for getting the ID of Z. If you need more/all columns from both x and z tables then this is not the best solution.
SELECT
x.id,
(
SELECT TOP 1
z.zid
FROM
y
INNER JOIN
z
ON
z.id = y.zid
WHERE
y.xid = x.id
ORDER BY
z.sort DESC
)
FROM
x
This is how you can do it and return all the data from all the tables.
SELECT
*
FROM
x
INNER JOIN
y
ON
y.xid = x.id
AND
y.zid =
(
SELECT TOP 1
z2.zid
FROM
y y2
INNER JOIN
z z2
ON
z2.id = y2.zid
WHERE
y2.xid = x.id
ORDER BY
z2.sort DESC
)
INNER JOIN
z
ON
z.id = y.zid
select xid,max(zid) as zid from y
group by xid
select xid, zid /* columns from x; and columns from y or z taken from q */
from (select y.xid, y.zid, /* columns from y or z */
row_number() over(partition by y.xid order by z.sort desc) r
from y
join z on z.id = y.zid
) q
join x on x.id = q.xid
where r = 1