I need a recommendation.
I have two tables. Table 1 is the main table and table 2 is the table that I initially thought to join Table 1 through a left join, table 2 is much larger than table 1. What would be the best performing way to join Table 1 and Table 2 being the union condition that Column b is equal to column b or that column c is equal to column c and column d is equal to column d, that is, any of these conditions is met but no empty values are met. This without using OR in the left join due to the poor performance it would have and the execution time. I appreciate any help.
Note: table 1 and table 2 is the result of 40 lines query. Database do not support recursive query. The database is sap hana.
Table 1
ID
column b
column c
column d
1
d
g
j
2
e
h
k
3
f
i
Table 2
ID_2
column b
column c
column d
4
d
g
5
k
6
i
Desired Result
ID
column b
column c
column d
ID_2
1
d
g
J
4
2
e
h
k
5
3
f
i
6
Use two left joins:
select t1.*,
coalesce(t2_b.id_2, t2_c.id_2, t2_d.id_2) as id_2
from table1 t1 left join
table2 t2_b
on t1.b = t2_b.b left join
table2 t2_c
on t1.c = t2_d.c and t2_b.b is null
table2 t2_d
on t1.d = t2_d.d and t2_c.c is null;
Note that for optimal performance, you want three indexes:
table2(b, id_2)
table2(c, id_2)
table2(d, id_2)
Related
I have these 3 tables
Table 1:
id_Table1 field_table1_1 field_table1_2
1 A B
2 C D
3 E F
Table 1:
id_Table2 field_table2_1 field_table2_2
4 G H
5 I J
List item
Table 3:
id_Table3 id_Table1 id_Table2
1 1 4
2 1 5
3 2 5
So table 3 holds the relation between table 1 and 2.
What I want to do, is with a query, get all the fields in the table 1, plus one extra field that contains all the ids of the table 2 separated by coma.
So the result should be something like this:
id_Table1 field_table1_1 field_table1_2 id_Table2
1 A B 4, 5
2 C D 5
3 E F
One option use a lateral join and string_agg():
select t1.*, x.*
from table1 t1
outer apply (
select string_agg(t3.id_table2) id_table2
from table3 t3
where t3.id_table1 = t1.id_table1
) x
There is no need to bring table2 to get the results you want.
I have bug in my query, but I have no idea what happen.
So there is 3 tables.
table 1
name grade min max
a
b
c
d
e
table 2
fullname name min max
a a123 1 10
bbbb b 2 20
c cccc 3 30
d dd 1 10
E Ed 2 20
table 3
value grade
25 A
15 B
5 C
my goal is using name, show the grade of the name( the max > value in table 3).
for example, c has 30 in max, it should have A grade, instead of B and C.
Also, the name usually is the fullname in table 2, but sometime it is name in the table2(like b)(here is one of bugs). That's how the table look like, I can't change it.
if I am not include the checking table 1.name = table 2.name . no bug at all, but cannot get grade for b
if i include the table 1.name = table 2.name.then, it has problem
for the query of matching the grade, it is like(assume get the min and max from table 2 before)
update table1
set table1.grade = table3.grade
from table1 inner join table3
on table1.max > table3.value
All the cases are include the checking table 1.name = table 2.name
case 1:
the grade will equal = C for all data if there is some data inlcude.
for example, in table1, if I am not include E, then everything is fine.
but if I include E, will get C grade for all records.
case 2:
if I run the query for all data at the same time, the result goes wrong.
it work fine if I update the record one by one.,
for example, i add one more condition in update query
update table1
set table1.grade = table3.grade
from table1 inner join table3
on table1.max > table3.value and fullname='c'
after getting wrong result, i add the condition and run it again,
then c will get grade 'A' instead of 'C'. but if I remove the condition and run the query again.
c will get grade 'C' again.
case 3:
there is no problem when I only run the set of data that will cause case 1 problem independently.
but if I put the data together, It cases problem.
That is all cases. I don't know what cause the problem. Please help
The result should be:
table 1
name grade min max
a C 1 10
b B 2 20
c A 3 30
d C 1 10
e B 2 20
If I remove table1.name = table2.name, result will be
table 1
name grade min max
a C 1 10
b null null null
c A 3 30
d C 1 10
e B 2 20
with table1.name = table2.name, result will be
table 1
name grade min max
a C 1 10
b C 2 20
c C 3 30
d C 1 10
e C 2 20
with table1.name = table2.name but remove e , result will be
table 1
name grade min max
a C 1 10
b B 2 20
c A 3 30
d C 1 10
with table1.name = table2.name but only for e,result will be
name grade min max
e B 2 20
those situation happen when I run the update query for whole table.
there is no problem with table1.name = table2.name if I update each row one by one.
At the very least, you should avoid the way you update the table. You should be carefull with joins on update. If you happen to have more than one value for the same row, the result is not deterministic. Better use this form:
update table1
set table1.grade = (SELECT TOP 1 table3.grade FROM table3
WHERE table3.value < table1.max
ORDER BY table3.value DESC)
I am not sure about the expected results but you may have traced the problem already: incorrect criteria.
Before doing an update, just try a select with the same criteria eg:
select *
from table1 inner join table3
on table1.max > table3.value
and see what you get.
i have following SQL table A in my database:
index, group, foo
1 A 2
2 A 2
3 A 0
4 A 1
5 B 2
6 B 1
7 C 1
There are few more groups and I need to write a query based on this filter table B. For each group in table A it's index should be equal or greater than index_egt from table B for the same group.
If the group is not listed in table B, the group won't be filtered.
index_egt, group
3 A
5 B
Expected result:
index, group, foo
3 A 0
4 A 1
5 B 2
6 B 1
7 C 1
Try this, the A.index>=B.index_egt will handle cases where the group is listed in TableB and the B.index_egt IS NULL will handle cases where the group is not listed:
SELECT
A.index,
A.group,
A.foo
FROM TableA AS A
LEFT JOIN TableB AS B ON A.group=B.group
WHERE A.index>=B.index_egt
OR B.index_egt IS NULL
select
a.*
from
A a
left join
B b ON b.group = a.group
where
a.index >= b.index_egt OR b.index_egt IS NULL
I always like this trick with coalesce
SELECT a.*
FROM a_table_with_no_name a
LEFT JOIN b_table_with_no_name b ON b.group = a.group
WHERE a.index >= COALESCE(b.index_egt,a.index)
Consider below 3 tables.
Table a
Col a Col b Col c
1 000 Actual data
1 001 Actual data
2 000 Actual data
3 000 Actual data
3 001 Actual data
3 002 Actual data
Table b
Col a Col b Col d
1 000 Actual data
1 001 Actual data
2 000 Actual data
Table c
Col a Col b Col d
3 000 Actual data
3 001 Actual data
3 002 Actual data
Table a is parent table and table b and c are child table having col a & b common among 3 and needs to be joined.
Now Join should be such if data is not found in table b then only it should be searched in table c
Desired:
cola col b col c col d
1 000 somedata moredata
1 001 somedata moredata
2 000 somedata moredata
3 000 somedata moredata
3 001 somedata moredata
3 002 somedata moredata
Well, currently what i am doing is, left join b to a and c to a, but i think every time for record in a will be searched in b and c both making it Less cost effective. hence want to make it cost effective/fine-tune such that if records NOT exist in b then only search c.
What you really need is a way to "collect" all the rows from table B, and if there are none, then all the rows from table C. Doing the join to A is then standard.
Something like this should work. Make it a subquery and join to your first table.
select col_a, col_b, col_c
from table_b
union all
select col_a, col_b, col_c
from table_c
where (select count(*) from table_b) = 0
If table_b has at least one row, then nothing will be selected from table_c (because the where condition will be false for all rows in table_c). However, if table_b is empty, all the rows from table_c will be selected.
What you need to do is first create a union of two tables B and C with only those records where are in B and C but if they are in B then we should ignore the C ones then do a join with Table A. Thus:
SELECT B.cola, B.colb from B
UNION ALL
SELECT C.cola, C.colb from C
Now using this table, you can join with Table A like:
SELECT A.cola, A.colb, tmp.colc
FROM A
JOIN
( SELECT B.cola, B.colb, B.colc from B
UNION ALL
SELECT C.cola, C.colb from C) AS tmp
ON A.cola = tmp.cola
AND A.colb = tmp.colb
Two left joins:
select a.*, b.*, c.*
from a
left join b
on a.cola=b.cola
and a.colb = b.colb
left join c
on a.cola=c.cola
and a.colb=c.colb
I have the following tables in a Hive database:
table1:
id t X
1 1 a
1 4 a
2 5 a
3 10 a
table2:
id t Y
1 3 b
2 6 b
2 8 b
3 15 b
And I would like to merge them to have a table like:
id t Z
1 1 a
1 3 b
1 4 a
2 5 a
2 6 b
2 8 b
3 10 a
3 15 b
Basically what I want to do is :
a join on the column id (that part is easy)
merge the columns table1.t and table2.t into a new column t
have the variable Z that is equal to table1.X if the corresponding t comes from table1.t, and table2.Y if it comes from table2.t
order the table by id and then by t (that shouldn't be too hard)
I have no idea on how to do the parts 2 and 3. I tried with an outer join on
table1.id = table2.id and table1.t = table2.t, but it doesn't merge the two columns t.
Any pointer would be appreciated. Thanks!
CREATE TABLE table3 as SELECT * FROM (SELECT id,t,X as Z FROM t3_1 UNION ALL SELECT id,t,Y as Z FROM t3_2) u1 order by id,t;
Although not always required, using a subquery for the union'd queries help to organize, plus you can then reference the fields from the union (e.g. u1.id ) in other parts of the query.
You'll need the alias on the 3rd column to make the schemas match. If the source table name was not already a column, you could do something like this:
select * from (select id,t,'a' from t3_1 UNION ALL select id,t,'b' from t3_2) u1;
Try this one. It will insert in table 3, all the values from the other 2 tables
INSERT INTO table3 ( t, Z )
SELECT t, X
FROM table1
UNION ALL
SELECT t, Y
FROM table2