Iterate over the rows of a second table to return resultset - sql

I'm trying to achieve a SELECT query that takes in consideration the result of a second table, by making combinations, replicating a result. I must follow a certain order to don't get all combinations, using a ORDER column.
To ilustrate what I'm trying to achieve:
Until now, I tried to use SUBSELECT with JOIN to replicate the results based on a second table.
SELECT a.table_a_id, b.label_x, b.label_y
FROM table_a a
JOIN
(
SELECT label_x, label_y
FROM table_b
WHERE b.table_a_id = a.table_a_id
) b
ON b.table_a_id = a.table_a_id
But, of course, I cant reference the table_a from inside the SUBSELECT.
What should be my next steps for achieving my desired ResultSet?

Use a self join here on the table_b table, with the join condition being that the table_a_id values match, but label_y > label_x.
SELECT
b1.table_a_id,
b1.label_x,
b2.label_y
FROM table_a a
INNER JOIN table_b b1
ON b1.table_a_id = a.table_a_id
INNER JOIN table_b b2
ON b2.table_a_id = b1.table_a_id AND
b2.label_y > b1.label_x
ORDER BY
b1.table_a_id,
b1.label_x,
b2.label_y;
Demo

Related

Distinct IDs from one table for inner join SQL

I'm trying to take the distinct IDs that appear in table a, filter table b for only these distinct IDs from table a, and present the remaining columns from b. I've tried:
SELECT * FROM
(
SELECT DISTINCT
a.ID,
a.test_group,
b.ch_name,
b.donation_amt
FROM table_a a
INNER JOIN table_b b
ON a.ID=b.ID
ORDER by a.ID;
) t
This doesn't seem to work. This query worked:
SELECT DISTINCT a.ID, a.test_group, b.ch_name, b.donation_amt
FROM table_a a
inner join table_b b
on a.ID = b.ID
order by a.ID
But I'm not entirely sure this is the correct way to go about it. Is this second query only going to take unique combinations of a.ID and a.test_group or does it know to only take distinct values of a.ID which is what I want.
Your first and second query are similar.(just that you can not use ; inside your query) Both will produce the same result.
Even your second query which you think is giving you desired output, can not produce the output what you actually want.
Distinct works on the entire column list of the select clause.
In your case, if for the same a.id there is different a.test_group available then it will have multiple records with same a.id and different a.test_group.

SQL function to create a one-to-one match between two tables?

I am trying to join 2 tables. Table_A has ~145k rows whereas Table_B has ~205k rows.
They have two columns in common (i.e. ISIN and date). However, when I execute this query:
SELECT A.*,
B.column_name
FROM Table_A
JOIN
Table_B ON A.date = B.date
WHERE A.isin = B.isin
I get a table with more than 147k rows. How is it possible? Shouldn't it return a table with at most ~145k rows?
What you are seeing indicates that, for some of the records in Table_A, there are several records in Table_B that satisfy the join conditions (equality on the (date, isin) tuple).
To exhibit these records, you can do:
select B.date, B.isin
from Table_A
join Table_B on A.date = B.date and A.isin = B.isin
group by B.date, B.isin
having count(*) > 1
It's up to you to define how to handle those duplicates. For example:
if the duplicates have different values in column column_name, then you can decide to pull out the maximum or minimum value
or use another column to filter on the top or lower record within the duplicates
if the duplicates are true duplicates, then you can use select distinct in a subquery to dedup them before joining
... other solutions are possible ...
If you want one row per table A, then use outer apply:
SELECT A.*,
B.column_name
FROM Table_A a OUTER APPLY
(SELECT TOP (1) b.*
FROM Table_B b
WHERE A.date = B.date AND A.isin = B.isin
ORDER BY ? -- you can specify *which* row you want when there are duplicates
) b;
OUTER APPLY implements a lateral join. The TOP (1) ensures that at most one row is returned. The OUTER (as opposed to CROSS) ensures that nothing is filtered out. In this case, you could also phrase it as a correlated subquery.
All that said, your data does not seem to be what you really expect. You should figure out where the duplicates are coming from. The place to start is:
select b.date, b.isin, count(*)
from tableb b
group by b.date, b.isin
having count(*) >= 2;
This will show you the duplicates, so you can figure out what to do about them.
Duplicate possibilities is already discuss.
When millions of records are use in join then often due to poor Cardianility Estimate,
record return are not accurate.
For this just change join order,
SELECT A.*,
B.column_name
FROM Table_A
JOIN
Table_B ON A.isin = B.isin
and
A.date = B.date
Also create non clustered index on both table.
Create NonClustered index isin_date_table_A on Table_A(isin,date)include(*Table_A)
*Table_A= comma seperated list Table_A column which is require in resultset
Create NonClustered index isin_date_table_B on Table_B(isin,date)include(column_nameA)
Update STATISTICS Table_A
Update STATISTICS Table_B
Keeping the DATE columns of both tables in the same format in the JOIN condition you should be getting the result as expected.
Select A.*, B.column_name
from Table_A
join Table_B on to_date(a.date,'DD-MON-YY') = to_date(b.date,'DD-MON-YY')
where A.isin = B.isin

SQL join including all rows from one table irrespective of how many are represented in the other

I have two tables:
I want to output the following:
I tried this statement:
SELECT TableA.bu_code
, SUM(TableB.count_invalid_date) AS TotalInvDate
FROM TableA
LEFT JOIN TableB ON TableA.bu_code = TableB.bu_code
GROUP BY TableA.bu_code
But it doesn't show every row represented in TableA, instead it does this:
Is there a single SQL statement that can output what I want?
You could use a left join after performing the group by:
SELECT a.bu_code, COALESCE (TotalInvDate, 0)
FROM TableA a
LEFT JOIN (SELECT bu_code, SUM(count_invalid_date) AS TotalInvDate
FROM TableB
GROUP BY bu_code) b ON a.bu_code = b.bu_code
There may be orphaned TableB rows without a parent row in TableA.
If so, use this GROUP BY syntax to see them.
GROUP BY ROLLUP(TableA.bu_code)
Refer to this Microsoft SQL-Server page on the GROUP BY clause for more details on the ROLLUP option.
SELECT A.bu_code AS [bu_code],
ISNULL(sum( B.count_invalid_date),0) AS [TotalInvDate]
FROM
TableA A
left JOIN
TableB B
ON
A.bu_code=B.bu_code
GROUP BY
A.bu_code

Select returns the same row multiple times

I have the following problem:
In DB, I have two tables. The value from one column in the first table can appear in two different columns in the second one.
So, the configuration is as follows:
TABLE_A: Column Print_group
TABLE _B: Columns Print_digital and Print_offset
The value from the different rows and Print_group column of the Table_A can appear in one row of the Table_B but in different column.
I have the following query:
SELECT DISTINCT * FROM Table_A
INNER JOIN B ON (Table_A. Print_digital = Table_B.Print_group OR
Table_A.Print_offset = Table_B.Print_group)
The problem is that this query returns the same row from the Table_A two times.
What I am doing wrong? What is the right query?
Thank you for your help
If I'm understanding your question correctly, you just need to clarify your fields to come from Table_A:
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
EDIT:
Given your comments, looks like you just need SELECT DISTINCT B.*
SELECT DISTINCT B.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
I've still another question... first,to be clear, the right query version is
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group.
If I want it returns also one column from the B table it again returns duplicate rows. My query (the bad one) is the following one:
SELECT DISTINCT A.*, B.Id
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group

SQL LEFT outer join with only some rows from the right?

I have two tables TABLE_A and TABLE_B having the joined column as the employee number EMPNO.
I want to do a normal left outer join. However, TABLE_B has certain records that are soft-deleted (status='D'), I want these to be included. Just to clarify, TABLE_B could have active records (status= null/a/anything) as well as deleted records, in this case i don't want that employee in my result. If however there are only deleted records of the employee in TABLE_B i want the employee to be included in the result.I hope i'm making my requirement clear. (I could do a lengthy qrslt kind of thingy and get what I want, but I figure there has to be a more optimized way of doing this using the join syntax). Would appreciate any suggestions(even without the join). His newbness is trying the following query without the desired result:
SELECT TABLE_A.EMPNO
FROM TABLE_A
LEFT OUTER JOIN TABLE_B ON TABLE_A.EMPNO = TABLE_B.EMPNO AND TABLE_B.STATUS<>'D'
Much appreciate any help.
Just to clarify -- all records from TABLE_A should appear, unless there are rows in table B with statues other than 'D'?
You'll need at least one non-null column on B (I'll use 'B.ID' as an example, and this approach should work):
SELECT TABLE_A.EMPNO
FROM TABLE_A
LEFT OUTER JOIN TABLE_B ON
(TABLE_A.EMPNO = TABLE_B.EMPNO)
AND (TABLE_B.STATUS <> 'D' OR TABLE_B.STATUS IS NULL)
WHERE
TABLE_B.ID IS NULL
That is, reverse the logic you might think -- join onto TABLE_B only where you have rows that would exclude TABLE_A entries, and then use the IS NULL at the end to exclude those. This means that only those which didn't match (those with no row in TABLE_B, or with only 'D' rows) get included.
An alternative might be
SELECT TABLE_A.EMPNO
FROM TABLE_A
WHERE NOT EXISTS (
SELECT * FROM TABLE_B
WHERE TABLE_B.EMPNO = TABLE_A.EMPNO
AND (TABLE_B.STATUS <> 'D' OR TABLE_B.STATUS IS NULL)
)
The following query will get you the employee records that aren't deleted, or only the employ only has deleted records.
select
a.*
from
table_a a
left join table_b b on
a.empno = b.empno
where
b.status <> 'D'
or (b.status = 'D' and
(select count(distinct status) from table_b where empno = a.empno) = 1)
This is in ANSI SQL, but if I knew your RDBMS, I could give a more specific solution that may be a bit more elegant.
ah crud, this apparently works ><
SELECT TABLE_A.EMPNO
FROM TABLE_A
LEFT OUTER JOIN TABLE_B ON TABLE_A.EMPNO = TABLE_B.EMPNO
where TABLE_B.STATUS<>'D'
If you guys have any extra info to chime in with though, please feel free.
UPDATE:
Saw this question after sometime and thought i'll add more helpful info: This link has good info regarding ANSI syntax - http://www.oracle-base.com/articles/9i/ANSIISOSQLSupport.php
In particular this part from the linked page is informative:
Extra filter conditions can be added to the join to using AND to form a complex join. These are often necessary when filter conditions are required to restrict an outer join. If these filter conditions are placed in the WHERE clause and the outer join returns a NULL value for the filter column the row would be thrown away. if the filter condition is coded as part of the join the situation can be avoided.
SELECT A.*, B.*
FROM
Table_A A
INNER JOIN Table_B B
ON A.EmpNo = B.EmpNo
WHERE
NOT EXISTS (
SELECT *
FROM Table_B X
WHERE
A.EmpNo = X.EmpNo
AND X.Status <> 'D'
)
I think this does the trick. The left join is not needed because you only want to include employees with all (and at least one) deleted rows.
This is how I understand the question. You need to include only those employees for which either of the following is true:
an employee has only (soft-)deleted rows in TABLE_B;
an employee has only non-deleted rows in TABLE_B;
an employee has no rows in TABLE_B at all.
In other words, if an employee has both deleted and non-deleted rows in TABLE_B, omit that employee, otherwise include them.
This is how I think it could be solved:
SELECT DISTINCT a.EMPNO
FROM TABLE_A a
LEFT JOIN TABLE_B b1 ON a.EMPNO = b1.EMPNO
LEFT JOIN TABLE_B b2 ON b1.EMPNO = b2.EMPNO
AND (b1.STATUS = 'D' AND (b2.STATUS <> 'D' OR b2 IS NULL) OR
b2.STATUS = 'D' AND (b1.STATUS <> 'D' OR b1 IS NULL))
WHERE b2.EMPNO /* or whatever non-nullable column there is */ IS NULL
Alternatively, though, you could use grouping:
SELECT a.EMPNO
FROM TABLE_A a
LEFT JOIN TABLE_B b ON a.EMPNO = b1.EMPNO
GROUP BY a.EMPNO
HAVING 0 IN (COUNT(CASE b.STATUS WHEN 'D' THEN 1 ELSE NULL END),
COUNT(CASE b.STATUS WHEN 'D' THEN NULL ELSE 1 END))