SQL help, left join with filter

SQL help, left join with filter - sql

Please have a look at the SQL in the SQLFiddle link below.
http://sqlfiddle.com/#!9/22e094/4
My goal is to get all the records from Table1, and if SecId exists in Table2, join only if the status is 'Y'.
Result should be that it pulls from table 1: ID 1 and 2. And for ID 1, it successfully left joins Table2 and pulls 'Y'
As you can see in the fiddle, I tried 3 different ways but can't seem to get it.
It's got me stumped... Help would be awesome! :)

The left join, with just the join condition in the ON clause, is fine.
But then, you said you want to keep the rows where status is 'Y' only when the secid exists in table2. So this means you also want to keep the rows where the secid from table2 is NULL.
select * won't fly, because you have columns by the same name, secid, in both tables. You must distinguish them (by giving them aliases - or at least one of them; I gave aliases to both of them) if you need to reference them in the where clause, or anywhere else in the query, to break the ambiguity. And, since the aliases can only be given in the SELECT clause, which is evaluated after the WHERE clause, you need to do the left join in a subquery.
select id, secid_a, secid_b, status
from (
select a.id, a.secid as secid_a, b.secid as secid_b, b.status
from table1 a left join table2 b
on a.secid = b.secid
)
where status = 'Y' or secid_b is null;

Related

sql left join explanation

Found this code example online regarding sql left joins and I want to make sure i get it correctly ( since I am no expert )
SELECT table1.column1, table2.column2...
FROM table1
LEFT JOIN table2
ON table1.common_field = table2.common_field AND table1.common_field_2 = table2.common_field_2
WHERE table1.column3 = ... AND table2.common_field IS NULL
My question comes for the AND table2.common_field IS NULL part and how it affects the ON above.
For me it seems that join result will contain only those that they exist on table1, but not on table2 based on the common_field.
Is that correct? Can it be written simpler since the above seems confusing to me.

The first step in any SQL development is to check the data that is actually stored in the tables you intend to use in your query.
How the data is stored will affect the results of the query, particularly when filtering for NULLs or checking for the existence of a row.
Using EXISTS or NOT EXISTS to check for existence/non-existence of one or more rows is very effective, providing the WHERE clause within the EXISTS sub-query doesn't have conflicting logic (e.g. NOT EXISTS and <> are used together), which can be confusing and produce results that are difficult to test.
Does table2.common_field contain any NULLs? If it does, it would be wise to filter on those in a nested query, CTE or view first, then use the results of that in the main query.
If table2.common_field doesn't contain NULLs or has a NOT NULL constraint, then perhaps you are using table2.common_field IS NULL to filter on the results of the LEFT JOIN, where there is no match on the join criteria for table2. If this is the case and you want to stick with using LEFT JOIN, I recommend to nest your query and filter on the NULL in the outer query.
Here's a couple of options:
Option 1: Use LEFT JOIN, filter on NULL in the outer query.
Note the careful use of an alias for table2.common_field which is important.
SELECT
result.*
FROM
(
SELECT table1.column1, table2.column2, table2.common_field as table2_common_field...
FROM table1
LEFT JOIN table2
ON table1.common_field = table2.common_field AND table1.common_field_2 = table2.common_field_2
WHERE table1.column3 = ...
) result
WHERE result.table2_common_field IS NULL;
Option 2 (recommended): Use NOT EXISTS.
SELECT table1.column1, table2.column2...
FROM table1
WHERE NOT EXISTS (
select 1
from table2
where table2.common_field = table1.common_field
AND table2.common_field_2 = table1.common_field_2
)
AND table1.column3 = ...

SQL filter LEFT TABLE before left join

I have read a number of posts from SO and I understand the differences between filtering in the where clause and on clause. But most of those examples are filtering on the RIGHT table (when using left join). If I have a query such as below:
select * from tableA A left join tableB B on A.ID = B.ID and A.ID = 20
The return values are not what I expected. I would have thought it first filters the left table and fetches only rows with ID = 20 and then do a left join with tableB.
Of course, this should be technically the same as doing:
select * from tableA A left join table B on A.ID = B.ID where A.ID = 20
But I thought the performance would be better if you could filter the table before doing a join. Can someone enlighten me on how this SQL is processed and help me understand this thoroughly.

A left join follows a simple rule. It keeps all the rows in the first table. The values of columns depend on the on clause. If there is no match, then the corresponding table's columns are NULL -- whether the first or second table.
So, for this query:
select *
from tableA A left join
tableB B
on A.ID = B.ID and A.ID = 20;
All the rows in A are in the result set, regardless of whether or not there is a match. When the id is not 20, then the rows and columns are still taken from A. However, the condition is false so the columns in B are NULL. This is a simple rule. It does not depend on whether the conditions are on the first table or the second table.
For this query:
select *
from tableA A left join
tableB B
on A.ID = B.ID
where A.ID = 20;
The from clause keeps all the rows in A. But then the where clause has its effect. And it filters the rows so on only id 20s are in the result set.
When using a left join:
Filter conditions on the first table go in the where clause.
Filter conditions on subsequent tables go in the on clause.

Where you have from tablea, you could put a subquery like from (select x.* from tablea X where x.value=20) TA
Then refer to TA like you did tablea previously.
Likely the query optimizer would do this for you.
Oracle should have a way to show the query plan. Put "Explain plan" before the sql statement. Look at the plan both ways and see what it does.

In your first SQL statement, A.ID=20 is not being joined to anything technically. Joins are used to connect two separate tables together, with the ON statement joining columns by associating them as keys.
WHERE statements allow the filtering of data by reducing the number of rows returned only where that value can be found under that particular column.

Comparing two datasets SQL SSRS 2005

I have two datasets on two seperate servers. They both pull one column of information each.
I would like to build a report showing the values of the rows that only appear in one of the datasets.
From what I have read, it seems I would like to do this on the SQL side, not the reporting side; I am not sure how to do that.
If someone could shed some light on how that is possible, I would really appreciate it.

You can use the NOT EXISTS clause to get the differences between the two tables.
SELECT
Column
FROM
DatabaseName.SchemaName.Table1
WHERE
NOT EXISTS
(
SELECT
Column
FROM
LinkedServerName.DatabaseName.SchemaName.Table2
WHERE
Table1.Column = Table2.Column --looks at equalities, and doesn't
--include them because of the
--NOT EXISTS clause
)
This will show the rows in Table1 that don't appear in Table2. You can reverse the table names to find the rows in Table2 that don't appear in Table1.
Edit: Made an edit to show what the case would be in the event of linked servers. Also, if you wanted to see all of the rows that are not shared in both tables at the same time, you can try something as in the below.
SELECT
Column, 'Table1' TableName
FROM
DatabaseName.SchemaName.Table1
WHERE
NOT EXISTS
(
SELECT
Column
FROM
LinkedServerName.DatabaseName.SchemaName.Table2
WHERE
Table1.Column = Table2.Column --looks at equalities, and doesn't
--include them because of the
--NOT EXISTS clause
)
UNION
SELECT
Column, 'Table2' TableName
FROM
LinkedServerName.DatabaseName.SchemaName.Table2
WHERE
NOT EXISTS
(
SELECT
Column
FROM
DatabaseName.SchemaName.Table1
WHERE
Table1.Column = Table2.Column
)

You can also use a left join:
select a.* from tableA a
left join tableB b
on a.PrimaryKey = b.ForeignKey
where b.ForeignKey is null
This query will return all records from tableA that do not have corresponding records in tableB.

If you want rows that appear in exactly one data set and you have a matching key on each table, then you can use a full outer join:
select *
from table1 t1 full outer join
table2 t2
on t1.key = t2.key
where t1.key is null and t2.key is not null or
t1.key is not null and t2.key is null
The where condition chooses the rows where exactly one match.
The problem with this query, though, is that you get lots of columns with nulls. One way to fix this is by going through the columns one by one in the SELECT clause.
select coalesce(t1.key, t2.key) as key, . . .
Another way to solve this problem is to use a union with a window function. This version brings together all the rows and counts the number of times that key appears:
select t.*
from (select t.*, count(*) over (partition by key) as keycnt
from ((select 'Table1' as which, t.*
from table1 t
) union all
(select 'Table2' as which, t.*
from table2 t
)
) t
) t
where keycnt = 1
This has the additional column specifying which table the value comes from. It also has an extra column, keycnt, with the value 1. If you have a composite key, you would just replace with the list of columns specifying a match between the two tables.

Can we add an empty Table in From clause

In my SQL query from clause there is a table (table 1)I have noticed if this table is empty my whole query results null.Also if i do not use this table in where and select clause.
What is reason behind this and how to prevent this.
select ps.ProductName from
productsale ps,StockTransfer dtc
where
ps.productcode = '010134600223'
and ps.productcode=
(case when ps.productcode=dtc.productcode then
dtc.productcode else ps.productcode
end)
This query works fine if I not add StockTransfer table in form clause(If StockTransfer table is empty) else it works fine.

Depends on how you did your join.
If you do
select * from tablea join tableb on tablea.id=tableb.id
and tableb is empty, then nothing could result as nothing was there.
left or right joins depending on which is the empty table, work to do exactly that.
left join will return all data in the first table and any data from the second table that matches so
select * from tablea left join tableb on tablea.id=tableb.id
would return in short, all but just tablea data, because thats all there is (but any additional fields you asked for from tableb, but obviously no data as there was none)

You do a cross join. That means that for every row in Table1 you will get all rows in Table2. So if Table1 has 4 rows and Table2 has 2 rows you will get 8 rows in the result set (4*2=8). If Table2 contains 0 rows you will get 0 rows in result set (4*0=0).
To fix this you need to add a join condition.

Getting distinct rows from a left outer join

I am building an application which dynamically generates sql to search for rows of a particular Table (this is the main domain class, like an Employee).
There are three tables Table1, Table2 and Table1Table2Map.
Table1 has a many to many relationship with Table2, and is mapped through Table1Table2Map table. But since Table1 is my main table the relationship is virtually like a one to many.
My app generates a sql which basically gives a result set containing rows from all these tables. The select clause and joins dont change whereas the where clause is generated based on user interaction. In any case I dont want duplicate rows of Table1 in my result set as it is the main table for result display. Right now the query that is getting generated is like this:
select distinct Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
For simplicity I have excluded the where clause. The problem is when there are multiple rows in Table2 for Table1 even though I have said distinct of Table1.Id the result set has duplicate rows of Table1 as it has to select all the matching rows in Table2.
To elaborate more, consider that for a row in Table1 with Id = 1 there are two rows in Table1Table2Map (1, 1) and (1, 2) mapping Table1 to two rows in Table2 with ids 1, 2. The above mentioned query returns duplicate rows for this case. Now I want the query to return Table1 row with Id 1 only once. This is because there is only one row in Table2 that is like an active value for the corresponding entry in Table1 (this information is in Mapping table).
Is there a way I can avoid getting duplicate rows of Table1.
I think there is some basic problem in the way I am trying to solve the problem, but I am not able to find out what it is. Thanks in advance.

Try:
left outer join (select distinct YOUR_COLUMNS_HERE ...) SUBQUERY_ALIAS on ...
In other words, don't join directly against the table, join against a sub-query that limits the rows you join against.

You can use GROUP BY on Table1.Id ,and that will get rid off the extra rows. You wouldn't need to worry about any mechanics on join side.
I came up with this solution in a huge query and it this solution didnt effect the query time much.
NOTE : I'm answering this question 3 years after its been asked but this may help someone i believe.

You can re-write your left joins to be outer applies, so that you can use a top 1 and an order by as follows:
select Table1.Id as Id, Table1.Name, Table2.Description
from Table1
outer apply (
select top 1 *
from Table1Table2Map
where (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
order by somethingCol
) t1t2
outer apply (
select top 1 *
from Table2
where (Table2.Id = Table1Table2Map.Table2Id)
) t2;
Note that an outer apply without a "top" or an "order by" is exactly equivalent to a left outer join, it just gives you a little more control. (cross apply is equivalent to an inner join).
You can also do something similar using the row_number() function:
select * from (
select distinct Table1.Id as Id, Table1.Name, Table2.Description,
rowNum = row_number() over ( partition by table1.id order by something )
from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
) x
where rowNum = 1;
Most of this doesn't apply if the IsActive flag can narrow down your other tables to one row, but they might come in useful for you.

To elaborate on one point: you said that there is only one "active" row in Table2 per row in Table1. Is that row not marked as active such that you could put it in the where clause? Or is there some magic in the dynamic conditions supplied by the user that determines what's active and what isn't.
If you don't need to select anything from Table2 the solution is relatively simply in that you can use the EXISTS function but since you've put TAble2.Description in the clause I'll assume that's not the case.
Basically what separates the relevant rows in Table2 from the irrelevant ones? Is it an active flag or a dynamic condition? The first row? That's really how you should be removing duplicates.
DISTINCT clauses tend to be overused. That may not be the case here but it sounds like it's possible that you're trying to hack out the results you want with DISTINCT rather than solving the real problem, which is a fairly common problem.

You have to include activity clause into your join (and no need for distinct):
select Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)

If you want to display multiple rows from table2 you will have duplicate data from table1 displayed. If you wanted to you could use an aggregate function (IE Max, Min) on table2, this would eliminate the duplicate rows from table1, but would also hide some of the data from table2.
See also my answer on question #70161 for additional explanation

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas