SQL Join With Fallback - sql

Given
CREATE TABLE Addresses
Id INT NOT NULL
Zip NVARCHAR(5) NULL
ZipPlus4 NVARCHAR(9) NULL
CREATE TABLE ZipLookup
Zip NVARCHAR(5) NULL
Code NVARCHAR(10) NULL
CREATE TABLE ZipPlus4Lookup
ZipPlus4 NVARCHAR(9) NULL
Code NVARCHAR(10) NULL
And data like
Addresses
1 | 92123 | 921234444
ZipLookup
92123 | Type A
ZipPlus4Lookup
921234444 | Type B
Is it possible to construct a query such that:
A given row in Addresses is outer joined to ZipPlus4Lookup if there is a match
Addresses.ZipPlus4 = ZipPlus4Lookup.ZipPlus4
Otherwise, the given row in Addresses is outer joined to ZipLookup if there is a match
Addresses.Zip = ZipLookup.Zip
Otherwise neither table is outer joined
In plain English, the Addresses table has a Zip and a ZipPlus4 column and I need to look up a code using the most precise match. If there's a match on Zip+4, use the code from that match. Otherwise, use the code from a Zip match.
I wish I had an attempted query to share, but with this one I don't know where to start.

This basic query will work:
SELECT
A.*,
Code = IsNull(Z4.Code, Z.Code)
FROM
dbo.Addresses A
LEFT JOIN dbo.ZipPlus4Lookup Z4
ON A.ZipPlus4 = Z4.ZipPlus4
LEFT JOIN dbo.ZipLookup Z
ON A.Zip = Z.Zip
AND Z4.ZipPlus4 IS NULL;
Or you could try something like this:
SELECT
A.*,
Z.Code
FROM
dbo.Addresses A
OUTER APPLY (
SELECT TOP 1 Code
FROM (
SELECT 0, Code FROM dbo.ZipPlus4Lookup Z4
WHERE A.ZipPlus4 = Z4.ZipPlus4
UNION ALL
SELECT 1, Code FROM dbo.ZipLookup Z
WHERE A.Zip = Z.Zip
) X (Seq, Code)
ORDER BY X.Seq
) Z;
They may have different performance characteristics. It's worth testing. My guess is the second query is unnecessary but it's still conceptually possible to be better.
See these in action in a SQL Fiddle.

Related

Don't select rows where column A is duplicated AND any row of column B is a specific value

I'm working on generating a report merging multiple tables. The report requires only showing projects that did not have any document marked 'Not Received' These document markings are listed in a table that lists each document in an individual line. So when merged into my other table it creates multiple rows of the same project. For example the following table
Project Number
ChecklistValue
565
Received
565
Not Received
465
Received
465
Not Applicable
As you can see really only two projects are listed on this table but the desired output is:
Project Number
Other Info
465
etc
I do not need the checklist value on the actual report, so I can use the GROUP BY to combine all the good rows, but where I have an Issue is that would still include project 565 even if I include something like where ChecklistValue <> 'Not Received', 565 needs to be hidden from the report entirely because any row for 565 contains 'Not Received'.
So that's my actual question, how do I exclude all project numbers rows that have any row containing 'Not Received'?
I'm adding the entire query will generalized names below:
SELECT
Project Number
,Name
,Contractor
,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod
,S.NoteDate
,S.FinalAppDate
,Status
,S.ONE
,S.TWO
,S.THREE
,S.FOUR
,CH.ChecklistValue
FROM [DB1] A
INNER JOIN [DB2] C ON A.Contractor = C.Contractor
INNER JOIN [DB3] S ON A.AppID = S.AppID
INNER JOIN [DB4] LS ON S.StatusID = LS.StatusID
LEFT OUTER JOIN [DB5] CH ON A.AppID = CH.AppID AND CH.OtherID = 1
WHERE C.TypeID = 4 AND A.YEAR = 2022, AND S.THING = 1 AND
(CH.CheckListValue IS NULL OR A.AppID NOT IN (SELECT * FROM [DB5] WHERE
CheckListValue = 'Not Reveived'))
GROUP BY Project Number,Name,Contractor,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod,S.NoteDate,S.FinalAppDate,Status,S.ONE,S.TWO,S.THREE,S.FOUR
The last portion of the WHERE clause was added from a suggestion, but I'm clearly not implementing it correctly as it errors
You can use not in like:
create table test(
num int,
description varchar(20)
);
insert into test(num,description)
values(565,'Received'),
(565,'Not Received'),
(465,'Received'),
(465,'Not Applicable');
select *
from test
where num not in
(
select num -- Only select one column here
from test
where description = 'Not Received'
);
Results:
+-----+---------------+
| num | description |
+-----+---------------+
| 465 | Received |
| 465 | Not Applicable|
+-----+---------------+
db<>fiddle this is on sql-server but works on other dbms as well.
So in your query you should have (in my understanding):
OR A.AppID NOT IN
(
SELECT AppID -- Not select *
FROM [DB5]
WHERE CheckListValue = 'Not Reveived'
)
Other way to do it is with a cte but it is complicated at first glance:
with x as(
select num
from test
where description = 'Not Received'
)
select t.num, t.description
from test t
left join x
on t.num = x.num
where x.num is null
I'm first creating a cte on the num column where the description = not received then I'm selecting all from the test table, and I'm left joining to the cte but I'm only selecting the num column that are not in the cte by using where x.num is null, and this will only return 465.
Now which one is better? I don't know sometimes join would be faster and sometimes in, for more you can find on this post.

Use WHERE + AND + CASE + IN when crafting T-SQL

I have a stored procedure that has a few variables that may or may not be passed. they are a list of PKs from other tables, so FKs but formatted in as a string of CSVs.
here's what the query essentially looks like
DECLARE #SomeIds VARCHAR(MAX) = ''
CREATE TABLE #TempIds (Id INT)
IF (#SomeIds = '' OR #SomeIds = NULL) INSERT INTO #TempIds VALUES (NULL)
ELSE INSERT INTO #TempIds SELECT * FROM SplitString(#SomeIds,',') -- SplitString() is a user function
SELECT cont.varchar_LastName AS LastName
,cred.varchar_CredentialName AS CredentialName
FROM [dbo].[tbl_Contacts] AS cont
LEFT JOIN [dbo].[tbl_ContactsCredentials] AS cc ON cont.pk_int_Id = cc.fk_ContactId
LEFT JOIN [dbo].[tbl_Credentials] AS cred ON cc.fk_CredentialId = cred.pk_int_Id
So this query basically gives me a full list of contacts both with and without a credential name. I don't have a WHERE clause, so not surprised.
I get data basically like:
LastName | CredentialName
---------------------------
Stevens | Admin
Arnolds | User
Bishop | NULL
Evans | NULL
So if I add a WHERE clause like this:
WHERE cred.pk_int_Id IN (SELECT * FROM #TempIds)
I get zero results.
When I run this:
SELECT * FROM #TempIds
I get this:
Id
-----------
NULL
When I run it with "real values" in #SomeIds like '1,2' then it works fine.
I presume this is because my WHERE clause is looking in the cred table and there are no NULL values in that table, so that's why I'm not getting anything.
But I'm not sure how I fix it?
I guess I really want to do something like this:
WHERE CredentialName IN (SELECT * FROM #TempIds)
But I believe to do that, I'd have to run the first query into another temp table, then run a second query on that table.
Any help is greatly appreciated.
You can avoid a temp table
WHERE (NULLIF(#SomeIds,'') IS NULL OR cred.pk_int_Id IN (SELECT value FROM SplitString(#SomeIds,',')))
Or if your Sql Server version supports STRING_SPLIT
WHERE (NULLIF(#SomeIds,'') IS NULL OR cred.pk_int_Id IN (SELECT value FROM STRING_SPLIT(#SomeIds,',')))
And then don't initialize #SomeIds to make it get all records.
I would consider using a UNION for this. Putting an OR in a WHERE clause can make for some bad execution plans, and often makes indexes unusable.
SELECT cont.varchar_LastName AS LastName
,cred.varchar_CredentialName AS CredentialName
FROM [dbo].[tbl_Contacts] AS cont
LEFT JOIN [dbo].[tbl_ContactsCredentials] AS cc
ON cont.pk_int_Id = cc.fk_ContactId
LEFT JOIN [dbo].[tbl_Credentials] AS cred
ON cc.fk_CredentialId = cred.pk_int_Id
WHERE cred.pk_int_Id IN (SELECT value FROM STRING_SPLIT(#SomeIds,','))
UNION ALL
SELECT cont.varchar_LastName AS LastName
,cred.varchar_CredentialName AS CredentialName
FROM [dbo].[tbl_Contacts] AS cont
LEFT JOIN [dbo].[tbl_ContactsCredentials] AS cc
ON cont.pk_int_Id = cc.fk_ContactId
LEFT JOIN [dbo].[tbl_Credentials] AS cred
ON cc.fk_CredentialId = cred.pk_int_Id
WHERE NULLIF(#SomeIds,'') IS NULL
I talk about using OR in UPDATES statements here. But the same logic applies to SELECT.

How to get different data from two different tables in SQL query?

I have two table named Soft and Web, table containing multiple data in that which data is different that data I want. For Ex :
In soft table containing 5 data i.e.
Also in Web table containing 5 data i.e.
Now I want output i.e.
I have done query but unfortunately didnt succed, lets see my query i.e.
SELECT DISTINCT soft.GSTNo AS SoftGST
,web.GSTNo AS WebGST
,soft.InvoiceNumber AS SoftInvoice
,web.InvoiceNumber AS WebInvoice
,soft.Rate AS SoftRate
,web.Rate AS WebRate
FROM soft
LEFT OUTER JOIN web ON web.GstNo = soft.GSTNo
AND web.InvoiceNumber = soft.invoicenumber
AND web.rate = soft.rate
Also I apply inner join bt same thing didnt work.
You can achieve this by
;WITH cte_soft AS
(SELECT * FROM soft
EXCEPT
SELECT * FROM web)
,cte_web AS
(SELECT * FROM web
EXCEPT
SELECT * FROM soft)
SELECT *
FROM
(SELECT gst softgst, NULL webgst, invoice softinvoice, NULL webinvoice, rate softrate, NULL webrate
FROM cte_soft
UNION ALL
SELECT NULL, gst, NULL, invoice, NULL , rate
FROM cte_web) tbl
ORDER BY coalesce(softgst, webgst),coalesce(softinvoice,webinvoice)
Fiddle
You can use full join:
SELECT s.gst as softgst, w.gst as webgst,
s.invoice as softinvoice, w.invoice as webinvoice,
s.rate as softrate, w.rate as webrate
FROM soft s FULL JOIN
web w
ON s.gst = w.gst AND s.invoice = w.invoice AND s.rate = w.rate
WHERE s.gst IS NULL OR w.gst IS NULL
ORDER BY COALESCE(s.gst, w.gst), COALESCE(s.invoice, w.invoice);
No subqueries are CTEs are needed. This is really just a slight variant of your query.

SQL query: Iterate over values in table and use them in subquery

I have a simple SQL table containing some values, for example:
id | value (table 'values')
----------
0 | 4
1 | 7
2 | 9
I want to iterate over these values, and use them in a query like so:
SELECT value[0], x1
FROM (some subquery where value[0] is used)
UNION
SELECT value[1], x2
FROM (some subquery where value[1] is used)
...
etc
In order to get a result set like this:
4 | x1
7 | x2
9 | x3
It has to be in SQL as it will actually represent a database view. Of course the real query is a lot more complicated, but I tried to simplify the question while keeping the essence as much as possible.
I think I have to select from values and join the subquery, but as the value should be used in the subquery I'm lost on how to accomplish this.
Edit: I oversimplified my question; in reality I want to have 2 rows from the subquery and not only one.
Edit 2: As suggested I'm posting the real query. I simplified it a bit to make it clearer, but it's a working query and the problem is there. Note that I have hardcoded the value '2' in this query two times. I want to replace that with values from a different table, in the example table above I would want a result set of the combined results of this query with 4, 7 and 9 as values instead of the currently hardcoded 2.
SELECT x.fantasycoach_id, SUM(round_points)
FROM (
SELECT DISTINCT fc.id AS fantasycoach_id,
ffv.formation_id AS formation_id,
fpc.round_sequence AS round_sequence,
round_points,
fpc.fantasyplayer_id
FROM fantasyworld_FantasyCoach AS fc
LEFT JOIN fantasyworld_fantasyformation AS ff ON ff.id = (
SELECT MAX(fantasyworld_fantasyformationvalidity.formation_id)
FROM fantasyworld_fantasyformationvalidity
LEFT JOIN realworld_round AS _rr ON _rr.id = round_id
LEFT JOIN fantasyworld_fantasyformation AS _ff ON _ff.id = formation_id
WHERE is_valid = TRUE
AND _ff.coach_id = fc.id
AND _rr.sequence <= 2 /* HARDCODED USE OF VALUE */
)
LEFT JOIN fantasyworld_FantasyFormationPlayer AS ffp
ON ffp.formation_id = ff.id
LEFT JOIN dbcache_fantasyplayercache AS fpc
ON ffp.player_id = fpc.fantasyplayer_id
AND fpc.round_sequence = 2 /* HARDCODED USE OF VALUE */
LEFT JOIN fantasyworld_fantasyformationvalidity AS ffv
ON ffv.formation_id = ff.id
) x
GROUP BY fantasycoach_id
Edit 3: I'm using PostgreSQL.
SQL works with tables as a whole, which basically involves set operations. There is no explicit iteration, and generally no need for any. In particular, the most straightforward implementation of what you described would be this:
SELECT value, (some subquery where value is used) AS x
FROM values
Do note, however, that a correlated subquery such as that is very hard on query performance. Depending on the details of what you're trying to do, it may well be possible to structure it around a simple join, an uncorrelated subquery, or a similar, better-performing alternative.
Update:
In view of the update to the question indicating that the subquery is expected to yield multiple rows for each value in table values, contrary to the example results, it seems a better approach would be to just rewrite the subquery as the main query. If it does not already do so (and maybe even if it does) then it would join table values as another base table.
Update 2:
Given the real query now presented, this is how the values from table values could be incorporated into it:
SELECT x.fantasycoach_id, SUM(round_points) FROM
(
SELECT DISTINCT
fc.id AS fantasycoach_id,
ffv.formation_id AS formation_id,
fpc.round_sequence AS round_sequence,
round_points,
fpc.fantasyplayer_id
FROM fantasyworld_FantasyCoach AS fc
-- one row for each combination of coach and value:
CROSS JOIN values
LEFT JOIN fantasyworld_fantasyformation AS ff
ON ff.id = (
SELECT MAX(fantasyworld_fantasyformationvalidity.formation_id)
FROM fantasyworld_fantasyformationvalidity
LEFT JOIN realworld_round AS _rr
ON _rr.id = round_id
LEFT JOIN fantasyworld_fantasyformation AS _ff
ON _ff.id = formation_id
WHERE is_valid = TRUE
AND _ff.coach_id = fc.id
-- use the value obtained from values:
AND _rr.sequence <= values.value
)
LEFT JOIN fantasyworld_FantasyFormationPlayer AS ffp
ON ffp.formation_id = ff.id
LEFT JOIN dbcache_fantasyplayercache AS fpc
ON ffp.player_id = fpc.fantasyplayer_id
-- use the value obtained from values again:
AND fpc.round_sequence = values.value
LEFT JOIN fantasyworld_fantasyformationvalidity AS ffv
ON ffv.formation_id = ff.id
) x
GROUP BY fantasycoach_id
Note in particular the CROSS JOIN which forms the cross product of two tables; this is the same thing as an INNER JOIN without any join predicate, and it can be written that way if desired.
The overall query could be at least a bit simplified, but I do not do so because it is a working example rather than an actual production query, so it is unclear what other changes would translate to the actual application.
In the example I create two tables. See how outer table have an alias you use in the inner select?
SQL Fiddle Demo
SELECT T.[value], (SELECT [property] FROM Table2 P WHERE P.[value] = T.[value])
FROM Table1 T
This is a better way for performance
SELECT T.[value], P.[property]
FROM Table1 T
INNER JOIN Table2 p
on P.[value] = T.[value];
Table 2 can be a QUERY instead of a real table
Third Option
Using a cte to calculate your values and then join back to the main table. This way you have the subquery logic separated from your final query.
WITH cte AS (
SELECT
T.[value],
T.[value] * T.[value] as property
FROM Table1 T
)
SELECT T.[value], C.[property]
FROM Table1 T
INNER JOIN cte C
on T.[value] = C.[value];
It might be helpful to extract the computation to a function that is called in the SELECT clause and is executed for each row of the result set
Here's the documentation for CREATE FUNCTION for SQL Server. It's probably similar to whatever database system you're using, and if not you can easily Google for it.
Here's an example of creating a function and using it in a query:
CREATE FUNCTION DoComputation(#parameter1 int)
RETURNS int
AS
BEGIN
-- Do some calculations here and return the function result.
-- This example returns the value of #parameter1 squared.
-- You can add additional parameters to the function definition if needed
DECLARE #Result int
SET #Result = #parameter1 * #parameter1
RETURN #Result
END
Here is an example of using the example function above in a query.
SELECT v.value, DoComputation(v.value) as ComputedValue
FROM [Values] v
ORDER BY value

querying 2 tables with the same spec for the differences

I recently had to solve this problem and find I've needed this info many times in the past so I thought I would post it. Assuming the following table def, how would you write a query to find all differences between the two?
table def:
CREATE TABLE feed_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT feed_tbl_PK PRIMARY KEY (code)
CREATE TABLE data_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT data_tbl_PK PRIMARY KEY (code)
Here is my solution, as a view using three queries joined by unions. The diff_type specified is how the record needs updated: deleted from _data(2), updated in _data(1), or added to _data(0)
CREATE VIEW delta_vw AS (
SELECT feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 0 as diff_type
FROM feed_tbl LEFT OUTER JOIN
data_tbl ON feed_tbl.code = data_tbl.code
WHERE (data_tbl.code IS NULL)
UNION
SELECT feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 1 as diff_type
FROM data_tbl RIGHT OUTER JOIN
feed_tbl ON data_tbl.code = feed_tbl.code
where (feed_tbl.name <> data_tbl.name) OR
(data_tbl.status <> feed_tbl.status) OR
(data_tbl.update <> feed_tbl.update)
UNION
SELECT data_tbl.code, data_tbl.name, data_tbl.status, data_tbl.update, 2 as diff_type
FROM feed_tbl LEFT OUTER JOIN
data_tbl ON data_tbl.code = feed_tbl.code
WHERE (feed_tbl.code IS NULL)
)
UNION will remove duplicates, so just UNION the two together, then search for anything with more than one entry. Given "code" as a primary key, you can say:
edit 0: modified to include differences in the PK field itself
edit 1: if you use this in real life, be sure to list the actual column names. Dont use dot-star, since the UNION operation requires result sets to have exactly matching columns. This example would break if you added / removed a column from one of the tables.
select dt.*
from
data_tbl dt
,(
select code
from
(
select * from feed_tbl
union
select * from data_tbl
)
group by code
having count(*) > 1
) diffs --"diffs" will return all differences *except* those in the primary key itself
where diffs.code = dt.code
union --plus the ones that are only in feed, but not in data
select * from feed_tbl ft where not exists(select code from data_tbl dt where dt.code = ft.code)
union --plus the ones that are only in data, but not in feed
select * from data_tbl dt where not exists(select code from feed_tbl ft where ft.code = dt.code)
I would use a minor variation in the second union:
where (ISNULL(feed_tbl.name, 'NONAME') <> ISNULL(data_tbl.name, 'NONAME')) OR
(ISNULL(data_tbl.status, 'NOSTATUS') <> ISNULL(feed_tbl.status, 'NOSTATUS')) OR
(ISNULL(data_tbl.update, '12/31/2039') <> ISNULL(feed_tbl.update, '12/31/2039'))
For reasons I have never understood, NULL does not equal NULL (at least in SQL Server).
You could also use a FULL OUTER JOIN and a CASE ... END statement on the diff_type column along with the aforementioned where clause in querying 2 tables with the same spec for the differences
That would probably achieve the same results, but in one query.