CASE WHEN Statement with two tables with table 2 column referencing has non-unique value - sql

I have a query that I have worked on and only one section has caused me fits. I am trying to create a column within the query based on the values of two tables. I have tried CASE WHEN and it functions, but due to the non-unique values involved, the row count in the query between the original query without increases. For example, this is the case when that I have written:
Select r.Id,
r.RequiredOn AT TIME ZONE 'UTC' AT TIME ZONE 'Central Standard Time'
as RequiredDate,
Concat(vs.Salutation, ' ',vs.FirstName, ' ', vs.LastName) as Name,
oo.Name as RequestingOrganization,
o.Name as Location,
Case
When r.IntendedOutcome = '1' Then 'T'
When r.IntendedOutcome = '2' Then 'R'
End as RequestType,
etr.TypeRequested,
Case
When etr.Identifier is not null then etr.Identifier
When etr.Identifier is null then ' '
End as Identifier,
f.OfferedOn,
f.OfferResponse,
r.DestinationCountryCodes,
o.Id,
CASE
WHEN o.Id = oir.OrganizationId AND oir.OrganizationRoleId =
'de51c814-f86d-49c9-941b-999a98be4894'
THEN 1
ELSE NULL
END AS Bk1
From [Request] r
Left Join Recovered etr
on etr.DistributionRequestId = r.Id
Left Join [Offer] f
on f.Id = etr.Id
Left Join [dbo].Contact vs
on vs.Id = r.SId
Left Join [dbo].Organization o
on o.Id = r.SLocationId or o.Id = r.RLocationId
Left Join [dbo].Organization oo
on oo.Id = r.RequestingOrganizationId
Left Join dbo.OrganizationInRole oir
on oir.OrganizationId = o.Id
Where f.Response = 'Accepted' or f.Response is NULL
The picture shows that the OrganizationId is not unique with this table and therefore when an OrganizationId is matched and the OrganizationRoleId is found, it is bringing all of the OrganizationRoleId's over in the query and adding to it rather than just seeing that it has the particular Role ID and adding to the one row I need it to.
The Organization Role column in non-unique and every organization can multiple roles(sometimes 4-5). I need that if the OrganizationId is A and the matching OrganizationId in Table 2 has the identifier in the OrganizationRole column, then add a 1.
The Organization table (Table 2) has a OrganizationId column and a OrganizationRole column. The OrganizationId is non-unique as the OrgnanizationId could be used in 5 consecutive rows since that organization has 5 Roles.
The results that I am getting are that the query is pulling all of the Roles from Organizations that do match that table. It basically added 33% more rows to the query versus the original.

When you say
... if the OrganizationId is A and the matching OrganizationId in Table 2 has the identifier in the OrganizationRole column, then add a 1.
Are you wanting to create a count of the number if times this condition is true? If so, you need to wrap your CASE in an aggregate function and group on the other rows.
Alternatively, as Stu suggests in the comments, you could pre-aggregate the OrganizationInRole table, filtering for the role you are actually interested in; something like
SELECT r.Id,
...
oir.RoleCount AS Bk1
FROM [Request] r
...
LEFT JOIN [dbo].Organization o
ON o.Id = r.SLocationId or o.Id = r.RLocationId
...
LEFT JOIN (
SELECT OrganizationId, COUNT(*) AS RoleCount
FROM dbo.OrganizationInRole
WHERE OrganizationRoleId = 'de51c814-f86d-49c9-941b-999a98be4894'
GROUP BY OrganizationId) AS oir ON oir.OrganizationId = o.Id
...
You can do this for any other table which has multiple related rows, reducing them to a single row to join to and removing the need for aggregation and grouping in the main query.

Related

IS it possible to inner join 3 tables for SQL?

I'm trying to fetch certain data by using a column that exists in all 3 tables, I have three tables
[Txn].[TxnPaymentResponse]
[Txn].[Txn]
[Txn].[TxnLineItem]
Right now if I run the query, I'm able to get the correct data only userID.
But I want to grab another piece of information (suppose that call column X) from the 3rd table (TxnLineItem). That column doesn't exist in the first 2 tables. In this scenario how could I perform inner join and show that piece of info in the query?
DECLARE #CompletedTransactionSince DATETIME2(7) = '2022-09-13 00:00:00.000000'
SELECT DISTINCT
t.UserID
FROM
[Txn].[Txn] T WITH(NOLOCK)
INNER JOIN
[Txn].[TxnPaymentResponse] TPR WITH(NOLOCK) ON T.[TxnID] = TPR.[TxnID]
WHERE
TPR.[PaymentResponseType] = 'FINAL'
AND TPR.[AuthorizedAmount] > CONVERT(DECIMAL(9,3), 0)
AND (#CompletedTransactionSince IS NULL OR
T.[CreatedOn] > #CompletedTransactionSince)
Results from my query:
UserID
C1671FDA-70A8-4C07-BBDF-ACD06ADD145F
Table 3:
TxnID
StandardProductCategory
6FE0D0D0-9959-41AA-9BF0-00000003DED8
Carwash
D1B0EA51-C476-488C-A140-0000C1C7D099
General
Suppose I'm doing inner join like
INNER JOIN
[Txn].[TxnLineItem] TXL WITH(NOLOCK) ON T.[TxnID] = [Txn].[TxnID],
But my I want to grab the X column that has the same transactionID. I want to display UserID that has a Carwash only. Not sure if it's possible to write another clause with an inner join.
As per "I want to display UserID that has a Carwash only", you need to exclude records.
Just add these EXISTS and NOT EXISTS clauses at the end of to query.
EXISTS part is filter users that has CarWash.
NOT EXISTS part is exclude users that has StandardProduct Category records other than Car Wash
......
AND (#CompletedTransactionSince IS NULL OR
T.[CreatedOn] > #CompletedTransactionSince)
AND EXISTS ( SELECT 1 FROM [Txn].[TxnLineItem] TXL
WHERE T.[TxnID] = [Txn].[TxnID]
AND TXL.StandardProductCategory='Carwash')
AND NOT EXISTS( SELECT 1 FROM [Txn].[TxnLineItem] TXL
WHERE T.[TxnID] = [Txn].[TxnID]
AND TXL.StandardProductCategory=<>'Carwash')

How to prioritise selection of one column value over another on join?

I have two tables, namely offers(containing columns id and user_id) and offer_maps (containing offer_id, user_id). I want to join both of the tables on offer_id, and the final selection should have user_id column populated by prioritising offer_maps' user_id column over offers' user_id column. For example, if offer_maps' user_id column is null and offers' user_id column has a value, the final user_id should have offers' user_id column. But if both are populated, then pick only offer_maps' user_id column value. How can I achieve this through sql query? Here's a sample which I wrote
select concat(offers.user_id, o.user_id) AS user_id
from offers
left join offer_maps o on offers.id::text = o.offer_id
This actually joins both the values of columns, but I need only one in case both exist.
You can use ISNULL in this case:
select ISNULL (offers.user_id, o.user_id) AS user_id
from offers
left join offer_maps o on offers.id::text = o.offer_id
--This assumes O.user_id is not nullable, but OM.user_id is:
SELECT O.id[offer_id], ISNULL(O.user_id, OM.user_id)[user_id]
FROM offers as O
LEFT JOIN offer_maps as OM
ON OM.offer_id = O.id
AND ISNULL(OM.user_id, O.user_id) = O.user_id --Join when OM.user_id is null.

Conditional Table Join In SQL Server

I have a table named 'Task' with fields (Id int, TaskName nvarchar, AssigneeType int, AssigneeId int).
AssigneeType can contain 3 int values pointing to specific tables. (0 = User, 1 = Group, 2 = Location)
User, Group, Location are the tables
AssigneeId contains the Id of record in the table pointed by AssigneeType.
Problem Area
I want to extract all tasks by joining task table with the table pointed by AssigneeType.
If AssigneeType contains 0, I need to join Task table with User table.
If AssigneeType contains 1, I need to join Task table with Group table.
If AssigneeType contains 2, I need to join Task table with Location table.
Basically I need to make conditionally join. I have found this, but I dont know that how can I incorporate for my need. I want to show TaskName and Joined Table Record's Name field.
Any Help?
This will do a left join and give you the first name it finds using COALESCE
SELECT Task.*, COALESCE([User].Name, [Group].Name, Location.Name) AS Name
FROM Task
LEFT JOIN [User]
ON [User].Id = Task.AssigneeId
AND Task.AssigneeType = 0
LEFT JOIN [Group]
ON [Group].Id = Task.AssigneeId
AND Task.AssigneeType = 1
LEFT JOIN Location
ON Location.Id = Task.AssigneeId
AND Task.AssigneeType = 2
You cannot join a table or not. You must join all tables. So you will outer join the three tables getting only one match. Then show that match with COALESCE.
select t.*, coalesce(u.name, g.name, l.name) as name
from task t
left join user u on t.assigneetype = 0 and t.assigneeid = u.id
left join [group] g on t.assigneetype = 1 and t.assigneeid = g.id
left join location l on t.assigneetype = 2 and t.assigneeid = l.id;
EDIT: I've corrected my answer and replaced backticks with brackets. Different dbms use different symbols in order to use reserved words such as 'GROUP' for table names. In SQL Server this should be brackets rather than backticks. However, it is always a bad idea to use reserved words for table names and columns, so you might want to change this, if you can.

A messy SQL statement

I have a case where I wanna choose any database entry that have an invalid Country, Region, or Area ID, by invalid, I mean an ID for a country or region or area that no longer exists in my tables, I have four tables: Properties, Countries, Regions, Areas.
I was thinking to do it like this:
SELECT * FROM Properties WHERE
Country_ID NOT IN
(
SELECT CountryID FROM Countries
)
OR
RegionID NOT IN
(
SELECT RegionID FROM Regions
)
OR
AreaID NOT IN
(
SELECT AreaID FROM Areas
)
Now, is my query right? and what do you suggest that i can do and achieve the same result with better performance?!
Your query in fact is optimal.
LEFT JOIN's proposed by others are worse, as they select ALL values and then filter them out.
Most probably your subquery will be optimized to this:
SELECT *
FROM Properties p
WHERE NOT EXISTS
(
SELECT 1
FROM Countries i
WHERE i.CountryID = p.CountryID
)
OR
NOT EXISTS
(
SELECT 1
FROM Regions i
WHERE i.RegionID = p.RegionID
)
OR
NOT EXISTS
(
SELECT 1
FROM Areas i
WHERE i.AreaID = p.AreaID
)
, which you should use.
This query selects at most 1 row from each table, and jumps to the next iteration right as it finds this row (i. e. if it does not find a Country for a given Property, it will not even bother checking for a Region).
Again, SQL Server is smart enough to build the same plan for this query and your original one.
Update:
Tested on 512K rows in each table.
All corresponding ID's in dimension tables are CLUSTERED PRIMARY KEY's, all measure fields in Properties are indexed.
For each row in Property, PropertyID = CountryID = RegionID = AreaID, no actual missing rows (worst case in terms of execution time).
NOT EXISTS 00:11 (11 seconds)
LEFT JOIN 01:08 (68 seconds)
You could rewrite it differently as follows:
SELECT p.*
FROM Properties p
LEFT JOIN Countries c ON p.Country_ID = c.CountryID
LEFT JOIN Regions r on p.RegionID = r.RegionID
LEFT JOIN Areas a on p.AreaID = a.AreaID
WHERE c.CountryID IS NULL
OR r.RegionID IS NULL
OR a.AreaID IS NULL
Test the performance difference (if there is any - there should be as NOT IN is a nasty search, especially over a lot of items as it HAS to test every single one).
You can also make this faster by indexing the IDS being searched - in each master table (Country, Region, Area) they should be clustered primary keys.
Since this seems to be cleanup sql, this should be ok. But how about using foreign keys so that it does not bother you next time around?
Well, you could try things like UNION (instead of OR) - but I expect that the optimizer is already doing the best it can given the information available:
SELECT * FROM Properties
WHERE NOT EXISTS (SELECT 1 FROM Areas WHERE Areas.AreaID = Properties.AreaID)
UNION
SELECT * FROM Properties
WHERE NOT EXISTS (SELECT 1 FROM Regions WHERE Regions.RegionID = Properties.RegionID)
UNION
SELECT * FROM Properties
WHERE NOT EXISTS (SELECT 1 FROM Countries WHERE Countries.CountryID = Properties.CountryID)
Subqueries in the conditions can be quite inefficient. Instead you can do left joins against the related tables. Where there are no matching record you get a null value. You can use this in the condition to select only the records where there is a matching record missing:
select p.*
from Properties p
left join Countries c on c.CountryID = p.Country_ID
left join Regions r on r.RegionID = p.RegionID
left join Areas a on a.AreaID = p.AreaID
where c.CountryID is null or r.RegionID is null or a.AreaID is null
If you're not grabbing the row data from countries/regions/areas you can try using "exists":
SELECT Properties.*
FROM Properties
WHERE Properties.CountryID IS NOT NULL AND NOT EXISTS (SELECT 1 FROM Countries WHERE Countries.CountryID = Properties.CountryID)
OR Properties.RegionID IS NOT NULL AND NOT EXISTS (SELECT 1 FROM Regions WHERE Regions.RegionID = Properties.RegionID)
OR Properties.AreaID IS NOT NULL AND NOT EXISTS (SELECT 1 FROM Areas WHERE Areas.AreaID = Properties.AreaID)
This will typically hint to use the pkey indices of countries et al for the existence check... but whether that is an improvement depends on your data stats, you simply have to plug it into query analyzer and try it.

SQL Logical AND operator for bit fields

I have 2 tables that have a many to many relationship; An Individual can belong to many Groups. A Group can have many Individuals.
Individuals basically just have their Primary Key ID
Groups have a Primary Key ID, IndividualID (same as the ID in the Individual Table), and a bit flag for if that group is the primary group for the individual
In theory, all but one of the entries for any given individual in the group table should have that bit flag set to false, because every individual must have exactly 1 primary group.
I know that for my current dataset, this assumption doesn't hold true, and I have some individuals that have the primary flag for ALL their groups set to false.
I'm having trouble generating a query that will return those individuals to me.
The closest I've gotten is:
SELECT * FROM Individual i
LEFT JOIN Group g ON g.IndividualID = i.ID
WHERE g.IsPrimaryGroup = 0
but going further than that with SUM or MAX doesn't work, because the field is a bit field, and not a numeric.
Any suggestions?
Don't know your data...but....that LEFT JOIN is an INNER JOIN
what happens when you change the WHERE to AND
SELECT * FROM Individual i
LEFT JOIN Group g ON g.IndividualID = i.ID
AND g.IsPrimaryGroup = 0
Here try running this....untested of course since you didn't provide any ample data
SELECT SUM(convert(int,g.IsPrimaryGroup)), i.ID
FROM Individual i
LEFT JOIN [Group] g ON g.IndividualID = i.ID
AND g.IsPrimaryGroup = 0
GROUP BY i.ID
HAVING COUNT(*) > 1
Try not using a bit field if you need to do SUM and MAX - use a TINYINT instead. In addition, from what I remember bit fields can not be indexed, so you will loose some performance in your joins.
Update: Got it working with a subselect. Select IndividualID from Group where the primary group is false, and individualID NOT IN (select IndividualID from Group where primary group is true)
You need to include the IsPrimaryGroup condition into the JOIN clause. This query finds all individuals with no PrimaryGroup set:
SELECT * FROM Individual i
LEFT OUTER JOIN Group g ON g.IndividualID = i.ID AND g.IsPrimaryGroup = 1
WHERE g.ID IS NULL
However, the ideal way to solve your problem (in terms of relational db) is to have a PrimaryGroupID in the Individual table.
SELECT COUNT(bitflag),individualId
FROM Groups
WHERE bitflag = 1
GROUP BY individualId
ORDER BY SUM(bitFlag)
HAVING COUNT(bitFlag) <> 1
That will give you each individual and how many primary groups they have
I don't know if this is optimal from a performance standpoint, but I believe something along these lines should work. I'm using OrgIndividual as the name of the resolution table between the Individal and the Group.SELECT DISTINCT(i.IndividualID)
FROM
Individual i INNER JOIN OrgIndividual oi
ON i.IndividualID = oi.IndividualID AND oi.PrimaryOrg = 0
LEFT JOIN OrgIndividual oip
ON oi.IndividualID = oip.IndividualID AND oi.PrimaryOrg = 1
WHERE
oi2.IndividualID IS NULL
SELECT IndividualID
FROM Group g
WHERE NOT EXISTS (
SELECT NULL FROM Group
WHERE PrimaryOrg = 1
AND IndividualID = g.IndividualID)
GROUP BY IndividualID