How to remove duplicating rows from union statement

How to remove duplicating rows from union statement - sql

OK - I have looked and looked and found a lot of examples but nothing quite meeting my need. Maybe I used the wrong words to search with, but I could use your help. I will provide as much detail as I can.
I need to produce a report that merges fields from two tables, or rather a view and a table, into one table for a report. Here is the statement I am using:
SELECT A.ConfInt, A.Conference,
NULL as Ordered,
NULL as Approved,
NULL as PickedUp,
SUM(dbo.Case_Visit_Payments.Qty) AS Qty
FROM dbo.Conferences as A INNER JOIN
dbo.Case_Table ON A.ConfInt = dbo.Case_Table.Conference_ID INNER JOIN
dbo.Case_Visit_Payments ON dbo.Case_Table.Case_ID = dbo.Case_Visit_Payments.Case_ID
WHERE (dbo.Case_Visit_Payments.Item_ID = 15 AND A.ProjectCool = 1)
GROUP BY A.Conference, A.ConfInt
UNION
SELECT B.ConfInt,
B.Conference,
SUM(dbo.Cool_Fan_Order.NumberOfFansRequested) AS Ordered,
SUM(dbo.Cool_Fan_Order.Qty_Fans_Approved) AS Approved,
SUM(dbo.Cool_Fan_Order.Qty_Fans_PickedUp) AS PickedUp,
NULL AS Qty
FROM dbo.Conferences as B LEFT OUTER JOIN
dbo.Cool_Fan_Order ON B.ConfInt = dbo.Cool_Fan_Order.Conference_ID
where B.ProjectCool = 1
GROUP BY B.Conference, B.ConfInt
And here are the results:
4 Our Lady NULL NULL NULL 11
4 Our Lady 40 40 40 NULL
7 Holy Rosary 20 20 20 NULL
11 Little Flower NULL NULL NULL 21
11 Little Flower 5 5 20 NULL
19 Perpetual Help NULL NULL NULL 2
19 Perpetual Help 20 20 20 NULL
What I would strongly prefer is to not have the duplicating rows, such as:
4 Our Lady 40 40 40 11
7 Holy Rosary 20 20 20 NULL
11 Little Flower 5 5 20 21
19 Perpetual Help 20 20 20 2
I hope this question was clear enough. Any Suggestions would be greatly appreciated. And I do mark as answered. :)
Gregory

you could use your actual query as a subQuery, use an aggregate function (MAX OR SUM) on your non-duplicated values and group by the non aggregated columns
SELECT ConfInt, Conference, MAX(Ordered), MAX(Approved), MAX(PickedUp), MAX(Qty)
FROM (<your actualQuery>)
GROUP BY ConfInt, Conference.

The quick answer is to wrap your query inside another one,
SELECT ConfInt
, Conference
, SUM(Ordered) AS Ordered
, SUM(Approved) As Approved
, SUM(PickedUp) AS PickedUp
, SUM(Qty) AS Qty
FROM (
<your UNION query here>
)
GROUP BY ConfInt, Conference
This is not the only way to achieve the result set, but its the quickest fix to meet the specified requirements.
As an alternative, I believe these queries will return equivalent results:
We could use a correlated subquery in the SELECT list to get Qty:
;WITH q AS
( SELECT B.ConfInt
, B.Conference
, SUM(o.NumberOfFansRequested) AS Ordered
, SUM(o.Qty_Fans_Approved) AS Approved
, SUM(o.Qty_Fans_PickedUp) AS PickedUp
FROM dbo.Conferences as B
LEFT
JOIN dbo.Cool_Fan_Order o ON o.Conference_ID = B.ConfInt
WHERE B.ProjectCool = 1
GROUP BY B.ConfInt, B.Conference
)
SELECT q.ConfInt
, q.Conference
, q.Ordered
, q.Approved
, q.PickedUp
, ( SELECT SUM(v.Qty)
FROM dbo.Case_Table t
JOIN dbo.Case_Visit_Payments v ON v.Case_ID = t.Case_ID
WHERE t.Conference_ID = q.ConfInt
AND v.Item_ID = 15
) AS Qty
FROM q
ORDER BY q.ConfInt, q.Conference
Or, we could use LEFT JOIN operation on the two queries, rather than UNION. (We know that the query referencing Cool_Fan_Order can be the LEFT side of the outer join, because we know that it returns at least as many rows as the other query. (Basically, we know that the other query can't return values of ConfInt and Conference that aren't in the Cool_Fan_Order query.)
;WITH p AS
( SELECT A.ConfInt
, A.Conference
, SUM(v.Qty) AS Qty
FROM dbo.Conferences as A
JOIN dbo.Case_Table t ON t.Conference_ID = A.ConfInt
JOIN dbo.Case_Visit_Payments v ON v.Case_ID = t.Case_ID
WHERE A.ProjectCool = 1
AND v.Item_ID = 15
GROUP BY A.ConfInt, A.Conference
)
, q AS
( SELECT B.ConfInt
, B.Conference
, SUM(o.NumberOfFansRequested) AS Ordered
, SUM(o.Qty_Fans_Approved) AS Approved
, SUM(o.Qty_Fans_PickedUp) AS PickedUp
FROM dbo.Conferences as B
LEFT
JOIN dbo.Cool_Fan_Order o ON B.ConfInt = o.Conference_ID
WHERE B.ProjectCool = 1
GROUP BY B.ConfInt, B.Conference
)
SELECT q.ConfInt
, q.Conference
, q.Ordered
, q.Approved
, q.PickedUp
, p.Qty
FROM q
LEFT
JOIN p ON p.ConfInt = q.ConfInt AND p.Conference = q.Conference
ORDER BY q.ConfInt, q.Conference
The choice between those three (they all return an equivalent resultset under all conditons), boils down to readability and maintainability, and performance. On large enough rowsets, there may be some observable performance differences between the three statements.

I'd just put the join to Cool_Fan_Order in the first select and remove the union.
Does this return the same results?
select
A.ConfInt,
A.Conference,
sum(dbo.Cool_Fan_Order.NumberOfFansRequested) as Ordered,
sum(dbo.Cool_Fan_Order.Qty_Fans_Approved) as Approved,
sum(dbo.Cool_Fan_Order.Qty_Fans_PickedUp) as PickedUp,
sum(sub.Qty) as Qty
from
dbo.Conferences as A
left outer join
(
select
c.ConfInt,
cvp.Qty
from dbo.Conferences c
inner join dbo.Case_Table ct
on a.confInt=ct.Conference_ID
inner join dbo.Case_Visit_Payments cvp
on ct.Case_ID=cvp.Case_ID
where cvp.Item_ID=15
) sub
on a.ConfInt=sub.ConfInt
left outer join dbo.Cool_Fan_Order
on A.ConfInt = dbo.Cool_Fan_Order.Conference_ID
where
(
A.ProjectCool = 1
)
group by
A.Conference,
A.ConfInt

Related

Horizontal To Vertical Sql Server

I'm stuck with a SQL query (SQL Server) that involves converting horizontal rows to vertical rows
Below is my Query that I am trying
SELECT P AS Amount_Rs
FROM (
Select (F1.D32-F1.D20) As Profit_For_The_Period ,F3.D2 as Current_Libilities,F5.D20 As Capital_Acount,
--M1.Name As Name,
F2.D20 AS Loan_Liabilities,F4.d1 As Opening_Diff --F2.D68 As Loan,
from Folio1 As F1
--inner Join Master1 As m1 on m1.Code like '101' or m1.Code Like '102' or m1.Code Like '106' or m1.Code Like '109' or m1.Code lIke '103'
--And m1.Code=102 And m1.Code=101)
inner Join Folio1 As F2 On (F2.MasterCode=F2.MasterCode)
inner Join Folio1 As F3 On (F3.MasterCode=F3.MasterCode)
inner Join Folio1 As F4 On (F4.MasterCode=F4.MasterCode)
inner Join Folio1 As F5 On (F5.MasterCode=F5.MasterCode)
Where F1.MasterCode=109
and F2.MasterCode =106
and F3.MasterCode=103
and F4.MasterCode=102
And F5.MasterCode=101
) p UNPIVOT
( p FOR value IN
( Profit_For_The_Period,Capital_Acount, Current_Libilities, Loan_Liabilities, Opening_Diff )
) AS unvpt
Current Output:
1 12392
2 0
3 0
4 4000
5 -200
Desired Output:
1 Capital Account 12392
2 Current Assets 0
3 Current Liabilities 0
4 Loans (Liability) 4000
5 Revenue Accounts -200
Thanks !!!

I think you are looking for a pivot. Use the CASE statement with a SUM or any aggregate function in the SELECT part and a group by in the where clause, that's how I use to put rows into columns in a query when I have to in MySQL. I don't know SQL Server but I think you can do quite the same.
your conditions below
F1.MasterCode=109
and F2.MasterCode =106
and F3.MasterCode=103
and F4.MasterCode=102
And F5.MasterCode=101
shouldn't be in the the where clause but with the case in the select part
example :
select whatever,
case when F2.MasterCode =106 then sum(column_name)
end case as column_alias, (other columns) from ...
hope this could help

Can alias column used in a view for calculation in some other column?

CREATE VIEW dbo.myview1 As
SELECT
a.Id ,
a.Name ,
a.Age ,
CASE
WHEN b.PmtSched ='Monthly' THEN 12
WHEN b.PmtSched ='Quarterly' THEN 4
WHEN b.PmtSched ='Semi-Annual' THEN 2
WHEN b.PmtSched ='Annually' THEN 1
ELSE 12
END AS ABC,
SUM(a.Amount) *50 as TotalAmount ,
(a.AmtSpent - TotalAmount) * ABC as TOTALSPENDS
FROM dbo.User a join dbo.Details b on a.Id = b.Id
Here ABC and TotalAmount are Alias columns which needs to be used in computation in view and i am not able to use them.how to achieve this ?is there any way we could do this or we cant ?please help.

The simple solution to your problem is to repeat the expression, use a subquery, or use a CTE.
However, the more intelligent method is to add a reference table for payment schedules. This would look like:
create table PaymentSchedules (
PaymentScheduleId int identity(1, 1) primary key,
ScheduleName varchar(255),
FrequencyPerYear float -- this could be less often than once per year
);
Then the view would look like:
CREATE VIEW dbo.myview1 As
SELECT a.Id, a.Name, a.Age, ps.FrequencyPerYear,
SUM(a.Amount) * 50 as TotalAmount,
(a.AmtSpent - SUM(a.Amount) * 50) * ps.FrequencyPerYear as TOTALSPENDS
FROM dbo.User a join
dbo.Details b
on a.Id = b.Id join
dbo.PaymentSchedules ps
on ps.PaymentScheduleId = a.PamentScheduleId;

Yes, you can use it and you don't need neither subqueries, nor CTEs. It's a simple CROSS APPLY. It's quite elegant and doesn't hurt readability. If you need more information, read here.
Please see this example:
CREATE VIEW dbo.myview1
AS
SELECT A.Id
, A.Name
, A.Age
, SUM(A.Amount) * 50 AS TotalAmount
, (A.AmtSpent - TotalAmount) * T.ABC AS TotalSpends
FROM dbo.[User] AS A
CROSS APPLY (
SELECT CASE B.PmtSched
WHEN 'Monthly' THEN 12
WHEN 'Quarterly' THEN 4
WHEN 'Semi-Annual' THEN 2
WHEN 'Annually' THEN 1
ELSE 12
END) AS T(ABC)
INNER JOIN dbo.Details AS B
ON A.Id = B.Id;

Simple answer "NO"
It's impossible to do this without a subquery.
or
You need to use CTE
Below query will help you to get what you need.
;WITH Amount
(
SELECT a.Id,a.NAME,a.Age,a.AmtSpent,
CASE WHEN b.PmtSched ='Monthly' THEN 12
WHEN b.PmtSched ='Quarterly' THEN 4
WHEN b.PmtSched ='Semi-Annual' THEN 2
WHEN b.PmtSched ='Annually' THEN 1
ELSE 12
END AS ABC
,SUM(a.Amount) * 50 AS TotalAmount
FROM
dbo.[User] a
INNER JOIN
dbo.Details b ON a.Id = b.Id
GROUP BY id, NAME, age, abc, a.AmtSpent, TotalAmount
)
Now you can call those alias for calculation.
SELECT id,NAME,age,abc,(a.AmtSpent - TotalAmount) * ABC AS TOTALSPENDS FROM Amount

CTE with Group By returning the same results as basic query

I'm trying to select a count of product reviews by rating, where the rating can be 0 - 5.
The following basic select works but won't give a count of ratings that don't exist in the underlying table.
SELECT Rating, COUNT(*) AS 'Reviews' FROM ProductReviews
WHERE ProductID = 'product1'
GROUP BY Rating
I've tried using a CTE to generate the missing results, joined to Reviews table using an outer join, as soon as I try to include the "group by" expression the results fall back to match the results from the basic query.
(I've checked that the CTE does indeed generate the full range of required values).
BEGIN
DECLARE #START AS INT = 0;
DECLARE #END AS INT = 5;
WITH CTE_Ratings AS
(
SELECT #START as cte_rating
UNION ALL
SELECT 1 + cte_rating
FROM CTE_Ratings
WHERE cte_rating < #END
)
SELECT
cte_rating AS 'ReviewRating'
, ISNULL(COUNT(*), 0) AS 'ReviewCount'
FROM CTE_Ratings
LEFT OUTER JOIN Reviews ON Reviews.Rating = cte_rating
WHERE ProductReviews.ProductID = 'product1'
AND cte_rating BETWEEN #START AND #END
GROUP BY cte_rating
END
(I also tried building a temporary table containing the required values, joined to the Reviews table, with identical results).
In the case of both of the above queries the results are:
Rating Reviews
0 1
3 3
4 9
5 47
Whereas what I'm trying to get to for the same data is:
Rating Reviews
0 1
1 0
2 0
3 3
4 9
5 47
Can anyone suggest when the addition of the Group By aggregate function is causing the query to fail, or how it might be improved ?

The WHERE changes the OUTER JOIN to an INNER JOIN because you are filtering on optional rows from the outer table.So, move the outer table filter into the JOIN
Also, you are already filtering between start and end in the CTE
WITH ...
-- CTE here
SELECT
C.cte_rating AS 'ReviewRating'
, ISNULL(COUNT(PR.Rating), 0) AS 'ReviewCount'
FROM
CTE_Ratings C
LEFT OUTER JOIN
ProductReviews PR ON C.cte_rating= PR.Rating AND PR.ProductID = 'product1'
GROUP BY
C.cte_rating
More clearly, you are actually doing this
WITH ...
-- CTE here
SELECT
C.cte_rating AS 'ReviewRating'
, ISNULL(COUNT(PR.Rating), 0) AS 'ReviewCount'
FROM
CTE_Ratings C
LEFT OUTER JOIN
(
SELECT PR.Rating
FROM ProductReviews
WHERE ProductID = 'product1'
) PR ON C.cte_rating= PR.Rating
GROUP BY
C.cte_rating

Find records with exact matches on a many to many relationship

I have three tables that look like these:
PROD
Prod_ID|Desc
------------
P1|Foo1
P2|Foo2
P3|Foo3
P4|Foo4
...
RAM
Ram_ID|Desc
------------
R1|Bar1
R2|Bar2
R3|Bar3
R4|Bar4
...
PROD_RAM
Prod_ID|Ram_ID
------------
P1|R1
P2|R2
P3|R1
P3|R2
P3|R3
P4|R3
P5|R1
P5|R2
...
Between PROD and RAM there's a Many-To-Many relationship described by the PROD_RAM table.
Given a Ram_ID set like (R1,R3) I would like to find all the PROD that has exactly ONE or ALL of the RAM of the given set.
Given (R1,R3) should return for example P1,P4 and P5; P3 should not be returned because has R1 and R3 but also R2.
What's the fastest query to get all the PROD that has exactly ONE or ALL of the Ram_ID of a given RAM set?
EDIT:
The PROD_RAM table could contain relationship bigger than 1->3 so, "hardcoded" checks for count = 1 OR = 2 are not a viable solution.

Another solution you could try for speed would be like this
;WITH CANDIDATES AS (
SELECT pr1.Prod_ID
, pr2.Ram_ID
FROM PROD_RAM pr1
INNER JOIN PROD_RAM pr2 ON pr2.Prod_ID = pr1.Prod_ID
WHERE pr1.Ram_ID IN ('R1', 'R3')
)
SELECT *
FROM CANDIDATES
WHERE CANDIDATES.Prod_ID NOT IN (
SELECT Prod_ID
FROM CANDIDATES
WHERE Ram_ID NOT IN ('R1', 'R3')
)
or if you don't like repeating the set conditions
;WITH SUBSET (Ram_ID) AS (
SELECT 'R1'
UNION ALL SELECT 'R3'
)
, CANDIDATES AS (
SELECT pr1.Prod_ID
, pr2.Ram_ID
FROM PROD_RAM pr1
INNER JOIN PROD_RAM pr2 ON pr2.Prod_ID = pr1.Prod_ID
INNER JOIN SUBSET s ON s.Ram_ID = pr1.Ram_ID
)
, EXCLUDES AS (
SELECT Prod_ID
FROM CANDIDATES
LEFT OUTER JOIN SUBSET s ON s.Ram_ID = CANDIDATES.Ram_ID
WHERE s.Ram_ID IS NULL
)
SELECT *
FROM CANDIDATES
LEFT OUTER JOIN EXCLUDES ON EXCLUDES.Prod_ID = CANDIDATES.Prod_ID
WHERE EXCLUDES.Prod_ID IS NULL

One way to do this would be something like the following:
SELECT PROD.Prod_ID FROM PROD WHERE
(SELECT COUNT(*) FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID) > 0 AND
(SELECT COUNT(*) FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID AND PROD.Ram_ID <>
IFNULL((SELECT TOP 1 Ram_ID FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID),0)) = 0

SELECT Prod_ID
FROM
( SELECT Prod_ID
, COUNT(*) AS cntAll
, COUNT( CASE WHEN Ram_ID IN (1,3)
THEN 1
ELSE NULL
END
) AS cntGood
FROM PROD_RAM
GROUP BY Prod_ID
) AS grp
WHERE cntAll = cntGood
AND ( cntGood = 1
OR cntGood = 2 --- number of items in list (1,3)
)
Not at all sure if it's the fastest way. You'll have to try different ways to write this query (using JOINs and NOT EXISTS ) and test for speed.

Using (IN operator) OR condition in Where clause as AND condition

Please look at following image, I have explained my requirements in the image.
alt text http://img30.imageshack.us/img30/5668/shippment.png
I can't use here WHERE UsageTypeid IN(1,2,3,4) because this will behave as an OR condition and fetch all records.
I just want those records, of first table, which are attached with all 4 ShipmentToID .
All others which are attached with 3 or less ShipmentToIDs are not needed in result set.
Thanks.

if (EntityId, UsageTypeId) is unique:
select s.PrimaryKeyField, s.ShipmentId from shipment s, item a
where s.PrimaryKeyField = a.EntityId and a.UsageTypeId in (1,2,3,4)
group by s.PrimaryKeyField, s.ShipmentId having count(*) = 4
otherwise, 4-way join for the 4 fields,
select distinct s.* from shipment s, item a, item b, item c, item d where
s.PrimaryKeyField = a.EntityId = b.EntityId = c.EntityId = d.EntityId and
a.UsageTypeId = 1 and b.UsageTypeId = 2 and c.UsageTypeId = 3 and
d.UsageTypeId = 4
you'll want appropriate index on (EntityId, UsageTypeId) so it doesn't hang...

If there will never be duplicates of the UsageTypeId-EntityId combo in the 2nd table, so you'll never see:
EntityUsageTypeId | EntityId | UsageTypeId
22685 | 4477 | 1
22687 | 4477 | 1
You can count matching EntityIds in that table.
WHERE (count(*) in <tablename> WHERE EntityId = 4477) = 4

DECLARE #numShippingMethods int;
SELECT #numShippingMethods = COUNT(*)
FROM shippedToTable;
SELECT tbl1.shipmentID, COUNT(UsageTypeId) as Usages
FROM tbl2 JOIN tbl1 ON tbl2.EntityId = tbl1.EntityId
GROUP BY tbl1.EntityID
HAVING COUNT(UsageTypeId) = #numShippingMethods

This way is preferred to the multiple join against same table method, as you can simply modify the IN clause and the COUNT without needing to add or subtract more tables to the query when your list of IDs changes:
select EntityId, ShipmentId
from (
select EntityId
from (
select EntityId
from EntityUsage eu
where UsageTypeId in (1,2,3,4)
group by EntityId, UsageTypeId
) b
group by EntityId
having count(*) = 4
) a
inner join Shipment s on a.EntityId = s.EntityId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to remove duplicating rows from union statement - sql

you could use your actual query as a subQuery, use an aggregate function (MAX OR SUM) on your non-duplicated values and group by the non aggregated columns SELECT ConfInt, Conference, MAX(Ordered), MAX(Approved), MAX(PickedUp), MAX(Qty) FROM (<your actualQuery>) GROUP BY ConfInt, Conference.

Related

Horizontal To Vertical Sql Server

Can alias column used in a view for calculation in some other column?

CTE with Group By returning the same results as basic query

Find records with exact matches on a many to many relationship

Using (IN operator) OR condition in Where clause as AND condition

Categories

Resources