Showing specific output data based on duplicate rows and null values [postgresql] - sql

I'm using the following SQL (with a union to two similar queries):
SELECT
distinct a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
'Def' as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
distinct source,
p_id,
name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag = 0 AND
loan_flag = 3
GROUP BY
source,
name,
p_id ) as b
ON
a.p_id = b.p_id
UNION
SELECT
distinct a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
'Other' as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
distinct source,
p_id,
name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag = 1 AND
loan_flag = 3
GROUP BY
source,
name,
p_id
ORDER BY
name ) as b
ON
a.p_id = b.p_id
The output I'm getting looks like this:
Essentially since FakeName #2 has one row showing actual numbers (not null), I ONLY want FakeName #2 to show up. This means I also want the null row for FakeName #2. But, since FakeName #1 and #3 have 2 null rows, I don't need them to show. What type of SQL command (or edit to my query) can accomplish this?

Firstly, if I read your query correctly, you can eliminate the need for a UNION by using CASE and IN. You also have a couple of bogus DISTINCTs in there (since you're using GROUP BY anyway). That gives:
SELECT DISTINCT
a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
Case When default_banner_flag = 0 Then 'Def' Else 'Other' End as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
source,
p_id,
name,
default_banner_flag,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
GROUP BY
source,
name,
p_id,
default_banner_flag
) as b
ON
a.p_id = b.p_id
However, what you actually want is information about those p_ids which have at least one row in dwh.prod_count, so I think you can change your whole query around to use that as the sub-select:
SELECT
a.source,
a.p_id,
a.name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt,
Case When default_banner_flag = 0 Then 'Def' Else 'Other' End as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
dwh.prod_count b
On a.p_id = b.p_id
INNER JOIN
(
SELECT DISTINCT
p_id
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
) as c
ON a.p_id = c.p_id
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
(You could also rewrite this as a WHERE p_id IN ( sub-select ) or with a little fiddling WHERE EXISTS ( ... ), but this seemed the easiest version to demonstrate.)
Note that I haven't actually tested any of these queries, but I think they're logically sound.

Related

JOIN AND CASE MORE AN TABLE

I have 2 tables; the first one ORG contains the following columns:
ORG_REF, ARB_REF, NAME, LEVEL, START_DATE
and the second one WORK contains these columns:
ARB_REF, WORK_STREET - WORK_NUM, WORK_ZIP
I want to do the following: write a select query that search in work and see if the WORK_STREET, WORK_ZIP are duplicate together, then you should look at WORK_NUM. If it is the same then output value ' ok ', but if WORK_NUM is not the same, output 'not ok'
I wrote this SQL query:
select
A.ARB_REF, A.WORK_STREET, A.WORK_NUM, A.WORK_ZIP
case when B.B = 1 then 'OK' else 'not ok' end
from
work A
join
(select
WORK_STREET, WORK_ZIP count(distinct , A.WORK_NUM) B
from
WORK
group by
WORK_STREET, WORK_ZIP) B on B.WORK_STREET = A.WORK_STREET
and B.WORK_ZIP = A.WORK_ZIP
Now I want to join the table ORG with this result I want to check if every address belong to org if it belong I should create a new column result and set it to yes in it (RESULT) AND show the "name" column otherwise set no in 'RESULT'.
Can anyone help me please?
While you can accomplish your result by adding a left outer join to the query you've already started, it might be easiest to just use count() over....
with org_data as (
-- do the inner join before the left join later
select * from org1 o1 inner join org2 o2 on o2.orgid = o1.orgid
)
select
*,
count(*) over (partition by WORK_STREET, WORKZIP) as cnt,
case when o.ARB_REF is not null then 'Yes' else 'No' end as result
from
WORK w left outer join org_data o on o.ARB_REF = w.ARB_REF

Rewrite a where in / intersect query to a join

Is there a way to rewrite my query into a join one.
The question I have to solve is: List the names of green items sold by no department on the first floor. Do not show duplicates.
select distinct itemname from xsale where deptname in (
select deptname from xdept where deptfloor <> 1
)
intersect (
select itemname from xitem where itemcolor= 'green'
)
I have been stuck at this exercise a couple of days now because the join statements don't make much sense to me even after reading about it. I hope someone can help me.
I think the following query will help you about how to use join statment for your query;
EDITED
select distinct xs.itemname from xsale xs
inner join xdept xp on xs.deptname=xp.deptname
where xp.deptfloor <>1
intersect
(select xi.itemname from xitem xi where xi.itemcolor= 'green')
Check if this works for you
Select Distinct a.itemname from xsale a
INNER JOIN xdept b on a.deptName = b.deptName
INNER JOIN xitem c on a.itemName = c.itemname
Where b.deptfloor <> 1 and c.itemcolor = 'green'
I think you can express the logic as an EXIST and NOT EXIST query:
SELECT itemname
FROM xitem
WHERE itemcolor = 'green' -- all green items
AND EXISTS (
-- exists a sale for that item
SELECT 1
FROM xsale
WHERE xsale.itemname = xitem.itemname
AND NOT EXISTS (
-- not exists a department in those sales with floor = 1
SELECT 1
FROM xdept
WHERE xdept.deptname = xsale.deptname AND xdept.deptfloor = 1
)
)
It is often best to build your queries up step by step. So if we break the problem down:
List the names of green items sold
SELECT i.ItemName
FROM xitem AS i
WHERE i.ItemColor = 'Green';
Items sold by a department on the first floor
SELECT s.ItemName
FROM xsale AS s
INNER JOIN xdept AS d
ON d.DeptName = s.DeptName
WHERE d.DeptFloor = 1;
So now, you want all the items output by the first query, except for those that appear in the 2nd:
SELECT i.ItemName
FROM xitem AS i
WHERE i.ItemColor = 'Green'
EXCEPT
SELECT s.ItemName
FROM xsale AS s
INNER JOIN xdept AS d
ON d.DeptName = s.DeptName
WHERE d.DeptFloor = 1;
Then the final part:
Do not show duplicates:
SELECT DISTINCT i.ItemName
FROM xitem AS i
WHERE i.ItemColor = 'Green'
EXCEPT
SELECT s.ItemName
FROM xsale AS s
INNER JOIN xdept AS d
ON d.DeptName = s.DeptName
WHERE d.DeptFloor = 1;
An alertanative to EXCEPT would be NOT EXISTS, they will almost always result in the same execution plan, but I find NOT EXISTS is more flexible (you don't need the same columns in both queries):
SELECT i.ItemName
FROM xitem AS i
WHERE i.ItemColor = 'Green'
AND NOT EXISTS
( SELECT 1
FROM xsale AS s
INNER JOIN xdept AS d
ON d.DeptName = s.DeptName
WHERE d.DeptFloor = 1
AND s.ItemName = i.ItemName
)
GROUP BY i.ItemName;
Again to show an alternative, I have used GROUP BY rather than DISTINCT. In most cases these are semantically equivalent, but there are scenarios where GROUP BY will perform better (namely when a scalar function is involved - GROUP BY will remove duplciates first, and then execute the funciton on all remaining values, DISTINCT will execute the function first and remove duplicate results).
Examples on DB Fiddle

How to use multiple count and where condition sql server 2008?

I have this two query
1.
select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id
2.
select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status
i have this column i need to put count function and where condition
[cloi_current_status]
166
30
30
30
150
150
150
150
150
150
150
Quite simple, you just encapsulate the queries and give their result sets an alias and then do a JOIN between their aliases on the column that is common. (In the query below I assume you'll be joining by client id)
SELECT *
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
WHERE ...
GROUP BY ...
This will be treated as a separate result set, so you can also filter results with a WHERE or just a GROUP BY, just like in a normal SELECT.
UPDATE:
To answer the question in your comments, when you join two tables that have a column with the same value and use
SELECT * FROM A INNER JOIN B the * will show all columns returned by the join, meaning all columns from A and all columns from B, this is why you have duplicate columns.
If you want to filter the columns returned you can specifiy which columns you want returned. So, in your case, the top SELECT * can be replaced with
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis so, your query becomes:
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
UPDATE #2:
For your last question, you need to GROUP BY at the end of the big query and use a HAVING condtion, like this:
GROUP BY A.cl_id, A.cl_name, A.number_of_orders, B.dis
HAVING COUNT(cloi_current_status) > 100
All depends on what data you are trying to get, but you can go about it like this.
SELECT Column_x, Column_y, etc..
FROM ClL_Clients a
JOIN (select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id) b
on a.cl_id = b.cl_id
JOIN (select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status) c
on a.cl_id = c.cl_id
Group by BLAH BLAH
Hope this gets you in the right direction.

Produce result table trom multiple tables

SQL Server 2008 R2
I have 3 tables contained data for 3 different types of events
Type1, Type2, Type3 with two columns:
DatePoint ValuePoint
I want to produce result table which would look like that:
DatePoint TotalType1 TotalType2 TotalType3
I've started from that
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType1
FROM [dbo].[Type1]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType2
FROM [dbo].[Type2]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType3
FROM [dbo].[Type3]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
So I have three result but I need to produce one (Date TotalType1 TotalType2 TotalType3), what I need to do next achieve my goal?
UPDATE
Forgot to mention that DatePoint which is exists in one type may or may not exist in another
Here's my take. I assume that you don't have the same datetime values in every table (certainly, the stuff I get to work with is never so consistant). There should be an easier way to do this, but once you're past two outer joins things can get pretty tricky.
SELECT
dp.DatePoint
,isnull(t1.TotalType1, 0) TotalType1
,isnull(t2.TotalType2, 0) TotalType2
,isnull(t3.TotalType3, 0) TotalType3
from (-- Without "ALL", UNION will filter out duplicates
select DatePoint
from Type1
union select DatePoint
from Type2
union select DatePoint
from Type3) dp
left outer join (select DatePoint, sum(ValuePoint) TotalType1
from Type1
group by DatePoint) t1
on t1.DatePoint = db.DatePoint
left outer join (select DatePoint, sum(ValuePoint) TotalType2
from Type2
group by DatePoint) t2
on t2.DatePoint = db.DatePoint
left outer join (select DatePoint, sum(ValuePoint) TotalType3
from Type3
group by DatePoint) t3
on t3.DatePoint = db.DatePoint
order by dp.DatePoint
Suppose some distinct could help, but the general idea should be the following:
SELECT
t.[DatePoint],
SUM(t1.ValuePoint) as TotalType1,
SUM(t2.ValuePoint) as TotalType2,
SUM(t3.ValuePoint) as TotalType3
FROM
(
SELECT [DatePoint] FROM [dbo].[Type1]
UNION
SELECT [DatePoint] FROM [dbo].[Type2]
UNION
SELECT [DatePoint] FROM [dbo].[Type3]
) as t
LEFT JOIN
[dbo].[Type1] t1
ON
t1.[DatePoint] = t.[DatePoint]
LEFT JOIN
[dbo].[Type2] t2
ON
t2.[DatePoint] = t.[DatePoint]
LEFT JOIN
[dbo].[Type3] t3
ON
t3.[DatePoint] = t.[DatePoint]
GROUP BY
t.[DatePoint]
ORDER BY
t.[DatePoint]
To avoid all of the JOINs:
SELECT
SQ.DatePoint,
SUM(CASE WHEN SQ.type = 1 THEN SQ.ValuePoint ELSE 0 END) AS TotalType1,
SUM(CASE WHEN SQ.type = 2 THEN SQ.ValuePoint ELSE 0 END) AS TotalType2,
SUM(CASE WHEN SQ.type = 3 THEN SQ.ValuePoint ELSE 0 END) AS TotalType3
FROM (
SELECT
1 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type1
UNION ALL
SELECT
2 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type2
UNION ALL
SELECT
3 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type3
) AS SQ
GROUP BY
DatePoint
ORDER BY
DatePoint
From the little information provided though, it seems like there are some flaws in the database design, which is probably part of the reason that querying the data is so difficult.

Multiple MAX values select using inner join

I have query that work for me only when values in the StakeValue don't repeat.
Basically, I need to select maximum values from SI_STAKES table with their relations from two other tables grouped by internal type.
SELECT a.StakeValue, b.[StakeName], c.[ProviderName]
FROM SI_STAKES AS a
INNER JOIN SI_STAKESTYPES AS b ON a.[StakeTypeID] = b.[ID]
INNER JOIN SI_PROVIDERS AS c ON a.[ProviderID] = c.[ID] WHERE a.[EventID]=6
AND a.[StakeGroupTypeID]=1
AND a.StakeValue IN
(SELECT MAX(d.StakeValue) FROM SI_STAKES AS d
WHERE d.[EventID]=a.[EventID] AND d.[StakeGroupTypeID]=a.[StakeGroupTypeID]
GROUP BY d.[StakeTypeID])
ORDER BY b.[StakeName], a.[StakeValue] DESC
Results for example must be:
[ID] [MaxValue] [StakeTypeID] [ProviderName]
1 1,5 6 provider1
2 3,75 7 provider2
3 7,6 8 provider3
Thank you for your help
There are two problems to solve here.
1) Finding the max values per type. This will get the Max value per StakeType and make sure that we do the exercise only for the wanted events and group type.
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
2) Then we need to get only one return back for that value since it may be present more then once.
Using the Max Value, we must find a unique row for each I usually do this by getting the Max ID is has the added advantage of getting me the most recent entry.
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
3) Now that we have the ID's of the rows that we want, we can just get that information.
SELECT Stakes.ID, Stakes.StakeValue, SType.StakeName, SProv.ProviderName
FROM SI_STAKES AS Stakes
INNER JOIN SI_STAKESTYPES AS SType ON Stake.[StakeTypeID] = SType.[ID]
INNER JOIN SI_PROVIDERS AS SProv ON Stake.[ProviderID] = SProv.[ID]
WHERE Stake.ID IN (
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
)
You can use the over clause since you're using T-SQL (hopefully 2005+):
select distinct
a.stakevalue,
max(a.stakevalue) over (partition by a.staketypeid) as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
Essentially, this will find the max a.stakevalue for each a.staketypeid. Using a distinct will return one and only one row. Now, if you wanted to include the min a.id along with it, you could use row_number to accomplish this:
select
s.id,
s.maxvalue,
s.staketypeid,
s.providername
from (
select
row_number() over (order by a.stakevalue desc
partition by a.staketypeid) as rownum,
a.id,
a.stakevalue as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
) s
where
s.rownum = 1