Union ALL fetches empty rows - sql

I have two query joined with a union All.
SELECT select 'Finished' AS Status,amount AS amount,units As Date
from table1 WHERE Pdate > cdate AND name =#name
UNION ALL
SELECT select 'Live' AS Live,amount,units
from table1 Where Pdate = cdate And name =#name
Result
Status amount units
Finished 100 20
Live 200 10
When either of the query fetches empty set I get only one row and, if both fetches empty set then I no rows
So how can I get result like this
Status amount Units
Finished 100 20
Live 0 0
OR
Status amount Units
Finished 0 0
Live 200 10
OR
Status amount Units
Finished 0 0
Live 0 0
Thanks.

I would think you can do it using sum? And if sum doesn't return 0 when there are no rows then replace with Coalesce(sum(amount), 0) as amount
SELECT select 'Finished' AS Status,sum(amount) AS amount, sum(units) As Unit
from table1 WHERE Pdate > cdate AND name =#name
UNION ALL
SELECT select 'Live' AS Status, sum(amount) as amount, sum(units) as Unit
from table1 Where Pdate = cdate And name =#name
And if you are not trying to sum the results then just coalesce should work? coalesce(amount, 0) As amount etc...

I just want to point out that your query is needlessly complex, with nested selects and a union all. A better way to write the query is:
select (case when pdate > cdate then 'Finished' else 'Live' end) AS Status,
amount AS amount, units As Date
from table1
WHERE Pdate >= cdate AND name = #name
This query does not produce what you want, since it only produces rows where there is data.
One way to get the additional rows is to augment the original data and then check if it is needed.
select status, amount, units as Date
from (select Status, amount, units,
row_number() over (partition by status order by amount desc, units desc) as seqnum
from (select (case when pdate > cdate then 'Finished' else 'Live' end) AS Status,
amount, units, name
from table1
WHERE Pdate >= cdate AND name = #name
) union all
(select 'Finished', 0, 0, #name
) union all
(select 'Live', 0, 0, #name
)
) t
where (amount > 0 or units > 0) or
(seqnum = 1)
This adds in the extra rows that you want. It then enumerates them, so they would go last in any sequence. They are ignored, unless they are the first in the sequence.

Try something like this
with stscte as
(
select 'Finished' as status
union all
select 'Live'
),
datacte
as(
select 'Finished' AS Status,amount AS amount,units As Date
from table1 WHERE Pdate > cdate AND name =#name
UNION ALL
select 'Live' ,amount,units
from table1 Where Pdate = cdate And name =#name
)
select sc.status,isnull(dc.amount,0) as amount,isnull(dc.unit,0) as unit
from stscte sc left join datacte dc
on sc.status = dc.status

Related

SQL - find row with closest date but different column value

i'm new to SQL and i would need an help.
I have a TAB and I need to find for any item B in the TAB the item A with the closest date. In this case the A with 02.09.2021 04:25:30
Date.
Item
07.09.2021 05:02:05
A
06.09.2021 05:01:02
A
05.09.2021 05:00:02
A
04.09.2021 04:59:01
A
03.09.2021 04:58:03
A
02.09.2021 04:56:55
A
02.09.2021 04:33:56
B
02.09.2021 04:25:30
A
WITH CTE(DATE,ITEM)AS
(
SELECT '20210907 05:02:05' , 'A'UNION ALL
SELECT '20210906 05:01:02' , 'A'UNION ALL
SELECT '20210905 05:00:02' , 'A'UNION ALL
SELECT'20210904 04:59:01' , 'A'UNION ALL
SELECT'20210903 04:58:03' , 'A'UNION ALL
SELECT'20210902 04:56:55' , 'A'UNION ALL
SELECT'20210902 04:33:56' , 'B'UNION ALL
SELECT'20210902 04:25:30' , 'A'
)
SELECT
CAST(C.DATE AS DATETIME)X_DATE,C.ITEM,Q.CLOSEST
FROM CTE AS C
OUTER APPLY
(
SELECT TOP 1 CAST(X.DATE AS DATETIME)CLOSEST
FROM CTE AS X
WHERE X.ITEM='A'AND CAST(X.DATE AS DATETIME)<CAST(C.DATE AS DATETIME)
ORDER BY CAST(X.DATE AS DATETIME) ASC
)Q
WHERE C.ITEM='B'
You can use OUTER APPLY-approach as in the above query.
Please also take a look that datetime-column (DATE)is written in the ISO-compliant form
Your data has only two columns. If you want the only the closest A timestamp, then the fastest way is probably window functions:
select t.*,
(case when prev_a_date is null then next_a_date
when next_a_date is null then prev_a_date
when datediff(second, prev_a_date, date) <= datediff(second, date, next_a_date) then prev_a_date
else next_a_date
end) as a_date
from (select t.*,
max(case when item = 'A' then date end) over (order by date) as prev_a_date,
min(case when item = 'A' then date end) over (order by date desc) as next_a_date
from t
) t
where item = 'B';
This uses seconds to measure the time difference, but you can use a smaller unit if appropriate.
You can also do this using apply if you have more columns from the "A" rows that you want:
select tb.*, ta.*
from t b outer apply
(select top (1) ta.*
from t ta
where item = 'A'
order by abs(datediff(second, a.date, b.date))
) t
where item = 'B';

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.

Case statement based in max min dates

I have a columns as Memnumber, activity type, activity date, activity ID. One member can have activities after few days. I want to write a case statement that if the activity date is most initial then INITIAL and if activity is most recent then MR and if there is any activity in between these 2 dates then BETWEEN. They need to be grouped by Memnumber and treatment type.
I wrote query as :
--MR County Tree
SELECT T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17,
T0.ACTIVITY_DATE,
(T0.ACTIVITYID)
FROM DLA_EXTRACT_FINAL T0
INNER JOIN (
SELECT MEMBERNUMBER,
ACTIVITYTYPE,
MAX(ACTIVITY_DATE) MR_CY17,
MIN(ACTIVITY_DATE) IN_CY17
FROM DLA20_EXTRACT_FINAL
WHERE to_char(ACTIVITY_DATE, 'YYYYMMDD') >= 20170101
AND to_char(ACTIVITY_DATE, 'YYYYMMDD') <= 20171231
GROUP BY MEMBERNUMBER,
ACTIVITYTYPE
) T1 ON T0.MEMBERNUMBER = T1.MEMBERNUMBER
AND T0.ACTIVITYTYPE = T1.ACTIVITYTYPE
AND T0.ACTIVITY_DATE = T1.MR_CY17
--where T0.ACTIVITYTYPE='MT'
WHERE t0.MEMBERNUMBER = 'M500085268'
GROUP BY T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17,
T0.ACTIVITYID,
T0.ACTIVITY_DATE
ORDER BY T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17.
Looking for a solution.
You want to use window functions. Something like:
SELECT T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T0.ACTIVITY_DATE,
T0.ACTIVITYID,
case when row_number() over (partition by T0.MEMBERNUMBER, T0.ACTIVITYTYPE
order by T0.ACTIVITY_DATE) = 1 then 1 else 0 end most_initial,
case when row_number() over (partition by T0.MEMBERNUMBER, T0.ACTIVITYTYPE
order by T0.ACTIVITY_DATE desc) = 1 then 1 else 0 end most_recent
FROM DLA_EXTRACT_FINAL T0
Then you can use case statements to label as INITIAL if most_intial = 1, MR if most_recent = 1, or BETWEEN if both are 0.

How to find latest status of the day in SQL Server

I have a SQL Server question that I'm trying to figure out at work:
There is a table with a status field which can contain a status called "Participate." I am only trying to find records if the latest status of the day is "Participate" and only if the status changed on the same day from another status to "Participate."
I don't want any records where the status was already "Participate." It must have changed to that status on the same day. You can tell when the status was changed by the datetime field ChangedOn.
In the sample below I would only want to bring back ID 1880 since the status of "Participated" has the latest timestamp. I would not bring back ID 1700 since the last record is "Other," and I would not bring back ID 1600 since "Participated" is the only status of that day.
ChangedOn Status ID
02/01/17 15:23 Terminated 1880
02/01/17 17:24 Participated 1880
02/01/17 09:00 Other 1880
01/31/17 01:00 Terminated 1700
01/31/17 02:00 Participated 1700
01/31/17 03:00 Other 1700
01/31/17 02:00 Participated 1600
I was thinking of using a Window function, but I'm not sure how to get started on this. It's been a few months since I've written a query like this so I'm a bit out of practice.
Thanks!
You can use window functions for this:
select t.*
from (select t.*,
row_number() over (partition by cast(ChangedOn as date)
order by ChangedOn desc
) as seqnum,
sum(case when status <> 'Participate' then 1 else 0 end) over (partition by cast(ChangedOn as date)) as num_nonparticipate
from t
) t
where (seqnum = 1 and ChangedOn = 'Participate') and
num_nonparticipate > 0;
Can you check this?
WITH sample_table(ChangedOn,Status,ID)AS(
SELECT CONVERT(DATETIME,'02/01/2017 15:23'),'Terminated',1880 UNION ALL
SELECT '02/01/2017 17:24','Participated',1880 UNION ALL
SELECT '02/01/2017 09:00','Other',1880 UNION ALL
SELECT '01/31/2017 01:00','Terminated',1700 UNION ALL
SELECT '01/31/2017 02:00','Participated',1700 UNION ALL
SELECT '01/31/2017 03:00','Other',1700 UNION ALL
SELECT '01/31/2017 02:00','Participated',1600
)
SELECT ID FROM (
SELECT *
,ROW_NUMBER()OVER(PARTITION BY ID,CONVERT(VARCHAR,ChangedOn,112) ORDER BY ChangedOn) AS rn
,COUNT(0)OVER(PARTITION BY ID,CONVERT(VARCHAR,ChangedOn,112)) AS cnt
,CASE WHEN Status<>'Participated' THEN 1 ELSE 0 END AS ss
,SUM(CASE WHEN Status!='Participated' THEN 1 ELSE 0 END)OVER(PARTITION BY ID,CONVERT(VARCHAR,ChangedOn,112)) AS OtherStatusCnt
FROM sample_table
) AS t WHERE t.rn=t.cnt AND t.Status='Participated' AND t.OtherStatusCnt>0
--Return:
1880
try this with other sample data,
declare #t table(ChangedOn datetime,Status varchar(50),ID int)
insert into #t VALUES
('02/01/17 15:23', 'Terminated' ,1880)
,('02/01/17 17:24', 'Participated' ,1880)
,('02/01/17 09:00', 'Other' ,1880)
,('01/31/17 01:00', 'Terminated' ,1700)
,('01/31/17 02:00', 'Participated' ,1700)
,('01/31/17 03:00', 'Other' ,1700)
,('01/31/17 02:00', 'Participated' ,1600)
;
WITH CTE
AS (
SELECT *
,row_number() OVER (
PARTITION BY id
,cast(ChangedOn AS DATE) ORDER BY ChangedOn DESC
) AS seqnum
FROM #t
)
SELECT *
FROM cte c
WHERE seqnum = 1
AND STATUS = 'Participated'
AND EXISTS (
SELECT id
FROM cte c1
WHERE seqnum > 1
AND c.id = c1.id
)
2nd query,this is better
here CTE is same
SELECT *
FROM cte c
WHERE seqnum = 1
AND STATUS = 'Participated'
AND EXISTS (
SELECT id
FROM cte c1
WHERE STATUS != 'Participated'
AND c.id = c1.id
)

Getting rid of grouping field

Is there a safe way to not have to group by a field when using an aggregate in another field? Here is my example
SELECT
C.CustomerName
,D.INDUSTRY_CODE
,CASE WHEN D.INDUSTRY_CODE IN ('003','004','005','006','007','008','009','010','017','029')
THEN 'PM'
WHEN UPPER(CustomerName) = 'ULINE INC'
THEN 'ULINE'
ELSE 'DR'
END AS BU
,ISNULL((SELECT SUM(GrossAmount)
where CONVERT(date,convert(char(8),InvoiceDateID )) between DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 1, 0) and DATEADD(year, -1, GETDATE())),0) [PREVIOUS YEAR GROSS]
FROM factMargins A
LEFT OUTER JOIN dimDate B ON A.InvoiceDateID = B.DateId
LEFT OUTER JOIN dimCustomer C ON A.CustomerID = C.CustomerId
LEFT OUTER JOIN CRCDATA.DBO.CU10 D ON D.CUST_NUMB = C.CustomerNumber
GROUP BY
C.CustomerName,D.INDUSTRY_CODE
,A.InvoiceDateID
order by CustomerName
before grouping I was only getting 984 rows but after grouping by the A.InvoiceDateId field I am getting over 11k rows. The rows blow up since there are multiple invoices per customer. Min and Max wont work since then it will pull data incorrectly. Would it be best to let my application (crystal) get rid of the extra lines? Usually I like to have my base data be as close as possible to how the report will layout if possible.
Try moving the reference to InvoiceDateID to within an aggregate function, rather than within a selected subquery's WHERE clause.
In Oracle, here's an example:
with TheData as (
select 'A' customerID, 25 AMOUNT , trunc(sysdate) THEDATE from dual union
select 'B' customerID, 35 AMOUNT , trunc(sysdate-1) THEDATE from dual union
select 'A' customerID, 45 AMOUNT , trunc(sysdate-2) THEDATE from dual union
select 'A' customerID, 11000 AMOUNT , trunc(sysdate-3) THEDATE from dual union
select 'B' customerID, 12000 AMOUNT , trunc(sysdate-4) THEDATE from dual union
select 'A' customerID, 15000 AMOUNT , trunc(sysdate-5) THEDATE from dual)
select
CustomerID,
sum(amount) as "AllRevenue"
sum(case when thedate<sysdate-3 then amount else 0 end) as "OlderRevenue",
from thedata
group by customerID;
Output:
CustomerID | AllRevenue | OlderRevenue
A | 26070 | 26000
B | 12035 | 12000
This says:
For each customerID
I want the sum of all amounts
and I want the sum of amounts earlier than 3 days ago