Aggregation disables window function capability - sql

I am trying to re-write the query where I am joining the query on itself:
select count(distinct case when cancelled_client_id is null and year(RUM.first_date) = year(date) and RUM.first_date <= .date then user_id
when cancelled_client_id is null and year(coalesce(RUM.first_date,RUR.first_date)) = year(date)
and coalesce(RUM.first_date,RUR.first_date) <= RUL.date then user_id end) as
from RUL
left join
(
select enrolled_client_id, min(date) as first_date
from RUL
where enrolled_client_id is not null
group by enrolled_client_id
) RUR on RUR.enrolled_client_id=RUL.enrolled_client_id
left join
(
select managed_client_id, min(date) as first_date
from RUL
where managed_client_id is not null
group by managed_client_id
) RUM on RUM.managed_client_id=RUL.managed_client_id
Using window functions:
count(distinct case when cancelled_client_id is null
and year(min(case when enrolled_client_id is not null then date end) over(partition by enrolled_client_id)) = year(date)
and min(case when enrolled_client_id is not null then date end) over(partition by enrolled_client_id) <= date
then user_id
when cancelled_client_id_rev is null
and year(coalesce(
min(case when enrolled_client_id is not null then date end) over(partition by enrolled_client_id),
min(case when managed_client_id is not null then date end) over(partition by managed_client_id))) = year(date)
and coalesce(
min(case when enrolled_client_id is not null then date end) over(partition by enrolled_client_id),
min(case when managed_client_id is not null then date end) over(partition by managed_client_id)) <= date
then user_id end)
from RUL
However I am getting an error that "Windowed functions cannot be used in the context of another windowed function or aggregate" due to the count(distinct min). Any work-arounds?

I have no idea what the count(distinct) is supposed to be doing, but you can simplify the code to:
select count(distinct case when cancelled_client_id is null and
year(rum_first_date) = year(date) and
rum_first_date <= rul.date
then user_id
when cancelled_client_id is null and
year(coalesce(RUM_first_date, RUR_first_date)) = year(rul.date) and
coalesce(rum_first_date, rur_first_date) <= RUL.date
then user_id
end) as . . .
from (select RUL.*,
min(date) over (partition by enrolled_client_id) as rur_date,
min(date) over (partition by managed_client_id) as rum_date
from RUL
) RUL

Related

SQL QUERY TO SHOW FIRST IN AND FIRST OUT

my query like this :
select u.name, (case when IOType=0 then format(Edatetime,'hh:mm tt') end) as 'IN' ,
(case when IOType=1 then format(Edatetime,'hh:mm tt') end) as 'out'
from TestZEOTRA T
inner join Mx_UserMst U on t.UsrRefCode=u.UserID
where UsrRefCode='1506' and CAST(Edatetime as date)='28 OCT 2019'
output getting :
BUT I NEED TO SHOW THE RESULT LIKE THIS
Assuming that your in and outs are interleaved, you can use row_number() and aggregation:
select ram, min(in) as in, min(out) as out
from (select t.*,
row_number() over (partition by name, out order by in) as seqnum_in,
row_number() over (partition by name, in order by in) as seqnum_out
from t
) t
group by (case when in is null then seqnum_out else seqnum_in);
For your particular query, this is a little simpler:
select u.name,
min(case when IOType = 0 then format(t.Edatetime, 'hh:mm tt') end) as in_time
max(case when IOType = 1 then format(t.Edatetime, 'hh:mm tt') end) as out_time
from Mx_UserMst U join
(select t.*,
row_number() over (partition by t.userrefcode, t.iotype order by t.edatetime) as seqnum
from TestZEOTRA T
where UsrRefCode = '1506' and
cast(t.Edatetime as date) = '2019-10-28'
) t
on t.UsrRefCode = u.UserID
group by u.UserID, u.name, seqnum

MSSQL Group by and Select rows from grouping

I'm trying to figure out if what I'm trying to do is possible. Instead of resorting to multiple queries on a table, I wanted to group the records by business date and id then group by the id and select one date for a field and another date for the other field.
SELECT
*
{AMOUNT FROM DATE}
{AMOUNT FROM OTHER DATE}
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
AS subquery
GROUP BY id
It seems that you're looking to do a pivot query. I usually use cross tabs for this. Based on the query you posted, it could look like:
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)AS subquery
GROUP BY id;
You could also use a CTE.
WITH CTE AS(
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
Or even be a rebel and do the operation directly.
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
However, some people have tested for performance and found that pre-aggregating can improve performance.
If I understand you correctly, then you're just trying to pivot, but only with two particular dates:
select id,
date1 = sum(iif(date = '2000-01-01', amount, null)),
date2 = sum(iif(date = '2000-01-02', amount, null))
from [table]
group by id

Hive rolling sum of data over date

I am working on Hive and am facing an issue with rolling counts. The sample data I am working on is as shown below:
and the output I am expecting is as shown below:
I tried using the following query but it is not returning the rolling count:
select event_dt,status, count(distinct account) from
(select *, row_number() over (partition by account order by event_dt
desc)
as rnum from table.A
where event_dt between '2018-05-02' and '2018-05-04') x where rnum =1
group by event_dt, status;
Please help me with this if some one has solved a similar issue.
You seem to just want conditional aggregation:
select event_dt,
sum(case when status = 'Registered' then 1 else 0 end) as registered,
sum(case when status = 'active_acct' then 1 else 0 end) as active_acct,
sum(case when status = 'suspended' then 1 else 0 end) as suspended,
sum(case when status = 'reactive' then 1 else 0 end) as reactive
from table.A
group by event_dt
order by event_dt;
EDIT:
This is a tricky problem. The solution I've come up with does a cross-product of dates and users and then calculates the most recent status as of each date.
So:
select a.event_dt,
sum(case when aa.status = 'Registered' then 1 else 0 end) as registered,
sum(case when aa.status = 'active_acct' then 1 else 0 end) as active_acct,
sum(case when aa.status = 'suspended' then 1 else 0 end) as suspended,
sum(case when aa.status = 'reactive' then 1 else 0 end) as reactive
from (select d.event_dt, ac.account, a.status,
max(case when a.status is not null then a.timestamp end) over (partition by ac.account order by d.event_dt) as last_status_timestamp
from (select distinct event_dt from table.A) d cross join
(select distinct account from table.A) ac left join
(select a.*,
row_number() over (partition by account, event_dt order by timestamp desc) as seqnum
from table.A a
) a
on a.event_dt = d.event_dt and
a.account = ac.account and
a.seqnum = 1 -- get the last one on the date
) a left join
table.A aa
on aa.timestamp = a.last_status_timestamp and
aa.account = a.account
group by d.event_dt
order by d.event_dt;
What this is doing is creating a derived table with rows for all accounts and dates. This has the status on certain days, but not all days.
The cumulative max for last_status_timestamp calculates the most recent timestamp that has a valid status. This is then joined back to the table to get the status on that date. Voila! This is the status used for the conditional aggregation.
The cumulative max and join is a work-around because Hive does not (yet?) support the ignore nulls option in lag().

Window Functions

I want to add a window functions.
Take the min date when visit = Y and end as Associd.
TableA
ID Date AssocId Visit
1 1/1/17 10101 Y
1 1/2/17 10102 Y
End Results.
ID Date AssocId
1 1/1/17 10101
SQL > This gives me the min date but I need the AssocId associated to that date.
SELECT MIN(CASE WHEN A.VISIT = 'Y'
THEN A.DATE END) OVER (PARTITION BY ID)
AS MIN_DT,
You can use FIRST_VALUE():
SELECT MIN(CASE WHEN A.VISIT = 'Y' THEN A.DATE END) OVER (PARTITION BY ID) AS MIN_DT,
FIRST_VALUE(CASE WHEN A.VISIT = 'Y' THEN A.ASSOCID END) KEEP (DENSE_RANK FIRST OVER (PARTITION BY ID ORDER BY A.VISIT DESC, A.DATE ASC),
Note that this is a little tricky with conditional operations. I would be more inclined to use a subquery to nest the query operations. The outer expression would be:
SELECT MAX(CASE WHEN Date = MIN_DT THEN ASSOCID END) OVER (PARTITION BY ID)
If you wanted this per ID, I would suggest:
select id, min(date),
first_value(associd) over (partition by id order by date)
from t
where visit = 'Y'
group by id;
That is, use aggregation functions.
You seems want :
select t.*
from table t
where visit = 'Y' and
date= (select min(t1.date) from table t1 where t1.id = t.id);

SQL query min and max group by flag

I have a table as below :
How can I craft a SQL select statement so that MIN AND MAX EVENT DATE groups results by FLAG (0,1)?
So the result would be:
Just do conditional aggregation with use of window function
SELECT card_no, descr_reader,
max(CASE WHEN flag = 0 THEN event_date END) date_in,
max(CASE WHEN flag = 1 THEN event_date END) date_out
FROM
(
SELECT *,
COUNT(flag) OVER (PARTITION BY flag ORDER BY id) Seq
FROM table t
)t
GROUP BY card_no, descr_reader, Seq
An alternative if Window function does not work:
SELECT
t1.card_no, t1.descr_reader,
t1.event_date date_in,
(select top 1 event_date from test t2
where t2.card_no = t1.card_no and
t2.reader_no = t1.reader_no and
t2.descr_reader = t1.descr_reader and
t2.event_date > t1.event_date and
t2.flag = 1
order by t2.event_date ) as date_out
FROM test t1
WHERE t1.flag = 0