How to rewrite query without selfjoin - google-bigquery

I have a view, that run next query
SELECT DISTINCT R.Id AS Id
, R.app
, case when R2.EventTypeId IS NOT NULL THEN 1 END AS type
FROM user R
left join user R2 on R.Id = R2.Id
and R2.EventTypeId > 0
and R2.Date > '2022-10-16'
WHERE R.Date between '2022-10-16' and '2022-10-30'
AND R.EventTypeId = 0
Is there any way to rewrite it without self join?
Date is partitiong field with filter retuire on.

Please use a window function. Please check that all conditions are included.
SELECT Id, app,
max(type)
from(
Select *,
max( case when EventTypeId IS NOT NULL and EventTypeId > 0 and Date > '2022-10-16' THEN 1 END) over (partition by Id) AS type
FROM user R
WHERE Date between '2022-10-16' and '2022-10-30' # limits both time areas
)
WHERE Date between '2022-10-16' and '2022-10-30' # limits only the time area of the displayed
AND EventTypeId = 0
group by 1,2

Related

CASE WHEN expression in oracle

I have built a CASE WHEN expression to show me when someone has no system transaction in the system under their user id for anything greater then 15 minutes.
case
when MOD_DATE_TIME > prior MOD_DATE_TIME+(1/96)
and user_id = prior user_id
then round((MOD_DATE_TIME - prior MOD_DATE_TIME)*1440,2)
else null
end as TIME_GAP
That above is the case when statement only. The full query is below:
select
user_id, MENU_OPTN_NAME, MOD_DATE_TIME,
case
when MOD_DATE_TIME > prior MOD_DATE_TIME+(1/96) and user_id = prior user_id then round((MOD_DATE_TIME - prior MOD_DATE_TIME)*1440,2)
else null
end as TIME_GAP
from
(select
ptt.user_id, MENU_OPTN_NAME, ptt.MOD_DATE_TIME,
row_number() over (partition by ptt.user_id order by ptt.MOD_DATE_TIME) seq
from PROD_TRKG_TRAN ptt
join cd_master cm on
ptt.cd_master_id = cm.cd_master_id
Where
MENU_OPTN_NAME = 'Cycle Cnt {Reserve}' --CHANGE BASED ON WHAT YOUR TRACKING... MENU NAMES AT THE BOTTOM
and ptt.user_id = 'LLEE1' --CHANGE BASED ON WHO YOU WANT TO TRACK FOR A GAP...
and cm.cd_master_id =
(select cd_master_id
from cd_master
where
co = '&CO'
and div = '&DIV')
and ptt.create_date_time >=
/*Today*/ trunc(sysdate)
--/*This Week*/ trunc(sysdate-(to_char(sysdate,'D')-1))
--/*This Month*/ trunc(sysdate)-(to_char(sysdate,'DD')-1)
--/*Date Range*/ '&FromDate' and ptt.create_date_time-1 < '&ToDate'
--group by ptt.user_id
)cc
CONNECT BY
user_id = prior user_id
and seq = prior seq+1
start with
seq = 1
What I want the query to do is in the CASE WHEN clause. I want it to account for the shift start time being 4pm, so if they don't do anything till 4:41PM then I would see that listed as a 41 min gap.

Case statement based in max min dates

I have a columns as Memnumber, activity type, activity date, activity ID. One member can have activities after few days. I want to write a case statement that if the activity date is most initial then INITIAL and if activity is most recent then MR and if there is any activity in between these 2 dates then BETWEEN. They need to be grouped by Memnumber and treatment type.
I wrote query as :
--MR County Tree
SELECT T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17,
T0.ACTIVITY_DATE,
(T0.ACTIVITYID)
FROM DLA_EXTRACT_FINAL T0
INNER JOIN (
SELECT MEMBERNUMBER,
ACTIVITYTYPE,
MAX(ACTIVITY_DATE) MR_CY17,
MIN(ACTIVITY_DATE) IN_CY17
FROM DLA20_EXTRACT_FINAL
WHERE to_char(ACTIVITY_DATE, 'YYYYMMDD') >= 20170101
AND to_char(ACTIVITY_DATE, 'YYYYMMDD') <= 20171231
GROUP BY MEMBERNUMBER,
ACTIVITYTYPE
) T1 ON T0.MEMBERNUMBER = T1.MEMBERNUMBER
AND T0.ACTIVITYTYPE = T1.ACTIVITYTYPE
AND T0.ACTIVITY_DATE = T1.MR_CY17
--where T0.ACTIVITYTYPE='MT'
WHERE t0.MEMBERNUMBER = 'M500085268'
GROUP BY T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17,
T0.ACTIVITYID,
T0.ACTIVITY_DATE
ORDER BY T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T1.MR_CY17,
T1.IN_CY17.
Looking for a solution.
You want to use window functions. Something like:
SELECT T0.MEMBERNUMBER,
T0.ACTIVITYTYPE,
T0.ACTIVITY_DATE,
T0.ACTIVITYID,
case when row_number() over (partition by T0.MEMBERNUMBER, T0.ACTIVITYTYPE
order by T0.ACTIVITY_DATE) = 1 then 1 else 0 end most_initial,
case when row_number() over (partition by T0.MEMBERNUMBER, T0.ACTIVITYTYPE
order by T0.ACTIVITY_DATE desc) = 1 then 1 else 0 end most_recent
FROM DLA_EXTRACT_FINAL T0
Then you can use case statements to label as INITIAL if most_intial = 1, MR if most_recent = 1, or BETWEEN if both are 0.

What's the proper SQL query to find a 'status change' before given date?

I have a table of logged 'status changes'. I need to find the latest status change for a user, and if it was a) a certain 'type' of status change (s.new_status_id), and b) greater than 7 days old (s.change_date), then include it in the results. My current query sometimes returns the second-to-latest status change for a given user, which I don't want -- I only want to evaluate the last one.
How can I modify this query so that it will only include a record if it is the most recent status change for that user?
Query
SELECT DISTINCT ON (s.applicant_id) s.applicant_id, a.full_name, a.email_address, u.first_name, s.new_status_id, s.change_date, a.applied_class
FROM automated_responses_statuschangelogs s
INNER JOIN application_app a on (a.id = s.applicant_id)
INNER JOIN accounts_siuser u on (s.person_who_modified_id = u.id)
WHERE now() - s.change_date > interval '7' day
AND s.new_status_id IN
(SELECT current_status
FROM application_status
WHERE status_phase_id = 'In The Flow'
)
ORDER BY s.applicant_id, s.change_date DESC, s.new_status_id, s.person_who_modified_id;
You can use row_number() to filter one entry per applicant:
select *
from (
select row_number() over (partition by applicant_id
order by change_date desc) rn
, *
from automated_responses_statuschangelogs
) as lc
join application_app a
on a.id = lc.applicant_id
join accounts_siuser u
on lc.person_who_modified_id = u.id
join application_status stat
on lc.new_status_id = stat.current_status
where lc.rn = 1
and stat.status_phase_id = 'In The Flow'
and lc.change_date < now() - interval '7' day

SQL Query in CRM Report

A "Case" in CRM has a field called "Status" with four options.
I'm trying to
build a report in CRM that fills a table with every week of the year (each row is a different week), and then counts the number of cases that have each Status option (the columns would be each of the Status options).
The table would look like this
Status 1 Status 2 Status 3
Week 1 3 55 4
Week 2 5 23 5
Week 3 14 11 33
So far I have the following:
SELECT
SUM(case WHEN status = 1 then 1 else 0 end) Status1,
SUM(case WHEN status = 2 then 1 else 0 end) Status2,
SUM(case WHEN status = 3 then 1 else 0 end) Status3,
SUM(case WHEN status = 4 then 1 else 0 end) Status4,
SUM(case WHEN status = 5 then 1 else 0 end) Status5
FROM [DB].[dbo].[Contact]
Which gives me the following:
Status 1 Status 2 Status 3
2 43 53
Now I need to somehow split this into 52 rows for the past year and filter these results by date (columns in the Contact table). I'm a bit new to SQL queries and CRM - any help here would be much appreciated.
Here is a SQLFiddle with my progress and sample data: http://sqlfiddle.com/#!2/85b19/1
Sounds like you want to group by a range. The trick is to create a new field that represents each range (for you one per year) and group by that.
Since it also seems like you want an infinite range of dates, marc_s has a good summary for how to do the group by trick with dates in a generic way: SQL group by frequency within a date range
So, let's break this down:
You want to make a report that shows, for each contact, a breakdown, week by week, of the number of cases registered to that contact, which is divided into three columns, one for each StateCode.
If this is the case, then you would need to have 52 date records (or so) for each contact. For calendar like requests, it's always good to have a separate calendar table that lets you query from it. Dan Guzman has a blog entry that creates a useful calendar table which I'll use in the query.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
-- order by first date of week, grouping calendar year to produce week numbers
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar -- created from script
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
----include the below if the data is necessary
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20))
+ ', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber
FROM
CRM.dbo.Contact C
-- use a cartesian join to produce a table list
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber
This is different from the solution Ben linked to because Marc's query only returns weeks where there is a matching value, whereas you may or may not want to see even the weeks where there is no activity.
Once you have your core tables of contacts split out week by week as in the above (or altered for your specific time period), you can simply add a subquery for each StateCode to see the breakdown in columns as in the final query below.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20)) +', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Active'
) ActiveCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Resolved'
) ResolvedCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Canceled'
) CancelledCases
FROM
CRM.dbo.Contact C
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber

How to determine if two records are 1 year apart (using a timestamp)

I need to analyze some weblogs and determine if a user has visited once, taken a year break, and visited again. I want to add a flag to every row (Y/N) with a VisitId that meets the above criteria.
How would I go about creating this sql?
Here are the fields I have, that I think need to be used (by analyzing the timestamp of the first page of each visit):
VisitID - each visit has a unique Id (ie. 12356, 12345, 16459)
UserID - each user has one Id (ie. steve = 1, ted = 2, mark = 12345, etc...)
TimeStamp - looks like this: 2010-01-01 00:32:30.000
select VisitID, UserID, TimeStamp from page_view_t where pageNum = 1;
thanks - any help would be greatly appreciated.
You could rank every user's rows, then join the ranked row set to itself to compare adjacent rows:
;
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY TimeStamp)
FROM page_view_t
),
flagged AS (
SELECT
*,
IsReturnVisit = CASE
WHEN EXISTS (
SELECT *
FROM ranked
WHERE UserID = r.UserID
AND rnk = r.rnk - 1
AND TimeStamp <= DATEADD(YEAR, -1, r.TimeStamp)
)
THEN 'Y'
ELSE 'N'
END
FROM ranked r
)
SELECT
VisitID,
UserID,
TimeStamp,
IsReturnVisit
FROM flagged
Note: the above flags only return visits.
UPDATE
To flag the first visits same as return visits, the flagged CTE could be modified as follows:
…
SELECT
*,
IsFirstOrReturnVisit = CASE
WHEN p.UserID IS NULL OR r.TimeStamp >= DATEADD(YEAR, 1, p.TimeStamp)
THEN 'Y'
ELSE 'N'
END
FROM ranked r
LEFT JOIN ranked p ON r.UserID = p.UserID AND r.rnk = p.rnk + 1
…
References that might be useful:
WITH common_table_expression (Transact-SQL)
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
The other guy was faster but since I took time to do it and it's a completely different approach I might as well post It :D.
SELECT pv2.VisitID,
pv2.UserID,
pv2.TimeStamp,
CASE WHEN pv1.VisitID IS NOT NULL
AND pv3.VisitID IS NULL
THEN 'YES' ELSE 'NO' END AS IsReturnVisit
FROM page_view_t pv2
LEFT JOIN page_view_t pv1 ON pv1.UserID = pv2.UserID
AND pv1.VisitID <> pv2.VisitID
AND (pv1.TimeStamp <= DATEADD(YEAR, -1, pv2.TimeStamp)
OR pv2.TimeStamp <= DATEADD(YEAR, -1, pv1.TimeStamp))
AND pv1.pageNum = 1
LEFT JOIN page_view_t pv3 ON pv1.UserID = pv3.UserID
AND (pv3.TimeStamp BETWEEN pv1.TimeStamp AND pv2.TimeStamp
OR pv3.TimeStamp BETWEEN pv2.TimeStamp AND pv1.TimeStamp)
AND pv3.pageNum = 1
WHERE pv2.pageNum = 1
Assuming page_view_t table stores UserID and TimeStamp details of each visit of the user, the following query will return users who have visited taking a break of at least an year (365 days) between two consecutive visits.
select t1.UserID
from page_view_t t1
where (
select datediff(day, max(t2.[TimeStamp]), t1.[TimeStamp])
from page_view_t t2
where t2.UserID = t1.UserID and t2.[TimeStamp] < t1.[TimeStamp]
group by t2.UserID
) >= 365