Snowflake - invalid identifier 'LEVEL' - sql

When running this query, I'm getting an error in Snowflake: invalid identifier 'LEVEL'
Also, I had some help putting parts of the query together, and I'm confused by origination of reference to 'Q'. In the two 'from' clauses in the query, I don't understand where Q is pulling from - I don't have a table called 'Q' to pull from.
WITH Q AS (SELECT LEVEL Q_LEVEL FROM DUAL A CONNECT BY PRIOR LEVEL <= 36),
Q1 AS (
select
Q.Q_LEVEL Q_LEVEL
, v_dept_history_adj.associate_id,
v_dept_history_adj.home_department_code,
v_dept_history_adj.position_effective_date
, max(position_effective_date)
OVER(PARTITION BY v_dept_history_adj.associate_id) AS most_recent_record
from datawarehouse.srctable, Q
where v_dept_history_adj.position_effective_date <=
last_day(date_from_parts(year(current_date()),
month(current_date())-Q.Q_LEVEL,1),month))
select
associate_id
, position_effective_date
, home_department_code,
most_recent_record
, (last_day(date_from_parts(year(current_date())
,month(current_date())-Q_LEVEL,1),month)) AS month
FROM Q1
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc

So in Oracle
SELECT LEVEL
FROM DUAL A
CONNECT BY PRIOR LEVEL <= 36
would give the rows 1 -> 36, in Snowflake need to use:
select row_number() over(order by null) as q_level
from table(generator(ROWCOUNT=>36));
last_day(date_from_parts(year(current_date()),
month(current_date())-Q.Q_LEVEL,1),month)
is the same as:
last_day(dateadd(month, -q.q_level, CURRENT_DATE), month)
at which point that should be moved into the Q CTE for read ability:
WITH Q AS (
select
row_number() over(order by null) as q_level,
last_day(dateadd(month, -q_level, CURRENT_DATE), month) as last_day_month
from table(generator(ROWCOUNT=>36))
), Q1 AS (
select
q.last_day_month
,v_dept_history_adj.associate_id
,v_dept_history_adj.home_department_code
,v_dept_history_adj.position_effective_date
,max(position_effective_date) OVER (PARTITION BY v_dept_history_adj.associate_id) AS most_recent_record
from datawarehouse.srctable
join Q
on v_dept_history_adj.position_effective_date <= q.last_day_month
)
select
associate_id
,position_effective_date
,home_department_code
,most_recent_record
,last_day_month AS month
FROM Q1
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc
So in the question, it was asked "where does Q come from" Q is the first of the CTE's or Common Table Expression, with the the two blocks of
WITH cte_name (
<valid sql selection>
)
The above SQL from logical perspective could be as:
with cte_q as (
select
row_number() over(order by null) as q_level,
last_day(dateadd(month, -q_level, CURRENT_DATE), month) as last_day_month
from table(generator(ROWCOUNT=>36))
), cte_q1 as (
select
q.last_day_month
,v_dept_history_adj.associate_id
,v_dept_history_adj.home_department_code
,v_dept_history_adj.position_effective_date
,max(position_effective_date) over(partition by v_dept_history_adj.associate_id) AS most_recent_record
from datawarehouse.srctable
join cte_q
on v_dept_history_adj.position_effective_date <= q.last_day_month
)
select
associate_id
,position_effective_date
,home_department_code
,most_recent_record
,last_day_month AS month
from cte_q1
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc
OR cte_q could be just a sub-query:
with cte_q1 as (
select
q.last_day_month
,v_dept_history_adj.associate_id
,v_dept_history_adj.home_department_code
,v_dept_history_adj.position_effective_date
,max(position_effective_date) over(partition by v_dept_history_adj.associate_id) AS most_recent_record
from datawarehouse.srctable
join (
select
row_number() over(order by null) as q_level,
last_day(dateadd(month, -q_level, CURRENT_DATE), month) as last_day_month
from table(generator(ROWCOUNT=>36))
) as q
on v_dept_history_adj.position_effective_date <= q.last_day_month
)
select
associate_id
,position_effective_date
,home_department_code
,most_recent_record
,last_day_month AS month
from cte_q1
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc
and then the cte_q1 can also be just a sub-query, but the "main" query is just filtering results, so that can be pushed into a QUALIFY clause:
select
,v_dept_history_adj.associate_id
,v_dept_history_adj.position_effective_date
,v_dept_history_adj.home_department_code
,max(position_effective_date) over(partition by v_dept_history_adj.associate_id) AS most_recent_record
q.last_day_month as month
from datawarehouse.srctable
join (
select
row_number() over(order by null) as q_level,
last_day(dateadd(month, -q_level, CURRENT_DATE), month) as last_day_month
from table(generator(ROWCOUNT=>36))
) as q
on v_dept_history_adj.position_effective_date <= q.last_day_month
qualify position_effective_date = most_recent_record
order by month desc, position_effective_date desc
The major problem I see is that v_dept_history_adj seems like a view name, but it does not appear in the original tables/cte list, so I am not sure what the real code is really doing.

Related

Is there a better way to stack 36 monthly snapshots of employee department assignments in SQL other than 36 blocks unioned together?

I am trying to create a table with monthly snapshots of all employees by department. My source table is a list of transactions:
Employee A - Dept 1 - 1/1/2020
Employee A - Dept 2 - 7/1/2021
Employee B - Dept 1 - 10/1/2022
I've figured out how to do one snapshot, and using UNION ALL to stack them. My next step is to copy paste my query 36 times and edit the month index by minus one so I have a running 36 months of snapshots.
I'm wondering if there is a more efficient way to write the query to achieve the same end, since I know having the same block of code 36 times with only one element changing per instance is inefficient.
select associate_id, position_effective_date, home_department_code,
most_recent_record, (last_day(date_from_parts(year(current_date()),
month(current_date())-1,1),month)) AS month
from(
select v_dept_history_adj.associate_id,
v_dept_history_adj.home_department_code,
v_dept_history_adj.position_effective_date, max(position_effective_date)
OVER(PARTITION BY v_dept_history_adj.associate_id) AS most_recent_record
from src_table
where v_dept_history_adj.position_effective_date <=
last_day(date_from_parts(year(current_date()),
month(current_date())-1,1),month))
where position_effective_date = most_recent_record
union all
select associate_id, position_effective_date, home_department_code,
most_recent_record, (last_day(date_from_parts(year(current_date()),
month(current_date())-2,1),month)) AS month
from(
select v_dept_history_adj.associate_id,
v_dept_history_adj.home_department_code,
v_dept_history_adj.position_effective_date, max(position_effective_date)
OVER(PARTITION BY v_dept_history_adj.associate_id) AS most_recent_record
from src_table
where v_dept_history_adj.position_effective_date <=
last_day(date_from_parts(year(current_date()),
month(current_date())-2,1),month))
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc
Hope this would give an idea on the approach
select associate_id, position_effective_date, home_department_code,
most_recent_record, (last_day(date_from_parts(year(current_date()),
month(current_date())-Q_LEVEL,1),month)) AS month
from(
WITH Q AS (SELECT LEVEL Q_LEVEL FROM DUAL A CONNECT BY LEVEL <= 36)
select Q.Q_LEVEL Q_LEVEL, v_dept_history_adj.associate_id,
v_dept_history_adj.home_department_code,
v_dept_history_adj.position_effective_date, max(position_effective_date)
OVER(PARTITION BY v_dept_history_adj.associate_id) AS most_recent_record
from src_table, Q
where v_dept_history_adj.position_effective_date <=
last_day(date_from_parts(year(current_date()),
month(current_date())-Q.Q_LEVEL,1),month))
where position_effective_date = most_recent_record
order by month desc, position_effective_date desc

SQL QUERY to get previous date

PERIOD_SERV
PERSON_NUMBER DATE_sTART PERIOD_ID
10 06-JAN-2020 192726
10 04-APR-2019 12827
11 01-FEB-2021 282726
11 09-APR-2018 827266
For each person_number I want to add a column with previous date start. When i am using the below query, it is giving me repeated rows.
I want to get only row, with an additional column of the most recent "last date_start". For example -
PERSON_NUMBER DATE_sTART PERIOD_ID PREVIOUS_DATE
10 06-JAN-2020 192726 04-APR-2019
11 01-FEB-2021 282726 09-APR-2018
I am using the below query but getting two rows,
SELECT person_number,
period_id AS pv_period_id,
LAG(date_start) OVER ( PARTITION BY person_number ORDER BY date_start) AS previous_date
FROM period_serv
You can restrict the set of rows in the outer query
select person_number, pv_period_id, PREVIOUS_DATE
from (
select person_number,
PERIOD_ID pv_period_id,
lag(date_start) OVER ( partition BY person_number order by DATE_sTART ) PREVIOUS_DATE ,
row_number() OVER ( partition BY person_number order by DATE_sTART desc) rn
from period_serv
) t
where rn = 1
One option is to use MAX(..) KEEP (DENSE_RANK ..) OVER (PARTITION BY ..) analytic function such as
WITH p AS
(
SELECT MAX(date_start) KEEP (DENSE_RANK FIRST ORDER BY date_start)
OVER (PARTITION BY person_number) AS previous_date,
p.*
FROM period_serv p
)
SELECT p.person_number, p.date_start, p.period_id, p.previous_date
FROM p
JOIN period_serv ps
ON ps.person_number = p.person_number
AND ps.period_id = p.period_id
WHERE ps.date_start != previous_date
Demo

Computing session start and end using SQL window functions

I've a table of game logs containing a handDate, like this:
ID
handDate
1
2019-06-30 16:14:02.000
2
2019-07-12 06:18:02.000
3
...
I'd like to compute game sessions from this table (start and end), given that:
A new session is considered if there is no activity since 1 hour.
a session can exist across 2 days
So I'd like results like this:
day
session_start
sesssion_end
2019-06-30
2019-06-15 16:14:02.000
2019-06-15 16:54:02.000
2019-07-02
2019-07-02 16:18:02.000
2019-07-02 17:18:02.000
2019-07-02
2019-07-02 23:18:02.000
2019-07-03 03:18:02.000
2019-07-03
2019-07-03 06:18:02.000
2019-07-03 08:28:02.000
Currently I'm playing with the following code, but cannot achieve what I want:
SELECT *
FROM (
SELECT *,
strftime( '%s', handDate) - strftime( '%s', prev_event) AS inactivity
FROM (
SELECT handDate,
date( handDate) as day,
FIRST_VALUE( handDate) OVER (PARTITION BY date( handDate) ORDER BY handDate) AS first_event,
MIN(handDate) OVER (PARTITION BY date( handDate) ORDER BY handDate),
MAX(handDate) OVER (PARTITION BY date( handDate) ORDER BY handDate),
LAG( handDate) OVER (PARTITION BY date( handDate) ORDER BY handDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS prev_event,
LEAD( handDate) OVER (PARTITION BY date( handDate) ORDER BY handDate) AS next_event
FROM hands
) last
) final
I'm using SQLite.
I found the following solution:
SELECT day,
sessionId,
MIN(handDate) as sessionStart,
MAX(handDate) as sessionEnd
FROM(
SELECT day,
handDate,
sum(is_new_session) over (
order by handDate rows between unbounded preceding and current row
) as sessionId
FROM (
SELECT *,
CASE
WHEN prev_event IS NULL
OR strftime('%s', handDate) - strftime('%s', prev_event) > 3600 THEN true
ELSE false
END AS is_new_session
FROM (
SELECT handDate,
date(handDate) as day,
LAG(handDate) OVER (
PARTITION BY date(handDate)
ORDER BY handDate RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS prev_event
FROM hands
)
)
)
GROUP BY sessionId
DROP TABLE IF EXISTS hands;
CREATE TABLE hands(handDate TIMESTAMP);
INSERT INTO hands(handDate)
VALUES ('2021-10-29 10:30:00')
, ('2021-10-29 11:35:00')
, ('2021-10-29 11:36:00')
, ('2021-10-29 11:37:00')
, ('2021-10-29 12:38:00')
, ('2021-10-29 12:39:00')
, ('2021-10-29 12:39:10')
;
SELECT start_period, end_period
FROM (
SELECT is_start, handDate AS start_period
, CASE WHEN is_start AND is_end THEN handDate
ELSE LEAD(handDate) OVER (ORDER BY handDate)
END AS END_period
FROM (
SELECT *
FROM (
SELECT *
,CASE WHEN (event-prev_event) * 1440.0 > 60 OR prev_event IS NULL THEN true ELSE FALSE END AS is_start
,CASE WHEN (next_event-event) * 1440.0 > 60 OR next_event IS NULL THEN true ELSE FALSE END AS is_end
FROM (
SELECT handDate
, juliANDay(handDate) event
, juliANDay(LAG(handDate) OVER (ORDER BY handDate)) AS prev_event
, juliANDay(LEAD(handDate) OVER (ORDER BY handDate)) AS next_event
FROM hands
) t
) t
WHERE is_start OR is_end
)t
)t
WHERE is_start

How do I include a days calculation?

We got this to work well, but I want to show a column that will have the days since the last actual_date
I don't know how to code 'day' to be an output column.
WITH
cte_ul_ev AS (
SELECT
ev.full_name,
ev.event_name,
ev.actual_date,
ev.service_provider_name,
datediff(day, actual_date, getdate())
row_num = ROW_NUMBER() OVER (PARTITION BY ev.full_name ORDER BY ev.actual_date DESC) --<<--<<--
FROM
dbo.event_expanded_view ev
WHERE
ev.full_name IS NOT NULL
AND ev.category_code IN ('OTHER_ACT', 'CONTACTS', 'PEOPLEPLANS', 'PEOPLETESTS', 'PERSONREQ')
)
SELECT
ue.full_name,
ue.event_name,
ue.actual_date,
ue.service_provider_name
FROM
cte_ul_ev ue
WHERE
ue.row_num = 1;
you just missing a comma and , and wrong way of aliasing the column and seesm like distinct is ans extra thing you are doing
;WITH
cte_ul_ev AS (
SELECT
ev.full_name,
ev.event_name,
ev.actual_date,
ev.service_provider_name,
datediff(day, actual_date, getdate()) as DaysDiff,
ROW_NUMBER() OVER (PARTITION BY ev.full_name ORDER BY ev.actual_date DESC) as row_num --<<--<<--
FROM
dbo.event_expanded_view ev
WHERE
ev.full_name IS NOT NULL
AND ev.category_code IN ('OTHER_ACT', 'CONTACTS', 'PEOPLEPLANS', 'PEOPLETESTS', 'PERSONREQ')
)
SELECT
ue.full_name,
ue.event_name,
ue.actual_date,
ue.service_provider_name.
ue.DaysDiff
FROM
cte_ul_ev ue
WHERE
ue.row_num = 1;

Returning max value and value prior

I have a query I run to tell me the latest note for active participants:
select notes.applicant_id,
reg.program_code,
reg.last_name,
reg.first_name,
reg.status_cd,
MAX(notes.service_date) as "Last Note"
from reg inner join notes on reg.applicant_id=notes.applicant_id
where reg.status_cd='AC'
group by notes.applicant_id, reg.program_code,
reg.last_name, reg.first_name, reg.reg_date,
reg.region_code, reg.status_cd
order by MAX(notes.service_date)
But I would also like this query to give me the result of the note.service_date just prior to the max service_date as well.
Results would look like this
notes.applicant_id reg.last_name reg.first_name reg.status_cd Last Note Prior Note
12345 Johnson Lori AC 01-NOV-2011 01-OCT-2011
I am working in oracle.
You can use the lag function, or join it with the same table.
Here is a simpler example (you haven't givven us data sample):
create table t as
(select level as id, mod(level , 3) grp, sysdate - level dt
from dual
connect by level < 100
)
and here are the queries:
select t2.grp,t1.grp, max(t1.dt) mdt, max(t2.dt) pdt
from t t1
join t t2 on t1.dt < t2.dt and t1.grp = t2.grp
group by t2.grp, t1.grp;
or
select grp, max(pdt), max(dt)
from(
select grp, lag(dt) over (partition by grp order by dt) pdt, dt
from t)
group by grp
Here is a fiddle
In your case it could be something like this:
select t.applicant_id, t.program_code,
t.last_name, t.first_name, t.reg_date,
t.region_code, t.status_cd,
max(t.dt) as "Last Note",
max(t.pdt) as "Prev Note"
from (
select notes.applicant_id,
reg.program_code,
reg.last_name,
reg.first_name,
reg.status_cd,
notes.service_date as dt,
lag(notes.service_date) over (partition by notes.applicant_id,
reg.program_code,
reg.last_name,
reg.first_name,
reg.status_cd order by notes.service_date) as pdt
from reg inner join notes on reg.applicant_id=notes.applicant_id
where reg.status_cd='AC'
) t
group by t.applicant_id, t.program_code,
t.last_name, t.first_name, t.reg_date,
t.region_code, t.status_cd
order by MAX(t.dt)
If I understand you correctly, here's one way to do it:
SELECT *
FROM (select notes.applicant_id,
reg.program_code,
reg.last_name,
reg.first_name,
reg.status_cd,
notes.service_date AS "Last Note",
ROW_NUMBER() OVER (PARTITION BY notes.applicant_id, reg.program_code,
reg.last_name, reg.first_name, reg.reg_date, reg.region_code,
reg.status_cd ORDER BY notes.service_date DESC) rn
from reg inner join notes on reg.applicant_id=notes.applicant_id
where reg.status_cd='AC')
WHERE rn < 3;