SQL (BigQuery ANSI) Efficiently get last value that was updated before 1st of each month - sql

So my database has a changes history table which I am looking up to know my user's status on 1st of each month. Since changing dates are arbitrary, I am trying to get date of last update before 1st of that month (considering the fact that users stay in same status unless recorded by the same table again) then checking what status user had on that timestamp and regarding that as the status of user on the first of the month. So doing something like this:
WITH converted_before_time_changed as (
SELECT dch.user_id,
max(CASE WHEN dch.time_changed <= '2022-01-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_jan_1,
max(CASE WHEN dch.time_changed <= '2022-02-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_feb_1,
max(CASE WHEN dch.time_changed <= '2022-03-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_mar_1,
max(CASE WHEN dch.time_changed <= '2022-04-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_apr_1,
max(CASE WHEN dch.time_changed <= '2022-05-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_may_1,
max(CASE WHEN dch.time_changed <= '2022-06-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_jun_1,
max(CASE WHEN dch.time_changed <= '2022-07-01' THEN dch.time_changed ELSE NULL END) as time_changed_before_jul_1,
FROM my_database.defacto_users_changes_history dch
WHERE dch.table = 'all_users' AND dch.column='status'
GROUP BY user_id
),
c2_before_flags as (SELECT
c2b.user_id,
jan_dch.new_value as status_on_jan_1,
feb_dch.new_value as status_on_feb_1,
mar_dch.new_value as status_on_mar_1,
apr_dch.new_value as status_on_apr_1,
may_dch.new_value as status_on_may_1,
jun_dch.new_value as status_on_jun_1,
jul_dch.new_value as status_on_jul_1
FROM
converted_before_time_changed c2b
LEFT JOIN my_database.defacto_users_changes_history jan_dch on jan_dch.time_changed = time_changed_before_jan_1 AND c2b.user_id = jan_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history feb_dch on feb_dch.time_changed = time_changed_before_feb_1 AND c2b.user_id = feb_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history mar_dch on mar_dch.time_changed = time_changed_before_mar_1 AND c2b.user_id = mar_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history apr_dch on apr_dch.time_changed = time_changed_before_apr_1 AND c2b.user_id = apr_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history may_dch on may_dch.time_changed = time_changed_before_may_1 AND c2b.user_id = may_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history jun_dch on jun_dch.time_changed = time_changed_before_jun_1 AND c2b.user_id = jun_dch.user_id
LEFT JOIN my_database.defacto_users_changes_history jul_dch on jul_dch.time_changed = time_changed_before_jul_1 AND c2b.user_id = jul_dch.user_id
)
SELECT * FROM c2_before_flags
This already takes a lot of time which increases exponentially with each month added, plus its not scalable as I have to edit the query to add each month. What would be the ideal way of achieving the same, dynamically and efficiently?

Related

Nested CASE statements in SQL

I am running the below SQL and I need to add a case statement for the svcState column.
I have a value defined for each number in that column which I need to have in my query. For instance 7 is OK, 4 is down etc. I tried adding this in the CASE statement as below and it seems, the syntax is incorrect. Any help will be greatly appreciated.
SELECT * FROM
(
SELECT
A.NodeName AS NodeName,
MAX(CASE WHEN Poller_Name='svcServiceName' THEN CAST(Status AS varchar) ELSE ''END) svcServiceName,
MAX(CASE (CASE WHEN Poller_Name='svcState' AND Status ='7' THEN 'OK'
WHEN Poller_Name='svcstate' AND Status ='4' THEN 'OUT OF SERVICE' END)
THEN CAST(Status AS bigint) ELSE '' END) svcState
FROM
(
SELECT
Nodes.Caption AS NodeName, CustomNodePollers_CustomPollers.UniqueName AS Poller_Name, CustomNodePollerStatus_CustomPollerStatus.Status AS Status, CustomNodePollerStatus_CustomPollerStatus.rowid as row, CustomNodePollerStatus_CustomPollerStatus.RawStatus as RawStatus
FROM
((Nodes INNER JOIN CustomPollerAssignment CustomNodePollerAssignment_CustomPollerAssignment ON (Nodes.NodeID = CustomNodePollerAssignment_CustomPollerAssignment.NodeID)) INNER JOIN CustomPollers CustomNodePollers_CustomPollers ON (CustomNodePollerAssignment_CustomPollerAssignment.CustomPollerID = CustomNodePollers_CustomPollers.CustomPollerID)) INNER JOIN CustomPollerStatus CustomNodePollerStatus_CustomPollerStatus ON (CustomNodePollerAssignment_CustomPollerAssignment.CustomPollerAssignmentID = CustomNodePollerStatus_CustomPollerStatus.CustomPollerAssignmentID)
WHERE
(
(CustomNodePollers_CustomPollers.UniqueName = 'svcServiceName') OR
(CustomNodePollers_CustomPollers.UniqueName = 'svcState')
)
AND
(
(CustomNodePollerAssignment_CustomPollerAssignment.InterfaceID = 0)
)
and Nodes.Caption = '101'
)A
GROUP BY NodeName, row
--ORDER BY svcServiceName
) B
Desired Output
MAX(CASE WHEN Poller_Name = 'svcState' THEN (CASE WHEN status = '7' THEN 'OK' ELSE 'DOWN' END) END)
Or...
MAX(CASE WHEN Poller_Name = 'svcState' AND status = '7' THEN 'OK'
WHEN Poller_Name = 'svcState' AND status = '4' THEN 'DOWN' END)
Or...
MAX(CASE WHEN Poller_Name != 'svcState' THEN NULL -- Assumes the poller_name is never NULL
WHEN status = '7' THEN 'OK'
WHEN status = '4' THEN 'DOWN'
END)
Where there is no ELSE specified, it is implicitly ELSE NULL, and NULL values are skipped by the MAX().

How to filter columns in addition to summing up in sql developer?

I wrote this code to get the number of risks with different status but I don't want the result in multiple rows, I just added the C_INSERTTIME field in order to filter the time later. how can i get the result in one row?
SELECT rsk.C_INSERTTIME inserttime,
SUM(CASE
WHEN rskst.c_code = 'HSE_RISK_STATUS_DRAFT' THEN 1
ELSE 0
END) AS drafted_risk,
SUM(CASE
WHEN rskst.c_code <> 'HSE_RISK_STATUS_DRAFT' THEN 1
ELSE 0
END) AS analyzed_risk,
SUM(CASE
WHEN rskdecsass.c_code = 'HSE_DECISION_ASSESSMENT_APPROVED' THEN 1
ELSE 0
END) AS approved_assessed_risks
FROM T_HSE_RISK rsk
LEFT JOIN t_hse_category_element rskst ON rskst.c_id = rsk.f_category_element_id_rsk_stts
LEFT JOIN t_hse_category_element rskdecsass ON rskdecsass.c_id = rsk.F_CTGRY_ELMNT_ID_DCSN_ASSSSMNT
WHERE rsk.C_INSERTTIME >= TIMESTAMP '2000-01-01 00:00:00'
GROUP BY rsk.C_INSERTTIME
You should remove the GROUP BY and use aggregation on C_inserttime as follows:
SELECT min(rsk.C_INSERTTIME) inserttime, -- or max
SUM(CASE
WHEN rskst.c_code = 'HSE_RISK_STATUS_DRAFT' THEN 1
ELSE 0
END) AS drafted_risk,
SUM(CASE
WHEN rskst.c_code <> 'HSE_RISK_STATUS_DRAFT' THEN 1
ELSE 0
END) AS analyzed_risk,
SUM(CASE
WHEN rskdecsass.c_code = 'HSE_DECISION_ASSESSMENT_APPROVED' THEN 1
ELSE 0
END) AS approved_assessed_risks
FROM T_HSE_RISK rsk
LEFT JOIN t_hse_category_element rskst ON rskst.c_id = rsk.f_category_element_id_rsk_stts
LEFT JOIN t_hse_category_element rskdecsass ON rskdecsass.c_id = rsk.F_CTGRY_ELMNT_ID_DCSN_ASSSSMNT
WHERE rsk.C_INSERTTIME >= TIMESTAMP '2000-01-01 00:00:00'
-- GROUP BY rsk.C_INSERTTIME -- removed this

Dynamicaly Naming Columns - Using SQL server 2008

I have query that that references a dynamic calander table, it updates aged periods based on the current date. I am using this table in another query, it sum's sales value based on aged QTR's. These values will update as time passes and I can assign colmn names such as 'AGED1Q' but I would like to find a way to use the actual period name from the calander table. Any sugestions will be greatly appreciated.
Current query is below, I have added a few comment lines where I would like to make the changes. Thanks again
SELECT dbo.CUSTOMER_ORDER.CUSTOMER_ID AS CUST_ID,
SUM(CASE WHEN dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -4 THEN dbo.CUST_ORDER_LINE.TOTAL_AMT_ORDERED ELSE 0 END) AS '-4', --As MAX(dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTR) WHERE dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -4
SUM(CASE WHEN dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -3 THEN dbo.CUST_ORDER_LINE.TOTAL_AMT_ORDERED ELSE 0 END) AS '-3', --As MAX(dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTR) WHERE dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -3
SUM(CASE WHEN dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -2 THEN dbo.CUST_ORDER_LINE.TOTAL_AMT_ORDERED ELSE 0 END) AS '-2', --As MAX(dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTR) WHERE dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -2
SUM(CASE WHEN dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -1 THEN dbo.CUST_ORDER_LINE.TOTAL_AMT_ORDERED ELSE 0 END) AS '-1', --As MAX(dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTR) WHERE dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED = -1
SUM(CASE WHEN dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED IN(-4, -3, -2, -1) THEN dbo.CUST_ORDER_LINE.TOTAL_AMT_ORDERED ELSE 0 END) AS 'TOTAL'
FROM dbo.CUSTOMER_ORDER LEFT OUTER JOIN
dbo.UFC_Calander LEFT OUTER JOIN
dbo.UCC_CALENDAR_TODAY_BASED_AGING ON dbo.UFC_Calander.DAY = dbo.UCC_CALENDAR_TODAY_BASED_AGING.DATES ON
dbo.CUSTOMER_ORDER.ORDER_DATE = dbo.UFC_Calander.DAY LEFT OUTER JOIN
dbo.CUSTOMER ON dbo.CUSTOMER_ORDER.CUSTOMER_ID = dbo.CUSTOMER.ID RIGHT OUTER JOIN
dbo.PART RIGHT OUTER JOIN
dbo.CUST_ORDER_LINE ON dbo.PART.ID = dbo.CUST_ORDER_LINE.PART_ID ON dbo.CUSTOMER_ORDER.ID = dbo.CUST_ORDER_LINE.CUST_ORDER_ID
WHERE (dbo.CUSTOMER_ORDER.STATUS <> 'x') AND dbo.UCC_CALENDAR_TODAY_BASED_AGING.QTRs_AGED IN(-4, -3, -2, -1)
GROUP BY dbo.CUSTOMER_ORDER.CUSTOMER_ID
HAVING (dbo.CUSTOMER_ORDER.CUSTOMER_ID <> 'UNFOCO') AND (dbo.CUSTOMER_ORDER.CUSTOMER_ID <> 'QUOTE')
ORDER BY TOTAL DESC

Cross Join Generating Series of Dates

I am trying to count the number of tenants that renew in a month, based off of a "dtleasefrom" (lease start date). However, I want to attribute each renewal back to the month that the last lease ended, and tie it to a generated series of dates that can be used to pivot off of. However, I cannot seem to figure out why the unit_month_date doesn't equal the dtleaseto date. See pictures and code below.
Select
ten.scode as leasename
,th.sevent as event_type
,th.dtoccurred as date_occurred
,unit.scode as unit
,p.scode as property
,th.dtapply as apply_date
,th.dtapprove as approve_date
,th.dtsigndate as sign_date
,th.dtmovein as move_in_date
,th.dtmoveout as move_out_date
,th.dtleasefrom as lease_from
,th.dtleaseto as lease_to
,th.dtnotice as notice_date
,th.crent as rent
,att.subgroup2 as zone
,(ten.sfirstname || ' ' || ten.slastname) as tenant_name
,pm.property_manager as property_manager
,(case when date_trunc('month',th2.b) = dd.month::date and th2.row_numba = 1 and th.sevent = 'Lease Signed' then 1 else 0 end) as Leases_Ending
,dd.month::date as unit_month
,(case when date_trunc('month',th2.b) = dd.month::date and EXTRACT(day from age(th.dtleasefrom,th2.b)) <= 60 and EXTRACT(day from age(th.dtleasefrom,th2.b)) >= 0 and th2.row_numba = 1 and th.sevent = 'Lease Renewal' then 1 else 0 end) as Renewals
,(case when date_trunc('month',th2.b) = dd.month::date and EXTRACT(day from age(th.dtleasefrom,th2.b)) <= 60 and th2.row_numba = 1 and th.sevent = 'Lease Renewal' then 1 else 0 end) as Renewals_All_In
,(case when th.istatus not in (1,2) and date_trunc('month',th2.b) < dd.month::date and th.sevent = 'Lease Signed' then 1 else 0 end) as MTM_tenant
,th2.row_numba
,th.hmy
FROM
yardi.tenant ten
JOIN yardi.Tenant_History th on ten.hmyperson = th.htent
CROSS JOIN (SELECT generate_series('05-01-2017'::date,'01-01-2020','1 month') as month) dd
JOIN (SELECT th.htent as a , date_trunc('month',th.dtleaseto) as b, Row_Number() over(partition by th.htent order by th.hmy) as Row_Numba FROM yardi.tenant_history th group by th.htent,th.dtleaseto,th.hmy ) th2 on th2.b = dd.month and th.htent = th2.a
JOIN yardi.unit on unit.hmy = th.hunit
JOIN yardi.property p on p.hmy = unit.hproperty
JOIN yardi.attributes att on att.hprop = p.hmy
JOIN yardi.propbut_property_management pm on pm.hcode = p.hmy
WHERE ten.istatus < 6 and th2.row_numba = 1

How to group by month and year and also separate the entries?

I have a table [MY_TABLE] with the following datas : a date [DOCUMENT_DATE] and a status [STATUS]. I want to separate and count the 3 differents status : open when status < 8, lost when status = 8 or win when status > 8 while grouping them by month and year.
The final result would be something like that : year, month, count(won), count(lost), count(open), giving effectively the count of each status for each month.
Some months don't have status at all (can be ignored) and some have only some status and not all of them (should write the month and year correctly)
I have a working query right now but it is really huge :
SELECT
CASE WHEN "open".year IS NOT NULL
THEN
"open".year
ELSE
(CASE WHEN "lost".year IS NOT NULL
THEN
"lost".year
ELSE
"won".year
END)
END AS "Année",
CASE WHEN "open".month IS NOT NULL
THEN
"open".month
ELSE
(CASE WHEN "lost".month IS NOT NULL
THEN
"lost".month
ELSE
"won".month
END)
END AS "Mois",
"open".count AS "Ouvertes",
"lost".count AS "Perdues",
"won".count AS "Gagnées"
FROM (SELECT
year([DOCUMENT_DATE]) AS "year",
MONTH([DOCUMENT_DATE]) AS "month",
COUNT(*) AS "count"
FROM [MY_TABLE]
WHERE [STATUS] < 8 AND [DOCUMENT_DATE] >= ?1 AND [DOCUMENT_DATE] <= ?2 AND ([SEGMENT] = ?3 OR ?3 IS NULL)
GROUP BY YEAR([DOCUMENT_DATE]), MONTH([DOCUMENT_DATE])) AS "open"
FULL JOIN (SELECT
year([DOCUMENT_DATE]) AS "year",
MONTH([DOCUMENT_DATE]) AS "month",
COUNT(*) AS "count"
FROM [MY_TABLE]
WHERE [STATUS] = 8 AND [DOCUMENT_DATE] >= ?1 AND [DOCUMENT_DATE] <= ?2 AND ([SEGMENT] = ?3 OR ?3 IS NULL)
GROUP BY YEAR([DOCUMENT_DATE]), MONTH([DOCUMENT_DATE])) AS "lost"
ON "open".month = "lost".month AND "open".year = "lost".year
FULL JOIN (SELECT
year([DOCUMENT_DATE]) AS "year",
MONTH([DOCUMENT_DATE]) AS "month",
COUNT(*) AS "count"
FROM [MY_TABLE]
WHERE [STATUS] > 8 AND [DOCUMENT_DATE] >= ?1 AND [DOCUMENT_DATE] <= ?2 AND ([SEGMENT] = ?3 OR ?3 IS NULL)
GROUP BY YEAR([DOCUMENT_DATE]), MONTH([DOCUMENT_DATE])) AS "won"
ON "open".month = "won".month AND "open".year = "won".year
ORDER BY CASE WHEN "open".year IS NOT NULL
THEN
"open".year
ELSE
(CASE WHEN "lost".year IS NOT NULL
THEN
"lost".year
ELSE
"won".year
END)
END,
CASE WHEN "open".month IS NOT NULL
THEN
"open".month
ELSE
(CASE WHEN "lost".month IS NOT NULL
THEN
"lost".month
ELSE
"won".month
END)
END
I'm fairly sure there is a much simpler and cleaner way to do that but I can't figure it out.
I think this may be what you are looking for, based on the description.
SELECT year([DOCUMENT_DATE]) AS "year",
MONTH([DOCUMENT_DATE]) AS "month",
COUNT(case when [STATUS] > 8 then 1 end) win_count,
COUNT(case when [STATUS] = 8 then 1 end) lost_count,
COUNT(case when [STATUS] < 8 then 1 end) open_count
FROM [MY_TABLE]
GROUP BY year([DOCUMENT_DATE]),MONTH([DOCUMENT_DATE])
ORDER BY 1,2
Add WHERE [DOCUMENT_DATE] >= ?1 AND [DOCUMENT_DATE] <= ?2 AND ([SEGMENT] = ?3 OR ?3 IS NULL) if the condition is common across all the counts.