Add daily status value to records ID in bigquery - sql

I have a table with records for each ID at a certain date. I want the table to have the missing dates for the IDs to track daily status of a value. If there isn't any change, the value from previous day will carry on.
*The date list should be set to CURRENT_DATE(), so I always have current date status of value. For the example below, 2023-02-05 is the current date therefore an ID might not have any change at current date.
This is the table
ID
First_name
Last_name
Date
Value
aaa
Adam
Glen
2023-02-02
Green
aaa
Adam
Glen
2023-02-05
Red
bbb
Daniel
Blue
2023-02-02
Red
bbb
Daniel
Blue
2023-02-04
Green
This is the output I want to have from the query
ID
First_name
Last_name
Date
Value
aaa
Adam
Glen
2023-02-02
Green
aaa
Adam
Glen
2023-02-03
Green
aaa
Adam
Glen
2023-02-04
Green
aaa
Adam
Glen
2023-02-05
Red
bbb
Daniel
Blue
2023-02-02
Red
bbb
Daniel
Blue
2023-02-03
Red
bbb
Daniel
Blue
2023-02-04
Green
bbb
Daniel
Blue
2023-02-05
Green

A little bit verbose but you might consider below approach.
SELECT * EXCEPT(Min_date, Value),
COALESCE(Value,LAST_VALUE(Value IGNORE NULLS) OVER w) AS Value
FROM (
SELECT ID, First_name, Last_name, MIN(Date) Min_date FROM sample_table GROUP BY 1, 2, 3
), UNNEST(GENERATE_DATE_ARRAY(Min_date, '2023-02-05')) Date
LEFT JOIN sample_table USING (ID, First_name, Last_name, Date)
WINDOW w AS (PARTITION BY ID ORDER BY Date);
Query results
you can replace 2023-02-05 in GENERATE_DATE_ARRAY(Min_date, '2023-02-05') with CURRENT_DATE

Related

Trying to count unique observations in SQL using Partition By

I have these two datasets:
Conditions: I would like to count the number of Unique Discharge_ID as Total_Discharges in my final dataset.
ICU_ID is a little bit more difficult. For PT_ID 001, what is happening is that PT 001 has 4 of the same discharge dates but 4 unique ICU_IDs. Since all of these ICU_IDs occur within 30 days of the Discharge_DT, I only want to count one of them. That is why total discharges for AZ is 1 and ICU_Admits = 1.
For PT_ID 002, I have 2 different Discharge_IDs but 1 ICU Admit that occurred within 30 days of both of the Discharge_IDs. I would like to count the Discharges as 2, and ICU_admits as 1.
DF1: Dataset of Discharges from hospital and admission to ICU within 30 days of Discharge_DT
City
PT_ID
Hospital_ID
Admit_Dt
Discharge_DT
Discharge_ID
ICU_ID
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-05-2021,01-06-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-08-2021,01-09-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-11-2021,01-11-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-15-2021,01-16-2021
CA
002
DEF
04-03-2021
04-07-2021
001,ABC,04-03-2021,04-07-2021
002,LMN,04-27-2021,04-27-2021
CA
002
DEF
04-20-2021
04-21-2021
001,ABC,04-20-2021,04-21-2021
002,LMN,04-27-2021,04-27-2021
DF desired:
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
1
Current Code:
DROP TABLE IF EXISTS #edit1
WITH CTE_df1 as (
select * from df1
)
select
City,
PT_ID,
Hospital_ID,
Admit_Dt,
Discharge_DT,
Discharge_ID,
count(ICU_ID) over (partition by ICU_ID) as ICU_Pts,
count(distinct Discharge_ID) as Total_Discharges
into #edit1
from CTE_df1
group by City, Discharge_ID, ICU_ID, PT_ID
order by City,
;with CTE_edit1 as (
select * from #edit1
)
select City, sum(ICU_Pts), sum(Total_Discharges)
from CTE_edit1
group by City
order by City
Current Output: PT_ID 001 works great but PT_ID 002 shows up at 2 in ICU_Admit as it is counting both as unique ICU visits.
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
2
Any help would be appreciated

How do I select a max date by person in a table

I am not too advanced with SSRS/SQL queries, and need to write a report that pulls out % allocations by person to then compare to a wage table to allocate the wages. These allocations change quarterly, but all allocations continue to be stored in the table. If a persons allocation did not change, they do NOT get a new entry in the table. Here is a sample table called Allocations.
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
01/01/20
A
25.00
Doe
Jane
01/01/20
B
25.00
Doe
Jane
01/01/20
C
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
01/01/20
A
100.00
Wayne
Bruce
04/01/20
B
100.00
The results that I would want to have from this sample table when querying it are:
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
04/01/20
B
100.00
However, I would also like to pull this by comparing it to a date that the user inputs, so that they could run this report at any point in time and get the correct "max" dates. So, for example, if there were also 7/1/20 dates in here, but the user input date was 6/30/20, I would NOT want to pull the 7/1/20 data. In other words, I would like to pull the rows with the maximum date by name w/o going over the user's input date.
Any idea on the best way to accomplish this?
Thanks in advance for any advice you can provide.
In SQL, ROW_NUMBER can be used to order records in groups by a particular field.
SELECT * FROM (
SELECT *, ROW_NUMBER()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1
Then you filter for ROW_NUM = 1.
However, I noticed that there are a couple with the same date and you want both. In this caseyou'd want to use RANK - which allows for ties so there may be multiple records with the same date that you want to capture.
SELECT * FROM (
SELECT *, RANK()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1

How Should I handle this Start and End Date for each address changes in Oracle?

I have a request to generate a report with the following data in an Oracle table: Just an example of a member.
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 1/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 1/1/2019 12/31/9999 2 Test Dr Apt 222 City AA 9/5/2019
12345 1/1/2019 12/31/9999 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 12/31/9999 3 Test TRL City AA 3/3/2019
I want this as my output on the report from the data above:
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 10/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 7/1/2019 9/30/2019 2 Test Dr Apt 222 City AA 9/5/2019
12345 4/1/2019 6/31/2019 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 3/31/2019 3 Test TRL City AA 3/3/2019
Would someone be able to help with this? I tried Dense_rank but just couldn't figure a logic that would work correctly. Like if a member has another address change, i would need to pull in the latest change on the report as well.
You seem to want records to end on the last day of the month of the last_updated column. Then next then begins on the next day.
This is easily handled using lag():
select t.*,
( lag(last_day(last_updated)) over (partition by member_id order by last_updated) +
interval '1' day
) as new_start_date,
last_day(last_updated) as new_end_date
from t;
I think you need a quarter start and end date of the last updated date as start and end date.
Select member_id,
Trunc(last_updated,'Q') as start_date,
case
when extract(month from Trunc(last_updated,'Q')) = 12
then end_date
else Add_months(Trunc(last_updated,'Q'), 3) - 1
end as end_date,
.....
From your_table

SQL Join from 2 Tables with Null Values

I have 2 tables and want to pull the results back from them into one.
Now the Name field is a unique ID with multiple data attached to it, i.e. the dates and the times. I've simplified the data somewhat to post here but this is the general gist.
Table 1
Name Date
John 12th
John 13th
John 15th
John 17th
Table 2
Name Colour
John Red
John Blue
John Orange
John Green
Result Needed
Name Date Time
John 12th NULL
John 13th NULL
John 15th NULL
John 17th NULL
John NULL Red
John NULL Blue
John NULL Orange
John NULL Green
I'm currently performing a Left join to pull the data however it is posting the results next to each other like
John 12th Red
You want union all:
select name, date, null as colour
from t1
union all
select name, null, colour
from t2;
I took the liberty of naming the second column colour rather than time, simply because that makes more sense in the context of the question.

Problem with GROUP BY statement (SQL)

I have a table GAMES with this information:
Id_Game Id_Player1 Id_Player2 Week
--------------------------------------
1211 Peter John 2
1215 John Louis 13
1216 Louis Peter 17
I would like to get a list of the last week when each player has played, and the number of games, which should be this:
Id_Player Week numberGames
-----------------------------
Peter 17 2
John 13 2
Louis 17 2
But instead I get this one (notice on Peter week):
Id_Player Week numberGames
-----------------------------
Peter 2 2
John 13 2
Louis 17 2
What I do is this:
SELECT Id_Player,
MAX(Week) AS Week,
COUNT(*) as numberGames
FROM ((SELECT Id_Player1 as Id_Player, Week
FROM Games)
UNION ALL
(SELECT Id_Player2 as Id_Player, Week
FROM Games)) AS g2
GROUP BY Id_Player;
Could anyone help me to find the mistake?
What is the datatype of the Week column? If the datatype of Week is varchar you would get this behavior.