Getting a snapshot on the first day of the each month - sql

My dataset looks like this:
emplid region location sub_dept dept start_dt end_dt days
------ ------ -------- -------- ---- -------- ---------- ----
123456 East NY A 1 7/1/2005 9/30/2005 91
123456 East NY B 1 7/1/2012 11/9/2012 131
123456 West San Jose C 2 7/1/2013 12/31/2013 183
123457 East NY B 1 7/1/2017 9/7/2017 68
123457 East NY B 1 7/1/2005 12/31/2005 183
123458 East NY B 1 7/1/2017 9/7/2017 68
123458 West San Jose C 2 7/1/2010 7/31/2010 30
123459 East NY A 1 7/1/2017 9/7/2017 68
123460 East Boston F 3 7/1/2007 11/30/2007 152
I need to be able to get a snapshot for each 1st of the month starting from the minimum date. So in the example minimum date is 9/30/2005. So I need to know in which department/sub_dept/location/region was each empl on 10/1/2005, 11/1/2005 , 12/1/2005 all the way through the max date.

You didn't mention the name of the employee table, so I've called it employee_table. The following query (or something very close to it) should generate what you want:
With report_limits as (
Select Trunc(min(start_dt), 'MONTH') as min_rpt_dt,
Trunc(max(end_dt), 'MONTH') as max_rpt_dt
From employee_table),
report_dates as (
Select add_months(min_rpt_dt, level-1) as rpt_dt
From report_limits
Connect By add_months(min_rpt_dt, level-1) <= max_rpt_dt)
--
Select e.emplid, e.region, e.location, e.sub_dept, e.dept,
e.start_dt, e.end_dt, e.days, r.rpt_dt
From report_dates r
Inner Join employee_table e on r.rpt_dt Between e.start_dt And e.end_dt
Order By r.rpt_dt, e.emplid;
The report_limits query determines the range of report dates, the report_dates query uses a Connect By clause to generate a set of dates within the range, and the main query joins the list of dates to the employee date.

Try this query:
Declare #StartDate date='2005-09-29',
#EndDate date='2017-04-01'
Select *,
Dateadd(mm, Datediff(mm, 0, date), 0) AS FirstDateOfMonth from TableName
where date >=#StartDate and date<=#EndDate

Related

Trying to count unique observations in SQL using Partition By

I have these two datasets:
Conditions: I would like to count the number of Unique Discharge_ID as Total_Discharges in my final dataset.
ICU_ID is a little bit more difficult. For PT_ID 001, what is happening is that PT 001 has 4 of the same discharge dates but 4 unique ICU_IDs. Since all of these ICU_IDs occur within 30 days of the Discharge_DT, I only want to count one of them. That is why total discharges for AZ is 1 and ICU_Admits = 1.
For PT_ID 002, I have 2 different Discharge_IDs but 1 ICU Admit that occurred within 30 days of both of the Discharge_IDs. I would like to count the Discharges as 2, and ICU_admits as 1.
DF1: Dataset of Discharges from hospital and admission to ICU within 30 days of Discharge_DT
City
PT_ID
Hospital_ID
Admit_Dt
Discharge_DT
Discharge_ID
ICU_ID
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-05-2021,01-06-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-08-2021,01-09-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-11-2021,01-11-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-15-2021,01-16-2021
CA
002
DEF
04-03-2021
04-07-2021
001,ABC,04-03-2021,04-07-2021
002,LMN,04-27-2021,04-27-2021
CA
002
DEF
04-20-2021
04-21-2021
001,ABC,04-20-2021,04-21-2021
002,LMN,04-27-2021,04-27-2021
DF desired:
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
1
Current Code:
DROP TABLE IF EXISTS #edit1
WITH CTE_df1 as (
select * from df1
)
select
City,
PT_ID,
Hospital_ID,
Admit_Dt,
Discharge_DT,
Discharge_ID,
count(ICU_ID) over (partition by ICU_ID) as ICU_Pts,
count(distinct Discharge_ID) as Total_Discharges
into #edit1
from CTE_df1
group by City, Discharge_ID, ICU_ID, PT_ID
order by City,
;with CTE_edit1 as (
select * from #edit1
)
select City, sum(ICU_Pts), sum(Total_Discharges)
from CTE_edit1
group by City
order by City
Current Output: PT_ID 001 works great but PT_ID 002 shows up at 2 in ICU_Admit as it is counting both as unique ICU visits.
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
2
Any help would be appreciated

How Should I handle this Start and End Date for each address changes in Oracle?

I have a request to generate a report with the following data in an Oracle table: Just an example of a member.
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 1/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 1/1/2019 12/31/9999 2 Test Dr Apt 222 City AA 9/5/2019
12345 1/1/2019 12/31/9999 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 12/31/9999 3 Test TRL City AA 3/3/2019
I want this as my output on the report from the data above:
MEMBER_ID START_DATE END_DATE ADDRESS1 ADDRESS2 CITY STATE LAST_UPDATED
12345 10/1/2019 12/31/9999 1 Test Ave Apt 111 City AA 3/4/2020
12345 7/1/2019 9/30/2019 2 Test Dr Apt 222 City AA 9/5/2019
12345 4/1/2019 6/31/2019 1 Test Ave APT 111 City AA 6/3/2019
12345 1/1/2019 3/31/2019 3 Test TRL City AA 3/3/2019
Would someone be able to help with this? I tried Dense_rank but just couldn't figure a logic that would work correctly. Like if a member has another address change, i would need to pull in the latest change on the report as well.
You seem to want records to end on the last day of the month of the last_updated column. Then next then begins on the next day.
This is easily handled using lag():
select t.*,
( lag(last_day(last_updated)) over (partition by member_id order by last_updated) +
interval '1' day
) as new_start_date,
last_day(last_updated) as new_end_date
from t;
I think you need a quarter start and end date of the last updated date as start and end date.
Select member_id,
Trunc(last_updated,'Q') as start_date,
case
when extract(month from Trunc(last_updated,'Q')) = 12
then end_date
else Add_months(Trunc(last_updated,'Q'), 3) - 1
end as end_date,
.....
From your_table

Calculate value using previous and current month

I have below three tables
Stock Table
ID GlobalStock Date Country
1 10 2017/01/01 India
1 20 2017/01/01 India
2 5 2017/02/01 Africa
3 6 2017/08/01 Japan
4 7 2017/04/01 Japan
5 89 2017/08/01 Japan
2 10 2017/03/01 Japan
5 8 2017/03/01 Japan
1 20 2017/02/01 India
ShipFile
ID GlobalStock Date Country
2 10 2017/03/01 Africa
3 60 2017/08/01 India
11 70 2017/08/01 India
1 8 2017/02/01 India
1 9 2017/02/01 India
2 4 2017/03/01 Japan
2 5 2017/04/01 Japan
5 3 2017/03/01 Japan
3 8 2017/08/01 Japan
SalesFiles
ID GlobalStock Date Country
2 10 2017/03/01 India
2 20 2017/03/01 Africa
3 30 2017/08/01 Japan
7 5 2017/02/01 Japan
8 8 2018/01/01 Japan
1 9 2017/02/01 India
1 70 2017/02/01 Africa
13 10 2017/08/01 Japan
10 60 2017/11/01 Japan
I want to calculate -> StockTable(Month - 1) + ShipFile (Month) - Sales (Month)
For example
For ID 1 suppose we are considering Jan (GlobalStock -> 10 + 20) data then in other tables we must take Feb values and country should be same for all tables.
So calculation would be
(10 + 20) + (8 + 9) - (9) = 38
If we consider Feb ID of stocktable then we must consider March data from other tables and so on..
the joining all table i am considering ID and Country.
You can query using subquery or cte as below:
;With cte_Stock as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from Stock
group by Id, [Date], Country
), cte_ShipFiles as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from ShipFile
group by Id, [Date], Country
)
, cte_SalesFiles as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from SalesFiles
group by Id, [Date], Country
)
select s.ID, s.[Date], sf.[Date], s.Country,
YourOutput = s.Sum_GlobalStock+sf.Sum_GlobalStock-sales.Sum_GlobalStock
from cte_Stock s
join cte_ShipFiles sf
on s.ID = sf.ID
and s.Country = sf.Country
and s.[Date] = dateadd(mm,-1, sf.[Date])
join cte_SalesFiles sales
on s.ID = sales.ID
and s.Country = sales.Country
and s.[Date] = dateadd(mm,-1, sales.[Date])
Output as below:
+----+------------+------------+---------+------------+
| ID | Date | Date | Country | YourOutput |
+----+------------+------------+---------+------------+
| 1 | 2017-01-01 | 2017-02-01 | India | 38 |
| 2 | 2017-02-01 | 2017-03-01 | Africa | -5 |
+----+------------+------------+---------+------------+
Here is an approach with derived tables:
DECLARE #CurrentMonth date = '20180101'
DECLARE #NextMonth date = DATEADD(MONTH,1,#CurrentMonth)
SELECT s.Country, SUM(s.GlobalStock) + ShipSum - SaleSum
FROM stock s
LEFT JOIN (SELECT ISNULL(SUM(GlobalStock),0) ShipSum, Country
FROM ShipFile
WHERE Date >= #NextMonth
AND Date <= EOMONTH(#NextMonth)
GROUP BY Country) sh on s.Country = sh.Country
LEFT JOIN (SELECT ISNULL(SUM(GlobalStock),0) SaleSum, Country
FROM SalesFile
WHERE Date >= #NextMonth
AND Date <= EOMONTH(#NextMonth)
GROUP BY Country) sa on s.Country = sa.Country
WHERE s.Date >= #CurrentMonth
AND s.Date <= EOMONTH(#CurrentMonth)
GROUP BY s.Country, ShipSum, SaleSum
Notes:
This uses Country for the joins because ID seems to change between tables.
It also uses a date range assuming that the day portion of your date column is not always the first of the month - if it is always the first that can be simplified to date = #CurrentMonth or date = #NextMonth

Records with overlapping dates

I have an addresses table, say:
address_id person_id start_date stop_date address
1 123 01-JAN-15 01-JUN-15 india
2 123 01-MAY-15 null russia
3 321 01-JAN-15 01-JUN-15 us
4 321 10-MAY-15 null india
I want to find all records (address_id values) which have overlapping dates for the same person_id. In this example that would find address_id 2 and 4, as May lies between Jan and Jun.
I then want to update the stop_date to start_date - 1 of the subsequent row belonging to same person so that the overlap is removed. For instance updating stop_date to 09-MAY-2015at row withaddress_id` 3.
So I want to end up with:
address_id person_id start_date stop_date address
1 123 01-JAN-15 30-APR-15 india
2 123 01-MAY-15 null russia
3 321 01-JAN-15 09-MAY-15 us
4 321 10-MAY-15 null india
I have tried:
update (
select * from addresses a1,addresses a2
where a1.person_id = a2.person_id
and a2.start_date > a1.start_date and a2.start_date <a1.stop_date
)
set a1.stop_date = a2.start_date - 1;
This worked fine in Microsoft Access but in Oracle it an invalid identifier error for a2.start_date.
How can I perform this update?
You can use a correlated update:
update addresses a
set stop_date = (
select min(start_date) - 1
from addresses
where person_id = a.person_id
and start_date > a.start_date
and start_date <= a.stop_date
)
where exists (
select null
from addresses
where person_id = a.person_id
and start_date > a.start_date
and start_date <= a.stop_date
);
2 rows updated.
select * from addresses;
ADDRESS_ID PERSON_ID START_DATE STOP_DATE ADDRESS
---------- ---------- ---------- --------- ----------
1 123 01-JAN-15 30-APR-15 india
2 123 01-MAY-15 russia
3 321 01-JAN-15 09-MAY-15 us
4 321 10-MAY-15 india
Both the set subquery and the exists subquery look for a row for the same person whose start date is between the start and stop date of the current row (which is the correlated part). The exists means only accounts which match are updated; without that any rows which don't have an overlap would be updated to null. (You wouldn't see any difference with the sample data, but would if you had more data).

Problem with GROUP BY statement (SQL)

I have a table GAMES with this information:
Id_Game Id_Player1 Id_Player2 Week
--------------------------------------
1211 Peter John 2
1215 John Louis 13
1216 Louis Peter 17
I would like to get a list of the last week when each player has played, and the number of games, which should be this:
Id_Player Week numberGames
-----------------------------
Peter 17 2
John 13 2
Louis 17 2
But instead I get this one (notice on Peter week):
Id_Player Week numberGames
-----------------------------
Peter 2 2
John 13 2
Louis 17 2
What I do is this:
SELECT Id_Player,
MAX(Week) AS Week,
COUNT(*) as numberGames
FROM ((SELECT Id_Player1 as Id_Player, Week
FROM Games)
UNION ALL
(SELECT Id_Player2 as Id_Player, Week
FROM Games)) AS g2
GROUP BY Id_Player;
Could anyone help me to find the mistake?
What is the datatype of the Week column? If the datatype of Week is varchar you would get this behavior.