Need your assistance on finding the missing dates from records, sample below
Currently, i've data for 1, 2, 6 and 10 Jan 2020
select p.effective_date,x.xref_security_id,x.xref_type
from securitydbo.price p
inner join securitydbo.xreference x on x.security_alias = p.security_alias
where p.src_intfc_inst = 253
and p.effective_date between ('01-JAN-2020') and ('10-JAN-2020')
and x.xref_security_id = 'ABC999999999'
Expected Results
Missing_Date Xref_Security_ID Xref_Type Price
1/3/2020 ABC99999999 ISIN 0
1/7/2020 ABC99999999 ISIN 0
1/8/2020 ABC99999999 ISIN 0
1/9/2020 ABC99999999 ISIN 0
I don't have your tables so I created one which looks like result you currently have:
SQL> select * From test order by missing_date;
MISSING_DA XREF_S
---------- ------
01/03/2020 ABC999
01/07/2020 ABC999
01/08/2020 ABC999
01/09/2020 ABC999
In order to get dates that are missing, create a calendar (see the CTE I used, which is just one of row generator techniques) whose
starting date is lower date from your period
add level to it
connect by clause "loops" as many times as there are days in desired period
XREF_SECURITY_ID is NULL for missing dates as there's no match for them in your tables.
SQL> with
2 -- create a calendar for desired period (see CONNECT BY)
3 calendar as
4 (select date '2020-01-01' + level - 1 datum
5 from dual
6 connect by level <= date '2020-01-10' - date '2020-01-01' + 1
7 )
8 -- outer join calendar with your table(s)
9 select c.datum, t.xref_security_id
10 from calendar c left join test t on t.missing_date = c.datum
11 order by c.datum;
DATUM XREF_S
---------- ------
01/01/2020
01/02/2020
01/03/2020 ABC999
01/04/2020
01/05/2020
01/06/2020
01/07/2020 ABC999
01/08/2020 ABC999
01/09/2020 ABC999
01/10/2020
10 rows selected.
SQL>
I can take a guess that the date_format might be the problem out here. Without actually knowing what is the data in your tables the only way to do is to guess.
select p.effective_date,x.xref_security_id,x.xref_type
from securitydbo.price p
inner join securitydbo.xreference x on x.security_alias = p.security_alias
where p.src_intfc_inst = 253
and p.effective_date between to_date('01-JAN-2020','DD-MON-YYY')
and to_date('10-JAN-2020','DD-MON-YYYY')
and x.xref_security_id = 'ABC999999999'
Related
I have a table t with:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-29
1
123
10
2021-10-30
1
123
9
2021-10-31
1
123
8
2021-10-29
1
456
100
2021-10-30
1
456
90
2021-10-31
1
456
80
2021-10-29
2
123
18
2021-10-30
2
123
17
2021-11-29
2
456
18
I need to find the AMOUNT of each PRODUCT_ID for each combination of LOCATION + PRODUCT_ID.
If a PRODUCT_ID has no entry for that day the AMOUNT is NULL.
So the result should look like:
DATE
LOCATION
PRODUCT_ID
AMOUNT
2021-10-31
1
123
8
2021-10-31
1
456
80
2021-10-31
2
123
NULL
2021-11-30
2
456
NULL
Sadly EXASOL has no LAST_DAY() or EOMONTH() function. How can I solve this?
You can get to the last day of the month using a date_trunc function in combination with date_add:
case
when t.date = date_add('day', -1, date_add('month', 1, date_trunc('month', t.date)))
then 'Y' else 'N' end as end_of_month
That being said, if you group your table for all combinations of locations and products, you will not get NULLs for products without sales on the last day of the month as shown in your output table.
When you group your data, any value that does not exist will simply not show up in your output table. If you want to force nulls to show up, you can create a new table that contains all combinations of products, locations, and hard-coded end of month dates.
Then, you can left join your old table with this new hard-coded table by date, location, and product. This method will give you the NULL values you expect.
I have a table of EMPLOYEES that contains information about the DATE and WORKTIME per that day. Fx:
ID | DATE | WORKTIME |
----------------------------------------
1 | 1-Sep-2014 | 4 |
2 | 2-Sep-2014 | 6 |
1 | 3-Sep-2014 | 5.5 |
1 | 4-Sep-2014 | 7 |
2 | 4-Sep-2014 | 4 |
1 | 9-Sep-2014 | 8 |
and so on.
Question: How can I create a query that would allow me to calculate amount of time worked per week (HOURS_PERWEEK). I understand that I need a summation of WORKTIME together with grouping considering both, ID and week, but so far my trials as well as googling didnt yield any results. Any ideas on this? Thank you in advance!
edit:
Got a solution of
select id, sum (worktime), trunc(date, 'IW') week
from employees
group by id, TRUNC(date, 'IW');
But will need somehow to connect that particular output with DATE table by updating a newly created column such as WEEKLY_TIME. Any hints on that?
You can find the start of the ISO week, which will always be a Monday, using TRUNC("DATE", 'IW').
So if, in the query, you GROUP BY the id and the start of the week TRUNC("DATE", 'IW') then you can SELECT the id and aggregate to find the SUM the WORKTIME column for each id.
Since this appears to be a homework question and you haven't attempted a query, I'll leave it at this to point you in the correct direction and you can complete the query.
Update
Now I need to create another column (lets call it WEEKLY_TIME) and populate it with values from the current output, so that Sep 1,3,4 (for ID=1) would all contain value 16.5, specifying that on that day (that is within the certain week) that person worked 16.5 in total. And for ID=2 it would then be a value of 10 for both Sep 2 and 4.
For this, if I understand correctly, you appear to not want to use aggregation functions and want to use the analytic version of the function:
select id,
"DATE",
trunc("DATE", 'IW') week,
worktime,
sum (worktime) OVER (PARTITION BY id, trunc("DATE", 'IW'))
AS weekly_time
from employees;
Which, for the sample data:
CREATE TABLE employees (ID, "DATE", WORKTIME) AS
SELECT 1, DATE '2014-09-01', 4 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-02', 6 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-03', 5.5 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-04', 7 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-04', 4 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-09', 8 FROM DUAL;
Outputs:
ID
DATE
WEEK
WORKTIME
WEEKLY_TIME
1
2014-09-01 00:00:00
2014-09-01 00:00:00
4
16.5
1
2014-09-03 00:00:00
2014-09-01 00:00:00
5.5
16.5
1
2014-09-04 00:00:00
2014-09-01 00:00:00
7
16.5
1
2014-09-09 00:00:00
2014-09-08 00:00:00
8
8
2
2014-09-04 00:00:00
2014-09-01 00:00:00
4
10
2
2014-09-02 00:00:00
2014-09-01 00:00:00
6
10
db<>fiddle here
edit: answer submitted without noticing "Oracle" tag. Otherwise, question answered here: Oracle SQL - Sum and group data by week
Select employee_Id,
DATEPART(week, workday) as [Week],
sum (worktime) as [Weekly Hours]
from WORK
group by employee_id, DATEPART(week, workday)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=238b229156a383fa3c466b6c3c2dee1e
Suppose I have patient admission/claim wise data like the sample below. Data type of patient_id and hosp_id columns is VARCHAR
Table name claims
rec_no
patient_id
hosp_id
admn_date
discharge_date
1
1
1
01-01-2020
10-01-2020
2
2
1
31-12-2019
11-01-2020
3
1
1
11-01-2020
15-01-2020
4
3
1
04-01-2020
10-01-2020
5
1
2
16-01-2020
17-01-2020
6
4
2
01-01-2020
10-01-2020
7
5
2
02-01-2020
11-01-2020
8
6
2
03-01-2020
12-01-2020
9
7
2
04-01-2020
13-01-2020
10
2
1
31-12-2019
10-01-2020
I have another table wherein bed strength/max occupancy strength of hospitals are stored.
table name beds
hosp_id
bed_strength
1
3
2
4
Expected Results I want to find out hospital-wise dates where its declared bed-strength has exceeded on any day.
Code I have tried Nothing as I am new to SQL. However, I can solve this in R with the following strategy
pivot_longer the dates
tidyr::complete() missing dates in between
summarise or aggregate results for each date.
Simultaneously, I also want to know that whether it can be done without pivoting (if any) in sql because in the claims table there are 15 million + rows and pivoting really really slows down the process. Please help.
You can use generate_series() to do something very similar in Postgres. For the occupancy by date:
select c.hosp_id, gs.date, count(*) as occupanyc
from claims c cross join lateral
generate_series(admn_date, discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date;
Then use this as a subquery to get the dates that exceed the threshold:
select hd.*, b.strength
from (select c.hosp_id, gs.date, count(*) as occupancy
from claims c cross join lateral
generate_series(c.admn_date, c.discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date
) hd join
beds b
using (hosp_id)
where h.occupancy > b.strength
I have a table like shown below where I want to use the start and end date to evenly distribute the value for each row to the 3 months in each quarter to all of the quarters in between start and end date (last two columns).
I am familiar with generate series and intervals in Postgres but I am having hard time to get what I want.
My table has and ID column that groups rows together, a quarter column that indicates which quarter the row references for the ID, a value column that is the value for the whole quarter (and every quarter in the date range), and start_date and end_date columns indicating the date range. Here is a sample:
ID quarter value start_date end_date
1 2 152 2019-11-07 2050-12-30
1 1 785 2019-11-07 2050-12-30
2 2 152 2019-03-05 2050-12-30
2 1 785 2019-03-05 2050-12-30
3 4 41 2018-06-12 2050-12-30
3 3 50 2018-06-12 2050-12-30
3 2 88 2018-06-12 2050-12-30
3 1 29 2018-06-12 2050-12-30
4 2 1607 2018-12-17 2050-12-30
4 1 4803 2018-12-17 2050-12-30
Here is my desired output (for ID 1):
ID quarter value start_date end_date
1 2 152/3 2020-04-01 2020-07-01
1 1 785/3 2020-01-01 2020-04-01
1 2 152/3 2021-04-01 2021-07-01
1 1 785/3 2021-01-01 2021-04-01
start_date in the output will be the next quarter on first table. I need the series to be generated from the start_date to the end_date of the first table.
You can do this by using the GENERATE_SERIES function and passing in the start and end date for each unique (by ID) row and setting the interval to 3 months. Then join the result back with your original table on both ID and quarter.
Here's an example (note original_data is what I've called your first table):
WITH
quarters_table AS (
SELECT
t.ID,
(EXTRACT('month' FROM t.quarter_date) - 1)::INT / 3 + 1 AS quarter,
t.quarter_date::DATE AS start_date,
COALESCE(
LEAD(t.quarter_date) OVER (),
DATE_TRUNC('quarter', t.original_end_date) + INTERVAL '3 months'
)::DATE AS end_date
FROM (
SELECT
original_record.ID,
original_record.end_date AS original_end_date,
GENERATE_SERIES(
DATE_TRUNC('quarter', original_record.start_date),
DATE_TRUNC('quarter', original_record.end_date),
INTERVAL '3 months'
) AS quarter_date
FROM (
SELECT DISTINCT ON (original_data.ID)
original_data.ID,
original_data.start_date,
original_data.end_date
FROM
original_data
ORDER BY
original_data.ID
) AS original_record
) AS t
)
SELECT
quarters_table.ID,
quarters_table.quarter,
original_data.value::DOUBLE PRECISION / 3 AS value,
quarters_table.start_date,
quarters_table.end_date
FROM
quarters_table
INNER JOIN
original_data
ON
quarters_table.ID = original_data.ID
AND quarters_table.quarter = original_data.quarter;
Sample output:
id | quarter | value | start_date | end_date
----+---------+------------------+------------+------------
1 | 1 | 261.666666666667 | 2020-01-01 | 2020-04-01
1 | 2 | 50.6666666666667 | 2020-04-01 | 2020-07-01
1 | 1 | 261.666666666667 | 2021-01-01 | 2021-04-01
1 | 2 | 50.6666666666667 | 2021-04-01 | 2021-07-01
For completeness, here's the original_data table I've used in testing:
WITH
original_data AS (
SELECT
1 AS ID,
2 AS quarter,
152 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
1 AS ID,
1 AS quarter,
785 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
2 AS ID,
2 AS quarter,
152 AS value,
'2019-03-05'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
-- ...
)
This is one way to go about it. Showing an example based on the output you've outlined. You can then add more conditions to the CASE/WHEN for additional quarters.
SELECT
ID,
Quarter,
Value/3 AS "Value",
CASE
WHEN Quarter = 1 THEN '2020-01-01'
WHEN Quarter = 2 THEN '2020-04-01'
END AS "Start_Date",
CASE
WHEN Quarter = 1 THEN '2020-04-01'
WHEN Quarter = 2 THEN '2020-07-01'
END AS "End_Date"
FROM
Table
I have this in my table
TempTable
Id Date
1 1-15-2010
2 2-14-2010
3 3-14-2010
4 4-15-2010
i would like to change every record so that they have all same day, that is the 15th
like this
TempTable
Id Date
1 1-15-2010
2 2-15-2010 <--change to 15
3 3-15-2010 <--change to 15
4 4-15-2010
what if i like on the 30th?
the records should be
TempTable
Id Date
1 1-30-2010
2 2-28-2010 <--change to 28 because feb has 28 days only
3 3-30-2010 <--change to 30
4 4-30-2010
thanks
You can play some fun tricks with DATEADD/DATEDIFF:
create table T (
ID int not null,
DT date not null
)
insert into T (ID,DT)
select 1,'20100115' union all
select 2,'20100214' union all
select 3,'20100314' union all
select 4,'20100415'
SELECT ID,DATEADD(month,DATEDIFF(month,'20100101',DT),'20100115')
from T
SELECT ID,DATEADD(month,DATEDIFF(month,'20100101',DT),'20100130')
from T
Results:
ID
----------- -----------------------
1 2010-01-15 00:00:00.000
2 2010-02-15 00:00:00.000
3 2010-03-15 00:00:00.000
4 2010-04-15 00:00:00.000
ID
----------- -----------------------
1 2010-01-30 00:00:00.000
2 2010-02-28 00:00:00.000
3 2010-03-30 00:00:00.000
4 2010-04-30 00:00:00.000
Basically, in the DATEADD/DATEDIFF, you specify the same component to both (i.e. month). Then, the second date constant (i.e. '20100130') specifies the "offset" you wish to apply from the first date (i.e. '20100101'), which will "overwrite" the portion of the date your not keeping. My usual example is when wishing to remove the time portion from a datetime value:
SELECT DATEADD(day,DATEDIFF(day,'20010101',<date column>),'20100101')
You can also try something like
UPDATE TempTable
SET [Date] = DATEADD(dd,15-day([Date]), DATEDIFF(dd,0,[Date]))
We have a function that calculates the first day of a month, so I just addepted it to calculate the 15 instead...