SQL get number of hours on previous rows - sql

I am working on a query that extracts information about a store opening and close time. This is the resultset:
RTL_LOC_ID TRANS_TYPCODE BEGIN_DATETIME
---------- ------------------------------ ---------------------------
2390 WORKSTATION_OPEN 14.10.01 09:53:43,121000000
2390 WORKSTATION_CLOSE 14.10.01 23:51:49,729000000
2390 WORKSTATION_OPEN 14.10.02 09:57:47,768000000
2390 WORKSTATION_CLOSE 14.10.02 23:47:00,120000000
2390 WORKSTATION_OPEN 14.10.03 09:47:38,949000000
2390 WORKSTATION_CLOSE 14.10.03 23:45:42,602000000
6 rows selected
This is the query:
SELECT RTL_LOC_ID,TRANS_TYPCODE, BEGIN_DATETIME
FROM TRN_TRANS
WHERE(trans_typcode = 'WORKSTATION_OPEN' OR trans_typcode='WORKSTATION_CLOSE')
AND BUSINESS_DATE BETWEEN '14.10.01 00:00:00' AND '14.10.03 00:00:00'
ORDER BY BUSINESS_DATE, BEGIN_DATETIME ASC;
So I need to calculate the number of hours between the opening and closing of the store and place that value into a new column. I would also like to put the result for the day in the same row instead of two separate lines for each day.

This answer assumes MySQL since the question was not tagged with Oracle to begin with. I'm leaving this answer here, since it might inspire someone with Oracle skills toward a solution...
Assuming a location always opens before it closes, a quick and dirty solution could look like this:
SELECT RTL_LOC_ID, DATE(BUSINESS_DATE),
MIN(BUSINESS_DATE) AS [OpenTime],
MAX(BUSINESS_DATE) AS [CloseTime]
FROM
TRN_TRANS
WHERE(trans_typcode = 'WORKSTATION_OPEN' OR trans_typcode='WORKSTATION_CLOSE')
AND BUSINESS_DATE BETWEEN '14.10.01 00:00:00' AND '14.10.03 00:00:00'
GROUP BY RTL_LOC_ID, DATE(BUSINESS_DATE)
ORDER BY DATE(BUSINESS_DATE)
Or if you want to be pedantic:
SELECT RTL_LOC_ID, DATE(BUSINESS_DATE),
MAX(CASE trans_typcode WHEN 'WORKSTATION_OPEN' THEN BUSINESS_DATE ELSE NULL END) AS [OpenTime],
MAX(CASE trans_typcode WHEN 'WORKSTATION_CLOSE' THEN BUSINESS_DATE ELSE NULL END) AS [CloseTime],
FROM
-- rest of query same as above --

Is this what you mean?
SELECT
TRN_TRANS.RTL_LOC_ID,
DATE_FORMAT(BEGIN_DATETIME ,'%Y-%m-%d') AS _date,
TIMEDIFF(closing_time.BEGIN_DATETIME , opening_time.BEGIN_DATETIME ) AS _hours
FROM TRN_TRANS
INNER JOIN
(
SELECT RTL_LOC_ID, BEGIN_DATETIME, DATE_FORMAT(BEGIN_DATETIME ,'%Y-%m-%d') as _date
FROM TRN_TRANS
WHERE TRANS_TYPCODE = 'WORKSTATION_OPEN'
) AS opening_time
ON
TRN_TRANS.RTL_LOC_ID = opening_time.RTL_LOC_ID
AND
DATE_FORMAT(TRN_TRANS.BEGIN_DATETIME ,'%Y-%m-%d') = opening_time._date
INNER JOIN
(
SELECT RTL_LOC_ID, BEGIN_DATETIME, DATE_FORMAT(BEGIN_DATETIME ,'%Y-%m-%d') as _date
FROM TRN_TRANS
WHERE TRANS_TYPCODE = 'WORKSTATION_CLOSE'
) AS closing_time
ON
TRN_TRANS.RTL_LOC_ID = closing_time.RTL_LOC_ID
AND
DATE_FORMAT(TRN_TRANS.BEGIN_DATETIME ,'%Y-%m-%d') = closing_time._date
GROUP BY TRN_TRANS.RTL_LOC_ID, _date, _hours

Related

How to solve a nested aggregate function in SQL?

I'm trying to use a nested aggregate function. I know that SQL does not support it, but I really need to do something like the below query. Basically, I want to count the number of users for each day. But I want to only count the users that haven't completed an order within a 15 days window (relative to a specific day) and that have completed any order within a 30 days window (relative to a specific day). I already know that it is not possible to solve this problem using a regular subquery (it does not allow to change subquery values for each date). The "id" and the "state" attributes are related to the orders. Also, I'm using Fivetran with Snowflake.
SELECT
db.created_at::date as Date,
count(case when
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-15,Date) and dateadd(day,-1,Date)) then db.id end)
= 0) and
(count(case when (db.state = 'finished')
and (db.created_at::date between dateadd(day,-30,Date) and dateadd(day,-16,Date)) then db.id end)
> 0) then db.user end)
FROM
data_base as db
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
In other words, I want to transform the below query in a way that the "current_date" changes for each date.
WITH completed_15_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-15,current_date) and dateadd(day,-1,current_date)
group by User
),
completed_16_days_before AS (
select
db.user as User,
count(case when db.state = 'finished' then db.id end) as Completed
from
data_base as db
where
db.created_at::date between dateadd(day,-30,current_date) and dateadd(day,-16,current_date)
group by User
)
SELECT
date(db.created_at) as Date,
count(distinct case when comp_15.completadas = 0 and comp_16.completadas > 0 then comp_15.user end) as "Total Users Churn",
count(distinct case when comp_15.completadas > 0 then comp_15.user end) as "Total Users Active",
week(Date) as Week
FROM
data_base as db
left join completadas_15_days_before as comp_15 on comp_15.user = db.user
left join completadas_16_days_before as comp_16 on comp_16.user = db.user
WHERE
db.created_at::date between '2020-01-01' and dateadd(day,-1,current_date)
GROUP BY Date
Does anyone have a clue on how to solve this puzzle? Thank you very much!
The following should give you roughly what you want - difficult to test without sample data but should be a good enough starting point for you to then amend it to give you exactly what you want.
I've commented to the code to hopefully explain what each section is doing.
-- set parameter for the first date you want to generate the resultset for
set start_date = TO_DATE('2020-01-01','YYYY-MM-DD');
-- calculate the number of days between the start_date and the current date
set num_days = (Select datediff(day, $start_date , current_date()+1));
--generate a list of all the dates from the start date to the current date
-- i.e. every date that needs to appear in the resultset
WITH date_list as (
select
dateadd(
day,
'-' || row_number() over (order by null),
dateadd(day, '+1', current_date())
) as date_item
from table (generator(rowcount => ($num_days)))
)
--Create a list of all the orders that are in scope
-- i.e. 30 days before the start_date up to the current date
-- amend WHERE clause to in/exclude records as appropriate
,order_list as (
SELECT created_at, rt_id
from data_base
where created_at between dateadd(day,-30,$start_date) and current_date()
and state = 'finished'
)
SELECT dl.date_item
,COUNT (DISTINCT ol30.RT_ID) AS USER_COUNT
,COUNT (ol30.RT_ID) as ORDER_COUNT
FROM date_list dl
-- get all orders between -30 and -16 days of each date in date_list
left outer join order_list ol30 on ol30.created_at between dateadd(day,-30,dl.date_item) and dateadd(day,-16,dl.date_item)
-- exclude records that have the same RT_ID as in the ol30 dataset but have a date between 0 amd -15 of the date in date_list
WHERE NOT EXISTS (SELECT ol15.RT_ID
FROM order_list ol15
WHERE ol30.RT_ID = ol15.RT_ID
AND ol15.created_at between dateadd(day,-15,dl.date_item) and dl.date_item)
GROUP BY dl.date_item
ORDER BY dl.date_item;

COUNT from DISTINCT values in multiple columns

If this has been asked before, I apologize, I wasn't able to find a question/solution like it before breaking down and posting. I have the below query (using Oracle SQL) that works fine in a sense, but not fully what I'm looking for.
SELECT
order_date,
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
srt AS srt_level,
COUNT(*) AS total_orders
FROM
database.t_con
WHERE
order_date IN (
'&Enter_Date_YYYYMM'
)
GROUP BY
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
srt,
order_date
ORDER BY
p_category,
issue_group,
srt_level,
order_date
Current Return (12 rows):
Needed Return (8 rows without the tan rows being shown):
Here is the logic of total_order column that I'm expecting:
count of order_date where (srt_level = 80 + 100 + Late) ... 'Late' counts needed to be added to the total, just not be displayed
I'm eventually adding a filled_orders column that will go before the total_orders column, but I'm just not there yet.
Sorry I wasn't as descriptive earlier. Thanks again!
You don't appear to need a subquery; if you want the count for each combination of values then group by those, and aggregate at that level; something like:
SELECT
t1.order_date,
t1.p_category,
CASE
WHEN ( t1.issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
t1.srt AS srt_level,
COUNT(*) AS total_orders
FROM
database.t_con t1
WHERE
t1.order_date = TO_DATE ( '&Enter_Date_YYYYMM', 'YYYYMM' )
GROUP BY
t1.p_category,
CASE
WHEN ( t1.issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
t1.srt,
t1.order_date
ORDER BY
p_category,
issue_group,
srt_level,
order_date;
You shouldn't be relying on implicit conversion and NLS settings for your date argument (assuming order_date is actually a date column, not a string), so I've used an explicit TO_DATE() call, using the format suggested by your substitution variable name and prompt.
However, that will give you the first day of the supplied month, since a day number isn't being supplied. It's more likely that you either want to prompt for a full date, or (possibly) just the year/month but want to include all days in that month - which IN() will not do, if that was your intention. It also implies that stored dates all have their time portions set to midnight, as that is all it will match on. If those values have non-midnight times then you need a range to pick those up too.
I got it working to the extent of what my question was. Just needed to nest each column where counts/calculations were happening.
SELECT
order_date,
p_category,
issue_group,
srt_level,
order_count,
SUM(order_count) OVER(
PARTITION BY order_date, issue_group, p_category
) AS total_orders
FROM
(
SELECT
order_date,
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END AS issue_group,
srt AS srt_level,
COUNT(*) AS order_count
FROM
database.t_con
WHERE
order_date IN (
'&Enter_Date_YYYYMM'
)
GROUP BY
p_category,
CASE
WHEN ( issue_grp = 1 ) THEN '1'
ELSE '2/3 '
END,
srt,
order_date
)
ORDER BY
order_date,
p_category,
issue_group

Need to calc start and end date from single effective date

I am trying to write SQL to calculate the start and end date from a single date called effective date for each item. Below is a idea of how my data looks. There are times when the last effective date for an item will be in the past so I want the end date for that to be a year from today. The other two items in the table example have effective dates in the future so no need to create and end date of a year from today.
I have tried a few ways but always run into bad data. Below is an example of my query and the bad results
select distinct tb1.itemid,tb1.EffectiveDate as startdate
, case
when dateadd(d,-1,tb2.EffectiveDate) < getdate()
or tb2.EffectiveDate is null
then getdate() +365
else dateadd(d,-1,tb2.EffectiveDate)
end as enddate
from #test tb1
left join #test as tb2 on (tb2.EffectiveDate > tb1.EffectiveDate
or tb2.effectivedate is null) and tb2.itemid = tb1.itemid
left join #test tb3 on (tb1.EffectiveDate < tb3.EffectiveDate
andtb3.EffectiveDate <tb2.EffectiveDate or tb2.effectivedate is null)
and tb1.itemid = tb3.itemid
left join #test tb4 on tb1.effectivedate = tb4.effectivedate \
and tb1.itemid = tb4.itemid
where tb1.itemID in (62741,62740, 65350)
Results - there is an extra line for 62740
Bad Results
I expect to see below since the first two items have a future end date no need to create an end date of today + 365 but the last one only has one effective date so we have to calculate the end date.
I think I've read your question correctly. If you could provide your expected output it would help a lot.
Test Data
CREATE TABLE #TestData (itemID int, EffectiveDate date)
INSERT INTO #TestData (itemID, EffectiveDate)
VALUES
(62741,'2016-06-25')
,(62741,'2016-06-04')
,(62740,'2016-07-09')
,(62740,'2016-06-25')
,(62740,'2016-06-04')
,(65350,'2016-05-28')
Query
SELECT
a.itemID
,MIN(a.EffectiveDate) StartDate
,MAX(CASE WHEN b.MaxDate > GETDATE() THEN b.MaxDate ELSE CONVERT(date,DATEADD(yy,1,GETDATE())) END) EndDate
FROM #TestData a
JOIN (SELECT itemID, MAX(EffectiveDate) MaxDate FROM #TestData GROUP BY itemID) b
ON a.itemID = b.itemID
GROUP BY a.itemID
Result
itemID StartDate EndDate
62740 2016-06-04 2016-07-09
62741 2016-06-04 2016-06-25
65350 2016-05-28 2017-06-24
This should do it:
SELECT itemid
,effective_date AS "Start"
,(SELECT MIN(effective_date)
FROM effective_date_tbl
WHERE effective_date > edt.effective_date
AND itemid = edt.itemid) AS "End"
FROM effective_date_tbl edt
WHERE effective_date <
(SELECT MAX(effective_date) FROM effective_date_tbl WHERE itemid = edt.itemid)
UNION ALL
SELECT itemid
,effective_date AS "Start"
,(SYSDATE + 365) AS "End"
FROM effective_date_tbl edt
WHERE 1 = ( SELECT COUNT(*) FROM effective_date_table WHERE itemid = edt.itemid )
ORDER BY 1, 2, 3;
I did this exercise for Items that have multiple EffectiveDate in the table
you can create this view
CREATE view [VW_TESTDATA]
AS ( SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY Item,CONVERT(datetime,EffectiveDate,110)) AS ID, Item, DATA
FROM MyTable ) AS Q
)
so use a select to compare the same Item
select * from [VW_TESTDATA] as A inner join [VW_TESTDATA] as B on A.Item = B.Item and A.id = B.id-1
in this way you always minor and major Date
I did not understand how to handle dates with only one Item , but it seems the simplest thing and can be added to this query with a UNION ALL, because the view not cover individual Item
You also need to figure out how to deal with Item with two equal EffectiveDate
you should use the case when statement..
[wrong query because a misunderstand of the requirements]
SELECT
ItemID AS Item,
StartDate,
CASE WHEN EndDate < Sysdate THEN Sysdate + 365 ELSE EndDate END AS EndDate
FROM
(
SELECT tabStartDate.ItemID, tabStartDate.EffectiveDate AS StartDate, tabEndDate.EffectiveDate AS EndDate
FROM TableItems tabStartDate
JOIN TableItems tabEndDate on tabStartDate.ItemID = tabEndDate.ItemID
) TableDatesPerItem
WHERE StartDate < EndDate
update after clarifications in the OP and some comments
I found a solution quite portable, because it doesn't make use of partioning but endorses on a sort of indexing rule that make to correspond the dates of each item with others with the same id, in order of time's succession.
The portability is obviously related to the "difficult" part of query, while row numbering mechanism and conversion go adapted, but I think that it isn't a problem.
I sended a version for MySql that it can try on SQL Fiddle..
Table
CREATE TABLE ITEMS
(`ItemID` int, `EffectiveDate` Date);
INSERT INTO ITEMS
(`ItemID`, `EffectiveDate`)
VALUES
(62741, DATE(20160625)),
(62741, DATE(20160604)),
(62740, DATE(20160709)),
(62740, DATE(20160625)),
(62740, DATE(20160604)),
(62750, DATE(20160528))
;
Query
SELECT
RESULT.ItemID AS ItemID,
DATE_FORMAT(RESULT.StartDate,'%m/%d/%Y') AS StartDate,
CASE WHEN RESULT.EndDate < CURRENT_DATE
THEN DATE_FORMAT((CURRENT_DATE + INTERVAL 365 DAY),'%m/%d/%Y')
ELSE DATE_FORMAT(RESULT.EndDate,'%m/%d/%Y')
END AS EndDate
FROM
(
SELECT
tabStartDate.ItemID AS ItemID,
tabStartDate.StartDate AS StartDate,
tabEndDate.EndDate
,tabStartDate.IDX,
tabEndDate.IDX AS IDX2
FROM
(
SELECT
tabStartDateIDX.ItemID AS ItemID,
tabStartDateIDX.EffectiveDate AS StartDate,
#rownum:=#rownum+1 AS IDX
FROM ITEMS AS tabStartDateIDX
ORDER BY tabStartDateIDX.ItemID, tabStartDateIDX.EffectiveDate
)AS tabStartDate
JOIN
(
SELECT
tabEndDateIDX.ItemID AS ItemID,
tabEndDateIDX.EffectiveDate AS EndDate,
#rownum:=#rownum+1 AS IDX
FROM ITEMS AS tabEndDateIDX
ORDER BY tabEndDateIDX.ItemID, tabEndDateIDX.EffectiveDate
)AS tabEndDate
ON tabStartDate.ItemID = tabEndDate.ItemID AND (tabEndDate.IDX - tabStartDate.IDX = ((select count(*) from ITEMS)+1) )
,(SELECT #rownum:=0) r
UNION
(
SELECT
tabStartDateSingleItem.ItemID AS ItemID,
tabStartDateSingleItem.EffectiveDate AS StartDate,
tabStartDateSingleItem.EffectiveDate AS EndDate
,0 AS IDX,0 AS IDX2
FROM ITEMS AS tabStartDateSingleItem
Group By tabStartDateSingleItem.ItemID
HAVING Count(tabStartDateSingleItem.ItemID) = 1
)
) AS RESULT
;

Filling in missing dates DB2 SQL

My initial query looks like this:
select process_date, count(*) batchCount
from T1.log_comments
order by process_date asc;
I need to be able to do some quick analysis for weekends that are missing, but wanted to know if there was a quick way to fill in the missing dates not present in process_date.
I've seen the solution here but am curious if there's any magic hidden in db2 that could do this with only a minor modification to my original query.
Note: Not tested, framed it based on my exposure to SQL Server/Oracle. I guess this gives you the idea though:
*now amended and tested on DB2*
WITH MaxDateQry(MaxDate) AS
(
SELECT MAX(process_date) FROM T1.log_comments
),
MinDateQry(MinDate) AS
(
SELECT MIN(process_date) FROM T1.log_comments
),
DatesData(ProcessDate) AS
(
SELECT MinDate from MinDateQry
UNION ALL
SELECT (ProcessDate + 1 DAY) FROM DatesData WHERE ProcessDate < (SELECT MaxDate FROM MaxDateQry)
)
SELECT a.ProcessDate, b.batchCount
FROM DatesData a LEFT JOIN
(
SELECT process_date, COUNT(*) batchCount
FROM T1.log_comments
) b
ON a.ProcessDate = b.process_date
ORDER BY a.ProcessDate ASC;

sql db2 select records from either table

I have an order file, with order id and ship date. Orders can only be shipped monday - friday. This means there are no records selected for Saturday and Sunday.
I use the same order file to get all order dates, with date in the same format (yyyymmdd).
i want to select a count of all the records from the order file based on order date... and (i believe) full outer join (or maybe right join?) the date file... because i would like to see
20120330 293
20120331 0
20120401 0
20120402 920
20120403 430
20120404 827
etc...
however, my sql statement is still not returning a zero record for the 31st and 1st.
with DatesTable as (
select ohordt "Date" from kivalib.orhdrpf
where ohordt between 20120315 and 20120406
group by ohordt order by ohordt
)
SELECT ohscdt, count(OHTXN#) "Count"
FROM KIVALIB.ORHDRPF full outer join DatesTable dts on dts."Date" = ohordt
--/*order status = filled & order type = 1 & date between (some fill date range)*/
WHERE OHSTAT = 'F' AND OHTYP = 1 and ohscdt between 20120401 and 20120406
GROUP BY ohscdt ORDER BY ohscdt
any ideas what i'm doing wrong?
thanks!
It's because there is no data for those days, they do not show up as rows. You can use a recursive CTE to build a contiguous list of dates between two values that the query can join on:
It will look something like:
WITH dates (val) AS (
SELECT CAST('2012-04-01' AS DATE)
FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT Val + 1 DAYS
FROM dates
WHERE Val < CAST('2012-04-06' AS DATE)
)
SELECT d.val AS "Date", o.ohscdt, COALESCE(COUNT(o.ohtxn#), 0) AS "Count"
FROM dates AS d
LEFT JOIN KIVALIB.ORDHRPF AS o
ON o.ohordt = TO_CHAR(d.val, 'YYYYMMDD')
WHERE o.ohstat = 'F'
AND o.ohtyp = 1