Reporting task complementation status with only create and operation_date params - sql

I have two tables that the first one stores task data (task name, create date, assign_to etc) and the second table stores task history data e.g operation_date,task completed, task rejected etc. (Task and Task_history tables)
Company creates tasks and assign them to employees, then employees accepted tasks and complete them.
Task create_date column specify the sequence of the task to do, both operation_date and completed status columns specify the sequence of the task complementation.
I need a query for reporting in employee detail that Does An Employee complete the tasks in a sequence specified at the beginning ? How many tasks completed accordance with the given sequence ?
I tried a query for status completed tasks that order tables for task_creation and operation_date for an employee for a given day. Then, add the rownum for select queries then join two tables. If rownums are equals, employee completes the task for given sequence else not. But the query result was not like what I expected. Rownums displaying like that, r_h--> 1,2,3 ; r_t--> 1,15,17
SELECT *
FROM (SELECT W.id, w.create_date, ROWNUM as r_t
FROM wfm_task_1 W where W.task_status = 3
ORDER BY W.create_date ASC) TASK_SEQ LEFT OUTER JOIN
( SELECT H.wfm_task, H.record_date, ROWNUM as r_h
FROM wfm_task_history H
WHERE H.task_status = 3
AND H.record_date BETWEEN (TO_DATE ('12.07.2013',
'DD.MM.YYYY')
- 1)
AND (TO_DATE ('12.07.2013',
'DD.MM.YYYY')
+ 1)
ORDER BY H.record_date ASC) HISTORY_SEQ
ON TASK_SEQ.id = HISTORY_SEQ.wfm_task
Sample dataset
wfm_task (ID, CREATION_DATE, TASK_NAME)
49361 | 06.07.2013 11:50:00 | missionx
49404 | 10.07.2013 13:01:00 | missiony
49407 | 11.07.2013 11:02:00 | missiona
49108 | 01.07.2013 21:02:00 | missionb
task_history (ID,WFM_TASK,OP_DATE, STATUS)
98 | 49361 | 12.07.2013 15:19:19 | 3
92 | 49404 | 12.07.2013 11:10:50 | 3
90 | 49407 | 12.07.2013 11:06:58 | 3
78 | 49108 | 03.07.2013 11:02:00 | 1
result (WFM_TASK,RECORD_DATE,R_H,ID,CREATE_DATE,R_T)
49361 | 12.07.2013 15:19:19 | 3 | 49361 | 06.07.2013 11:50:00 | 15
49404 | 12.07.2013 11:10:50 | 2 | 49404 | 10.07.2013 13:01:00 | 17
49407 | 12.07.2013 11:06:58 | 1 | 49407 | 11.07.2013 11:02:00 | 1
Status 3 = completed. I want to find that are the tasks completed by an order. I check that task complete order is likely to task creation order.

You'll probably have to use ROW_NUMBER function instead of ROWNUM.
SELECT a.id, a.create_date,
row_number() over (order by a.create_date) r_t,
b.record_date,
row_number() over (order by b.record_date) r_h
from wfm_task a left outer join task_history b
on a.id = b.wfm_task
where b.status = 3
and b.record_date between date'2013-07-12' - 1 and date'2013-07-12' + 1
Demo here.

Related

Self join to create a new column with updated records

I am trying to write a SQL query to get the start date for employees in a store. As seen in the first screenshot, employee number 5041 had the number A0EH but as the number got updated, it updated the start date for the employee as well. This effects the metric of total duration in the store.
I am trying to get to the output below but haven't been able to figure out how to get this view.
This is the code I was trying but I am not getting the correct output.
select
esd.employee_number,
(case when esd.old_employee_number is null then es.employee_number else es.old_employee_number end) as old_employee_number,
esd.entity_id,
esd.original_start_date
from earliest_start_date as esd
left join earliest_start_date as es
on (es.employee_number = esd.old_employee_number)
How do I solve this on SQL?
Redshift reportedly supports recursion via WITH clause. Here's an example:
MariaDB 10.5 has similar support. Test case is here:
Fully working test case (via MariaDB 10.5) (Updated)
Link to Amazon Redshift detail for WITH clause and window functions:
Amazon Redshift - WITH clause
Amazon redshift - Window functions
WITH RECURSIVE cte (employee_number, original_no, entity_id, original_start_date, n) AS (
SELECT employee_number, employee_number, entity_id, original_start_date, 1 FROM earliest_start_date WHERE old_employee_number IS NULL UNION ALL
SELECT new_tbl.employee_number, cte.original_no, cte.entity_id, cte.original_start_date, n+1
FROM earliest_start_date new_tbl
JOIN cte
ON cte.employee_number = new_tbl.old_employee_number
)
, xrows AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY n DESC) AS rn
FROM cte
)
SELECT * FROM xrows WHERE rn = 1
;
Result:
+-----------------+-------------+-----------+---------------------+------+----+
| employee_number | original_no | entity_id | original_start_date | n | rn |
+-----------------+-------------+-----------+---------------------+------+----+
| XXXX | XXXX | 88 | 2021-09-02 | 1 | 1 |
| 5041 | A0EH | 96 | 2021-09-05 | 2 | 1 |
+-----------------+-------------+-----------+---------------------+------+----+
2 rows in set
Raw test data:
SELECT * FROM earliest_start_date;
+-----------------+---------------------+-----------+---------------------+
| employee_number | old_employee_number | entity_id | original_start_date |
+-----------------+---------------------+-----------+---------------------+
| 5041 | A0EH | 96 | 2021-09-10 |
| A0EH | NULL | 96 | 2021-09-05 |
| XXXX | NULL | 88 | 2021-09-02 |
+-----------------+---------------------+-----------+---------------------+
Note that the logic makes assumption about uniqueness of the employee_number and, in the current form, can't handle cases where the employee_number is reused by the same employee or used again with a different employee without adjusting prior data. There may not be enough detail in the current structure to handle those cases.

How to aggregate based on various conditions

lets say I have a table which stores itemID, Date and total_shipped over a period of time:
ItemID | Date | Total_shipped
__________________________________
1 | 1/20/2000 | 2
2 | 1/20/2000 | 3
1 | 1/21/2000 | 5
2 | 1/21/2000 | 4
1 | 1/22/2000 | 1
2 | 1/22/2000 | 7
1 | 1/23/2000 | 5
2 | 1/23/2000 | 6
Now I want to aggregate based on several periods of time. For example, I Want to know how many of each item was shipped every two days and in total. So the desired output should look something like:
ItemID | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
_____________________________________________
1 | 7 | 6 | 13
2 | 7 | 13 | 20
How do I do that in the most efficient way
I know I can make three different subqueries but I think there should be a better way. My real data is large and there are several different time periods to be considered i. e. in my real problem I want the shipped items for current_week, last_week, two_weeks_ago, three_weeks_ago, last_month, two_months_ago, three_months_ago so I do not think writing 7 different subqueries would be a good idea.
Here is the general idea of what I can already run but is very expensive for the database
WITH
sq1 as (
SELECT ItemID, sum(Total_shipped) sum1
FROM table
WHERE Date BETWEEN '1/20/2000' and '1/21/2000'
GROUP BY ItemID),
sq2 as (
SELECT ItemID, sum(Total_Shipped) sum2
FROM table
WHERE Date BETWEEN '1/22/2000' and '1/23/2000'
GROUP BY ItemID),
sq3 as(
SELECT ItemID, sum(Total_Shipped) sum3
FROM Table
GROUP BY ItemID)
SELECT ItemID, sq1.sum1, sq2.sum2, sq3.sum3
FROM Table
JOIN sq1 on Table.ItemID = sq1.ItemID
JOIN sq2 on Table.ItemID = sq2.ItemID
JOIN sq3 on Table.ItemID = sq3.ItemID
I dont know why you have tagged this question with multiple database.
Anyway, you can use conditional aggregation as following in oracle:
select
item_id,
sum(case when "date" between date'2000-01-20' and date'2000-01-21' then total_shipped end) as "Jan20-Jan21",
sum(case when "date" between date'2000-01-22' and date'2000-01-23' then total_shipped end) as "Jan22-Jan23",
sum(case when "date" between date'2000-01-20' and date'2000-01-23' then total_shipped end) as "Jan20-Jan23"
from my_table
group by item_id
Cheers!!
Use FILTER:
select
item_id,
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-21') as "Jan20-Jan21",
sum(total_shipped) filter (where date between '2000-01-22' and '2000-01-23') as "Jan22-Jan23",
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-23') as "Jan20-Jan23"
from my_table
group by 1
item_id | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
---------+-------------+-------------+-------------
1 | 7 | 6 | 13
2 | 7 | 13 | 20
(2 rows)
Db<>fiddle.

Oracle SQL - Selecting records into groups and filtering based on a comparison of row 1 + row 2

I've got a database that contains data on monitored manufacturing machines that has these fields within (and more) :
ID | WORK_ORDER_ID | WORK_CENTER_ID | MFGNO | ...
The records are realtime data and are entered in sequentially based on when work_order_id changes. I want to check between work orders if the MFGNO is the same but grouped based on the work_center_id.
For example:
1. 998 | 100 | 205 | TEST_MFG
2. 997 | 100 | 205 | TEST_ MFG
This would return true (or 1 row), as the mfgno's are the same.
Currently I'm able to do this for each work_center_id individually like this:
SELECT * FROM
(
select * FROM (select ID, WORKORDER_ID, TIMESTAMP, MFGNO from
HIST_ILLUM_RT where WORK_CENTER_ID = 5237
ORDER BY ID desc) where rownum = 1
)
where MFGNO = (
SELECT mfgno FROM
(
select * FROM (select ID, WORKORDER_ID, TIMESTAMP, MFGNO from
HIST_ILLUM_RT where WORK_CENTER_ID = 5237
ORDER BY ID desc
) where rownum < 3 order by id asc
) where rownum = 1
)
This produces either 0 rows if there are no current back to back MFGNO's, then 1> if there is.
This way I have to write this expression for each individual work_center_id (there's about 40). I want to have an expression that checks the top two rows of each grouped work_center_id and only returns a row if the MFGNO's match.
For example:
1. 998 | 101 | 205 | TEST_MFG
2. 997 | 098 | 206 | SomethingElse
3. 996 | 424 | 205 | TEST_MFG
4. 995 | 521 | 206 | NotAMatch
5. 994 | 123 | 205 | Doesn'tCompareThis
6. 993 | 664 | 195 | Irrelevant
For this it would only return 1, as only the work_center_id = 205 has a back to back (row 1&2) MFGNO, compared to 206 which doesn't for example.
I'm running Oracle 11g which seems to be limiting me, but I am unable to upgrade or find a work around to create this expression in this current version.
I think you want lag() and some logic:
select count(*)
from (select t.*,
lag(MFGNO) over (partition by WORK_CENTER_ID order by id) as prev_mfgno
from t
) t
where prev_mfgno = mfgno

Show last update date

I am new in this forum and also new in SQL my question is
I have an Excel sheet link to database with "From Microsoft query" I have 3 tables link together pd_ln,pdcflbrt,pdlbr
By using the following query I am getting this data
SELECT pdcflbrt.lbrcod, pdcflbrt.lbrrat, pd_ln.prdnum, pdcflbrt.begeffdat
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky
+--------------+--------------+-----------+------------------+
| lbrcod | lbrrat | prdnum | begeffdat |
+--------------+--------------+-----------+------------------+
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.1 | 00236 | 12/10/2012 0:00 |
| Sizing | 0.21 | 03103 | 8/28/2015 0:00 |
| Sizing | 0.2 | 03103 | 10/13/2011 0:00 |
+--------------+--------------+-----------+------------------+
How do I query to get the last begeffdat of each prdnum.
Magood's answer may work in this situation. However, if there was a unique identifier for each edit that you were selecting, it wouldn't work. As far as I know, you would have to get involved with row_number() like so:
SELECT s2.lbrcod, s2.lbrrat, s2.prdnum, s2.begeffdat from
(SELECT pdcflbrt.lbrcod
, pdcflbrt.lbrrat
, pd_ln.prdnum
, pdcflbrt.begeffdat
, row_number() over (partition by pd_ln.prdnum order by pdcflbrt.begeffdat desc) as RN
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky) s2
where s2.rn = 1
This will return only the top date (it is the same query on the inner portion, but with the row_number() function added, with each different prdnum starting the numbers over, and ordering the rows by date, with the newest date first. The outer portion selects only row 1 (that's the last where) which is the newest date.
EDIT: Alternatively, if you only want the OLDEST update, you could change the desc in the main query's select statement to say asc.
-- Only for name and latest date
select lbrcod, max(begeffdate) begeffdat from #table
group by lbrcod
-- For all columns
select * from (
select *, row_number() over (partition by prdnum order by begeffdate desc) rowNum from #table
) data
where rowNum = 1

SQL query filtering

Using SQL Server 2005, I have a table where certain events are being logged, and I need to create a query that returns only very specific results. There's an example below:
Log:
Log_ID | FB_ID | Date | Log_Name | Log_Type
7 | 4 | 2007/11/8 | Nina | Critical
6 | 4 | 2007/11/6 | John | Critical
5 | 4 | 2007/11/6 | Mike | Critical
4 | 4 | 2007/11/6 | Mike | Critical
3 | 3 | 2007/11/3 | Ben | Critical
2 | 3 | 2007/11/1 | Ben | Critical
The query should do the following: return ONLY one row per each FB_ID, but this needs to be the one where Log_Name has changed for the first time, or if the name never changes, then the first dated row.
In layman's terms I need this to browse through a DB to check for each instance where the responsibility of a case (FB_ID) has been moved to another person, and in case it never has, then just get the original logger's name.
In the example above, I should get rows (Log_ID) 2 and 6.
Is this even possible? Right now there's a discussion going on whether the DB was just made the wrong way. :)
I imagine I need to somehow be able to store the first resulting Log_Name into a variable and then compare it with an IF condition etc. I have no idea how to do such a thing with SQL though.
Edit: Updated the date. And to clarify on this, the correct result would look like this:
Log_ID | FB_ID | Date | Log_Name | Log_Type
6 | 4 | 2007/11/6 | John | Critical
2 | 3 | 2007/11/1 | Ben | Critical
It's not the first date per FB_ID I'm after, but the row where the Log_Name is changed from the original.
Originally FB_ID 4 belongs to Mike, but the query should return the row where it moves on to John. However, it should NOT return the row where it moves further on to Nina, because the first responsibility change already happened when John got it.
In the case of Ben with FB_ID 3, the logger is never changed, so the first row for Ben should be returned.
I guess that there is a better and more performant way, but this one seems to work:
SELECT *
FROM log
WHERE log_id IN
( SELECT MIN(log_id)
FROM log
WHERE
( SELECT COUNT(DISTINCT log_name)
FROM log log2
WHERE log2.fb_id = log.fb_id ) = 1
OR log.log_name <> ( SELECT log_name
FROM log log_3
WHERE log_3.log_id =
( SELECT MIN(log_id)
FROM log log4
WHERE log4.fb_id = log.fb_id ) )
GROUP BY fb_id )
This will efficiently use an index on (fb_id, cdate, id):
SELECT lo4.*
FROM
(
SELECT CASE WHEN ln.log_id IS NULL THEN lo2.log_id ELSE ln.log_id END AS log_id,
ROW_NUMBER() OVER (PARTITION BY lo2.fb_id ORDER BY lo2.cdate) AS rn
FROM (
SELECT
lo.*,
(
SELECT TOP 1 log_id
FROM t_log li
WHERE li.fb_id = lo.fb_id
AND li.cdate >= lo.cdate
AND li.log_id <> lo.log_id
AND li.log_name <> lo.log_name
ORDER BY
cdate, log_id
) AS next_id
FROM t_log lo
) lo2
LEFT OUTER JOIN
t_log ln
ON ln.log_id = lo2.next_id
) lo3, t_log lo4
WHERE lo3.rn = 1
AND lo4.log_id = lo3.log_id
If I've understood the problem correctly, the following SQL should do the trick:
SELECT Log_ID, FB_ID, min(Date), Log_Name, Log_Type
FROM Log
GROUP BY Date
The SQL will select the row with the earliest date for each FP_ID.