Query for records only where the Project Status has changed - sql

I'm trying to write a query where I return only the records where the status of a project changes. Projects start off "In Progress," can "Pause," go back "In Progress," and continue until they are "Completed." The table is designed for the full history of modifications toward the project and details of the project cause a new record to be created. So, table structure will look like below, and the highlighted records are what I want to query for:
I've tried to use a combination of ROW_NUMBER() and RANK() with certain elements but I can't get seem to get it right. How do I query for the records where the status changes? This post and this post are similar but don't help.

Thanks to #klin for the LAG() recommendation.
WITH l as (
SELECT LAG(status, 1) OVER (PARTITION BY mod_id ORDER BY mod_date) lag, *
FROM projects_table
ORDER BY mod_id, mod_date
)
SELECT *
FROM l
WHERE lag IS DISTINCT FROM status
ORDER BY mod_date

Related

Select largest date from column based on another column in table

I'm new to SQL. Trying to get a certain date for jobs from a table. The only way to get these dates is to look to a massive table where every item for each job is stored with a last transaction date. The date I want is the largest date in the lst_trx_date column for each job.
The data in the table looks something like this:
Where each job has a varying amount of items. My biggest hurdle and my main question: How can I instead of selecting the entire job table only select the largest lst_trx_date for each job? I initially brought in the data using microsoft query, but I realize my request will probably require modifying the SQL command text directly.
Try something like this.. this will give you the max date
SELECT MAX (lst_trx_date) AS "Max Date"
FROM table where job = 1234;
To get the latest date for each job, you can use windowing functions. As an example try:
select job, item, lst_trx_date from (select job, item, lst_trx_date, row_number()
over(partition by stat,job,item order by
lst_trx_date desc) rn
from <table>)t
where rn = 1
I think it would be along these lines:
SELECT job, MAX(lst_trx_date) as job, last_transaction_date
FROM table
GROUP BY job
ORDER BY lst_trx_date DESC

Reduce consecutive rows with the same value in a column to a single row

I am trying to create a biometric attendance system that receives data from a biometric device.
The structure of the attendance table received from the device looks something like this.
The table originally has a lot of data with more than one emp_no, but I created a stored procedure that extracts details of one employee on a specific date as seen above.
The challenge that is facing right now is that, I need to analyze this table and restructure it ( recreate another table ) so that it has alternating check-ins and checkouts ( each checkin must be followed by a checkout and vice versa ) and for
consecutive check-ins, I should take the earlier one while for consecutive check-outs, I should take the latest one.
Any ideas on how to go about this will be very much appreciated.
Thank you.
Use the window functions lag() and lead():
select emp_id, att_date, att_time, status
from (
select
emp_id, att_date, att_time, status,
case
when status = 'checkin' then lag(status) over w is distinct from 'checkin'
else lead(status) over w is distinct from 'checkout'
end as visible
from my_table
window w as (partition by emp_id, att_date order by att_time)
) s
where visible
Db<>fiddle.

Select the latest record based on certain criteria in PL/SQL

I have a table with real-time scanning data from our employees. As you can see, each boxes can be scanned multiple times, even employees can scan the boxes multiple time for one status.
I am trying to pull the latest record for each box, but the status for the latest record should be "Refused"
From the picture, as you can see, although Carton 1234 has a record with status "Refused", but this record is not the last one, so I don’t need this. And the carton 1235 is what I need.
I don’t want to use a window function to rank each record in the table first, because I have a lot of rows in the table, and I think it will be time consuming.
So is there any better way to achieve my goal?
Supposing that you don't really need a PL/SQL solution. Here is SQL only:
This is a solution without window functions:
select *
from mytable
where (carton_id, scantime) in
(
select carton_id, max(scantime)
from mytable
group by carton_id
having max(status) keep (dense_rank last order by scantime) = 'Refused'
);
But I don't think that this is superior to using a window function. So you can just as well try
select *
from
(
select mytable.*, max(scantime) over (partition by carton_id) as max_scantime
from mytable
group by carton_id
)
where scantime = max_scantime and status = 'Refused';
Here is one method:
select t.*
from t
where t.status = 'Refused' and
t.scantime = (select max(t2.scantime) from t t2 where t2.carton_id = t.carton_id);

SQL select the first match in an ordered list

Using Microsoft SQL Server Management Studio version 14.0.17213.0
I have a list of events that go in order. I want to select the highest precedent acct_no, complete_date and event.
My problem is if I use
select
account_number, event, max(complete_date) as mx_comp
from
mytable
where
event in ('event1','event2'....)
then I get all my acct_numbers, all the events in the list and the max complete date for that event. But I want acct_no listed with the maximum completed date for any item in the list and the associated event.
Furthermore, its wholly possible that two events occurred on the same date, so I cannot do
select *
from mytable mt
join
(select acct_number, max(complete_date)
from mytable) t on mt.acct_number = t.account_number
and mt.complete_date = t.complete_date
because if two events occurred on the same day then I still get duplicate results.
I have tried to do a similar thing with
row_number() over (order by account_number) as RowNum
but it did not work, because I still get matches to all the events, not just my highest precedence event
it really boils down to needing to return the acct_number, event and complete date associated to the highest importance match from items in an ordered list.
I am sure it is easy - I just cannot seem to figure it out and despite all my google and stack searching I simply cannot figure it out
I have recently been thinking that it might be possible with something like coalesce(mylist) because I would be able to put my list in order but I cannot figure out how to use coalesce in a meaningful way for this problem.
The real solution would be to create a table with precedence numbers or have a most recent indicator but I dont have unlimited access to create any tables I want.
Any help or ideas on how to match to an ordered list would be appreciated
You seem to want:
select t.*
from (select t.*,
row_number() over (partition by account_number order by complete_date desc) as seqnum
from mytable t
where event in ('event1', 'event2', ....)
) t
where seqnum = 1;

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.