I have a table in Access that has a record of test results for each person for each day. Some people may have taken more than one test on the same day. I have shown a simplified version of this table below:
| ID | testDate | Person | Pass? | ConsecFailDays |
----------------------------------------------------------
| 01 | 01/08/18 | John | Fail | |
| 02 | 01/08/18 | John | Pass | |
| 03 | 03/08/18 | John | Fail | |
| 04 | 01/08/18 | Mark | Fail | |
| 05 | 02/08/18 | Mark | Pass | |
I have tried to write an SQL UPDATE query that will update the last column with the number of consecutive days that that person has failed at least one test (not necessarily consecutive calender days, just the days that they actually did a test). The final result should look like this...
| ID | testDate | Person | Pass? | ConsecFailDays |
----------------------------------------------------------
| 01 | 01/08/18 | John | Fail | 1 |
| 02 | 01/08/18 | John | Pass | 1 |
| 03 | 03/08/18 | John | Fail | 2 |
| 04 | 01/08/18 | Mark | Fail | 1 |
| 05 | 02/08/18 | Mark | Pass | 0 |
I was really struggling to get this to work and in the end I resorted to using VBA to create a recordset of each unique person and then loop through each day for that person to check if they had a fail on that day. My dataset is quite large and it is taking hours to run.
I expect that a query that operates on the entire set of data would be much quicker. Does anyone know if there is an SQL solution to this after all?
You need several subqueries to get the last date with a passed test and no fail, then count the days between this date and testDate:
SELECT t1.ID
,t1.testDate
,t1.Person
,t1.[Pass?]
,(
SELECT count(testDate)
FROM (
SELECT DISTINCT testDate
,Person
,[Pass?]
FROM yourTable
) AS t2
WHERE t2.[Pass?] = false
AND t2.Person = t1.Person
AND t2.testDate >= Nz((
SELECT max(t3.testDate)
FROM (
SELECT t4.testDate
,t4.Person
FROM yourTable AS t4
WHERE (
((t4.[Pass?]) = True)
AND (
(
(
SELECT count(*) AS failed
FROM yourTable AS t5
WHERE t5.testDate = t4.testDate
AND t5.Person = t4.Person
AND t5.[Pass?] = false
)
) = 0
)
)
) AS t3
WHERE t1.Person = t3.Person
AND t3.testDate <= [t1].[testDate]
GROUP BY Person
))
AND t2.testDate <= t1.testDate
) AS ConsecFailDays
FROM yourTable AS t1;
Where t2 counts the distinct days (you can drop DISTINCT if just one fail per day is posible to speed up).
t3 are the days with passed tests and no fail.
t4 are the days with passed tests (and maybe failed tests).
t5 counts the failed tests on a day with a passed test.
As you wanted an Update-Query you can use:
UPDATE yourTable SET ConsecFailDays = DLookUp("ConsecFailDays","newQuery","ID = " & yourTable.ID)
but you should try the Select-Query with indices first. If performance is too poor you can use the update, but you have to run it every time your data changes.
Suggestions:
Don't use specialchars like ? in Pass? for column or table names to avoid being forced to use square brackets.
Person should be a foreign-key to table persons as persons can have equal names (e.g John Smith).
[Pass?] should be boolean (true/false). If you want to stay on string you have to replace [Pass?] = false with [Pass?] = 'Fail' and [Pass?] = true with
[Pass?] = 'Pass'
there should be an index for testDate, Person and a combined index for testDate, Person to increase query performance.
Related
I have a Production Table and a Standing Data table. The relationship of Production to Standing Data is actually Many-To-Many which is different to how this relationship is usually represented (Many-to-One).
The standing data table holds a list of tasks and the score each task is worth. Tasks can appear multiple times with different "ValidFrom" dates for changing the score at different points in time. What I am trying to do is query the Production Table so that the TaskID is looked up in the table and uses the date it was logged to check what score it should return.
Here's an example of how I want the data to look:
Production Table:
+----------+------------+-------+-----------+--------+-------+
| RecordID | Date | EmpID | Reference | TaskID | Score |
+----------+------------+-------+-----------+--------+-------+
| 1 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 2 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 3 | 30/02/2020 | 1 | 123 | 1 | 2 |
| 4 | 31/02/2020 | 1 | 123 | 1 | 2 |
+----------+------------+-------+-----------+--------+-------+
Standing Data
+----------+--------+----------------+-------+
| RecordID | TaskID | DateActiveFrom | Score |
+----------+--------+----------------+-------+
| 1 | 1 | 01/02/2020 | 1.5 |
| 2 | 1 | 28/02/2020 | 2 |
+----------+--------+----------------+-------+
I have tried the below code but unfortunately due to multiple records meeting the criteria, the production data duplicates with two different scores per record:
SELECT p.[RecordID],
p.[Date],
p.[EmpID],
p.[Reference],
p.[TaskID],
s.[Score]
FROM ProductionTable as p
LEFT JOIN StandingDataTable as s
ON s.[TaskID] = p.[TaskID]
AND s.[DateActiveFrom] <= p.[Date];
What is the correct way to return the correct and singular/scalar Score value for this record based on the date?
You can use apply :
SELECT p.[RecordID], p.[Date], p.[EmpID], p.[Reference], p.[TaskID], s.[Score]
FROM ProductionTable as p OUTER APPLY
( SELECT TOP (1) s.[Score]
FROM StandingDataTable AS s
WHERE s.[TaskID] = p.[TaskID] AND
s.[DateActiveFrom] <= p.[Date]
ORDER BY S.DateActiveFrom DESC
) s;
You might want score basis on Record Level if so, change the where clause in apply.
I have a situation which is a little hard to describe. I'll try to explain with an example and the result which I want.
I have three tables like so
Employee
| id | Name |
|----+-------|
| 1 | Alice |
| 2 | Bob |
| 3 | Jane |
| 4 | Jack |
Task
| id | employee_id | description |
|----+-------------+---------------------|
| 1 | 1 | Fix bug |
| 2 | 1 | Implement feature |
| 3 | 1 | Deploy master |
| 4 | 2 | Integrate feature |
| 5 | 2 | Fix cosmetic issues |
Status
| id | task_id | time | details | Terminal |
|----+---------+-------+-----------+----------|
| 1 | 1 | 12:00 | Assigned | false |
| 2 | 1 | 12:30 | Started | false |
| 3 | 1 | 13:00 | Completed | true |
| 4 | 2 | 12:10 | Assigned | false |
| 5 | 2 | 14:00 | Started | false |
| 6 | 3 | 12:15 | Assigned | false |
| 7 | 4 | 12:20 | Assigned | false |
| 8 | 5 | 12:25 | Assigned | false |
| 9 | 4 | 12:30 | Started | false |
(I have also put these into a sqlfiddle page at http://sqlfiddle.com/#!9/728c85/1)
The basic idea is that I have some employees and tasks. The tasks can be assigned to employees and as they work on them, they keep adding "status" rows.
Each status entry has a field "terminal" which can either be true or false.
If the last status entry for a task has terminal set to true, then that task is over and there's nothing more to be done on it.
If all tasks assigned to an employee are over, then the employee is considered free.
I need to get a list of free employees. This would basically mean, given an employee, a list of all his or her tasks with statuses. So, something like this for Alice
| Task | Completed |
|-------------------+-----------|
| Fix bug | true |
| Implement feature | false |
| Deploy master | false |
From which I know that she's not free right now since there are 'false' entries in completed.
How would I do this? If my tables are not constructed properly for this kind of query, I'd very much like some advice on that too.
(I titled the question like this since I want to order the statuses of each task per user and them limit them to the last row).
Update
It was suggested to me that the status field should really go inside the tasks table and the Status table should simple be a log table.
I would go for the idea to have the status in the tasks table. (Please see my comment on your request on this.) However, here are two queries to select free employees:
If tasks cannot be re-opened, it is simple: Get all incompleted tasks by checking whether a record with terminal = true exists. Free employees are all that have no incomplete task.
select *
from employee
where id not in
(
select employee_id
from task
where id not in (select task_id from status where terminal = true)
);
If tasks can be re-opened, however, then you do the same but must find the last status. This can be done with Postgre's DISTINCT ON for instance.
select *
from employee
where id not in
(
select employee_id
from task
where not
(
select distinct on (task_id) terminal
from status
where task_id = task.id
order by task_id, id desc
)
);
(I am using the ID to find the last entry per task, as the time without a date seems inappropriate. You could only use the time column instead, if a task will always run within one day only.)
SQL fiddles:
http://sqlfiddle.com/#!15/f0ea8/2
http://sqlfiddle.com/#!15/f0ea8/1
You have to group all the statuses togheter and you can then use MAX() to find if one of them is true, like this:
SELECT t.description, MAX(s.terminal)
FROM Employee e
INNER JOIN task t ON t.employee_id = e.id
INNER JOIN status s ON s.task_id = t.id
GROUP BY t.id;
When you want this just for one user add something like this WHERE e.id = 1.
Hope this helps
select T.employee_id, T.description, S.Terminal
from Employee E
INNER JOIN Task T ON E.id=T.employee_id
INNER JOIN (Select task_id, max(id) as status_id FROM Status GROUP BY task_id) as ST on T.id=ST.task_id
INNER JOIN Status S on S.id=ST.status_id
I hope this will help you...??
select E.Name,T.id as[Task Id],T.description,S.Terminal from Employee E
inner join Task T on E.id=T.employee_id
inner join Status S on S.task_id=T.id
where e.id not in (select employee_id from Task where id in (select task_id from Status where Terminal='true' and details='Completed') )
I have data on approx 1000 individuals, where each individual can have multiple rows, with multiple dates and where the columns indicate the program admitted to and a code number.
I need each row to contain a distinct date, so I need to delete the rows of duplicate dates from my table. Where there are multiple rows with the same date, I need to keep the row that has the lowest code number. In the case of more than one row having both the same date and the same lowest code, then I need to keep the row that also has been in program (prog) B. For example;
| ID | DATE | CODE | PROG|
--------------------------------
| 1 | 1996-08-16 | 24 | A |
| 1 | 1997-06-02 | 123 | A |
| 1 | 1997-06-02 | 123 | B |
| 1 | 1997-06-02 | 211 | B |
| 1 | 1997-08-19 | 67 | A |
| 1 | 1997-08-19 | 23 | A |
So my desired output would look like this;
| ID | DATE | CODE | PROG|
--------------------------------
| 1 | 1996-08-16 | 24 | A |
| 1 | 1997-06-02 | 123 | B |
| 1 | 1997-08-19 | 23 | A |
I'm struggling to come up with a solution to this, so any help greatly appreciated!
Microsoft SQL Server 2012 (X64)
The following works with your test data
SELECT ID, date, MIN(code), MAX(prog) FROM table
GROUP BY date
You can then use the results of this query to create a new table or populate a new table. Or to delete all records not returned by this query.
SQLFiddle http://sqlfiddle.com/#!9/0ebb5/5
You can use min() function: (See the details here)
select ID, DATE, min(CODE), max(PROG)
from table
group by DATE
I assume that your table has a valid primary key. However i would recommend you to take IDas Primary key. Hope this would help you.
I have at table containing procurement contracts that looks like this:
+------+-----------+------------+---------+------------+-----------+
| type | text | date | company | supplierID | name |
+ -----+-----------+------------+---------+------------+-----------+
| 0 | None | 2004-03-29 | 310 | 227234 | HRC INFRA |
| 0 | None | 2007-09-30 | 310 | 227234 | HRC INFRA |
| 0 | None | 2010-11-29 | 310 | 227234 | HRC INFRA |
| 2 | Strategic | 2011-01-01 | 310 | 227234 | HRC INFRA |
| 0 | None | 2012-04-10 | 310 | 227234 | HRC INFRA |
+------+-----------+------------+---------+------------+-----------+
In this example the first three rows the contract is the same. So I only want the first one.
The row with type = 2 is a change in procurement contract with the given supplier. I want to select that row as well.
On the last row the contract changes back to 0, so I want to select that row as well.
Basically I want to select the first row and the rows where the contract type changes. So the result should look like this:
+------+-----------+------------+---------+------------+-----------+
| type | text | date | company | supplierID | name |
+ -----+-----------+------------+---------+------------+-----------+
| 0 | None | 2004-03-29 | 310 | 227234 | HRC INFRA |
| 2 | Strategic | 2011-01-01 | 310 | 227234 | HRC INFRA |
| 0 | None | 2012-04-10 | 310 | 227234 | HRC INFRA |
+------+-----------+------------+---------+------------+-----------+
Any suggestions to how I can accomplish this?
;WITH cte AS
(
SELECT ROW_NUMBER() OVER (ORDER BY date) AS Id,
type, text, date, company, supplierId, name
FROM your_table
)
SELECT c1.type, c1.text, c1.date, c1.company,
c1.supplierId, c1.name
FROM cte c1 LEFT JOIN cte c2 ON c1.id = c2.id + 1
WHERE c2.text IS NULL OR c1.text != c2.text
Demo on SQLFiddle
I don't have SQL server in front of me to test it out so I'm not going to attempt the actual solution for it right now, but fyi there are few things you need:
1) A way to make sure the records are ordered properly. I don't see any kind of an id here which means you have no guarantee that they will appear in that order. I assume there is one so just make sure you order by it
2) You need to do an outer join on the table to itself on whatever the index is, but instead of "table1.index = table2.index" it will look like "table1.index = table2.index + 1". If your indexes aren't sequential then it will make joining them this way more complex than that though.
3) In the where clause you'll specify something like
where table1.type <> table2.type
That will get you most the way there. That won't pick up the very first record though since there is no record before the first record to compare to so you'll need an OR addition to compensate for that. And I'm assuming that type has no NULL values.
Sorry I couldn't be more help with an actual implementation but maybe someone else will take care of that shortly.
might be what you want. Presumingly you dont have type < 0.
SELECT *
FROM [TABLE] as ot where ot.type <>
(select top 1 coalesce(it.type, -1) from [TABLE] as it where it.date < ot.date order by it.date desc)
Also, take not of brandon note to make shure tables are ordered, due i dont see PK.
I am having recods as below
---------------------------------------------------------------------
| AcnttNo | Date1 | Balance1 | Date2 | balance3 | date4 | balance4 |
|--------------------------------------------------------------------
| 123 | 50282 | 3456 | 45465 | 56557 | 4556 | 324235 |
| 123 | 56757 | 23434 | 234235 | 344324 | 56476 | 5676 |
| 123 | 435 | 2434 | 2343 | 234545 | 24245 | 2423424 |
---------------------------------------------------------------------
For example:
for each AcnttNo there will be several rows of data for balance and date.
I need to get the balance for largest date.
I'm using PL/SQL developer and an oracle database
If you want the row with the greatest date:
select
*
from
YourTable y
where
greatest(y.date1, y.date2, y.date3) =
(select max(greatest(yx.date1, yx.date2, yx.date3))
from
YourTable yx)
If you do actually need the balance matching the greatest date on that row:
select
greatest(y.date1, y.date2, y.date3) as GreatestDate,
case greatest(y.date1, y.date2, y.date3)
when y.Date1 then
y.balance1
when y.date2 then
y.balance2
when y.date3 then
y.balance3
end as GreatestDateBalance
from
YourTable y
where
greatest(y.date1, y.date2, y.date3) =
(select max(greatest(yx.date1, yx.date2, yx.date3))
from
YourTable yx)
But I think what you really need, is to reconsider your table design. :)
I'm not sure why you've multiple dates / balances in your table, however, the below should get you something interesting that you can work on...
SELECT *
FROM YourTable T
WHERE NOT EXISTS (
SELECT *
FROM YourTable T2
WHERE T2.AcntNo = T.AcntNo
AND T2.Date1 > T.Date1
)