Merging multiple rows according to an order - sql

Suppose there are the following rows
| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1 | Alpha | Young | RUNNING |
| 1 | Beta | | STOPPED |
| 1 | Gamma | Foo | READY |
| 1 | Zeta | Zatta | |
| 2 | Guu | Niim | RUNNING |
| 2 | Yuu | Jaam | STOPPED |
| 2 | Nuu | | READY |
| 2 | Faah | Siim | |
| 3 | Iem | | RUNNING |
| 3 | Nyt | Fish | READY |
| 3 | Qwe | Siim | |
We want to merge these rows according to following priority :
STOPPED > RUNNING > READY > (null or empty)
If a row has a value for greatest priority, then value from that row should be used (only if it is not null). If it is null, a value from any other row should be used. The rows should be grouped by id
The correct output for the above input is :
| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1 | Beta | Foo | STOPPED |
| 2 | Yuu | Jaam | STOPPED |
| 3 | Iem | Fish | RUNNING |
What would be a good sql query to accomplish this? I tried using joins, but it did not work out.

You can view this as a case of the group-wise maximum problem, provided you can obtain a suitable ordering over your MachineState column—e.g. by using a CASE expression:
SELECT a.Id,
COALESCE(a.MachineName, t.MachineName) MachineName,
COALESCE(a.WorkerName , t.WorkerName ) WorkerName,
a.MachineState
FROM myTable a JOIN (
SELECT Id,
MIN(MachineName) AS MachineName,
MIN(WorkerName ) AS WorkerName,
MAX(CASE MachineState
WHEN 'READY' THEN 1
WHEN 'RUNNING' THEN 2
WHEN 'STOPPED' THEN 3
END) AS MachineState
FROM myTable
GROUP BY Id
) t ON t.Id = a.Id AND t.MachineState = CASE a.MachineState
WHEN 'READY' THEN 1
WHEN 'RUNNING' THEN 2
WHEN 'STOPPED' THEN 3
END
See it on sqlfiddle:
| id | machinename | workername | machinestate |
|----|-------------|------------|--------------|
| 1 | Beta | Foo | STOPPED |
| 2 | Yuu | Jaam | STOPPED |
| 3 | Iem | Fish | RUNNING |
You could save yourself the pain of using CASE if MachineState was an ENUM type column (defined in the appropriate order). It so happens in this case that a simple lexicographic ordering over the string value will yield the same result, but that's a coincidence on which you really shouldn't rely as it's bound to slip under the radar when someone tries to maintain this code in the future.

This is a prioritization query. One method uses variables. Another uses union all . . . this works if the states are not repeated for a given id:
select t.*
from table t
where machinestate = 'STOPPED'
union all
select t.*
from table t
where machinestate = 'RUNNING' and
not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED'))
union all
select t.*
from table t
where machinestate = 'READY' and
not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED', 'RUNNING'));

change MachineState as enum:
`MachineState` enum('READY','RUNNING','STOPPED') DEFAULT NULL
and sql is simple:
select t.id,state.machinename,state.workername,t.mstate from state,(select id,max(MachineState) mstate from state group by Id) t where t.mstate=state.machinestate and t.id=state.id;

Related

How can I define IIf parameters across different records in a table?

I've defined a query that filters out records that are null in a specific field. I'd like to also calculate a query field that returns the type of record that follows the record that was filtered out, if it matches the parameters. The way I thought to do this was with an IIf statement with multiple parameters:
Preparing: IIf([tblCustomers!OrderId]=([tblCustomers!OrderId]+1)
AND [tblCustomers!OrderStatus]="Preparing","Preparing","")
This didn't work as I hoped, but I wasn't too surprised, as it would have to return data from the field initially tested. So, the argument that adds 1 is actually doing nothing.
Is there a way to target the next record in the table, test if it matches one of two or three strings, then return which one it is?
Edit: Following #mazoula's solution, it seems a correlated subquery is indeed the answer here. Following the guide on allenbrowne.com (linked by June7), I seemed to be on the right track. Here is my code for retrieving the status of a previous record:
SELECT tblCustomers.AccountId,
tblCustomers.OrderId,
tblCustomers.OrderStatus,
tblCustomers.OrderShipped,
tblCustomers.OrderNotes,
(SELECT TOP 1 Dupe.OrderStatus
FROM tblCustomers AS Dupe
WHERE Dupe.AccountId = tblCustomers.AccountId
AND Dupe.OrderId > tblCustomers.OrderId
ORDER BY Dupe.AccountId DESC, Dupe.OrderId) AS NextStatus
FROM tblCustomers
WHERE (((tblCustomers.OrderShipped)="N") AND
((tblCustomers.OrderNotes) Is Null))
ORDER BY tblCustomers.AccountId DESC;
Unfortunately, I am met with the following error:
At most one record can be returned by this subquery
Doing a little more research, I found that incorporating an INNER JOIN expression should solve this.
...
FROM tblCustomers
INNER JOIN OrderStatus Dupe ON Dupe.AccountId = tblCustomers.AccountId
WHERE ...
This is where I've hit another roadblock and, when the syntax is at least correct, I receive the error:
Join expression not supported.
Is this a simple syntax issue, or have misunderstood the role of a Join expression?
in Access 2016 I do this in two parts because access throws the error: must use an updateable query when I try to update based on a subquery. For instance, if I want to replace the Null Values in TableA.Field3 with 'a' if the next record's Field3 is 'a'
tableA:
-------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 |
-------------------------------------------------------------------------------------
| 1 | a | 1 | |
-------------------------------------------------------------------------------------
| 2 | b | 2 | |
-------------------------------------------------------------------------------------
| 3 | c | 3 | a |
-------------------------------------------------------------------------------------
| 4 | d | 4 | b |
-------------------------------------------------------------------------------------
| 5 | e | 5 | |
-------------------------------------------------------------------------------------
| 6 | f | 6 | b |
-------------------------------------------------------------------------------------
I make a table on which to base the update query:
Replacement: (SELECT TOP 1 Dupe.Field3 FROM [TableA] as Dupe WHERE Dupe.ID > [TableA].[ID])
'SQL PANE'
SELECT TableA.ID, TableA.Field1, TableA.Field2, TableA.Field3, (SELECT TOP 1 Dupe.Field3 FROM [TableA] as Dupe WHERE Dupe.ID > [TableA].[ID]) AS Replacement INTO TempTable
FROM TableA;
TempTable:
----------------------------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 | Replacement |
----------------------------------------------------------------------------------------------------------
| 1 | a | 1 | | |
----------------------------------------------------------------------------------------------------------
| 2 | b | 2 | | a |
----------------------------------------------------------------------------------------------------------
| 3 | c | 3 | a | b |
----------------------------------------------------------------------------------------------------------
| 4 | d | 4 | b | |
----------------------------------------------------------------------------------------------------------
| 5 | e | 5 | | b |
----------------------------------------------------------------------------------------------------------
| 6 | f | 6 | b | |
----------------------------------------------------------------------------------------------------------
Finally do the Update
UPDATE TempTable INNER JOIN TableA ON TempTable.ID = TableA.ID SET TableA.Field3 = [TempTable].[Replacement]
WHERE (((TempTable.Replacement)='a'));
TableA after update
-------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 |
-------------------------------------------------------------------------------------
| 1 | a | 1 | |
-------------------------------------------------------------------------------------
| 2 | b | 2 | a |
-------------------------------------------------------------------------------------
| 3 | c | 3 | a |
-------------------------------------------------------------------------------------
| 4 | d | 4 | b |
-------------------------------------------------------------------------------------
| 5 | e | 5 | |
-------------------------------------------------------------------------------------
| 6 | f | 6 | b |
notes: In the Make Table query remember to sort TableA and Dupe in the same way. Here we use the default sort of increasing ID for TableA then grab the first record with a higher ID using the default sort again. the only reason I did the filtering to 'a' in the update query is it made the Make Table query simpler.

Join three tables by one foreign key

I have three tables:
Task (ID, TaskDescription)
Schedule (TaskID, ID, DueAt)
Audit (TaskID, TestID)
In Schedule table there is a list of scheduled tasks, and Audit table is for already done tasks. So first there is a row in Schedule, then when this task is done it's removing from Schedule table and added into Audit table.
Tasks table
+----+-----------------+
| ID | TaskDescription |
+----+-----------------+
| 1 | Clean room |
| 2 | Remove trash |
+----+-----------------+
Schedule table
+--------+--------+------------+
| ID | TaskID | DueAt |
+--------+--------+------------+
| 927847 | 1 | 2020-08-01 |
| 777777 | 2 | 2020-08-07 |
+--------+--------+------------+
Audit table
+--------+--------+
| TaskID | TestID |
+--------+--------+
| 1 | 3 |
| 1 | 2 |
| 1 | 1 |
| 2 | 4 |
+--------+--------+
I need to take all planned and already done tasks for one task ID. So for example, what I expect as result:
+---------+-----------------+-------------+----------------+--------+
| Task.ID | TaskDescription | Schedule.ID | Schedule.DueAt | TestID |
+---------+-----------------+-------------+----------------+--------+
| 1 | Clean room | 927847 | 2020-08-01 | NULL |
| 1 | Clean room | NULL | NULL | 3 |
| 1 | Clean room | NULL | NULL | 2 |
| 1 | Clean room | NULL | NULL | 1 |
+---------+-----------------+-------------+----------------+--------+
That means already 3 tasks are done and one is scheduled for 2020-08-01.
What i tried:
SELECT
TaskID = t.ID,
t.TaskDescription,
ScheduleID = s.ID,
ScheduleDueAt = s.DueAt,
a.TestID
FROM Task t
LEFT OUTER JOIN Schedule s
ON (s.TaskID = t.ID)
LEFT OUTER JOIN Audit a
ON (a.TaskID = t.ID)
WHERE t.ID = '1'
But of course, I get the wrong result:
+---------+-----------------+-------------+----------------+--------+
| Task.ID | TaskDescription | Schedule.ID | Schedule.DueAt | TestID |
+---------+-----------------+-------------+----------------+--------+
| 1 | Clean room | 927847 | 2020-08-01 | 3 |
| 1 | Clean room | 927847 | 2020-08-01 | 2 |
| 1 | Clean room | 927847 | 2020-08-01 | 1 |
+---------+-----------------+-------------+----------------+--------+
I'm going to use UNION for that but first wanted to ask maybe there is more right way how to do it.
You need to union all the schedule and audit tables and query nulls for the missing columns. Then, you can join that result with the task table:
SELECT t.id, t.taskdescription, s.id, s.dueat, s.testid
FROM task t
JOIN (SELECT taskid, id, dueat, NULL AS testid
FROM schedule
UNION ALL
SELECT taskid, NULL, NULL, testid
FROM audit) s ON t.id = s.taskid
I agree that using UNION ALL as #Mureinik suggested is probably your best option here, but just for fun, another alternative would be this.
If you added another entry to your audit table for each taskID with a TestID of 0 (sort of as a default whenever a new task is created), then it will allow you to join onto the audit table, without the need for UNION.
So your Audit table would look like this:
+--------+--------+
| TaskID | TestID |
+--------+--------+
| 1 | 0 |
| 2 | 0 |
| 1 | 3 |
| 1 | 2 |
| 1 | 1 |
| 2 | 4 |
+--------+--------+
Then you can modify your query to join the schedule table as normal, but only where the audit table value is 0.
And finally, to keep it tidy, use NULLIF to hide the 0 for that TestID if you wish:
Select
TaskID = t.ID,
t.TaskDescription,
ScheduleID = s.ID,
ScheduleDueAt = s.DueAt,
TestID= nullIF(a.TestID,0)
from
Task t
inner join
Audit a on
a.TaskID = t.ID
left join
Schedule s on
s.TaskID = t.ID
and a.TaskID = 0
where
t.ID = 1
UPDATE: You will also need an additional where clause for when there is no scheduled task, to prevent an empty row returning:
where
t.ID = 1
and not (s.TaskID is null and a.TestID = 0)

How can I subtract two row's values within same column using sql query in access?

(query access)
This is the table structure:
+-----+--------+--------+
| id | name | sub1 |
+-----+--------+--------+
| 1 | ABC | 6.27% |
| 2 | ABC | 7.47% |
| 3 | PQR | 3.39% |
| 4 | PQR | 2.21% |
+-----+--------+--------+
I want to subtract Sub1
Output should be:
+-----+--------+---------+------------------------------------+
| id | name | sub1 | |
+-----+--------+---------+------------------------------------+
| 1 | ABC | 6.27% | 0 First Rec no need Subtract |
| 2 | ABC | 7.47% | 1.2% <=(7.47-6.27) |
| 3 | PQR | 3.39% | 0 First Rec no need Subtract |
| 4 | PQR | 2.21% | -1.18% <=(2.21-3.39) |
+-----+--------+---------+------------------------------------+
Thank you so much.
If you can guarantee consecutive id values, then the following presents an alternative:
select t.*, nz(t.sub1-u.sub1,0) as sub2
from YourTable t left join YourTable u on t.name = u.name and t.id = u.id+1
Change YourTable to the name of your table.
This is painful, but you can do:
select t.*,
(select top 1 t2.sub1
from t as t2
where t2.name = t.name and t2.id < t.id
order by t2.id desc
) as prev_sub1
from t;
This gives the previous value or NULL for the first row. You can just use - for the subtraction.
An index on (name, id) would help a bit with performance. However, if you can upgrade to a better database, you can then just use lag().

SQL select multiple values present in multiple columns

I have two tables DiagnosisCodes and DiagnosisConditions as shown below. I need to find the members(IDs) who have a combination of Hypertension and Diabetes. The problem here is the DiagnosisCodes are spread across 10 columns. How do I check if the member qualifies for both conditions
DiagnosisCodes
+----+-------+-------+-------+-----+--------+
| ID | Diag1 | Diag2 | Diag3 | ... | Diag10 |
+----+-------+-------+-------+-----+--------+
| A | 2502 | 2593 | NULL | ... | NULL |
| B | 2F93 | 2509 | 2593 | ... | NULL |
| C | C257 | 2509 | C6375 | ... | NULL |
+----+-------+-------+-------+-----+--------+
DiagnosisConditions
+------+--------------+
| Code | Condition |
+------+--------------+
| 2502 | Hypertension |
| 2593 | Diabetes |
| 2509 | Diabetes |
| 2F93 | Hypertension |
| 2673 | HeartFailure |
+------+--------------+
Expected Result
+---------+
| Members |
+---------+
| A |
| B |
+---------+
How do I query to check Mulitple values which are present in Multiple columns. Do you suggest to use EXISTS?
SELECT DISTINCT id
FROM diagnosiscodes
WHERE ( diag1, diag2...diag10 ) IN (SELECT code
FROM diagnosiscondition
WHERE condition IN ( 'Hypertension','Diabetes' )
)
I would do this using group by and having:
select dc.id
from diagnosiscodes dc join
diagnosiscondistions dcon
on dcon.code in (dc.diag1, dc.diag2, . . . )
group by id
having sum(case when dcon.condition = 'diabetes' then 1 else 0 end) > 0 and
sum(case when dcon.condition = 'Hypertension' then 1 else 0 end) > 0;
Then, you should fix your data structure. Having separate columns with the same information distinguished by a number is usually a sign of a poor data structure. You should have a table, called somethhing like PatientDiagnoses with one row per patient and diagnosis.
Here is one way by unpivoting the data
SELECT DISTINCT id
FROM yourtable
CROSS apply (VALUES (Diag1),(Diag2),..(Diag10))tc(Diag)
WHERE Diag IN (SELECT code
FROM diagnosiscondition
WHERE condition IN ( 'Hypertension', 'Diabetes' ) group by code having count(distinct condition)=2)

Find and update specific duplicates in MS SQL

given below table:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | NULL |
| 2 | Ashly | Baboon | 09898788909 | NULL |
| 3 | James | Vangohg | 04333989878 | NULL |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+
I want to use MS SQL to find and update duplicate (based on name, last name and number) but only the earlier one(s). So desired result for above table is:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | DUPE |
| 2 | Ashly | Baboon | 09898788909 | DUPE |
| 3 | James | Vangohg | 04333989878 | DUPE |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+
This query uses a CTE to apply a row number, where any number > 1 is a dupe of the row with the highest ID.
;WITH x AS
(
SELECT ID,NAME,[LAST NAME],PHONE,STATE,
ROW_NUMBER() OVER (PARTITION BY NAME,[LAST NAME],PHONE ORDER BY ID DESC)
FROM dbo.YourTable
)
UPDATE x SET STATE = CASE rn WHEN 1 THEN NULL ELSE 'DUPE' END;
Of course, I see no reason to actually update the table with this information; every time the table is touched, this data is stale and the query must be re-applied. Since you can derive this information at run-time, this should be part of a query, not constantly updated in the table. IMHO.
Try this statement.
LAST UPDATE:
update t1
set
t1.STATE = 'DUPE'
from
TableName t1
join
(
select name, last_name, phone, max(id) as id, count(id) as cnt
from
TableName
group by name, last_name, phone
having count(id) > 1
) t2 on ( t1.name = t2.name and t1.last_name = t2.last_name and t1.phone = t2.phone and t1.id < t2.id)
If my understanding of your requirements is correct, you want to update all of the STATE values to DUPE when there exists another row with a higher ID value that has the same NAME and LAST NAME. If so, use this:
update t set STATE = (case when sorted.RowNbr = 1 then null else 'DUPE' end)
from yourtable t
join (select
ID,
row_number() over
(partition by name, [last name], phone order by id desc) as RowNbr from yourtable)
sorted on sorted.ID = t.ID