Sql getting MAX and MIN values based on two columns for the ids from two others - sql

I'm having difficulties figuring a query out, would someone be able to assist me with this?
Problem: 4 columns that represent results for the 2 separate tests. One of them taken in UK and another in US. Both of them are the same test and I need to find the highest and lowest score for the test taken in both countries. I also need to avoid using subqueries and temporary tables. Would appreciate theoretical ideas and actual solutions for the problem.
The table looks like this:
ResultID Test_UK Test_US Test_UK_Score Test_US_Score
1 1 2 48 11
2 4 1 21 24
3 3 1 55 71
4 5 6 18 78
5 7 4 19 49
6 1 3 23 69
7 5 2 98 35
8 6 7 41 47
The desired results I'm looking for:
TestID HighestScore LowestScore
1 71 23
2 35 11
3 69 55
4 49 21
5 98 18
6 78 41
7 47 19
I tried implementing a case of comparison, but I still ended up with subquery to pull out the final results. Also tried union, but it ends up in a sub query again. As far as I can think it shoul be a case when then query, but can't really come up with the logic for it, as it requires to match the ID's of the tests.
Thank you!
What I've tried and got the best results (still wrong)
select v.TestID,
max(case when Test_US_Score > Test_UK_Score then Test_UK_Score else null end) MaxS,
min(case when Test_UK_Score > Test_US_Score then Test_US_Score else null end) MinS
FROM ResultsDB rDB CROSS APPLY
(VALUES (Test_UK, 1), (Test_US, 0)
) V(testID, amount)
GROUP BY v.TestID
Extra
The answer provided by M. Kanarkowski is a perfect solution. I'm no expert on CTE, and a bit confused, how would it be possible to adapt this query to return the result ID of the row that min and max were found.
something like this:
TestID Result_ID_Max Result_ID_Min
1 3 6
2 7 1
3 6 3
Extra 2
The desired results of the query would me something like this.
The two last columns represent the IDs of the rows from the original table where the max and min values were found.
TestID HighestScore LowestScore Result_ID_Of_Max Result_ID_Of_Min
1 71 23 3 6
2 35 11 7 1
3 69 55 6 3

For example you can use union to have results from both countries togehter and then just pick the maximum and the minimum for your data.
with cte as (
select Test_UK as TestID, Test_UK_Score as score from yourTable
union all
select Test_US as TestID, Test_US_Score as score from yourTable
)
select
TestID
,max(score) as HighestScore
,min(score) as LowestScore
from cte
group by TestID
order by TestID
Extra:
I assumed that you want to have the additional column with the previous result. If not just take the above select and replace Test_UK_Score and Test_US_Score with ResultID.
with cte as (
select Test_UK as TestID, Test_UK_Score as score, ResultID from yourTable
union all
select Test_US as TestID, Test_US_Score as score, ResultID from yourTable
)
select
TestID
,max(score) as HighestScore
,min(score) as LowestScore
,max(ResultID) as Result_ID_Max
,min(ResultID) as Result_ID_Min
from cte
group by TestID
order by TestID

Related

Delete rows, which are duplicated and follow each other consequently

It's hard to formulate, so i'll just show an example and you are welcome to edit my question and title.
Suppose, i have a table
flag id value datetime
0 b 1 343 13
1 a 1 23 12
2 b 1 21 11
3 b 1 32 10
4 c 2 43 11
5 d 2 43 10
6 d 2 32 9
7 c 2 1 8
For each id i want to squeze the table by flag columns such that all duplicate flag values that follow each other collapse to one row with sum aggregation. Desired result:
flag id value
0 b 1 343
1 a 1 23
2 b 1 53
3 c 2 75
4 d 2 32
5 c 2 1
P.S: I found functions like CONDITIONAL_CHANGE_EVENT, which seem to be able to do that, but the examples of them in docs dont work for me
Use the differnece of row number approach to assign groups based on consecutive row flags being the same. Thereafter use a running sum.
select distinct id,flag,sum(value) over(partition by id,grp) as finalvalue
from (
select t.*,row_number() over(partition by id order by datetime)-row_number() over(partition by id,flag order by datetime) as grp
from tbl t
) t
Here's an approach which uses CONDITIONAL_CHANGE_EVENT:
select
flag,
id,
sum(value) value
from (
select
conditional_change_event(flag) over (order by datetime desc) part,
flag,
id,
value
from so
) t
group by part, flag, id
order by part;
The result is different from your desired result stated in the question because of order by datetime. Adding a separate column for the row number and sorting on that gives the correct result.

Re-Organize Access Table by converting Rows to Columns

I'm pretty new to access and SQL and need some help re-organizing a table. I have the following table (sorry for the table below - having trouble posting):
ID GroupID Distance Code Start_Finish
1 44 7 A S1
2 44 14 A F1
3 45 12 B S1
4 45 16 B F1
5 45 31 C S2
6 45 36 C F2
7 45 81 B S3
8 45 88 B F3
And need for the table to be transformed into:
GroupID Code Start_Distance Finish_Distance
44 A 7 14
45 B 12 16
45 C 31 36
45 B 81 88
try something like this
Select GroupID, Code, min(distance) as Start_distance, max(distance) as Finish_distance
from Table
group by GroupID, Code
If the min and max functions don't give you what you need, try it with First() and Last() instead.
Oops - just noticed you have 2 different entries in the output for GroupID 45 Code B - is that a requirement? With that data structure and requirement, the problem gets much more difficult.
Now I see the final column in the 1st table - I think that can be used to get the output you want:
Select GroupID, Code, mid(start_finish,2) as T, min(distance) as Start_distance, max(distance) as Finish_distance
from Table
group by GroupID, Code, T
You can use conditional aggregation for this.
select GroupID
, CODE
, max(case when Left(Start_Finish, 1) = 'S' then Distance end) as Start_Distance
, max(case when Left(Start_Finish, 1) = 'F' then Distance end) as Finish_Distance
from SomeTable
group by GroupID
, CODE

Query to return all results except for the first record

I have a archive table that has records of transactions per locationID.
A location will have 0, 1 or many rows in this table.
I need a SELECT query that will return rows for any location that has more than 1 row, and to skip the first entry.
e.g.
Transactions table
transactionId locationId amount
1 11 2343
2 11 23434
3 25 342
4 32 234
5 77 234
6 11 38938
7 43 234
8 43 1235
So given the above, since the locationID has multiple rows, I will get back all rows except for the first one (lowest transacton_id):
2 11 23434
6 11 38938
8 43 1235
You can use row_number to do this. This assumes there would be no duplicate transactionid's.
select transactionid,locationid,amount
from
(select t.*, row_number() over(partition by locationid order by transactionid) as rn
from transactions t) t
where rn > 1
The other answer is fine. You could also write it this way, it might give you a little insight into grouping practices:
SELECT Transactions.TransactionID, Transactions.locationID, Transactions.amount
FROM Transactions INNER JOIN
(SELECT locationID, MIN(TransactionID) AS MinTransaction,
COUNT(TransactionID) AS CountTransaction
FROM Transactions
GROUP BY locationID) TableSum ON Transactions.locationID = TableSum.locationID
WHERE (Transactions.TransactionID <> TableSum.MinTransaction) AND
(TableSum.CountTransaction > 1)

Multiple AVG in single SQL query

I've searched for adding multiple AVG calculations and have found a few entries, however I'm having to join another table and the examples of that are scarce.
closest answer I can find is this
but it deals with dates and no joins
here are my tables:
indicators:
StandardScore IndicatorID NID DID
0.033333 7 1 1
0.907723 9 1 1
0.574739 26 1 1
0.917391 21 1 1
.....
indexindicators:
IndexID IndicatorID
1 7
1 26
2 21
3 7
4 9
4 21
4 7
5 9
.......
My goal is to get the average for each IndexID (indexindicators) related to NID/DID (indicators) combination
a query to retrieve a single value would be
SELECT AVG(StandardScore) FROM `indicators` INNER JOIN indexindicators ON indicators.IndicatorId=indexindicators.IndicatorId WHERE nid=1 AND did=1 AND indexindicators.IndexId=1
ultimately there will be 6 (indexID) averages which then have to be rounded then * by 100 (should I do that part with PHP?)
This seems like such a simple query, but i just can't seem wrap my mind around it.
Thanks in advance for your help!
SELECT nid, did, indexid, 100.0 * AVG(StandardScore)
FROM 'indicators'
INNER JOIN 'indexindicators'
ON indicators.IndicatorId=indexindicators.IndicatorId
group by nid, did, indexid

How to track how many times a column changed its value?

I have a table called crewWork as follows :
CREATE TABLE crewWork(
FloorNumber int, AptNumber int, WorkType int, simTime int )
After the table was populated, I need to know how many times a change in apt occurred and how many times a change in floor occurred. Usually I expect to find 10 rows on each apt and 40-50 on each floor.
I could just write a scalar function for that, but I was wondering if there's any way to do that in t-SQL without having to write scalar functions.
Thanks
The data will look like this:
FloorNumber AptNumber WorkType simTime
1 1 12 10
1 1 12 25
1 1 13 35
1 1 13 47
1 2 12 52
1 2 12 59
1 2 13 68
1 1 14 75
1 4 12 79
1 4 12 89
1 4 13 92
1 4 14 105
1 3 12 115
1 3 13 129
1 3 14 138
2 1 12 142
2 1 12 150
2 1 14 168
2 1 14 171
2 3 12 180
2 3 13 190
2 3 13 200
2 3 14 205
3 3 14 216
3 4 12 228
3 4 12 231
3 4 14 249
3 4 13 260
3 1 12 280
3 1 13 295
2 1 14 315
2 2 12 328
2 2 14 346
I need the information for a report, I don't need to store it anywhere.
If you use the accepted answer as written now (1/6/2023), you get correct results with the OP dataset, but I think you can get wrong results with other data.
CONFIRMED: ACCEPTED ANSWER HAS A MISTAKE (as of 1/6/2023)
I explain the potential for wrong results in my comments on the accepted answer.
In this db<>fiddle, I demonstrate the wrong results. I use a slightly modified form of accepted answer (my syntax works in SQL Server and PostgreSQL). I use a slightly modified form of the OP's data (I change two rows). I demonstrate how the accepted answer can be changed slightly, to produce correct results.
The accepted answer is clever but needs a small change to produce correct results (as demonstrated in the above db<>fiddle and described here:
Instead of doing this as seen in the accepted answer COUNT(DISTINCT AptGroup)...
You should do thisCOUNT(DISTINCT CONCAT(AptGroup, '_', AptNumber))...
DDL:
SELECT * INTO crewWork FROM (VALUES
-- data from question, with a couple changes to demonstrate problems with the accepted answer
-- https://stackoverflow.com/q/8666295/1175496
--FloorNumber AptNumber WorkType simTime
(1, 1, 12, 10 ),
-- (1, 1, 12, 25 ), -- original
(2, 1, 12, 25 ), -- new, changing FloorNumber 1->2->1
(1, 1, 13, 35 ),
(1, 1, 13, 47 ),
(1, 2, 12, 52 ),
(1, 2, 12, 59 ),
(1, 2, 13, 68 ),
(1, 1, 14, 75 ),
(1, 4, 12, 79 ),
-- (1, 4, 12, 89 ), -- original
(1, 1, 12, 89 ), -- new , changing AptNumber 4->1->4 ges)
(1, 4, 13, 92 ),
(1, 4, 14, 105 ),
(1, 3, 12, 115 ),
...
DML:
;
WITH groupedWithConcats as (SELECT
*,
CONCAT(AptGroup,'_', AptNumber) as AptCombo,
CONCAT(FloorGroup,'_',FloorNumber) as FloorCombo
-- SQL SERVER doesnt have TEMPORARY keyword; Postgres doesn't understand # for temp tables
-- INTO TEMPORARY groupedWithConcats
FROM
(
SELECT
-- the columns shown in Andriy's answer:
-- https://stackoverflow.com/a/8667477/1175496
ROW_NUMBER() OVER ( ORDER BY simTime) as RN,
-- AptNumber
AptNumber,
ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as RN_Apt,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as AptGroup,
-- FloorNumber
FloorNumber,
ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as RN_Floor,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as FloorGroup
FROM crewWork
) grouped
)
-- if you want to see how the groupings work:
-- SELECT * FROM groupedWithConcats
-- otherwise just run this query to see the counts of "changes":
SELECT
COUNT(DISTINCT AptCombo)-1 as CountAptChangesWithConcat_Correct,
COUNT(DISTINCT AptGroup)-1 as CountAptChangesWithoutConcat_Wrong,
COUNT(DISTINCT FloorCombo)-1 as CountFloorChangesWithConcat_Correct,
COUNT(DISTINCT FloorGroup)-1 as CountFloorChangesWithoutConcat_Wrong
FROM groupedWithConcats;
ALTERNATIVE ANSWER
The accepted-answer may eventually get updated to remove the mistake. If that happens I can remove my warning but I still want leave you with this alternative way to produce the answer.
My approach goes like this: "check the previous row, if the value is different in previous row vs current row, then there is a change". SQL doesn't have idea or row order functions per se (at least not like in Excel for example; )
Instead, SQL has window functions. With SQL's window functions, you can use the window function RANK plus a self-JOIN technique as seen here to combine current row values and previous row values so you can compare them. Here is a db<>fiddle showing my approach, which I pasted below.
The intermediate table, showing the columns which has a value 1 if there is a change, 0 otherwise (i.e. FloorChange, AptChange), is shown at the bottom of the post...
DDL:
...same as above...
DML:
;
WITH rowNumbered AS (
SELECT
*,
ROW_NUMBER() OVER ( ORDER BY simTime) as RN
FROM crewWork
)
,joinedOnItself AS (
SELECT
rowNumbered.*,
rowNumberedRowShift.FloorNumber as FloorShift,
rowNumberedRowShift.AptNumber as AptShift,
CASE WHEN rowNumbered.FloorNumber <> rowNumberedRowShift.FloorNumber THEN 1 ELSE 0 END as FloorChange,
CASE WHEN rowNumbered.AptNumber <> rowNumberedRowShift.AptNumber THEN 1 ELSE 0 END as AptChange
FROM rowNumbered
LEFT OUTER JOIN rowNumbered as rowNumberedRowShift
ON rowNumbered.RN = (rowNumberedRowShift.RN+1)
)
-- if you want to see:
-- SELECT * FROM joinedOnItself;
SELECT
SUM(FloorChange) as FloorChanges,
SUM(AptChange) as AptChanges
FROM joinedOnItself;
Below see the first few rows of the intermediate table (joinedOnItself). This shows how my approach works. Note the last two columns, which have a value of 1 when there is a change in FloorNumber compared to FloorShift (noted in FloorChange), or a change in AptNumber compared to AptShift (noted in AptChange).
floornumber
aptnumber
worktype
simtime
rn
floorshift
aptshift
floorchange
aptchange
1
1
12
10
1
0
0
2
1
12
25
2
1
1
1
0
1
1
13
35
3
2
1
1
0
1
1
13
47
4
1
1
0
0
1
2
12
52
5
1
1
0
1
1
2
12
59
6
1
2
0
0
1
2
13
68
7
1
2
0
0
Note instead of using the window function RANK and JOIN, you could use the window function LAG to compare values in the current row to the previous row directly (no need to JOIN). I don't have that solution here, but it is described in the Wikipedia article example:
Window functions allow access to data in the records right before and after the current record.
If I am not missing anything, you could use the following method to find the number of changes:
determine groups of sequential rows with identical values;
count those groups;
subtract 1.
Apply the method individually for AptNumber and for FloorNumber.
The groups could be determined like in this answer, only there's isn't a Seq column in your case. Instead, another ROW_NUMBER() expression could be used. Here's an approximate solution:
;
WITH marked AS (
SELECT
FloorGroup = ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime),
AptGroup = ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime)
FROM crewWork
)
SELECT
FloorChanges = COUNT(DISTINCT FloorGroup) - 1,
AptChanges = COUNT(DISTINCT AptGroup) - 1
FROM marked
(I'm assuming here that the simTime column defines the timeline of changes.)
UPDATE
Below is a table that shows how the distinct groups are obtained for AptNumber.
AptNumber RN RN_Apt AptGroup (= RN - RN_Apt)
--------- -- ------ ---------
1 1 1 0
1 2 2 0
1 3 3 0
1 4 4 0
2 5 1 4
2 6 2 4
2 7 3 4
1 8 5 => 3
4 9 1 8
4 10 2 8
4 11 3 8
4 12 4 8
3 13 1 12
3 14 2 12
3 15 3 12
1 16 6 10
… … … …
Here RN is a pseudo-column that stands for ROW_NUMBER() OVER (ORDER BY simTime). You can see that this is just a sequence of rankings starting from 1.
Another pseudo-column, RN_Apt contains values produces by the other ROW_NUMBER, namely ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime). It contains rankings within individual groups of identical AptNumber values. You can see that, for a newly encountered value, the sequence starts over, and for a recurring one, it continues where it stopped last time.
You can also see from the table that if we subtract RN from RN_Apt (could be the other way round, doesn't matter in this situation), we get the value that uniquely identifies every distinct group of same AptNumber values. You might as well call that value a group ID.
So, now that we've got these IDs, it only remains for us to count them (count distinct values, of course). That will be the number of groups, and the number of changes is one less (assuming the first group is not counted as a change).
add an extra column changecount
CREATE TABLE crewWork(
FloorNumber int, AptNumber int, WorkType int, simTime int ,changecount int)
increment changecount value for each updation
if want to know count for each field then add columns corresponding to it for changecount
Assuming that each record represents a different change, you can find changes per floor by:
select FloorNumber, count(*)
from crewWork
group by FloorNumber
And changes per apartment (assuming AptNumber uniquely identifies apartment) by:
select AptNumber, count(*)
from crewWork
group by AptNumber
Or (assuming AptNumber and FloorNumber together uniquely identifies apartment) by:
select FloorNumber, AptNumber, count(*)
from crewWork
group by FloorNumber, AptNumber