How to Compare one row with other rows in SQL - sql

I have a data set that consist of team members.
I want to calculate the Average of only member 2.
However, there are conditions for the calculation to meet like,
for example, id 1 have Anna and Sam working together, I want to calculate only for member2 which is Sam
For that, I want to sum the score of the ID's that have
Sam working with him self like id:2 member1:Sam member2:Sam
Sam working with other member but not Anna(member1) like id:3 member1:Sam member2:Nihal OR id:4 member1:Nihal member2:Sam
Then divide by distinct #ID
Input
+----+---------+---------+-------+
| ID | member1 | member2 | score |
+----+---------+---------+-------+
| 1 | Anna | Sam | 10 |
| 2 | Sam | Sam | 30 |
| 3 | Sam | Nihal | 40 |
| 4 | Nihal | Sam | 50 |
| 5 | Sam | Anna | 20 |
| 6 | Anna | Anna | 60 |
| 7 | Nihal | May | 70 |
| 8 | May | May | 80 |
+----+---------+---------+-------+
Output
+----+---------+---------+-------+-----+
| ID | member1 | member2 | score | AVG |
+----+---------+---------+-------+-----+
| 1 | Anna | Sam | 10 | 40 |-->AVG= 30+40+50/3
| 2 | Sam | Sam | 30 | 30 |-->AVG= score
| 3 | Sam | Nihal | 40 | 70 |-->AVG= 70/1
| 4 | Nihal | Sam | 50 | 20 |-->AVG= 30+10+20/3
| 5 | Sam | Anna | 20 | 60 |-->AVG= 60/1
| 6 | Anna | Anna | 60 | 60 |-->AVG= score
| 7 | Nihal | May | 70 | 80 |-->AVG= 80/1
| 8 | May | May | 80 | 80 |-->AVG= score
+----+---------+---------+-------+-----+

Try the following:
select t1.*,q.avg_score
from yourtable t1
cross apply
(
select avg(score) as avg_score
from yourtable t2
where
t1.member2 in (t2.member1,t2.member2)
and t1.member1 not in (t2.member1,t2.member2)
)q

Related

SQL JOIN each id in JSON object

I have a JSON column containing col_values for another table. I want to return rows from that other table for each item in the JSON object.
If this was an INT column, I would use JOIN, but I need to JOIN every entry in the JSON object.
Take:
writers :
| id | name | projects (JSON) |
|:-- |:-----|:------------------|
| 1 | Andy | ["1","2","3","4"] |
| 2 | Hank | ["3","4","5","6"] |
| 3 | Alex | ["1","7","8","9"] |
| 4 | Joe | ["1","5","6","7"] |
| 5 | Ken | ["2","4","5","6"] |
| 6 | Zach | ["2","7","8","9"] |
| 7 | Walt | ["2","5","6","7"] |
| 8 | Mike | ["2","3","4","5"] |
cities :
| id | name | project |
|:-- |:---------|:--------|
| 1 | Boston | 1 |
| 2 | Chicago | 2 |
| 3 | Cisco | 3 |
| 4 | Seattle | 4 |
| 5 | North | 5 |
| 6 | West | 6 |
| 7 | Miami | 7 |
| 8 | York | 8 |
| 9 | Tainan | 9 |
| 10 | Seoul | 1 |
| 11 | South | 2 |
| 12 | Tokyo | 3 |
| 13 | Carlisle | 4 |
| 14 | Fugging | 5 |
| 15 | Turkey | 6 |
| 16 | Paris | 7 |
| 17 | Midguard | 8 |
| 18 | Fugging | 9 |
| 19 | Madrid | 1 |
| 20 | Salvador | 2 |
| 21 | Everett | 3 |
I need every city ordered by name for Mike (id=8).
Desired results:
This is what I'm getting and what I need to get (ORDER BY name).
Output :
| id | name | project |
|:---|:---------|:--------|
| 13 | Carlisle | 4 |
| 2 | Chicago | 2 |
| 3 | Cisco | 3 |
| 21 | Everett | 3 |
| 14 | Fugging | 5 |
| 5 | North | 5 |
| 20 | Salvador | 2 |
| 4 | Seattle | 4 |
| 11 | South | 2 |
| 12 | Tokyo | 3 |
Current query, but this can't be the best way...
SQL >
SELECT c.*
FROM cities c
WHERE EXISTS (
SELECT 1
FROM writers w
WHERE JSON_CONTAINS(
w.projects, CONCAT('\"', c.project, '\"'))
AND w.id = '8'
)
ORDER BY c.name;
DB Fiddle with the above. Is there a better way to do this "properly"?
Background
If it matters, I need to keep using JSON as the datatype because my server-side software that uses this database normally reads that column best if presented as a JSON object.
I would normally just do several database calls and iterate through that JSON object in my server-side language, but that is way too expensive with so many database calls, notwithstanding that it is even more costly to do multiple database calls for pagination.
I need all the results in a single database call. So, I need to JOIN or otherwise loop through each item in the JSON object within SQL.
Start with JOIN
Per a comment from a user, there is a better way...
SQL >
SELECT c.*
FROM writers w
JOIN cities c ON JSON_CONTAINS(w.projects, CONCAT('\"', c.project, '\"'))
WHERE w.id = '8'
ORDER BY c.name;
Output is the same...
Output :
id
name
project
13
Carlisle
4
2
Chicago
2
3
Cisco
3
21
Everett
3
14
Fugging
5
5
North
5
20
Salvador
2
4
Seattle
4
11
South
2
12
Tokyo
3
DB Fiddle

SQL lag window and custom logic [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am working on this dataset
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=447a5d2c33b04346e70dab0a8d098655
Custom logic:
Grouped by name, testcentre, coursename, testtype.
If a retest is taken, then scores are compared - if higher the highest one precedes or original score.
Lag:
Based on above selected row, if a lag window exists between the remaining set of records example within 4 days , then the higest score record shall be picked up.
Any suggestions is appreciated.
Sample data
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+-----------------------------------------------+
| recordid | Name | testcentre | coursename | testtype | testscore | testdate | retestflag | Preferred_Output | RejectReason |
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+-----------------------------------------------+
| 1 | Sam | Paris | English | IELTS | 90 | 01/02/2019 | 0 | 0 | |
| 3 | Sam | Paris | English | IELTS | 95 | 02/02/2019 | 1 | 1 | Better score in retest |
| 4 | Sam | Paris | English | TOEFL | 80 | 04/02/2019 | 0 | 0 | Within 4 days of previous test |
| 21 | Sam | Paris | English | IELTS | 95 | 02/02/2018 | 1 | 1 | marked as retest without base.needs inclusion |
| 5 | Jack | London | English | IELTS | 90 | 01/02/2019 | 0 | 1 | Same or bad score in retest |
| 8 | Jack | London | English | IELTS | 90 | 02/02/2019 | 1 | 0 | Same or bad score in retest |
| 7 | Louis | Brazil | English | IELTS | 70 | 01/02/2019 | 0 | 1 | Same score in retest |
| 11 | Louis | Brazil | English | IELTS | 70 | 02/02/2019 | 1 | 0 | Same score in retest |
| 13 | Louis | Brazil | English | TOEFL | 100 | 04/02/2019 | 0 | 0 | Within 4 days of previous test |
| 55 | Sam | Paris | English | IELTS | 90 | 01/02/2016 | 0 | 1 | Older test, no follow on |
| 60 | Sam | Paris | English | IELTS | 95 | 01/08/2019 | 0 | 1 | same score in retest |
| 61 | Sam | Paris | English | IELTS | 95 | 02/08/2019 | 1 | 0 | |
| 62 | Sam | Paris | English | TOEFL | 80 | 04/01/2020 | 0 | 1 | More than 4 days, included |
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+-----------------------------------------------+
Desired Output
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+
| recordid | Name | testcentre | coursename | testtype | testscore | testdate | retestflag | Preferred_Output |
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+
| 3 | Sam | Paris | English | IELTS | 95 | 02/02/2019 | 1 | 1 |
| 21 | Sam | Paris | English | IELTS | 95 | 02/02/2018 | 1 | 1 |
| 5 | Jack | London | English | IELTS | 90 | 01/02/2019 | 0 | 1 |
| 7 | Louis | Brazil | English | IELTS | 70 | 01/02/2019 | 0 | 1 |
| 55 | Sam | Paris | English | IELTS | 90 | 01/02/2016 | 0 | 1 |
| 60 | Sam | Paris | English | IELTS | 95 | 01/08/2019 | 0 | 1 |
| 62 | Sam | Paris | English | TOEFL | 80 | 04/01/2020 | 0 | 1 |
+----------+-------+------------+------------+----------+-----------+------------+------------+------------------+
Based on your sample data and description, this seems to do what you want:
select t.*
from (select t.*,
row_number() over (partition by name, testcentre, coursename, testtype order by testscore desc) as seqnum,
count(*) over (partition by name, testcentre, coursename, testtype) as cnt
from test t
) t
where seqnum = 1 and cnt >= 2;
This does not include the condition on "within 4 days", because that condition is not clearly explained. What happens if there are a series of 5 tests, each 3 days apart, for instance?
select * from (
select *, row_number() over (partition by name order by testscore desc) rn
from test
) t
where rn = 1
Fiddle

windowing functions ms access

I am working on a class scheduling database in MS Access. There are a variety of classes, each of which is taught multiple times, sometimes multiple times in a day, but not necessarily every day. Each course has a unique set of software and data that is stored on a laptop. there is a set of laptops for each course with that software loaded.
For any given training day I need to assign a range of laptop IDs to the right classes in different rooms, depending on how many people will be taking that class in that room, so that the instructors know which laptops to take to the room with them to teach the class that day.
For example, I have the raw data:
Date Room ClassName HeadCount
---- ---- --------- ---------
11/30 101 Intro 10
11/30 102 Intro 15
11/30 103 Course 2 5
12/1 101 Intro 10
12/1 102 Course 2 15
12/1 103 Course 3 10
I also know the following about the laptops:
ClassName LaptopID
--------- ---------
Intro LT.Intro_1
Intro ...
Intro LT.Intro_30
Course 2 LT.Course 2_1
Course 2 ...
Course 2 LT.Course 2_30
Course 3 LT.Course 3_1
Course 3 ...
Course 3 LT.Course 3_30
Based on the above two tables, I would want to output:
Date Room ClassName HeadCount First Laptop Last Laptop
---- ---- --------- --------- ------------ -----------
11/30 101 Intro 10 LT.Intro_1 LT.Intro_10
11/30 102 Intro 15 LT.Intro_11 LT.Intro_25
11/30 103 Course 2 5 LT.Course 2_1 LT.Course 2_5
12/1 101 Intro 10 LT.Intro_1 LT.Intro_10
12/1 102 Course 2 15 LT.Course 2_1 LT.Course 2_15
12/1 103 Course 3 10 LT.Course 3_1 LT.Course 3_10
I know this is a windowing function, but MS Access doesn't have lead or lag. Is there a workaround?
You might want to change your table definitions for better performance. I have recreated two tables as you've mentioned.
You know your laptop ids are in sequence and you know the headcount per class. In order to follow a lead, you must know the last headcount.
which would be toal attendees on the same date, for the same class, before current class/event.
x = sum(headCount) where id < currentID & classname = currentClassname & date = currentDate. (Current means currentRow.)
Now you know total laptops used before the current row and the headCount for current row. The First laptop would be
f = min(laptopid) where laptopid > x (x being totaLaptopUsedBefore this Row)
for the Last laptop, you must also add the current headcount.
l = min(laptopid) where laptopid >= currentHeadCount + x
Note f checks laptopid is greater and L checks >=.
Here is a working demo which you can improve on:
Table1: tbl_ClassEvents
+----+------------+------+-----------+-----------+
| ID | date | Room | ClassName | HeadCount |
+----+------------+------+-----------+-----------+
| 1 | 30/11/2017 | 101 | Intro | 10 |
| 2 | 30/11/2017 | 102 | intro | 15 |
| 3 | 30/11/2017 | 103 | Course 2 | 5 |
| 4 | 01/12/2017 | 101 | Intro | 10 |
| 5 | 01/12/2017 | 102 | Course 2 | 15 |
| 6 | 01/12/2017 | 103 | Course 3 | 10 |
| 7 | 17/11/2017 | 101 | Intro | 16 |
+----+------------+------+-----------+-----------+
Table2: Tbl_ClassVsLaptop
+----+-----------+----------------+
| Id | ClassName | LaptopId |
+----+-----------+----------------+
| 1 | Intro | LT.Intro_1 |
| 2 | Intro | LT.Intro_2 |
| 3 | Intro | LT.Intro_3 |
| 4 | Intro | LT.Intro_4 |
| 5 | Intro | LT.Intro_5 |
| 6 | Intro | LT.Intro_6 |
| 7 | Intro | LT.Intro_7 |
| 8 | Intro | LT.Intro_8 |
| 9 | Intro | LT.Intro_9 |
| 10 | Intro | LT.Intro_10 |
| 11 | Intro | LT.Intro_11 |
| 12 | Intro | LT.Intro_12 |
| 13 | Intro | LT.Intro_13 |
| 14 | Intro | LT.Intro_14 |
| 15 | Intro | LT.Intro_15 |
| 16 | Intro | LT.Intro_16 |
| 17 | Intro | LT.Intro_17 |
| 18 | Intro | LT.Intro_18 |
| 19 | Intro | LT.Intro_19 |
| 20 | Intro | LT.Intro_20 |
| 21 | Intro | LT.Intro_21 |
| 22 | Intro | LT.Intro_22 |
| 23 | Intro | LT.Intro_23 |
| 24 | Intro | LT.Intro_24 |
| 25 | Intro | LT.Intro_25 |
| 26 | Intro | LT.Intro_26 |
| 27 | Intro | LT.Intro_27 |
| 28 | Intro | LT.Intro_28 |
| 29 | Intro | LT.Intro_29 |
| 30 | Intro | LT.Intro_30 |
| 31 | Course 2 | LT.Course 2_1 |
| 32 | Course 2 | LT.Course 2_2 |
| 33 | Course 2 | LT.Course 2_3 |
| 34 | Course 2 | LT.Course 2_4 |
| 35 | Course 2 | LT.Course 2_5 |
| 36 | Course 2 | LT.Course 2_6 |
| 37 | Course 2 | LT.Course 2_7 |
| 38 | Course 2 | LT.Course 2_8 |
| 39 | Course 2 | LT.Course 2_9 |
| 40 | Course 2 | LT.Course 2_10 |
| 41 | Course 2 | LT.Course 2_11 |
| 42 | Course 2 | LT.Course 2_12 |
| 43 | Course 2 | LT.Course 2_13 |
| 44 | Course 2 | LT.Course 2_14 |
| 45 | Course 2 | LT.Course 2_15 |
| 46 | Course 2 | LT.Course 2_16 |
| 47 | Course 2 | LT.Course 2_17 |
| 48 | Course 2 | LT.Course 2_18 |
| 49 | Course 2 | LT.Course 2_19 |
| 50 | Course 2 | LT.Course 2_20 |
| 51 | Course 2 | LT.Course 2_21 |
| 52 | Course 2 | LT.Course 2_22 |
| 53 | Course 2 | LT.Course 2_23 |
| 54 | Course 2 | LT.Course 2_24 |
| 55 | Course 2 | LT.Course 2_25 |
| 56 | Course 2 | LT.Course 2_26 |
| 57 | Course 2 | LT.Course 2_27 |
| 58 | Course 2 | LT.Course 2_28 |
| 59 | Course 2 | LT.Course 2_29 |
| 60 | Course 2 | LT.Course 2_30 |
| 61 | Course 3 | LT.Course 3_1 |
| 62 | Course 3 | LT.Course 3_2 |
| 63 | Course 3 | LT.Course 3_3 |
| 64 | Course 3 | LT.Course 3_4 |
| 65 | Course 3 | LT.Course 3_5 |
| 66 | Course 3 | LT.Course 3_6 |
| 67 | Course 3 | LT.Course 3_7 |
| 68 | Course 3 | LT.Course 3_8 |
| 69 | Course 3 | LT.Course 3_9 |
| 70 | Course 3 | LT.Course 3_10 |
| 71 | Course 3 | LT.Course 3_11 |
| 72 | Course 3 | LT.Course 3_12 |
| 73 | Course 3 | LT.Course 3_13 |
| 74 | Course 3 | LT.Course 3_14 |
| 75 | Course 3 | LT.Course 3_15 |
| 76 | Course 3 | LT.Course 3_16 |
| 77 | Course 3 | LT.Course 3_17 |
| 78 | Course 3 | LT.Course 3_18 |
| 79 | Course 3 | LT.Course 3_19 |
| 80 | Course 3 | LT.Course 3_20 |
| 81 | Course 3 | LT.Course 3_21 |
| 82 | Course 3 | LT.Course 3_22 |
| 83 | Course 3 | LT.Course 3_23 |
| 84 | Course 3 | LT.Course 3_24 |
| 85 | Course 3 | LT.Course 3_25 |
| 86 | Course 3 | LT.Course 3_26 |
| 87 | Course 3 | LT.Course 3_27 |
| 88 | Course 3 | LT.Course 3_28 |
| 89 | Course 3 | LT.Course 3_29 |
| 90 | Course 3 | LT.Course 3_30 |
+----+-----------+----------------+
Here is the query:
SELECT tbl_classEvents.ID
,tbl_classEvents.DATE
,tbl_classEvents.Room
,tbl_classEvents.ClassName
,tbl_classEvents.HeadCount
,(
SELECT min(laptopId)
FROM tbl_ClassVsLaptop T1
WHERE T1.ClassName = tbl_ClassEvents.ClassNAme
AND Mid([T1.LaptopID], InStrRev([T1.LaptopID], "_") + 1, 3) > (
+ Nz((
SELECT sum(headCount)
FROM tbl_classEvents T2
WHERE T2.ID < Tbl_ClassEvents.ID
AND T2.[DATE] = [Tbl_ClassEvents].[DATE]
AND T2.[ClassName] = [Tbl_ClassEvents].[ClassName]
), 0)
)
) AS FirstLaptop
,(
SELECT min(laptopId)
FROM tbl_ClassVsLaptop T1
WHERE T1.ClassName = tbl_ClassEvents.ClassNAme
AND Mid([T1.LaptopID], InStrRev([T1.LaptopID], "_") + 1, 3) >= (
+ [tbl_classEvents].[HeadCount] + Nz((
SELECT sum(headCount)
FROM tbl_classEvents T2
WHERE T2.ID < Tbl_ClassEvents.ID
AND T2.[DATE] = [Tbl_ClassEvents].[DATE]
AND T2.[ClassName] = [Tbl_ClassEvents].[ClassName]
), 0)
)
) AS LastLaptop
FROM tbl_classEvents
ORDER BY tbl_classEvents.DATE
,tbl_classEvents.Room
,tbl_classEvents.ClassNAme;
And the output:
+----+------------+------+-----------+-----------+---------------+----------------+
| ID | DATE | Room | ClassName | HeadCount | FirstLaptop | LastLaptop |
+----+------------+------+-----------+-----------+---------------+----------------+
| 7 | 17/11/2017 | 101 | Intro | 16 | LT.Intro_1 | LT.Intro_16 |
| 1 | 30/11/2017 | 101 | Intro | 10 | LT.Intro_1 | LT.Intro_10 |
| 2 | 30/11/2017 | 102 | intro | 15 | LT.Intro_11 | LT.Intro_25 |
| 3 | 30/11/2017 | 103 | Course 2 | 5 | LT.Course 2_1 | LT.Course 2_5 |
| 4 | 01/12/2017 | 101 | Intro | 10 | LT.Intro_1 | LT.Intro_10 |
| 5 | 01/12/2017 | 102 | Course 2 | 15 | LT.Course 2_1 | LT.Course 2_15 |
| 6 | 01/12/2017 | 103 | Course 3 | 10 | LT.Course 3_1 | LT.Course 3_10 |
+----+------------+------+-----------+-----------+---------------+----------------+

Creating a SSRS report from 2 Tables

I Have two tables in a sql server database.
Here's my First table,Table1
+------------+------------------+----------------+
| Project ID | Project Manager | Approved Hours |
+------------+------------------+----------------+
| 1 | Mr.A | 120 |
| 2 | Mr.B | 100 |
+------------+------------------+----------------+
Here's my Second Table,Table 2
+-----------+-----------------+-----------+----------+---------------+
| ProjectID | Project Manager | Personnel | Week No. | Working Hours |
+-----------+-----------------+-----------+----------+---------------+
| 1 | Mr.A | Tom | 1 | 20 |
| 1 | Mr.A | Tom | 2 | 20 |
| 1 | Mr.A | Tom | 3 | 10 |
| 1 | Mr.A | Harry | 1 | 20 |
| 1 | Mr.A | Harry | 2 | 20 |
| 1 | Mr.A | Harry | 3 | 20 |
| 2 | Mr.B | Tom | 1 | 20 |
| 2 | Mr.B | Tom | 2 | 10 |
| 2 | Mr.B | Tom | 3 | 20 |
| 2 | Mr.B | Harry | 1 | 20 |
| 2 | Mr.B | Harry | 2 | 15 |
+-----------+-----------------+-----------+----------+---------------+
I would like to create a ssrs report that looks like this.I'm using the 2012 version.
Actual Hours being the sum of working Hours for each Project.
+------------+-----------------+----------------+--------------+
| Project ID | Project Manager | Approved Hours | Actual Hours |
+------------+-----------------+----------------+--------------+
| 1 | Mr.A | 120 | 110 |
| 2 | Mr.B | 100 | 85 |
+------------+-----------------+----------------+--------------+
I'm kind of new to SQL, Can I get this done with a single query.
As #jarlh Suggest simply do INNER JOIN & group by as below :
SELECT T.[Project ID],
T.[Project Manager],
T.[Approved Hours],
SUM(T1.[Working Hours]) [Actual Hours]
FROM Table1 T
INNER JOIN Table2 T1 ON T.[Project ID] = T1.[Project ID]
GROUP BY T.[Project ID],
T.[Project Manager],
T.[Approved Hours];
Result :
+------------+-----------------+----------------+--------------+
| Project ID | Project Manager | Approved Hours | Actual Hours |
+------------+-----------------+----------------+--------------+
| 1 | Mr.A | 120 | 110 |
| 2 | Mr.B | 100 | 85 |
+------------+-----------------+----------------+--------------+

Series of conditional table and cell references

I have a reference table as such in Sheet2 of my workbook
|Score 1| | |Score 2 | | |
----------------------------------------------------------
| name | min | max | target | min | max | target |
----------------------------------------------------------
| jeff | 30 | 40 | 35 | 45 | 55 | 50 |
----------------------------------------------------------
| steve | 35 | 45 | 40 | 45 | 65 | 55 |
then in Sheet1 I have a list of scores for each name as such
| jeff | 1 | | | | steve | 3 | | |
------------------------------------------------------------
| jeff | 2 | | | | steve | 2 | | |
------------------------------------------------------------
| jeff | 2 | | | | steve | 3 | | |
------------------------------------------------------------
| jeff | 3 | | | | steve | 3 | | |
------------------------------------------------------------
| jeff | 1 | | | | steve | 2 | | |
------------------------------------------------------------
I am aware of simple lookups and offsetting values but I can't think of a way to do multiple references on different levels... Is there a way to in Sheet1 next to the scores have a function that looks up the score, then who the score is for, and then prints the corresponding min max and target values for that person with that score.
So if it sees 1 and then jeff, it returns 30 | 40 | 35 in the next 3 boxes. I would do this manually but the list is very long and is populated daily by an existing macro.
Use VLOOKUP with the name (jeff) and take the index (1) to calculate the column to take.