Calculate number of cells in row above threshold

Calculate number of cells in row above threshold - sql

I have a table like this:
Student_ID | Year | Math Grade | English Grade
----------------------------------------------
1 | 2009 | 90 | 92
2 | 2009 | 80 | 95
1 | 2010 | 75 | 85
I want to calculate the number grades a student got each year over 90. The desired output for the above table is:
Student_ID | Year | Math Grade | English Grade | Grades Above 90
----------------------------------------------------------------
1 | 2009 | 91 | 92 | 2
2 | 2009 | 80 | 95 | 1
1 | 2010 | 75 | 85 | 0

You should do this using a case statement:
select ((math_grade > 90 then 1 else 0 end) + (english_grade > 90 then 1 else 0 end)
) as grades_above_90
The problem with using division is that it doesn't work if the threshold is less than 50.

Figured out a pretty simple solution for this. I added to the end of my SELECT statement:
SELECT math_grade/90 + english_grade/90 AS grades_above_90

Related

Generate 'average' column from sub query and ROW_NUMBER window function in SQL SELECT

I have the following SQL Server tables (with sample data):
Questionnaire
id | coachNodeId | youngPersonNodeId | complete
1 | 12 | 678 | 1
2 | 12 | 52 | 1
3 | 30 | 99 | 1
4 | 12 | 678 | 1
5 | 12 | 678 | 1
6 | 30 | 99 | 1
7 | 12 | 52 | 1
8 | 30 | 102 | 1
Answer
id | questionnaireId | score
1 | 1 | 1
2 | 2 | 3
3 | 2 | 2
4 | 2 | 5
5 | 3 | 5
6 | 4 | 5
7 | 4 | 3
8 | 5 | 4
9 | 6 | 1
10 | 6 | 3
11 | 7 | 5
12 | 8 | 5
ContentNode
id | text
12 | Zak
30 | Phil
52 | Jane
99 | Ali
102 | Ed
678 | Chris
I have the following T-SQL query:
SELECT
Questionnaire.id AS questionnaireId,
coachNodeId AS coachNodeId,
coachNode.[text] AS coachName,
youngPersonNodeId AS youngPersonNodeId,
youngPersonNode.[text] AS youngPersonName,
ROW_NUMBER() OVER (PARTITION BY Questionnaire.coachNodeId, Questionnaire.youngPersonNodeId ORDER BY Questionnaire.id) AS questionnaireNumber,
score = (SELECT AVG(score) FROM Answer WHERE Answer.questionnaireId = Questionnaire.id)
FROM
Questionnaire
LEFT JOIN
ContentNode AS coachNode ON Questionnaire.coachNodeId = coachNode.id
LEFT JOIN
ContentNode AS youngPersonNode ON Questionnaire.youngPersonNodeId = youngPersonNode.id
WHERE
(complete = 1)
ORDER BY
coachNodeId, youngPersonNodeId
This query outputs the following example data:
questionnaireId | coachNodeId | coachName | youngPersonNodeId | youngPersonName | questionnaireNumber | score
1 | 12 | Zak | 678 | Chris | 1 | 1
2 | 12 | Zak | 52 | Jane | 1 | 3
3 | 30 | Phil | 99 | Ali | 1 | 5
4 | 12 | Zak | 678 | Chris | 2 | 4
5 | 12 | Zak | 678 | Chris | 3 | 4
6 | 30 | Phil | 99 | Ali | 2 | 2
7 | 12 | Zak | 52 | Jane | 2 | 5
8 | 30 | Phil | 102 | Ed | 1 | 5
To explain what's happening here… There are various coaches whose job is to undertake questionnaires with various young people, and log the scores. A coach might, at a later date, repeat the questionnaire with the same young person several times, hoping that they get a better score. The ultimate goal of what I'm trying to achieve is that the managers of the coaches want to see how well the coaches are performing, so they'd like to see whether the scores for the questionnaires tend to go up or not. The window function represents a way to establish how many times the questionnaire has been undertaken by the same coach/young person combo.
I need to be able to determine the average score based on the questionnaire number. So for example, the coach 'Zak' logged scores of '1' and '3' for his first questionnaires (where questionnaireNumber = 1) so the average would be 2. For his second questionnaires (where questionnaireNumber = 2) the scores were '3' and '5' so the average would be 4. So in analysing this data we know that over time Zak's questionnaire scores have improved from an average of '2' the first time to an average of '4' the second time.
I feel like the query needs to be grouped by the coachNodeId and questionnaireNumber values so it would output something like this (I've ommitted the questionnaireId, youngPersonNodeId, youngPersonName and score columns as they aren't crucial for the output — they're only used to derive the averageScore — and wouldn't be useful the way the results are grouped):
coachNodeId | coachName | questionnaireNumber | averageScore
12 | Zak | 1 | 2 (calculation: (1 + 3) / 2)
12 | Zak | 2 | 4 (calculation: (3 + 5) / 2)
12 | Zak | 3 | 4 (only one value: 4)
30 | Phil | 1 | 5 (calculation: (5 + 5) / 2)
30 | Phil | 2 | 2 (only one value: 2)
Could anyone suggest how I can modify my query to output the average scores based on the score from the sub-query and the ROW_NUMBER window function? I've hit the limits of my SQL skills!
Many thanks.

It is a bit hard to tell without sample data, but I think you are describing aggregation:
SELECT q.coachNodeId AS coachNodeId,
cn.[text] AS coachName,
q.youngPersonNodeId AS youngPersonNodeId,
ypn.[text] AS youngPersonName,
AVG(score)
FROM Questionnaire q JOIN
ContentNode cn
ON q.coachNodeId = cn.id JOIN
ContentNode ypn
ON q.youngPersonNodeId = ypn.id LEFT JOIN
Answer a
ON a.questionnaireId = q.id
WHERE complete = 1
GROUP BY q.coachNodeID, cn.[text] AS coachName,
q.youngPersonNodeId, ypn.[text]

Sql Query issue and error regarding groupby cause

I am trying to calculate the total number of Projects in every year. and also how many projects are active, how many of them are canceled.
I tried to group by cause for PRojects dates so we have a total number of project in every year but I am not sure where to start and what to do
Select ts.Id as projectid ,
--a.ParentObjectId,
ts.RequestName as ProjectDates,
ts.Type,
ts.Category,
ts.SubType,
ts.status as projectstatus,
Count (ts.ReceptionDate),
cast (ts.ReceptionDate as Date) as ReceptionDate,
from [rpt].[TransmissionServicesRpt] ts
left join [dbo].[AuditHistory] a on a.ParentObjectId = ts.Id
Left join [dbo].[User] u on a.CreatedById = u.id
Group by ts.id, ts.ReceptionDate
+ -------------+--------+-----------+------------+----------+-----------------+
| New Projects | Active | Cancelled | Terminated | Inactive | Carried Forward |
+ -------------+--------+-----------+------------+----------+-----------------+
| 2013 | 32 | 45 | 4 | 11 | 30 |
| 2014 | 45 | 75 | 17 | 14 | 44 |
| 2015 | 46 | 90 | 25 | 21 | 44 |
| 2016 | 30 | 74 | 27 | 10 | 37 |
| 2017 | 82 | 119 | 11 | 26 | 82 |
| 2018 | 86 | 168 | 29 | 24 | 115 |
| 2019 | 23 | 138 | 9 | 4 | 125 |
+ -------------+--------+-----------+------------+----------+-----------------+

You want one result row per year. So group by year. You get it via YEAR or DATEPART. Then count conditionally:
select
year(receptiondate) as year,
count(*) as total,
count(case when status = 'Active' then 1 end) as active,
count(case when status = 'Cancelled' then 1 end) as cancelled,
count(case when status = 'Terminated' then 1 end) as terminated,
count(case when status = 'Inactive' then 1 end) as inactive,
count(case when status = 'Carried Forward' then 1 end) as carried_forward
from rpt.transmissionservicesrpt
group by year(receptiondate)
order by year(receptiondate);

Distribute sequential SQL results evenly based on count

I have SQL results that I need to break into item ranges and the count distributed evenly across a number of tasks. What is a good way to do this?
My data looks like this.
+------+-------+----------+
| Item | Count | ItmGroup |
+------+-------+----------+
| 1A | 100 | 1 |
| 1B | 25 | 1 |
| 1C | 2 | 1 |
| 1D | 6 | 1 |
| 2A | 88 | 2 |
| 2B | 10 | 2 |
| 2C | 122 | 2 |
| 2D | 12 | 2 |
| 3A | 4 | 3 |
| 3B | 103 | 3 |
| 3C | 1 | 3 |
| 3D | 22 | 3 |
| 4A | 55 | 4 |
| 4B | 42 | 4 |
| 4C | 100 | 4 |
| 4D | 1 | 4 |
+------+-------+----------+
Item = the item code.
Count = this context it is determining the popularity of the item. This can be used to RANK items if need be.
ItmGroup - this is a parent value for the Itm column. Item is contained in a Group.
What differentiates this from other similar questions I'veviewed is that the ranges I need to determine cannot be taken out of the order they show in this table. We can do Item Range from A1 to B3, in other words, they can cross over ItmGroups, but they must remain in alphanumeric order by Item.
The expected result would be item ranges that evenly distribute the total count.
+------+-------+----------+
| FrItem | ToItem | TotCount|
+------+-------+----------+
| 1A | 2D | 134 |
| 3A | 3D | 130 |
(etc)

Provided you've happy with a rough estimate, this will split the data in to two groups.
The first group will always have as many records as possible, but no more than half of the total count (and group 2 will have the rest).
WITH
cumulative AS
(
SELECT
*,
SUM([Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM([Count]) OVER () AS totalCount
FROM
yourData
)
SELECT
MIN(item) AS frItem,
MAX(item) AS toItem,
SUM([Count]) AS TotCount
FROM
cumulative
GROUP BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
ORDER BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
To split the data in to 5 portions, it's similar...
GROUP BY
CASE WHEN cumulativeCount <= totalCount * 1/5 THEN 0
WHEN cumulativeCount <= totalCount * 2/5 THEN 1
WHEN cumulativeCount <= totalCount * 3/5 THEN 2
WHEN cumulativeCount <= totalCount * 4/5 THEN 3
ELSE 4 END
Depending on your data this isn't necessarily ideal
Item | Count GroupAsDefinedAbove IdealGroup
------+-------
1A | 4 1 1
2A | 5 2 1
3A | 8 2 2
If you want something that can get the two groups as close in size as possible, that's a lot more complex.

Same as the accepted answer, except declaring a batch number and an addition to the select statement in the WITH cumulativeCte to prevent a remainder.
DECLARE #BatchCount NUMERIC(4,2) = 5.00;
WITH
cumulativeCte AS
(
SELECT
*,
SUM(r.[Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM(r.[Count]) OVER () AS totalCount
,CEILING(SUM(r.[Count]) OVER (ORDER BY IM.MMITNO ASC) / (SUM(r.[Count]) OVER () / #BatchCount)) AS BatchNo
FROM
records r
)
SELECT
MIN(c.Item) AS frItem,
MAX(c.Item) AS toItem,
SUM(c.[Count]) AS TotCount,
c.BatchNo
FROM
cumulativeCte c
GROUP BY
c.BatchNo
ORDER BY
c.BatchNo

How to use previous row's column's value for calculating the next row's column's value

I have a table
Id | Aisle | OddEven | Bay | Size | Y-Axis
3 | A1 | Even | 14 | 10 | 100
1 | A1 | Even | 16 | 10 |
6 | A1 | Even | 20 | 10 |
12 | A1 | Even | 26 | 5 | 150
10 | A1 | Even | 28 | 5 |
11 | A1 | Even | 32 | 5 |
2 | A1 | Odd | 13 | 10 | 100
5 | A1 | Odd | 17 | 10 |
4 | A1 | Odd | 19 | 10 |
9 | A1 | Odd | 23 | 5 | 150
7 | A1 | Odd | 25 | 5 |
8 | A1 | Odd | 29 | 5 |
want to look like this
Id | Aisle | OddEven | Bay | Size | Y-Axis
1 | A1 | Even | 14 | 10 | 100
2 | A1 | Even | 16 | 10 | 110
3 | A1 | Even | 20 | 10 | 120
4 | A1 | Even | 26 | 5 | 150
5 | A1 | Even | 28 | 5 | 155
6 | A1 | Even | 32 | 5 | 160
7 | A1 | Odd | 13 | 10 | 100
8 | A1 | Odd | 17 | 10 | 110
9 | A1 | Odd | 19 | 10 | 120
10 | A1 | Odd | 23 | 5 | 150
11 | A1 | Odd | 25 | 5 | 155
12 | A1 | Odd | 29 | 5 | 160
I need a select query and update query. What its doing is there are already some Y-Axis Number been filled (at the start of the Odd/Even) then I need to take the previous row's Y-Axis column's value and adds to the current rows's size which = to current Y-Axis. Needs to keep doing it until it finds another Y-Axis has the value it skips the calculation and next row is using that number.
My thinking process is this:
Id will definitely be used, however, the Id is not sequence as shown my example
so I need to have
ROW_Number OVER (PARTITION BY Aisle,OddEven,Bay Order BY Aisle,OddEven,Bay)
Then some kind of JOIN the same table but the ON is T1.RN = T2.RN - 1
Where I am stuck is but the first row has not previous value it will try to update that value.
Anyone have an idea for SQL Query 2008 for Select and Update will be greatly appreciated! Thanks.

You seem to want a cumulative sum. This would be easier in SQL Server 2012+. You can do this in SQL Server 2008 using outer apply:
select t.*, cume_value
from t outer apply
(select sum(size) + sum(yaxis) as cume_value
from t t2
where t2.aisle = t.aisle and t2.oddeven = t.oddeven and
t2.bay < t.bay
) t2;

A little more difficult on 2008, but I think this is what you are looking for
Declare #Table table (Id int,Aisle varchar(25),OddEven varchar(25),Bay int,Size int,[Y-Axis] int)
Insert Into #Table values
(3,'A1','Even',14,10 ,100),
(1,'A1','Even',16,10 ,0),
(6,'A1','Even',20,10 ,0),
(12,'A1','Even',26,5,150),
(10,'A1','Even',28,5,0),
(11,'A1','Even',32,5,0),
(2,'A1','Odd',13,10 ,100),
(5,'A1','Odd',17,10 ,0),
(4,'A1','Odd',19,10 ,0),
(9,'A1','Odd',23,5,150),
(7,'A1','Odd',25,5,0),
(8,'A1','Odd',29,5,0)
;with cteBase as (
Select *
,IDNew=Row_Number() over (Order By Aisle,Bay)
,RowNr=Row_Number() over (Order By Aisle,OddEven,Bay)
From #Table
)
, cteGroup as (Select TmpRowNr=RowNr,GrpNr=Row_Number() over (Order By RowNr) from cteBase where [Y-Axis]>0)
, cteFinal as (
Select A.*
,GrpNr = (Select max(GrpNr) from cteGroup Where TmpRowNr<=RowNr)
From cteBase A
)
Select ID=Row_Number() over (Order By A.OddEven,A.Bay)
,A.Aisle
,A.OddEven
,A.Bay
,A.Size
,[Y-Axis] = Sum(case when B.[Y-Axis]>0 then B.[Y-Axis] else B.Size end)
From cteFinal A
Join cteFinal B on (B.RowNr<=A.RowNr and A.GrpNr=B.GrpNr)
Group By
A.IDNew
,A.Aisle
,A.OddEven
,A.Bay
,A.Size
Order By A.OddEven,A.Bay
Returns
ID Aisle OddEven Bay Size Y-Axis
1 A1 Even 14 10 100
2 A1 Even 16 10 110
3 A1 Even 20 10 120
4 A1 Even 26 5 150
5 A1 Even 28 5 155
6 A1 Even 32 5 160
7 A1 Odd 13 10 100
8 A1 Odd 17 10 110
9 A1 Odd 19 10 120
10 A1 Odd 23 5 150
11 A1 Odd 25 5 155
12 A1 Odd 29 5 160

I gotta leave my computer so update query should be easy to move on from here.
Below is the select query;
select row_number() over (order by oddeven,bay) id,
Aisle,
OddEven,
Bay,
Size,
max(ISNULL([Y-Axis],0)) over (partition by Aisle, OddEven,Size order by bay)
+ sum(CASE WHEN [Y-Axis] is null THEN Size ELSE 0 END) over (partition by Aisle,OddEven,size order by Bay) as [Y-Axis]
from oddseven
order by id

SQL Server 2008 - accumulating column

I would like to accumulate my data as you can see below there is origin table table1:
What is the best query for to do this?
Is possible to do this dynamically - when I add more types of terms??
Table 1
ID | term | value
-----------------------
1 | I | 100
2 | I | 200
3 | II | 100
4 | II | 50
5 | II | 75
6 | III | 50
7 | III | 65
8 | IV | 30
9 | IV | 45
And the result should be like below:
YTD | Acc Value
------------------
I-I | 300
I-II | 525
I-III| 640
I-IV | 715
Thanks

select
(select min(term) from yourtable ) +'-'+term,
(select sum(value) from yourtable t1 where t1.term<=t.term)
from yourtable t
group by term

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Calculate number of cells in row above threshold - sql

You should do this using a case statement: select ((math_grade > 90 then 1 else 0 end) + (english_grade > 90 then 1 else 0 end) ) as grades_above_90 The problem with using division is that it doesn't work if the threshold is less than 50.

Figured out a pretty simple solution for this. I added to the end of my SELECT statement: SELECT math_grade/90 + english_grade/90 AS grades_above_90

Related

Generate 'average' column from sub query and ROW_NUMBER window function in SQL SELECT

Sql Query issue and error regarding groupby cause

Distribute sequential SQL results evenly based on count

How to use previous row's column's value for calculating the next row's column's value

SQL Server 2008 - accumulating column

Categories

Resources