Counting Only Most Recent Entries Matching Multiple Conditions - sql

I have a database table similar to this (but many more entries):
PupilId | PeriodId | Assessment
-------------------------------
1 | 10 | 7
1 | 30 | 7
1 | 50 | 7
2 | 20 | 7
3 | 10 | 7
3 | 20 | 8
I want to find the number of pupils (i.e. distinct PupilId) who got a given assessment at some point up to and including a given PeriodId. Only the most recent assessment before or on the given PeriodId should be used.
For instance:
Number of pupils who got 7 on or before PeriodId 100 = 2 (PupilId 1 and 2)
Number of pupils who got a 7 on or before PeriodId 10 = 2 (PupilId 1 and 3)
Number of pupils who got 8 on or before PeriodId 30 = 1 (PupilId 3)
This is for SQL Azure.
Many thanks.

OK, no answers so here's what I came up with after help from another source:
SELECT COUNT(1)
FROM (
SELECT PupilId AS pupil_id, Max(PeriodId) AS max_period
FROM steph1
WHERE PeriodId <= 100
GROUP BY PupilId
) steph2, steph1
WHERE
PupilId=pupil_id AND
max_period = PeriodId AND
Assessment = 7
Hope that helps somebody else with the same issue.

Related

Find records which have multiple occurrences in another table array (postgres)

I have a table which has records in array. Also there is another table which have single string records. I want to get records which have multiple occurrences in another table. Following are tables;
Vehicle
veh_id | vehicle_types
-------+---------------------------------------
1 | {"byd_tang","volt","viper","laferrari"}
2 | {"volt","viper"}
3 | {"byd_tang","sonata","jaguarxf"}
4 | {"swift","teslax","mirai"}
5 | {"volt","viper"}
6 | {"viper","ferrariff","bmwi8","viper"}
7 | {"ferrariff","viper","viper","volt"}
vehicle_names
id | vehicle_name
-----+-----------------------
1 | byd_tang
2 | volt
3 | viper
4 | laferrari
5 | sonata
6 | jaguarxf
7 | swift
8 | teslax
9 | mirai
10 | ferrariff
11 | bmwi8
I have a query which can give output what I expect but its not optimal and may be its expensive query.
This is the query:
select veh_name
from vehicle_names dsb
where (select count(*) from vehicle dsd
where dsb.veh_name = ANY (dsd.veh_types)) > 1
The output should be:
byd_tang
volt
viper
One option would be an aggregation query:
SELECT
vn.id,
vn.veh_name
FROM vehicle_names vn
INNER JOIN vehicle v
ON vn. veh_name = ANY (v.veh_types)
GROUP BY
vn.id,
vn.veh_name
HAVING
COUNT(*) > 1;
This only counts a vehicle name which appears in two or more records in the other table. It would not pick up, for example, a single vehicle record with the same name appearing two or more times.

How do I Transform / Pivot in Access SQL but without aggregating?

Firstly, thank you to anyone that can help, I hope this is a simple question for those in the know.
I have Data which is of the form:
LeaseID | ChargeID
1 | 1
1 | 2
2 | 3
3 | 4
3 | 5
3 | 6
i.e. LeaseID 1 has 2 ChargeIDs
How can I query this in Access SQL so that the data will be reflected as
LeaseID | ChargeID | ChargeID | ChargeID
1 | 1 | 2
2 | 3
3 | 4 | 5 | 6
I know I am limited to 255 columns but this is not a problem as there will never be 255 but the number of columns should increase with the maximum number of ChargeIDs on a given lease.
I believe it is something to do with Transform / Pivot but have been unable to get it working. I keep getting the "too many crosstabs error"
Thanks,
Consider a two-step process involving a staging table:
Make-Table Query (using correlated subquery with slow performance on very large tables)
SELECT t.LeaseID, t.ChargeID, 'ChargeID' & (SELECT count(*) FROM LeaseCharge sub
WHERE sub.LeaseID = t.LeaseID
AND sub.ChargeID <= t.ChargeID) As Rank
INTO myStagingTable
FROM myTable t;
Cross-Tab Query
TRANSFORM MAX(s.ChargeID) As MaxChargeID
SELECT s.LeaseID
FROM myStagingTable s
GROUP BY s.LeaseID
PIVOT s.[Rank]
-- LeaseID ChargeID1 ChargeID2 ChargeID3
-- 1 1 2
-- 2 3
-- 3 4 5 6

SQL get the time of different rows

I want to do a select that gives me the time of an employee resolving a ticket.
The problem is that the ticket is divided in actions, so its not only getting the time of a row, it can be from n rows.
This is an abbreviation of what I have:
Tickets
TicketID | Days | Hours | Minutes
------------------------------------------------
12 | 0 | 2 | 32
12 | 1 | 0 | 12
12 | 4 | 6 | 0
13 | 2 | 5 | 12
13 | 0 | 2 | 33
And this is what I want to get:
TicketID | Time (in minutes)
------------------------------------------------
12 | 2994
13 | 1425
(Or just one row with the condition where specifying TicketID)
This is the select that im doing right now:
select distinct ((Days*8)*60) + (Hours*60) + Minutes from Tickets where ticketid = 12
But is not working as I want.
select ticketid, sum((Days*8)*60), sum((Hours*60)), sum (Minutes)
from tickets
group by ticketid
select TicketID, sum((Days*8)*60) + sum(Hours*60) + sum(Minutes) as Time_in_minutes
from Tickets
group by TicketID
Distinct, as you were trying before, takes each row in the source table (Tickets) and filters out all of the duplicate rows. Instead, you are trying to sum up the days, minutes, and hours for each ticket. So sum them up, and group by the ticket number.
Try this:
SELECT TicketID, (Sum(Minutes)+(Sum(Hours)*60)+(sum(Days)*24*60) ) time
FROM Tickets Group by TicketID

Counting Just One Record Per Pupil Though Multiple Are Matched

I've set up a SQL Fiddle to illustrate the question...
I have a database of pupils (referenced by PupilId) who have assessments (AssessmentLevelId) recorded in various subjects (NCSubjectId) at various period (PeriodId).
Not every possible period may have an assessment in it.
PupilId | PeriodId | NCSubjectId | AssessmentLevelId
-----------------------------------------------------
100 | 1 | 10 | 1
100 | 3 | 10 | 2
200 | 1 | 10 | 1
300 | 1 | 10 | 1
400 | 1 | 10 | 1
100 | 5 | 10 | 2
300 | 7 | 10 | 2
100 | 15 | 10 | 2
I want to find the number of pupils who have a particular assessment level by a particular PeriodId.
So far I have this:
SELECT PupilId, COUNT(1) FROM NCAssessment
WHERE AssessmentLevelId = 2
AND NCSubjectId=10
AND PeriodId <= 10
GROUP BY PupilId
Which finds the pupil ids, but pupil 100 has a count of 2. I guess I need to wrap this in another query but am stumped. Any suggestions?
This is using Azure SQL.
Thanks.
If I understand your question correctly, I think this might be what you are looking for:
AssessmentLevelId = 2 has been removed from the query, because some Periods may not have an assessment.
SELECT AssessmentLevelID, PeriodID, COUNT(DISTINCT PupilID)
FROM NCAssessment
WHERE NCSubjectId=10 AND
PeriodId <= 10
GROUP BY AssessmentLevelID, PeriodID
If this isn't correct, could you please post a sample result you are expecting. Thanks!
If you want the number of distinct pupils that match, then use count(distinct):
SELECT COUNT(DISTINCT PupilId) as NumMatchingPupils, COUNT(*) as NumMatchingAssessments
FROM NCAssessment
WHERE AssessmentLevelId = 2 AND NCSubjectId = 10 AND PeriodId <= 10;
COUNT(DISTINCT) will count each pupil once, regardless of the number of maps. COUNT(*) or COUNT(1) will count the number of assessments that match.

Most Efficient Way to Compute Running Value in SQL [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Calculate a Running Total in SqlServer
Consider this data
Day | OrderCount
1 3
2 2
3 11
4 3
5 6
How can i get this accumulation of OrderCount(running value) resultset using T-SQL query
Day | OrderCount | OrderCountRunningValue
1 3 3
2 2 5
3 11 16
4 3 19
5 6 25
I Can easily do this with looping in the actual query (using #table) or in my C# codebehind but its so slow (Considering that i also get the orders per day) when im processing thousand of records so i'm looking for better / more efficient approach hopefully without loops something like recursing CTE or something else.
Any idea would be greatly appreciated. TIA
As you seem to need these results in the client rather than for use within another SQL query, you are probably better off Not doing this in SQL.
(The linked question in my comment shows 'the best' option within SQL, if that is infact necessary.)
What may be recommended is to pull the Day and OrderCount values as one result set (SELECT day, orderCount FROM yourTable ORDER BY day) and then calculate the running total in your C#.
Your C# code will be able to iterate through the dataset efficiently, and will almost certainly outperform the SQL approaches. What this does do, is to transfer some load from the SQL Server to the web-server, but at an overall (and significant) resource saving.
SELECT t.Day,
t.OrderCount,
(SELECT SUM(t1.OrderCount) FROM table t1 WHERE t1.Day <= t.Day)
AS OrderCountRunningValue
FROM table t
SELECT
t.day,
t.orderCount,
SUM(t1.orderCount) orderCountRunningValue
FROM
table t INNER JOIN table t1 ON t1.day <= t.day
group by t.day,t.orderCount
CTE's to the rescue (again):
DROP TABLE tmp.sums;
CREATE TABLE tmp.sums
( id INTEGER NOT NULL
, zdate timestamp not null
, amount integer NOT NULL
);
INSERT INTO tmp.sums (id,zdate,amount) VALUES
(1, '2011-10-24', 1 ),(1, '2011-10-25', 2 ),(1, '2011-10-26', 3 )
,(2, '2011-10-24', 11 ),(2, '2011-10-25', 12 ),(2, '2011-10-26', 13 )
;
WITH RECURSIVE list AS (
-- Terminal part
SELECT t0.id, t0.zdate
, t0.amount AS amount
, t0.amount AS runsum
FROM tmp.sums t0
WHERE NOT EXISTS (
SELECT * FROM tmp.sums px
WHERE px.id = t0.id
AND px.zdate < t0.zdate
)
UNION
-- Recursive part
SELECT p1.id AS id
, p1.zdate AS zdate
, p1.amount AS amount
, p0.runsum + p1.amount AS runsum
FROM tmp.sums AS p1
, list AS p0
WHERE p1.id = p0.id
AND p0.zdate < p1.zdate
AND NOT EXISTS (
SELECT * FROM tmp.sums px
WHERE px.id = p1.id
AND px.zdate < p1.zdate
AND px.zdate > p0.zdate
)
)
SELECT * FROM list
ORDER BY id, zdate;
The output:
DROP TABLE
CREATE TABLE
INSERT 0 6
id | zdate | amount | runsum
----+---------------------+--------+--------
1 | 2011-10-24 00:00:00 | 1 | 1
1 | 2011-10-25 00:00:00 | 2 | 3
1 | 2011-10-26 00:00:00 | 3 | 6
2 | 2011-10-24 00:00:00 | 11 | 11
2 | 2011-10-25 00:00:00 | 12 | 23
2 | 2011-10-26 00:00:00 | 13 | 36
(6 rows)