sum column based on values from another column, grouping by a third column - sql

I am but lost in this one. Seems simple but just cant get to the answer. And i have a feeling i might be embarrassed by the answer. Ok here it goes. Am trying to do this in WebMatrix 3.0, i believe its SQL Express. Not sure about that.
Table Dlivry
tID cID Boxes Delivered
1 01 5 1
2 03 7 1
3 01 2 0
4 01 3 1
5 03 5 1
6 05 4 1
7 05 10 0
what i want is
Client Delivered NotDelivered
01 8 2
03 12 0
05 4 10
What i have done so far
SELECT D1.cid AS Client,
Sum(D1.boxes) AS Delivered,
Sum(D2.boxes) AS NotDelivered
FROM dlivr D1,
dlivry D2
WHERE D1.delivered = 1
AND D2.delivered = 0
AND D1.cid = D2.cid
GROUP BY cid
Ok there it is. Am ready to get embarrassed and i am ready to learn.
Appreciate your help.

Since you're using an inner join, you'll only get rows for CID that have both delivered and undelivered boxes. So you won't get any row for CID = 3. Try this:
SELECT CID,
SUM(CASE WHEN Delivered = 1 THEN boxes ELSE 0 END) AS Delivered,
SUM(CASE WHEN Delivered = 0 THEN boxes ELSE 0 END) AS NotDelivered
FROM Dlivr
GROUP BY CID

Related

SQL Running total previous 3 months by date and id

This is a simplification of the table q3 I'm working with:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 1 2 2 2
b 31.6.2017 1 0 1 2
c 31.6.2017 2 3 1 4
a 31.7.2017 4 3 2 0
b 31.7.2017 3 0 6 0
c 31.7.2017 4 1 0 0
I need to sum the numbers in the last four columns for each part in Partno so that the sum represents the running total of the last three months at each date in the EndOfMonth column.
The result i'm looking for is:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 6 3 2 3
b 31.6.2017 6 1 1 3
c 31.6.2017 4 5 1 5
a 31.7.2017 10 6 4 3
b 31.7.2017 7 1 7 3
c 31.7.2017 8 6 1 5
So e.g. for partno A at 31.7.2017 the last thee months' sum for the 'AA' column is 4+1+5=10.
I'm quite new to SQL and am well and truly stuck with this. I've tried something like the following to just get a simple rolling total (without even specifying the sum range to be the last 3 months). Also, I'm not sure if the database even supports all the functions in the below code, since it's giving me the error "Incorrect Syntax near the keyword 'OVER'"
SELECT
Partno,
EndofMonth,
SUM(q3.AA) OVER (PARTITION BY q3.Partno ORDER BY EndofMonth ROWS UNBOUNDED PRECEDING) as 'AA'
FROM q3
Anyway, any help would be greatly appreciated!
Thanks
EDIT:
Thanks to Benjamin and with a little help from this post: https://dba.stackexchange.com/questions/114403/date-range-rolling-sum-using-window-functions
I was able to find the solution:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as 'AA', SUM(b.AS) as 'AS',...
FROM q3 a, q3 b
WHERE a.Partno = b.Partno AND a.endOfMonth >= b.endOfMonth
AND b.endOfMonth >= DATEADD(month,-2,a.endOfMonth)
GROUP BY a.Partno, a.endOfMonth
Something like this might work:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as AA
FROM q3 a, q3 b
WHERE a.Partno = b.Partno
AND DATEDIFF(month, b.endOfMonth, a.endOfMonth) < 4
GROUP BY a.Partno, b.Partno
This assumes that endOfMonth is in datetime format, if it is not you will have to use convert(). Note that you might have to replace DATEDIFF() depending on what implementation you are using.
I haven't tested this, so I might be way off. It has been a while since I worked with SQL. Hopefully you can get it working by messing around with it a bit, and if not then maybe it will inspire you to write something better. Let me know how it goes!

Generate a large data table in SQL from an exsiting table [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I would be grateful for your help. I have a table like this (Sample of the date):
Amount Desc Month SM code ID
$32,323.00 Bla1 1 121 3 2424221
$4,242.00 Bla2 1 A1 3 2424221
$3,535.00 Bla3 1 A3 1 3230824
$4,984.00 Bla4 1 433 1 3230824
$47,984.00 Bla5 1 B1 1 3230824
$3,472.00 Bla6 1 D2 27 2297429
$3,239.00 Bla7 1 124 27 2297429
$4,249.00 Bla8 1 114 24 3434334
$2,492.00 Bla9 1 132 24 3434334
$424.00 Bla10 2 232 3 2424221
$24,242.00 Bla7 2 124 3 2424221
$242,424 Bla4 2 433 1 3230824
$533.00 Bla13 2 235 1 3230824
$4,342.00 Bla14 2 223 1 3230824
$24,242.00 Bla15 2 224 27 2297429
$24,242.00 Bla1 2 121 27 2297429
$4,242.00 Bla17 2 432 24 3434334
$24,224.00 Bla9 2 132 24 3434334
And I would like to generate a table like this:
SM - distinct values
Desc
**TotalCntOfSM - number of time the SM showed up total**
**TotalCntOfSM_a (same but When code is in group (1,2,3))**
TotalCntofSM_a1 (same but only when code is equal 1)
TotalCntofSM_a2 (same when code is equal 2)
TotalCntofSM_a3 (same but only when code is equal 3)
**TotalCntofSM_b (same but only when code is between 4-27)**
**TotalsumOfAmountForSM(Sum Total amount for the SM)**
**TotalSumOfAmountForSmOfSM_a (Sum Amount for for the SM When code is in 1,2,3)**
TotalSumOfAmountForSmofSM_a1 (Sum Amount for the SM when code is equal 1)
TotalSumOfAmountForSmofSM_a2 (Sum Amount for the SM when code is equal 2)
TotalSumOfAmountForSmofSM_a3 (Sum Amount for the SM when code is equal 3)
**TotalSumOfAmountForSmofSM_b (Sum Amount for the SM when code is between 4-27)**
**TotalCntOfSMinJan - number of time the SM showed up total in month = 1**
**TotalCntOfSM_aIninJan (same but When code is in group (1,2,3))in month = 1**
TotalCntofSM_a1inJan (same but only when code is equal 1) in month = 1
TotalCntofSM_a2inJan (same when code is equal 2) in month = 1
TotalCntofSM_a3inJan (same but only when code is equal 3) in month = 1
**TotalCntofSM_binJan (same but only when code is between 4-27)** in month = 1
**TotalsumOfAmountForSMinJan(Sum Total amount for the SM)** in month = 1
**TotalSumOfAmountForSmOfSM_ainJan (Sum Amount for for the SM When code is in 1,2,3)** in month = 1
TotalSumOfAmountForSmofSM_a1inJan (Sum Amount for the SM when code is equal 1) in month = 1
TotalSumOfAmountForSmofSM_a2inJan (Sum Amount for the SM when code is equal 2) in month = 1
TotalSumOfAmountForSmofSM_a3inJan (Sum Amount for the SM when code is equal 3) in month = 1
**TotalSumOfAmountForSmofSM_binJan (Sum Amount for the SM when code is between 4-27)** in month = 1
And same for Month = 2,3,4,5,6,7,8,9 etc.
Any help would be great.
select
sm,
count(*) as TotalCntOfSM,
sum(case when code in (1,2,3) then 1 else 0 end) as TotalCntOfSM_a ,
sum(case when code in (1) then 1 else 0 end) as TotalCntofSM_a1
---etc---
sum(case when code in (1,2,3) then Amount else 0 end) as TotalSumOfAmountForSmOfSM_a
sum(case when code in (1) then Amount else 0 end) as TotalSumOfAmountForSmofSM_a1
---etc---
from <table>
group by sm

SQL Aggregate functions with groupings

I need to create some checks to make sure that students are enrolled in the correct courses with the correct number of units. Here is my SQL at the moment.
SELECT StudentID
,AssessmentCode
,BoardCode
,BoardCategory
,BoardUnits
,sum(cast(boardunits as int)) over (partition by studentid,boardcategory) as UnitCount
,Count(boardcategory) over (partition by studentid) as SubjectCount
FROM uvNCStudentSubjectDetails
where fileyear = 2015
and filesemester = 1
and studentyearlevel = 11
and StudentIBFlag = 0
order by Studentnameinternal,BoardCategory
This gives me the following info...
StudentID AssessmentCode BoardCode BoardCategory BoardUnits UnitCount SubjectCount
61687 11TECDAT 11080 A 2 11 7
61687 11PRS1U 11350 A 1 11 7
61687 11MATGEN 11235 A 2 11 7
61687 11LANGRB 11870 A 2 11 7
61687 11ENGSTD 11130 A 2 11 7
61687 11GEOGEO 11190 A 2 11 7
64549 11TECIND 11200 A 2 10 7
64549 11SCIPHY 11310 A 2 10 7
64549 11SCIEAE 11100 A 2 10 7
64549 11MATGEN 11235 A 2 10 7
64549 11ENGSTD 11130 A 2 10 7
64549 11TECHOS 26501 B 2 2 7
64549 11MUSDRS 63212 C 1 1 7
45461 11ECOECO 11110 A 2 13 7
45461 11ENGADV 11140 A 2 13 7
45461 11HISMOD 11270 A 2 13 7
45461 11HISLST 11220 A 2 13 7
45461 11MATMAT 11240 A 2 13 7
45461 11PRS1U 11350 A 1 13 7
45461 11SCIBIO 11030 A 2 13 7
Note for the first student, I have a count of Category A subject Units (11 in total) He is only doing Category A subjects. For the second student, he has 10 units of Category A subjects, he is doing 1 Category B subject worth 2 units and one category C subject worth 1 unit. the final student just has 13 Category A units.
Now what I would really like is something like this...!
StudentID Sum A Units Sum B Units Sum C Units Sum A Units + Sum B Units Count of Subjects
61687 11 0 0 11 7
64549 10 2 1 12 7
45461 13 0 0 13 7
So I would like some aggregated functions with a student grouped onto only 1 row and the sum of his different units as separate fields. I would also like a field which sums the Category A and B Units and also a field which gives a count of the total number of subjects they are doing. I could then use this data to set up some warning messages if a student is not doing the correct number of A or B Units etc
I have played around with common table expressions, subqueries etc but am not really sure what I am doing and am not sure which is the correct way about getting the data in the form I want.
Is anyone able to help?
SELECT
STUDENTID,
SUM(CASE BOARDCATEGORY WHEN 'A' THEN 1 ELSE 0 END) AS SUM_A_UNITS,
SUM(CASE BOARDCATEGORY WHEN 'B' THEN 1 ELSE 0 END) AS SUM_B_UNITS,
SUM(CASE BOARDCATEGORY WHEN 'C' THEN 1 ELSE 0 END) AS SUM_C_UNITS,
SUM(CASE BOARDCATEGORY WHEN 'A' THEN 1 WHEN 'B' THEN 1 ELSE 0 END) AS SUM_A_UNITS+SUM_B_UNITS,
COUNT(BOARDCODE) AS COUNT_OF_SUBJECTS
FROM (
SELECT StudentID
,AssessmentCode
,BoardCode
,BoardCategory
,BoardUnits
,sum(cast(boardunits as int)) over (partition by studentid,boardcategory) as UnitCount
,Count(boardcategory) over (partition by studentid) as SubjectCount
FROM uvNCStudentSubjectDetails
where fileyear = 2015
and filesemester = 1
and studentyearlevel = 11
and StudentIBFlag = 0
order by Studentnameinternal,BoardCategory
)
GROUP BY STUDENTID;
Wrapped your SQL statement in the solution, so that you can see what the solution does straight away.
Use SUM and CASE (i.e. SUM only when a condition is met).

Why SELECT 0, ... instead of SELECT

Lets say I have a SQLite database that contains a table:
sqlite> create table person (id integer, firstname varchar, lastname varchar);
Now I want to get every entry which is in the table.
sqlite> select t0.id, t0.firstname, t0.lastname from person t0;
This works fine and this it what I would use. However I have worked with a framework from Apple (Core Data) that generates SQL. This framework generates a slightly different SQL query:
sqlite> select 0, t0.id, t0.firstname, t0.lastname from person t0;
Every SQL query generated by this framework begins with "select 0,". Why is that?
I tried to use the explain command to see whats going on but this was inconclusive - at least to me.
sqlite> explain select t0.id, t0.firstname, t0.lastname from person t0;
addr opcode p1 p2 p3 p4 p5 comment
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
0 Trace 0 0 0 00 NULL
1 Goto 0 11 0 00 NULL
2 OpenRead 0 2 0 3 00 NULL
3 Rewind 0 9 0 00 NULL
4 Column 0 0 1 00 NULL
5 Column 0 1 2 00 NULL
6 Column 0 2 3 00 NULL
7 ResultRow 1 3 0 00 NULL
8 Next 0 4 0 01 NULL
9 Close 0 0 0 00 NULL
10 Halt 0 0 0 00 NULL
11 Transactio 0 0 0 00 NULL
12 VerifyCook 0 1 0 00 NULL
13 TableLock 0 2 0 person 00 NULL
14 Goto 0 2 0 00 NULL
And the table for the second query looks like this:
sqlite> explain select 0, t0.id, t0.firstname, t0.lastname from person t0;
addr opcode p1 p2 p3 p4 p5 comment
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
0 Trace 0 0 0 00 NULL
1 Goto 0 12 0 00 NULL
2 OpenRead 0 2 0 3 00 NULL
3 Rewind 0 10 0 00 NULL
4 Integer 0 1 0 00 NULL
5 Column 0 0 2 00 NULL
6 Column 0 1 3 00 NULL
7 Column 0 2 4 00 NULL
8 ResultRow 1 4 0 00 NULL
9 Next 0 4 0 01 NULL
10 Close 0 0 0 00 NULL
11 Halt 0 0 0 00 NULL
12 Transactio 0 0 0 00 NULL
13 VerifyCook 0 1 0 00 NULL
14 TableLock 0 2 0 person 00 NULL
15 Goto 0 2 0 00 NULL
Some frameworks do this in order to tell, without any doubt, whether a row from that table was returned.
Consider
A B
+---+ +---+------+
| a | | a | b |
+---+ +---+------+
| 0 | | 0 | 1 |
+---+ +---+------+
| 1 | | 1 | NULL |
+---+ +---+------+
| 2 |
+---+
SELECT A.a, B.b
FROM A
LEFT JOIN B
ON B.a = A.a
Results
+---+------+
| a | b |
+---+------+
| 0 | 1 |
+---+------+
| 1 | NULL |
+---+------+
| 2 | NULL |
+---+------+
In this result set, it is not possible to see that a = 1 exists in table B, but a = 2 does not. To get that information, you need to select a non-nullable expression from table b, and the simplest way to do that is to select a simple constant value.
SELECT A.a, B.x, B.b
FROM A
LEFT JOIN (SELECT 0 AS x, B.a, B.b FROM B) AS B
ON B.a = A.a
Results
+---+------+------+
| a | x | b |
+---+------+------+
| 0 | 0 | 1 |
+---+------+------+
| 1 | 0 | NULL |
+---+------+------+
| 2 | NULL | NULL |
+---+------+------+
There are a lot of situations where these constant values are not strictly required, for example when you have no joins, or when you could choose a non-nullable column from b instead, but they don't cause any harm either, so they can just be included unconditionally.
When I have code to dynamically generate a WHERE clause, I usually start the clause with a:
WHERE 1 = 1
Then the loop to add additional conditions always adds each condition in the same format:
AND x = y
without having to put conditional logic in place to check if this is the first condition or not: "if this is the first condition then start with the WHERE keyword, else add the AND keyword.
So I can imagine a framework doing this for similar reasons. If you start the statement with a SELECT 0 then the code to add subsequent columns can be in a loop without any conditional statements. Just add , colx each time without any conditional checking along the lines of "if this is the first column, don't put a comma before the column name, otherwise do".
Example pseudo code:
String query = "SELECT 0";
for (Column col in columnList)
query += ", col";
Only Apple knows … but I see two possibilities:
Inserting a dummy column ensures that the actual output columns are numbered beginning with 1, not 0. If some existing interface already assumed one-based numbering, doing it this way in the SQL backend might have been the easiest solution.
If you make a query for multiple objects using multiple subqueries, a value like this could be used to determine from which subquery a record originates:
SELECT 0, t0.firstname, ... FROM PERSON t0 WHERE t0.id = 123
UNION ALL
SELECT 1, t0.firstname, ... FROM PERSON t0 WHERE t0.id = 456
(I don't know if Core Data actually does this.)
Your EXPLAIN output shows that the only difference is (at address 4) that the second program sets the extra column to zero, so there is only a minimal performance difference.

MySQL subquery and bracketing

Here are my tables
respondents:
field sample value
respondentid : 1
age : 2
gender : male
survey_questions:
id : 1
question : Q1
answer : sample answer
answers:
respondentid : 1
question : Q1
answer : 1 --id of survey question
I want to display all respondents who answered the certain survey, display all answers and total all the answer and group them according to the age bracket.
I tried using this query:
SELECT
res.Age,
res.Gender,
answer.id,
answer.respondentid,
SUM(CASE WHEN res.Gender='Male' THEN 1 else 0 END) AS males,
SUM(CASE WHEN res.Gender='Female' THEN 1 else 0 END) AS females,
CASE
WHEN res.Age < 1 THEN 'age1'
WHEN res.Age BETWEEN 1 AND 4 THEN 'age2'
WHEN res.Age BETWEEN 4 AND 9 THEN 'age3'
WHEN res.Age BETWEEN 10 AND 14 THEN 'age4'
WHEN res.Age BETWEEN 15 AND 19 THEN 'age5'
WHEN res.Age BETWEEN 20 AND 29 THEN 'age6'
WHEN res.Age BETWEEN 30 AND 39 THEN 'age7'
WHEN res.Age BETWEEN 40 AND 49 THEN 'age8'
ELSE 'age9'
END AS ageband
FROM Respondents AS res
INNER JOIN Answers as answer ON answer.respondentid=res.respondentid
INNER JOIN Questions as question ON answer.Answer=question.id
WHERE answer.Question='Q1' GROUP BY ageband ORDER BY res.Age ASC
I was able to get the data but the listing of all answers are not present. Do I have to subquery SELECT into my current SELECT statement to show the answers?
I want to produce something like this:
ex: # of Respondents is 3 ages: 2,3 and 6
Question: what are your favorite subjects?
Ages 1-4:
subject 1: 1
subject 2: 2
subject 3: 2
total respondents for ages 1-4 : 2
Ages 5-10:
subject 1: 1
subject 2: 1
subject 3: 0
total respondents for ages 5-10 : 1
SELECT
res.Age,
res.Gender,
answer.id,
answer.respondentid,
SUM(CASE WHEN res.Gender='Male' THEN 1 else 0 END) AS males,
SUM(CASE WHEN res.Gender='Female' THEN 1 else 0 END) AS females,
group_concat(answer.answer separator '\n') answers
CASE
WHEN res.Age < 1 THEN 'age1'
WHEN res.Age BETWEEN 1 AND 4 THEN 'age2'
WHEN res.Age BETWEEN 4 AND 9 THEN 'age3'
WHEN res.Age BETWEEN 10 AND 14 THEN 'age4'
WHEN res.Age BETWEEN 15 AND 19 THEN 'age5'
WHEN res.Age BETWEEN 20 AND 29 THEN 'age6'
WHEN res.Age BETWEEN 30 AND 39 THEN 'age7'
WHEN res.Age BETWEEN 40 AND 49 THEN 'age8'
ELSE 'age9'
END AS ageband
FROM Respondents AS res
INNER JOIN Answers as answer ON answer.respondentid=res.respondentid
INNER JOIN Questions as question ON answer.Answer=question.id
WHERE answer.Question='Q1' GROUP BY ageband ORDER BY res.Age ASC;
You have to set the group_concat_max_len system variable to a higher value:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat
The result is truncated to the maximum length that is given by the group_concat_max_len system variable, which has a default value of 1024. The value can be set higher, although the effective maximum length of the return value is constrained by the value of max_allowed_packet.
Depending on your platform, you should also replace the separator from '\n' to char(13) or char(10), or ' < b r > '.