Confusing with Having query in sql - sql

I am using sql server management studio 2012 and have to make a query to show which subject a student has failed(condition for failing is point<5.0) the most for the first time from this table
StudentID | SubjectID | First/Second_Time | Point.
1 | 02 | 1 | 5.0
2 | 04 | 2 | 7.0
3 | 03 | 2 | 9
... etc
Here are my teacher's query:
SELECT SubjectID
FROM Result(NAME OF the TABLE)
WHERE [First/Second_Time] = 1 AND Point < 5
GROUP BY SubjectID
HAVING count(point) >= ALL
(
SELECT count(point)
FROM Result
WHERE [First/Second_Time] = 1 AND point < 5
GROUP BY SubjectID
)
I don't understand the reason for making the having query. Because Count(point) is always >=all(select count(point)
from Result
where First/Second_Time=1 and point<5
group by SubjectID), isnt it ?
and it doesn't show that the subject has most student fail for the first time. Thanks in advance and sorry for my bad english

The subquery is returning a list of the number of times a subject was failed (on the first attempt). It might be easier for you to see what it's doing if you run it like this:
SELECT SubjectID, count(point)
FROM Result
WHERE [First/Second_Time] = 1 AND point < 5
GROUP BY SubjectID
So if someone failed math twice and science once, the subquery would return:
2
1
You want to know which subject was failed the most (in this case, which subject was failed 2 or more times, since that is the highest number of failures in your subquery). So you count again (also grouping by subject), and use having to return only subjects with 2 or more failures (greater than or equal to the highest value in your subquery).
SELECT SubjectID
FROM Result
WHERE [First/Second_Time] = 1 AND Point < 5
GROUP BY SubjectID
HAVING count(point)...
See https://msdn.microsoft.com/en-us/library/ms178543.aspx for more examples.

Sounds like you are working on a project for a class, so I'm not even sure I should answer this, but here goes. The question is why the having clause. Have you read the descriptions for having and all ?
All "Compares a scalar value with a single-column set of values".
The scalar value in this case is count(point) or the number of occurrences of a subject id with point less than 5. The single-column set in this case is a list of the number of occurrences of every subject that has less than 5 points.
The net result of the comparison is in the ">=". "All" will only evaluate to true if it is true for every value in the subquery. The subquery returns a set of counts of all subjects meeting the <5 and 1st time requirement. If you have three subjects that meet the <5 and 1st time criteria, and they have a frequency of 1,2,3 times respectively, then the main query will have three "having" results; 1,2,3. Each of the main query results has to be >= each of the subquery results for that main value to evaluate true. So going through step by step, First main value 1 is >= 1, but isn't >= 2 so 1 drops because the "having" is false. Second main value 2 is >=1, is >= 2, but is not >= 3 so it drops. Third value, 3, evaluates true as >= 1, 2, and 3, so you end up returning the subject with the highest frequency.
This is fairly clear in the "remarks" section of the MSDN discussion of "All" keyword, but not as relates to your specific application.
Remember, MSDN is our friend!

Related

Selecting Rows and removing duplicates based on some fields based on two fields and limit to Top Ten?

Having this table:
Row Athlete Event Mark Meet
1 1 3 10 A
2 2 2 5 A
3 3 3 3 A
4 4 4 7 A
5 2 2 4 A
6 3 2 5 B
7 1 1 10 C
How can I select all rows but remove duplicate rows with have the athlete in the same event (Fields Athlete and Event), and pick the lowest (or highest Mark for that athlete), I would also like to limit each event to top 10 athletes (not shown in results)
Expected Output (choosing highest mark), (row 5 is removed)
Row Athlete Event Mark Meet
1 1 3 10 A
2 2 2 5 A
3 3 3 3 A
4 4 4 7 A
6 3 2 5 B
7 1 1 10 C
Thanks for the help the query that did what I wanted (minus the top ten) is:
SELECT [tblPerformanceData-FieldBoys].Eventnum, [tblPerformanceData- FieldBoys].Mark, [tblPerformanceData-FieldBoys].Meet, [tblPerformanceData-FieldBoys].CY, [tblPerformanceData-FieldBoys].AthleteID, [tblPerformanceData-FieldBoys].MeetID
FROM [tblPerformanceData-FieldBoys] INNER JOIN MaxAthleteByEventBoysField ON ([tblPerformanceData-FieldBoys].AthleteID = MaxAthleteByEventBoysField.AthleteID) AND ([tblPerformanceData-FieldBoys].Mark = MaxAthleteByEventBoysField.MaxOfMark) AND ([tblPerformanceData-FieldBoys].Eventnum = MaxAthleteByEventBoysField.Eventnum)
GROUP BY [tblPerformanceData-FieldBoys].Eventnum, [tblPerformanceData-FieldBoys].Mark, [tblPerformanceData-FieldBoys].Meet, [tblPerformanceData-FieldBoys].CY, [tblPerformanceData-FieldBoys].AthleteID, [tblPerformanceData-FieldBoys].MeetID
ORDER BY [tblPerformanceData-FieldBoys].Mark DESC;
You can do it using cascading queries. Try running a group-by query on the main table that only includes the athlete, event, and mark. The max or min clause would be applied to the mark depending on the outcome you're looking for. Use this query as the source for a second query where you link back to the initial table using direct links between the athlete, event, and Mark field. what the second query should look like
That solves the first part. I'm not sure how to get the top ten for each event using queries.
I don't own or have access to MS Access, but I can give you SQL, and hope Access will support some basic syntax.
Option 1: it's easier if Row is your primary key but you do not need to return it in the result; in this case you can even get both MIN and MAX of the Mark for the same athlete in the same row using a simple query:
SELECT
Athlete, Event, Meet, MAX(Mark) AS HighestMark, MIN(Mark) AS LowestMark
FROM
MyTable
GROUP BY
Athlete, Event, Meet
Note: I assumed you also want to group by Meet, but if that's not the case, you could remove it from GROUP BY, but then its value loses meaning in the result.
Option 2: Row is primary key, but you do need to return it - obviously in this case min and max cannot be returned in the same row and the query looks quite different:
SELECT
Row, Athlete, Event, Mark, Meet
FROM
MyTable m0
WHERE m0.Row IN
(SELECT MAX(Row)
FROM MyTable m1
WHERE
Athlete = m0.Athlete AND
Event = m0.Event AND
Meet = m0.Meet
Mark = (SELECT MAX(Mark)
FROM MyTable
WHERE
Athlete = m1.Athlete AND
Event = m1.Event AND
Meet = m1.Meet)
GROUP BY
Athlete, Event, Meet, Mark)
Few notes:
above query returns MAX(Mark); change it to MIN(Mark) to return lowest values
this query could be rewritten with JOINs as well; I'm not sure which method Access likes better (i.e. runs faster)
it has 2 sub-queries; the top sub-query MAX(Row) is there to make sure only 1 row is selected if the same athlete in the same meet and event gets the same Mark; in this case, the greater Row is returned
it is possible to return both MIN and MAX with one query (as separate rows) at the expense of additional sub-queries, but that you didn't ask for

How to combine multiple condition in table join?

Simply speaking, I two rule tables, one lists all the rules, the other lists details of the rules:
Rule_ID Rule_Name
1 "Rule Name 1"
2 "Rule Name 2"
Target_Rule_ID Condition
1 >10
1 <20
1 !=15
1 !=18
2 >30
Meaning: for Rule_ID number 1, the value is more than 10, less then 20, and not eqaul to 15 nor 18.
I need to apply this rule to another data table, like:
ID Value
1 11
2 60
3 15
And make the result like:
ID Value Rule_ID
1 11 1
2 60 2
3 15 null
The current method I can think of is use a high level language like python.
get the rules all out
make the where clause
join the table one by one
But sounds inefficient, since that means I need to join the rule with the data table X times (X = total rule number).
I wonder is there a better way to do this directly in sql server? Any suggestions?
(Also assume the rules don't conflict with each other, that would make the problem even harder)...
Regardless of where you do it, you need a way to separate the numerical value from its rule expression (condition, like <10 etc). Have you thought about separating expressions and values?
Something like
rule_details table:
t_rule_id rule_type Value
1 > 10
1 < 20
joining that set to the set of information to be checked/validated. With a lot of case statements
case
when rule_type = '>' and value > other_value then true
when rule_type = '>' and value <= other_value then false
...
end as rule_satisfied
you can create a column to validate each number against the criteria set out in the rule details. At that point you can do a logical AND on each group created -> if TRUE then all rules satisfied.

Select column based on another column's value - SQL

I have a table of rates and terms from which I need to use the term to select the appropriate rate adjustment. The issue is, each term is its own column as so:
term 12mon 24mon 36mon
----- ----- ----- -----
12 2 4 6
24 2 4 6
I need, for each term, to return the correct adjuster.
Thus for 12 months I would need a "2" and for 24 it would be "4" and so on. While vastly oversimplified it captures the essence - I need to select a column name in a table based upon the value of another column in the same table.
I'm not able to change the source table.
Thanks in advance...
case is your friend
case term
when 12 then [12mon]
when 24 then [24mon]
when 36 then [36 mon]
end as rate
if value of term can be between 12 and 24, etc. then write it this way (I'm not sure what your logic needs to be, but you get the idea)
case
when term < 12 then 0
when term < 24 then [12mon]
when term < 36 then [24mon]
when term < 48 then [36mon]
else [48mon]
end as rate
What flavor of SQL is that ?
In most of them you can use CASE WHEN (predicate) THEN x statements which you can use to get different columns and then alias it.

Finding the maximum value of year difference

I have two tables here
BIODATA
ID NAME
1 A
2 B
YEAR
ID JOIN YEAR GRADUATE YEAR
1 1990 1991
2 1990 1993
I already use
select
NAME,
max(year(JOIN_YEAR) - year(GRADUATE_YEAR)) as MAX
from
DATA_DIRI
right join DATA_KARTU
ON BIODATA.ID = YEAR.ID;
but the result became:
+--------+------+
| NAME | MAX |
+--------+------+
| A | 3 |
+--------+------+
I already try a lot of different kind of joins but I still can't find how the NAME to be "B". Anyone can help me? Thanks a lot before
If you use an aggregate and a non-aggregate in the selection set at once, then the row used for the non-aggregate field is essentially picked at random.
Basically, how max works is this - it gathers all rows for each group by query (if there is no group by, all of them), calculates the max and puts that in the result.
But since you also put in a non-aggregate field, it needs a value for that - so what SQL does is just pick a random row. You might think 'well, why doesn't it pick the same row max did?' but what if you used avg or count? These have no row associated with it, so the best it can do is pick randomly. This is why this behaviour exists in general.
What you need to do is use a subquery. Something like select d1.id from data_diri d1 where d1.graduate_year - d1.join_year = (select max(d2.graduate_year - d2.join_year from data_diri d2))

Possible To Run an SQL Loop to Increment The Value Being Selected?

I've looked into Dynamic SQL and the inc() function, but neither are really what I'm after.
Say I have a database like this:
grade name age
9 Bob 9
10 Sue 11
11 Larry 15
9 Joe 8
10 Carrot 10
I want to create a table that first selects all the rows with the lowest grade (9) then displays the oldest. It then goes through and searches for the next highest grade (10) and displays the oldest. Then goes to the next highest grade (11) and displays the oldest.
I'd like for them all to be in the same table and not have to write out a separate SQL call and different PHP variables for each grade.
This is the SQL call I have right now:
$query = "SELECT * FROM horses WHERE grade='1' ORDER BY points DESC LIMIT 1" or die(mysql_error());
Is there a way I can make the grade column increment until it reaches the highest number in the database?
Thanks for any suggestions.
You don't need a loop for this if I understand your request. Instead, you need a MAX() aggregate grouped by grade. The following method should work independently of your RDBMS. It relies on a JOIN against a subquery which returns the greatest age per group to get the age/group pair and join that back against the main table to retrieve the name (and other columns as needed).
SELECT
horses.grade,
horses.name,
horses.age
FROM
horses
JOIN (
SELECT grade, MAX(age) as maxage
FROM horses
GROUP BY grade
) ma ON horses.grade = ma.grade AND horses.age = ma.maxage
ORDER BY grade ASC
Here is an example on SQLFiddle.com
Returns:
GRADE NAME AGE
9 Bob 9
10 Sue 11
11 Larry 15
It is generally far faster an less resource-intensive to do one query instead of multiple queries in a loop, so this should be the approach whenever possible.