SQL on how to simultaneously compare two column values in a query - sql

I need create a query to find the names of drivers whose exam scores get lower when they take more exams. So I have the following tables:
branch(branch_id, branch_name, branch_addr, branch_city, branch_phone);
driver(driver_ssn, driver_name, driver_addr, driver_city, driver_birthdate, driver_phone);
license(license_no, driver_ssn, license_type, license_class, license_expiry, issue_date, branch_id);
exam(driver_ssn, branch_id, exam_date, exam_type, exam_score);
**the exam_date is a date
So I am using the tables driver and exam. I would like to somehow check that the exam_date>anotherDate while at the same time checking that exam_score
*EDIT
This is what I came up with but I feel like some of the syntax is illegal. I keep getting a syntax error.
s.executeQuery("SELECT driver_name " +
"FROM driver " +
"WHERE driver.driver_ssn IN " +
"(SELECT e1.driver_ssn" +
"FROM exam e1" +
"WHERE e1.exam_score < " +
"(SELECT e2.exam_score FROM exam e2)" +
"AND e1.exam_date > " +
"(SELECT e2.exam_date FROM exam e2)");
EDIT! I got it working! Thanks for your input everyone!
SELECT driver.driver_name
FROM driver
WHERE driver.driver_ssn IN
(SELECT e1.driver_ssn
FROM exam e1, exam e2, driver d
WHERE e1.exam_score < e2.exam_score
AND e1.exam_date > e2.exam_date
AND e1.driver_ssn=e2.driver_ssn)

You will need to do a self join. See this example and work it to your schema.
select d.name,
es.date_taken as 'prev date',
es.score as 'prev score',
es.date_taken as 'new date',
es_newer.score as 'new score'
from driver d
inner join exam_score es
on d.id = es.driver_id
left outer join exam_score es_newer
on d.id = es_newer.driver_id
and es_newer.date_taken > es.date_taken
and es_newer.score < es.score
where es_newer.id is not null
Here is a SQL Fiddle that I made to demonstrate.

SELECT returns a set and you cannot compare a single value against a set. You can try something along these lines. This is similar to yours and doesn't handle three exam case :-
SELECT driver_name
FROM driver
JOIN exam e1 ON driver_ssn
JOIN exam e2 ON driver_ssn
WHERE e1.exam_score < e2.exam_score
AND e1.exam_date > e2.exam_date
The query selects all pairs of exams taken by a driver in which the score is less and the date big

Simple take on this problem would be getting drivers who take a couple of exams and their second score is lower.
To compare columns from the same table SQL uses self-join. Your join condition should include:
select e1.driver_ssn, e1.exam_type, e1.exam_score as score_before,
e2.exam_score as score_after
exam e1 join exam e2 on (e1.driver_ssn = e2.driver_ssn and
e1.exam_type = e2.exam_type and
e1.exam_date < e2.exam_date and
e1.exam_score > e2.exam_score)

Related

Query-ing data to get average grade

I am trying to get the average grade by querying data from different tables and while it seems like everything is as it is supposed to be, the query does not work. Tables are the following:
grade(id_grade, id_subject, student_code and grade)
student(student_code, first_name, last_name, birthdate, average_grade)
subject(id_subject, credits, subject_code, name, teacher)
Average grade is calculated using the following formula: (sum of all matching grades (that correspond to student code in student) * average grade from student table + credits from subject * newest added grade) / (sum of all matching grades + credits from the subject).
The query is the following:
UPDATE student AS s
SET
average_grade = (
(SUM(g.grade) * s.average_grade + sub.credits * g.grade) /
(SUM(g.grade) + sub.credits) )
FROM grade AS g, subject AS sub
WHERE s.student_code = g.student_code AND
g.id_subject = sub.id_subject AND s.student_code = 141837;
Keep in mind that this is PostgreSQL so some input may differ from other SQL-s, but this is what I was able to come up with.
I also tried the following as well, but also yielded no results:
UPDATE student AS s
LEFT JOIN grade AS g ON (s.student_code = g.student_code)
LEFT JOIN subject AS sub ON (g.id_subject = sub.id_subject)
SET average_grade = (
(SUM(g.grade) * s.average_grade + sub.credits * g.grade) /
(SUM(g.grade) + sub.credits))
WHERE s.student_code = 141837;
Any help is highly appreciated!
You need to aggregate as a subquery. I don't think the average calculation is correct; I think you want something more like this:
UPDATE student s
SET average_grade = g.average_grade
FROM (SELECT g.student_code,
SUM(g.grade * s.credits) / SUM(s.credits) as average_grade
FROM grade g JOIN
subject s
ON g.id_subject = s.id_subject
GROUP BY g.student_code
) g
WHERE s.student_code = g.student_code AND
s.student_code = 141837;

How to find highest count of result set using multiple tables in SQL (Oracle)

I have four tables. Here are the skeletons...
ACADEMIC_TBL
academic_id
academic_name
AFFILIATION_TBL
academic_id*
institution_id*
joined_date
leave_date
INSTITUTION_TBL
institution_id
institution_name
REVIEW_TBL
academic_id*
institution_id*
date_posted
review_score
Using these tables I need to find the academic (displaying their name, not ID) with the highest number of reviews and the institution name (not ID) they are currently affiliated with. I imagine this will need to be done using multiple sub-select scripts but I'm having trouble figuring out how to structure it.
this will work:
SELECT at.academic_name,
it.institution_name,
Max(rt.review_score),
from academic_tbl at,
affiliation_tbl afft,
institution_tbl it,
review_tbl rt
WHERE AT.academic_id=afft.academic_id
AND afft.institution_id=it.institution_id
AND afft.academic_id=rt.academic_id
GROUP BY at.academia_name,it.instituton_id
You need an aggregated query that JOINs all 4 tables to count how many reviews were performed by each academic.
Query :
SELECT
inst.institution_name,
aca.academic_name,
COUNT(*)
FROM
academic_tbl aca
INNER JOIN affiliation_tbl aff ON aff.academic_id = aca.academic_id
INNER JOIN institution_tbl inst ON inst.institution_id = aff.institution_id
INNER JOIN review_tbl rev ON rev.academic_id = aca.academic_id AND rev.institution_id = aff.institution_id
GROUP BY
inst.institution_name,
aca.academic_name,
inst.institution_id,
aca.academic_id
NB :
added the academic and institution id to the GROUP BY clause to prevent potential academics or institutions having the same name from being (wrongly) grouped together
if the same academic performed reviews for different institutions, then you will find one row for each academic / institution couple, which, if I understood you right, is what you want
Try this one:
select
inst.institution_name
, aca.academic_name
from
academic_tbl aca
, institution_tbl inst
, affiliation_tbl aff
, review_tbl rev
, (
select
max(rt.review_score) max_score
from
review_tbl rt
, affiliation_tbl aff_inn
where
rt.date_posted >= aff_inn.join_date
and rt.date_posted <= aff_inn.leave_date
and rt.academic_id = aff_inn.academic_id
and rt.institution_id = aff_inn.institution_id
)
agg
where
aca.academic_id = inst.academic_id
and inst.institution_id = aff.institution_id
and aff.institution_id = rev.institution_id
and aff.academic_id = rev.academic_id
and rev.date_posted >= aff.join_date
and rev.date_posted <= aff.leave_date
and rev.review_score = agg.max_score
;
It might return more than one academic, if there are more with the same score (maximum one).

SQLPLUS (Oracle) - WHERE subquery to determine if date expression greater than n Days

I am new to SQL and have had some trouble grasping the use of subqueries and where to place them in respect to the outer query. The queries bellow are apart of a problem ive been working on and cant seem to get the desired results.
I need to extract the number of days between a start and end date. Then check if that is greater than 2 and apply that to the outer query. This particular attempt returns "Missing Expression" while other iterations (2nd Query Bellow) have returned an error stating the inner query returns multiple rows.(Modifying to use the "ALL" keyword Did not produce the right results either)
QUERY1
SELECT P.PETNAME,P.PETTYPE
FROM PETTREATMENT PT, EXAMINATION E, PET P, TREATMENT T
WHERE PT.EXAMNO = E.EXAMNO
AND E.PETNO = P.PETNO
AND PT.TREATNO = T.TREATNO
AND (SELECT TO_DATE(PETTREATMENT.ENDDATE,'DD-MON-YYYY') - TO_DATE(PETTREATMENT.STARTDATE,'DD-MON-YYYY') AS TOTALDAYS FROM PETTREATMENT WHERE TOTALDAYS > 2)
AND T.COST >100
ORDER BY P.PETNAME;
QUERY 2
SELECT P.PETNAME,P.PETTYPE
FROM PETTREATMENT PT, EXAMINATION E, PET P, TREATMENT T
WHERE PT.EXAMNO = E.EXAMNO
AND E.PETNO = P.PETNO
AND PT.TREATNO = T.TREATNO
AND 2 < (SELECT TO_DATE(PETTREATMENT.ENDDATE,'DD-MON-YYYY') - TO_DATE(PETTREATMENT.STARTDATE,'DD-MON-YYYY') FROM PETTREATMENT)
AND T.COST >100
ORDER BY P.PETNAME;
Avoid using old-style joins. Because enddate and startdate come from pettreatment table, and you are already selecting from it, you can just specify the condition in the where clause. No need for a select.
SELECT P.PETNAME,P.PETTYPE
FROM PETTREATMENT PT
JOIN EXAMINATION E ON PT.EXAMNO = E.EXAMNO
JOIN PET P ON E.PETNO = P.PETNO
JOIN TREATMENT T ON PT.TREATNO = T.TREATNO
WHERE TO_DATE(PT.ENDDATE,'DD-MON-YYYY') - TO_DATE(PT.STARTDATE,'DD-MON-YYYY') > 2
AND T.COST > 100
ORDER BY P.PETNAME;

Paid Amount Greater than Total Amount Paid

I have an Access DB I am maintaining for a client.
I have 4 tables. Claims, Eligibility, Pharmacy, and Codes.
The Primary Key I am using is PHID + SID = MemberID which I am linking to each table, and then the Codes table is merely used for a description. See queries below for better visualization of that...
Query 1: Member_Claims_Query
SELECT
Eligibility.GROUPID,
Eligibility.PHID & '-' & Eligibility.SID AS MemberID,
[Eligibility].[DOB] AS DOB,
Eligibility.GENDER,
Eligibility.RELATIONSHIP_CODE,
MaxDiagDollars.HighestDiagPaid/SUM(Claims.PAID_AMT) AS ['%'],
MaxDiagDollars.HighestDiagPaid/SUM(Claims.PAID_AMT) as 'Percent',
ROUND(SUM(Claims.PAID_AMT)) AS TOTALPAID,
ROUND(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2011',Claims.PAID_AMT,0))) AS 2011TOTALPAID,
ROUND(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2012',Claims.PAID_AMT,0))) AS 2012TOTALPAID,
ROUND(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2013',Claims.PAID_AMT,0))) AS 2013TOTALPAID
FROM (Claims
INNER JOIN Eligibility
ON (Claims.[SID] = Eligibility.[SID]) AND (Claims.[PHID] = Eligibility.[PHID]))
INNER JOIN (SELECT PHID, SID, MAX(TotalPaid) AS HighestDiagPaid
FROM (SELECT [PHID], [SID], DIAG_CODE1, SUM(PAID_AMT) AS TotalPaid FROM Claims GROUP BY [PHID], [SID], [DIAG_CODE1]) AS [%$###_Alias] GROUP BY PHID, SID) AS MaxDiagDollars ON ( MaxDiagDollars.[PHID]=Eligibility.[PHID] ) AND ( MaxDiagDollars.[SID] = Eligibility.[SID] )
WHERE Eligibility.DOB < DateAdd( 'y', -2, DATE())
GROUP BY
Eligibility.GROUPID, Eligibility.PHID & '-' & Eligibility.SID, [Eligibility].[DOB], Eligibility.GENDER, Eligibility.RELATIONSHIP_CODE, MaxDiagDollars.HighestDiagPaid
HAVING SUM(Claims.PAID_AMT)>10000 and MaxDiagDollars.HighestDiagPaid/SUM(Claims.PAID_AMT) <= 0.80;
This query is supposed to take the Total Amount Paid per Member and give a Total Amount PAid, and then yearly break outs.
Query 2: Member_By_Diag
SELECT
Eligibility.PHID & '-' & Eligibility.SID AS MemberID,
Claims.Diag_Code1,
ROUND(Sum(Claims.PAID_AMT)) AS TotalPaid,
ROUND(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2011',Claims.PAID_AMT,0))) AS 2011TotalPaid,
ROUND(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2012',Claims.PAID_AMT,0))) AS 2012TotalPaid,
ROUND( Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2013',Claims.PAID_AMT,0))) AS 2013TotalPaid
FROM
(Claims
INNER JOIN Eligibility
ON (Claims.[SID] = Eligibility.[SID]) AND (Claims.[PHID] = Eligibility.[PHID]))
INNER JOIN Pharmacy
ON (Eligibility.SID = Pharmacy.SID) AND (Eligibility.PHID = Pharmacy.PHID)
GROUP BY
Eligibility.PHID & '-' & Eligibility.SID, Claims.Diag_Code1
HAVING count( [Pharmacy].[NDC] ) >4 and count(IIF(Claims.REV_CODE= '450',1,0) ) > 1
ORDER BY Eligibility.PHID & '-' & Eligibility.SID;
The second query is essentially supposed to take the Codes for each member and break out their amount paids by Diagnosis code.
Query 3: combined_query
SELECT *
FROM (Member_Claims_Query AS a INNER JOIN Member_by_Diag AS b ON a.MemberID=b.MemberID) INNER JOIN Codes AS c ON c.DxCode = b.Diag_Code1;
ISSUE
My Client sent me an e-mail stating that the Total Paid in the Member_By_Diag query is sometimes higher than the Total Paid by the Member_By_Claim query. yet they are being computed the same way.
I opened up the DB and wrote a simple query to see how many records were returning where the b.Total_Paid ( Member_By_Diag.Total_Paid) is greater than the Member_Claims_Query.Total_Paid.
It returned 262/1278 records where this was the case.
SELECT * FROM Combined_Query WHERE b_TotalPaid > a_TotalPaid
This picture acurately describes what I am seeing along with my client.
As you can see. a_TotalPaid > b_TotalPaid. But if you look up at my query, they are the same? Is this a group by issue? or a join issue? Any help would be much appreciated.
There are a couple of differences between the queries that could be contributing to this. The main culprits are your INNER JOIN statements and your HAVING statement. Those could easily be excluding records that will have an effect on your TotalPaid field. Without the original dataset, there's not much I can tell you, but you may want to run those queries and play with removing and inserting the various INNER JOIN and HAVING clauses to see which one is deleting the records that are causing your totals to not be equal.
I appreciate the answers everyone... You didn't exactly answer the question, but the inner join on Pharmacy was causing the issue, I was specifically using it in relation to the HAVING clause, when I added a count(*) I noticed it was actually multiplying my results. For Ex. If a member had 7 claims, and 6 Pharmacy records it was multiplying it making it 42 records making my total paids extremely high, and they weren't relating to the CLAIMS themselves...hence the ultimate issue. Here is the solution in the Member_By_Diag Query:
SELECT Eligibility.PHID & '-' & Eligibility.SID AS MemberID, Claims.Diag_Code1, Round(Sum(Claims.PAID_AMT)) AS TotalPaid, Round(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2011',Claims.PAID_AMT,0))) AS 2011TotalPaid, Round(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2012',Claims.PAID_AMT,0))) AS 2012TotalPaid, Round(Sum(IIf(Format(Serv_Beg_Date,'yyyy')='2013',Claims.PAID_AMT,0))) AS 2013TotalPaid, Count(*) AS Expr1
FROM (Claims INNER JOIN Eligibility ON (Claims.[SID] = Eligibility.[SID]) AND (Claims.[PHID] = Eligibility.[PHID])) INNER JOIN ***(SELECT PHID, SID, COUNT(NDC) AS RXCount FROM Pharmacy GROUP BY PHID, SID ORDER BY PHID, SID) AS Pharmacy*** ON (Eligibility.SID = Pharmacy.SID) AND (Eligibility.PHID = Pharmacy.PHID)
GROUP BY Eligibility.PHID & '-' & Eligibility.SID, Claims.Diag_Code1
***HAVING Count(IIf([Claims].[REV_CODE]='450',1,0))>1***
ORDER BY Eligibility.PHID & '-' & Eligibility.SID;
This made the dollars look much more reasonable. Thanks everyone.

Joining 3 tables in Google bigquery

The example below stops at the first JOIN with an error message
Encountered " "JOIN" "JOIN "" at line 13, column 4. Was expecting: ")"
Am I missing something obvious with multiple joins in Bigquery?
SELECT type.CourseType AS CourseType,
SUM(joined.assign.StudentCount) AS StudentN
FROM
(
SELECT assign.StateCourseCode,
assign.StateCourseName,
assign.MatchType,
assign.Term,
assign.StudentCount
FROM [Assignment.AssignmentExtract5] AS assign
JOIN SELECT wgt.Term,
wgt.Weight
FROM [Crosswalk.TermWeights] AS wgt
ON wgt.Term = assign.Term
) AS joined
JOIN SELECT type.CourseCode,
type.CourseDescription,
type.CourseType,
type.CourseCategory
FROM [Crosswalk.CourseTypeDescription] AS type
ON joined.assign.StateCourseCode = type.CourseCode
GROUP BY CourseType
Thanks Ryan, your help was much appreciated. For anyone who might be interested, here is a query that worked.
SELECT type.CourseCategory AS CourseCategory,
SUM(joined.assign.StudentCount) AS StudentN
FROM
(
SELECT assign.StateCourseCode,
assign.StateCourseName,
assign.MatchType,
assign.Term,
assign.StudentCount
FROM [Assignment.AssignmentExtract5] AS assign
JOIN (SELECT Term,
Weight
FROM [Crosswalk.TermWeights]) AS wgt
ON wgt.Term = assign.Term
) AS joined
JOIN (SELECT CourseCode,
CourseDescription,
CourseType,
CourseCategory
FROM [Crosswalk.CourseTypeDescription]) AS type
ON (joined.assign.StateCourseCode = type.CourseCode)
GROUP BY CourseCategory;
I think you're just missing a parenthesis on line 13.
This:
JOIN SELECT wgt.Term,
wgt.Weight
FROM [Crosswalk.TermWeights] AS wgt
ON wgt.Term = assign.Term
Should be:
JOIN (SELECT wgt.Term,
wgt.Weight
FROM [Crosswalk.TermWeights]) AS wgt
ON wgt.Term = assign.Term
More info:
https://developers.google.com/bigquery/docs/query-reference#multiplejoinsexample
FYI - JOINs are not as fast as we'd like yet. We're working on improving the performance.