PostgreSQL show latest date and one value of a column from many - sql

I have a problem with a query, that i can't figure out. Have tried for some time, but I just can't figure it out. Would be a great deal of help if you could help me. So... I have 4 tables:
cars - ID, make, model, plate_number, price, type, year, owner_ID
persons - ID, name, surname, pers_code
insurance_data - company_ID, car_ID, first_date, last_date
companies - ID, title
My query so far is..
SELECT cars.plate_number, persons.name, persons.surname, insurance_data.last_date
FROM cars,persons,insurance_data
WHERE cars.owner_ID = persons.ID AND cars.ID = insurance_data.car_ID
This outputs cars plate number, owner of the car, and the last date of the car's insurance. But the problem is that there's two cars that have two end dates of insurance, so in the output there's two entries for same car and with both insurance end dates. What i need is that there would be only one entry for each car and corresponding insurance end date should be the latest.
I know this is pretty basic, but i'm a first year student of databases, and this is one of my first assignments. Thanks in advance

(1) Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
(2) Use table aliases!
The answer to your question is DISTINCT ON:
SELECT DISTINCT ON (c.plate_number) c.plate_number, p.name, p.surname, id.last_date
FROM cars c JOIN
persons p
ON c.owner_ID = p.ID JOIN
insurance_data id
ON c.ID = id.car_ID
ORDER BY c.plate_number, id.last_date DESC;

Related

Grouping by country and discipline in SQLServer

i've got a problem I cannot solve in SQLServer. There are 3 tables with data about the Olympics.
Table 1: Dysciplines - contains DisciplineID (int,PK) and Discipline (varchar)
Table 2: Athletes - contains AthleteID (int, FK), Athlete (varchar, PK), Nationality (varchar) and DisciplineID(int,FK)
Table3: Medals - contains AthleteID(int, PK), Year(int) and Medals(int)
I want to extract all the countries, that got more medals in their best discipline than in all the others combined. However, I am having problem with it.
Obviously I joined all the tables, but I'm not sure how do I continue. I tried:
WHERE MAX(SUM(dbo.Medals.Medals))>SUM(dbo.Medals.Medals)-MAX(SUM(dbo.Medals.Medals))
GROUP BY tab1.Dyscypline
But this is clearly wrong. I will be grateful for any help.
You would use aggregation and having. Start with the number of medals in each discipline in each country:
select a.nationality, a.disclipineid, count(*) as num_medals
from athletes a join
medals m
on a.AthleteID = m.AthleteID
group by a.nationality, a.disclipineid;
Then aggregate again:
select nationality
from (select a.nationality, a.disclipineid, count(*) as num_medals
from athletes a join
medals m
on a.AthleteID = m.AthleteID
group by a.nationality, a.disclipineid
) am
group by nationality
having max(num_medals) > sum(num_medals) * 0.5;
That is, the maximum number of medals for a discpline accounts for more than half the medals.

find an average of a column using group with inner join and then filtering through the groups

I've been trying to solve an sqlite question where I have two tables: Movies and movie_cast.
Movies has the columns: id, movie_title, and `score. Here is a sample of the data:
11|Star Wars|76.496
62|2001:Space Odyssey|39.064
152|Start Trek|26.551
movie_cast has the columns: movie_id, cast_id, cast_name, birthday, popularity. Here is a sample.
11|2|Mark Hamill|9/25/51|15.015
11|3|Harrison Ford|10/21/56|8.905
11|5|Peter Cushing|05/26/13|6.35
IN this case movies.id and movie_cast.movie_id are the same.
The question is to Find the top ten cast members who have the highest average movie scores.
Do not include movies with score <25 in the average score calculation.
▪ Exclude cast members who have appeared in two or fewer movies.
My query is as below but it doesn't seem to get me the right answer.
SELECT movie_cast.cast_id,
movie_cast.cast_name,
printf("%.2f",CAST(AVG(movies.score) as float)),
COUNT(movie_cast.cast_name)
FROM movies
INNER JOIN movie_cast ON movies.id = movie_cast.movie_id
WHERE movies.score >= 25
GROUP BY movie_cast.cast_id
HAVING COUNT(movie_cast.cast_name) > 2
ORDER BY AVG(movies.score ) DESC, movie_cast.cast_name ASC
LIMIT 10
The answers I get are in the format cast_id,cat_name,avg score.
-And example is: 3 Harrison Ford 52.30
I've analyzed and re-analyzed my logic but to no avail. I'm not sure where I'm going wrong. Any help would be great!
Thank you!
This is how I would write the query:
SELECT mc.cast_id,
mc.cast_name,
PRINTF('%.2f', AVG(m.score)) avg_score
FROM movie_cast mc INNER JOIN movies m
ON m.id = mc.movie_id
WHERE m.score >= 25
GROUP BY mc.cast_id, mc.cast_name
HAVING COUNT(*) > 2
ORDER BY AVG(m.score) DESC, mc.cast_name ASC
LIMIT 10;
I use aliases for the tables to shorten the code and make it more readable.
There is no need to cast the average to a float because the average in SQLite is always a real number.
Both COUNT(movie_cast.cast_name) can be simplified to COUNT(*) but the 1st one in the SELECT list is not needed by your requirement (if it is then add it).
The function PRINTF() returns a string, but if you want a number returned then use ROUND():
ROUND(AVG(m.score), 2) avg_score

Oracle SQL query, getting a a maximum of a sum

Hey, guys. I'm struggling to solve one query, just cant get around it.
Basically, I got a some tables from data mart :
DimTheatre(TheatreId(PK), TheatreNo, Name, Address, MainTel);
DimTrow(TrowId(PK), TrowNo, RowName, RowType);
DimProduction(ProductionId(PK), ProductionNo, Title, ProductionDir, PlayAuthor);
DimTime(TimeId(PK), Year, Month, Day, Hour);
TicketPurchaseFact( TheatreId(FK), TimeId(FK), TrowId(FK),
PId(FK), TicketAmount);
The thing I'm trying to achieve in oracle is - I need to retrieve the most popular row type in each theatre by value of ticket sale
Thing I'm doing now is :
SELECT dthr.theatreid, dthr.name, max(tr.rowtype) keep(dense_rank last order
by tpf.ticketamount), sum(tpf.ticketamount) TotalSale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid
GROUP BY dthr.theatreid, dthr.name;
It does give me the output, but the 'TotalSale' column is totally out of place, it gives much way higher numbers than they should be.. How could I approach this issue :) ?
I am not sure how MAX() KEEP () would help your case if I understand the problem correctly. But the below approach should work:
SELECT x.theatreid, x.name, x.rowtype, x.total_sale
FROM
(SELECT z.theatreid, z.name, z.rowtype, z.total_sale, DENSE_RANK() OVER (PARTITION BY z.theatreid, z.name ORDER BY z.total_sale DESC) as popular_row_rank
FROM
(SELECT dthr.theatreid, dthr.name, tr.rowtype, SUM(tpf.ticketamount) as total_sale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid AND tr.trowid = tpf.trowid
GROUP BY dthr.theatreid, dthr.name, tr.rowtype) z
) x
WHERE x.popular_row_rank = 1;
You want the row type per theatre with the highest ticket amount. So join purchases and rows and then aggregate to get the total per rowtype. Use RANK to rank your row types per theatre and stay with the best ranked ones. At last join with the theatre table to get the theatre name.
select
theatreid,
t.name,
tr.trowid
from
(
select
p.theatreid,
r.rowtype,
rank() over (partition by p.theatreid order by sum(p.ticketamount) desc) as rn
from ticketpurchasefact p
join dimtrow r using (trowid)
group by p.theatreid, r.rowtype
) tr
join dimtheatre t using (theatreid)
where tr.rn = 1;

SQL - Finding the AVG, MIN, MAX and SUM with ID

The task I have been given requires me to find the average, minimum, maximum and total cost of visits made by Tiddles the cat 'P0001' and vet Trevor McCafferty 'VO4'. This will be drawn from two tables, pet and visit.
Pet table structure:
pet_id, Name, Type, Breed, Gender, Born, owner_id, Notes
Visit table structure:
visit_id, pet_id, vet_id, Visit_Date, Basic_Cost, Symptom, Treatment
Below is the command I have created so far but I'm not sure if I'm doing this correctly which is why I need help.
SELECT Name, Type, AVG(Basic_Cost), MIN(Basic_Cost), MAX(Basic_Cost), SUM(Basic_Cost)
FROM visit, pet
WHERE pet_id = 'P0001' AND vet_id = 'V04';
Any questions just ask and any help is appreciated as I'm stumped.
SELECT FIRST(pet.Name) AS PetName,
FIRST(pet.Type) AS PetType,
AVG(Basic_Cost) AS AverageCost,
MIN(Basic_Cost) AS MinCost,
MAX(Basic_Cost) AS maxCost,
SUM(Basic_Cost) AS TotalCost
FROM visit
INNER JOIN pet ON visit.pet_id = pet.pet_id
WHERE visit.pet_id = 'P0001'
AND visit.vet_id = 'V04'
I think you are missing GROUP BY
SELECT pet.Name, pet.Type, AVG(Basic_Cost), MIN(Basic_Cost), MAX(Basic_Cost), SUM(Basic_Cost)
FROM visit, pet
where visit.pet_id = pet.pet_id
and visit.pet_id = 'P0001'
and visit.vet_id = 'V04'
GROUP BY pet.Name, pet.Type;
Also you probably have to join visit and pet tables

Postgresql - retrieve rows within criteria within 30 day span

I have the following tables
AdmittedPatients(pid, workerid, admitted, discharged)
Patients(pid, firstname, lastname, admitted, discharged)
DiagnosticHistory(diagnosisID, workerid, pid, timeofdiagnosis)
Diagnosis(diagnosisID, description)
Here is an SQL Fiddle: http://sqlfiddle.com/#!15/e7403
Things to note:
AdmittedPatients is a history of all admissions/discharges of patients at the hospital.
Patients contain all patients who have records at the hospital. Patients also lists who are currently staying at the hospital (i.e. discharged is NULL).
DiagnosticHistory contains all diagnosis made.
Diagnosis has the description of the diagnosis made
Here is my task: list patients who were admitted to the hospital within 30 days of their last discharge date. For each patient list their patient identification number, name, diagnosis, and admitting doctor.
This is what I've cooked up so far:
select pid, firstname, lastname, admittedpatients.workerid, patients.admitted, admittedpatients.discharged
from patients
join admittedpatients using (pid)
group by pid, firstname, lastname, patients.admitted, admittedpatients.workerid, admittedpatients.discharged
having patients.admitted <= admittedpatients.discharged;
This returns pid's from 0, 1, and 4 when it should 0, 1, 2, and 4.
Not sure why out need group by or having here... no aggregate...
SELECT A.pid, firstname, lastname, A.workerid, P.admitted, A.discharged
FROM patients P
INNER JOIN admittedpatients A
on P.pID = A.pID
WHERE date_add(a.discharged, interval 30 day)>=p.admitted
and p.admitted >=a.discharged
updated fiddle: http://sqlfiddle.com/#!2/dc33c/30/0
Didn't get into returning all your needed fields but as this gets the desired result set I imagine it's just a series of joins from here...
Updated to postgresql:
SELECT A.pid, firstname, lastname, A.workerid, P.admitted, A.discharged
FROM patients P
INNER JOIN admittedpatients A
on P.pID = A.pID
WHERE a.discharged+ interval '30 day' >=p.admitted
and p.admitted >=a.discharged
http://sqlfiddle.com/#!15/e7403/1/0
I didn't see any diagnostic info in the fiddle, so I didn't return any.
select pid
,p.lastname,p.firstname
,ad.lastname,ad.firstname
from AdmittedPatients as a
join AdmittedPatients as d using (pid)
join Patients as p using (pid)
join AdminDoctors as ad on ad.workerid=a.workerid
where d.discharged between a.admitted-30 and a.admitted
You have a rather basic WHERE clause error here:
Admitted cannot be both before discharged AND after discharged+30
Also you have an extra semicolon before your whole query is ended, probably throwing out the last line altogether.
I think you're looking for admitted=discharged