SQL Query, Average climbed and pair that has climbed the most peaks - sql

My Databases look like so:
PEAK (NAME, ELEV, DIFF, MAP, REGION)
CLIMBER (NAME, SEX)
PARTICIPATED (TRIP_ID, NAME)
CLIMBED (TRIP_ID, PEAK, WHEN)
PEAK gives info about the mountain peaks that the user is interested in. The table lists the name of each peak, it elevation(in ft), its difficulty level(on a scale of 1-5), the map that it is located on, and the region of the Sierra Nevada that it is located in.
CLIMBER lists the members of club, and gives their name and gender.
PARTICIPATED gives the set of climbers who participated in each of the various climbing trips. The number of participants in each trip varies.
CLIMBED tells which peaks were climbed on each climbing trip, along w/ the data that each peak was climbed.
I need help with writing a query for the following:
Compute the average number of peaks scaled by the men in the club and by the women in the club.
Which pair of climbers have climbed the most peaks together, and how many peaks is that?
Who has climbed more than 20 peaks in some 60 day span?
For the first query, so far I have found a way to compute the total number of peaks climbed by either gender, for men:
SELECT SUM(C)
FROM
(SELECT CD.PEAK, COUNT(*) C
FROM CLIMBED CD
WHERE CD.TRIP_ID IN
(SELECT TRIP_ID
FROM PARTICIPATED PA
WHERE PA.NAME IN
(SELECT NAME
FROM CLIMBER
WHERE SEX = 'M'))
GROUP BY CD.PEAK) T;
For the second query, I have the following which I'm fairly sure isn't correct:
SELECT TEMP2.TRIP_ID, COUNT (*)
FROM
(SELECT P1.NAME, P2.NAME, P1.TRIP_ID
FROM PARTICIPATED P1, PARTICIPATED P2
WHERE P1.NAME <> P2.NAME AND
P1.TRIP_ID = P2.TRIP_ID) TEMP1,
(SELECT *
FROM CLIMBED) TEMP2
WHERE TEMP2.TRIP_ID = TEMP1.TRIP_ID
GROUP BY TEMP2.TRIP_ID;

Question 1:
For total number of trips (including every time a peak was climbed)
SELECT t1.sex, AVG(t1.peak_count) AS average
FROM
(SELECT sex, COUNT(trip_id) AS peak_count
FROM climber c LEFT JOIN participated p ON c.name = p.name GROUP BY c.name, c.sex) t1
For each time a UNIQUE peak was climbed:
SELECT t1.sex, AVG(t1.peak_count) AS average
FROM
(SELECT sex, COUNT(trip_id) AS peak_count
FROM climber c LEFT JOIN participated p ON c.name = p.name GROUP BY c.name, c.sex) t1
Question 2:
SELECT P1.Name, P2.Name, COUNT(DISTINCT p1.trip_id) AS trips
FROM participated p1 INNER JOIN participated p2 ON p1.trip_id = p2.trip_id
WHERE p1.name > p2.name -- > instead of <> gets only one of the pairs
GROUP BY P1.Name, P2.Name
HAVING COUNT(DISTINCT p1.trip_id) > 0
ORDER BY trips DESC
Question 3:
SELECT p.name, cl.when AS span_begin_date, DATEADD(day, 60, cl.when) AS span_end_date, count(c2.trip_id) AS peaks
FROM climbed cl LEFT JOIN
climbed c2 ON c2.when BETWEEN cl.when AND DATEADD(day, 60, cl.when)
GROUP BY p.name, cl.when, DATEADD(day, 60, cl.when)
HAVING COUNT(c2.trip_id) > 20
ORDER BY peaks

Here is my solution. If you provide sample data, this can be verified. For question 3, the some 60 day span is not clear. Can you please specify better?
Question 1
select x.sex, avg(x.peaks_escalated) as peaks
from (
select u.name, u.sex, count(distinct c.peak) as peaks_escalated
from t1_climbed c
inner join t1_participated p on c.trip_id = p.trip_id
inner join t1_climber u on p.name = u.name
group by u.name, u.sex ) x
group by x.sex
Question 2
with list1 as (
select u.name as member, c.trip_id, c.peak, c.when
from t1_climbed c
inner join t1_participated p on c.trip_id = p.trip_id
inner join t1_climber u on p.name = u.name
)
select a.member as m1, b.member as m2, count(distinct a.peak) as total
from list1 a inner join list1 b
on a.trip_id = b.trip_id
and a.peak = b.peak
and a.when = b.when
and a.member <> b.member
group by a.member, b.member

Oracle Setup:
CREATE TABLE PEAK (
NAME VARCHAR2(50) PRIMARY KEY,
ELEV INT,
DIFF INT,
MAP VARCHAR2(10),
REGION VARCHAR2(10)
);
CREATE TABLE CLIMBER (
NAME VARCHAR2(50) PRIMARY KEY,
SEX CHAR(1) CHECK ( SEX IN ( 'M', 'F' ) )
);
-- Created this to have a primary key
CREATE TABLE TRIPS (
TRIP_ID INT PRIMARY KEY
);
CREATE TABLE PARTICIPATED (
TRIP_ID INT REFERENCES TRIPS( TRIP_ID ),
NAME VARCHAR2(50) REFERENCES CLIMBER( NAME ),
PRIMARY KEY ( TRIP_ID, NAME )
);
CREATE TABLE CLIMBED (
TRIP_ID INT REFERENCES TRIPS( TRIP_ID ),
PEAK VARCHAR2(50) REFERENCES PEAK ( NAME ),
"WHEN" DATE
);
Question 1
SELECT sex,
AVG( num_peaks ) AS avg_peaks
FROM (
SELECT c.*,
COUNT( DISTINCT l.peak ) num_peaks
FROM CLIMBED l
INNER JOIN
PARTICIPATED p
ON ( p.trip_id = l.trip_id )
RIGHT OUTER JOIN
CLIMBER c
ON ( p.name = c.name )
GROUP BY c.name, c.sex
)
GROUP BY sex;
You need to OUTER JOIN climbers as they could have not participated in any trips (so having climbed 0 peaks) and this needs to be taken into account in the average. It is also possible that a person could have climbed a peak multiple times - when you want the number of peaks climbed by a person you want to exclude multiple climbs on the same peak and will need to use COUNT( DISTINCT ... ) (or another similar technique) - if you want to count multiple climbs then remove the DISTINCT keyword.
Question 2:
SELECT *
FROM (
SELECT name1,
name2,
COUNT( DISTINCT c.peak ) AS num_peaks_climbed
FROM (
SELECT p1.name AS name1,
p2.name AS name2,
p1.trip_id
FROM PARTICIPATED p1
INNER JOIN
PARTICIPATED p2
ON ( p1.trip_id = p2.trip_id AND p1.name < p2.name )
) p
INNER JOIN
climbed c
ON ( p.trip_id = c.trip_id )
GROUP BY name1, name2
ORDER BY num_peaks_climbed DESC
)
WHERE ROWNUM = 1;
Question 3:
SELECT *
FROM (
SELECT p.name,
COUNT( c.peak ) OVER ( PARTITION BY p.name
ORDER BY c."WHEN"
RANGE BETWEEN INTERVAL '-60' DAY PRECEDING
AND CURRENT ROW
) AS num_peaks_in_60_days,
c."WHEN" AS last_date_of_range
FROM PARTICIPATED p
INNER JOIN
climbed c
ON ( p.trip_id = c.trip_id )
)
WHERE num_peaks_in_60_days > 20;

Related

Table Joining Issues

I have difficulties joining the tables below for the desired query as stated.
Theatre (Theatre#, Name, Address, MainTel);
Production (P#, Title, ProductionDirector, PlayAuthor);
Performance (Per#, P#, Theatre#, pDate, pHour, pMinute, Comments);
Client (Client#, Title, Name, Street, Town, County, telNo, e-mail);
Ticket Purchase (Purchase#, Client#, Per#, PaymentMethod, DeliveryMethod, TotalAmount)
Required Query
The theater name (for each theater) and the names of clients who have the highest spending in that theater
SELECT T.NAME, C.NAME, SUM(TOTALAMOUNT)
FROM TICKETPURCHASE TP,
THEATRE T,
CLIENT C,
PERFORMANCE PER
WHERE TOTALAMOUNT = (SELECT MAX (TOTALAMOUNT)
FROM TICKETPURCHASE TP2,
THEATRE T2,
PERFORMANCE PER2,
CLIENT C2,
PRODUCTION P2
WHERE T2.NAME = T.NAME
AND T2.THEATRE# = PER2.THEATRE#
AND TP2.CLIENT# = C2.CLIENT#
AND TP2.PER# =PER2.PER#
AND PER2.P# = P2.P# )
AND C.CLIENT# = TP.CLIENT#
AND T.THEATRE# = PER.THEATRE#
AND TP.PER# = PER.PER#
AND PER.P# = P.P#
GROUP BY T.NAME, C.NAME, TOTALAMOUNT
From Oracle 12, you can do it all in a single query if you aggregate by the primary keys for the theatre and the client to get the total spent and then rank the spending in the ORDER BY clause and use the row limiting clause to get the first ranked values:
SELECT MAX(T.NAME) AS theatre_name,
MAX(C.NAME) AS client_name,
SUM(TP.TOTALAMOUNT) AS amount_spent
FROM TICKETPURCHASE TP
INNER JOIN PERFORMANCE PER
ON (TP.PER# = PER.PER#)
INNER JOIN THEATRE T
ON (T.THEATRE# = PER.THEATRE#)
INNER JOIN CLIENT C
ON (C.CLIENT# = TP.CLIENT#)
GROUP BY
T.THEATRE#,
C.CLIENT#
ORDER BY
DENSE_RANK() OVER (
PARTITION BY T.THEATRE#
ORDER BY SUM(TP.TOTALAMOUNT) DESC
)
FETCH FIRST ROW WITH TIES;
In earlier versions, you can use the same technique and filter in an outer query:
SELECT theatre_name,
client_name,
amount_spent
FROM (
SELECT MAX(T.NAME) AS theatre_name,
MAX(C.NAME) AS client_name,
SUM(TP.TOTALAMOUNT) AS amount_spent,
DENSE_RANK() OVER (
PARTITION BY T.THEATRE#
ORDER BY SUM(TP.TOTALAMOUNT) DESC
) As rnk
FROM TICKETPURCHASE TP
INNER JOIN PERFORMANCE PER
ON (TP.PER# = PER.PER#)
INNER JOIN THEATRE T
ON (T.THEATRE# = PER.THEATRE#)
INNER JOIN CLIENT C
ON (C.CLIENT# = TP.CLIENT#)
GROUP BY
T.THEATRE#,
C.CLIENT#
)
WHERE rnk = 1;

PostgreSQL Select Join Not in List

The project is using Postgres 9.3
I have tables (that I have simplified) as follows:
t_person (30 million records)
- id
- first_name
- last_name
- gender
t_city (70,000 records)
- id
- name
- country_id
t_country (20 records)
- id
- name
t_last_city_visited (over 200 million records)
- person_id
- city_id
- country_id
- There is a unique constraint on person_id, country_id to
ensure that each person only has one last city per country
What I need to do are variations on the following:
Get the ids of Person who are female who have visited country 'UK'
but have never visited country 'USA'
I have tried the following, but it is too slow.
select t_person.id from t_person
join t_last_city_visited
on (
t_last_city_visited.person_id = t_person.id
and country_id = (select id from t_country where name = 'UK')
)
where gender = 'female'
except
(
select t_person.id from t_person
join t_last_city_visited
on (
t_last_city_visited.person_id = t_person.id
and country_id = (select id from t_country where name = 'USA')
)
)
I would really appreciate any help.
Hint: What you want to do here is to find the females for whom there EXISTS a visit to the UK, but where NOT EXISTS a visit to the US.
Something like:
select ...
from t_person
where ...
and exists (select null
from t_last_city_visited join
t_country on (...)
where t_country.name = 'UK')
and not exists (select null
from t_last_city_visited join
t_country on (...)
where t_country.name = 'US')
Another approach, to find the people who have visited the UK and not the US, which you can then join to the people to filter by gender:
select person_id
from t_last_city_visited join
t_country on t_last_city_visited.country_id = t_country.id
where t_country.name in ('US','UK')
group by person_id
having max(t_country.name) = 'UK'
Could you please run analyze and execute this query?
-- females who visited UK
with uk_person as (
select distinct person_id
from t_last_city_visited t
inner join t_person p on t.person_id = p.id and 'F' = p.gender
where country_id = (select id from t_country where name = 'UK')
),
-- females who visited US
us_person as (
select distinct person_id
from t_last_city_visited t
inner join t_person p on t.person_id = p.id and 'F' = p.gender
where country_id = (select id from t_country where name = 'US')
)
-- females who visited UK but not US
select uk.person_id
from uk_person uk
left join us_person us on uk.person_id = us.person_id
where us.person_id is null
This is one of the many ways this query can be formed. You might have to run them to find out which one works best and indexing tweaks you may need to make to have them run faster.
This is the way I would approach it, you can later substitute the inner queries by a with alias as #zedfoxus said
select
id
from
(SELECT
p.id id
FROM
t_person p JOIN t_last_city_visited lcv
ON(lcv.person_id = p.id)
JOIN country c
ON(lcv.country_id = c.id and cname = 'UK')
WHERE
p.gender = 'female') v JOIN
(SELECT
p2.id id
FROM
t_person p2 JOIN t_last_city_visited lcv2
ON(lcv2.person_id = p2.id)
JOIN country c
ON(lcv.country_id = c.id and cname != 'USA')
WHERE
p.gender = 'female') nv
ON(v.id = nv.id)

Select one record from two tables in Oracle

There are three tables:
A table about students: s41071030(sno, sname, ssex, sage, sdept)
A table about course: c41071030(cno, cname, cpno, credit)
A table about selecting courses: sc41071030(sno, cno, grade)
Now, I want select the details about a student whose sdept='CS' and he or she has selected the most courses in department 'CS'.
As with any modestly complex SQL statement, it is best to do 'TDQD' — Test Driven Query Design. Start off with simple parts of the question and build them into a more complex answer.
To find out how many courses each student in the CS department is taking, we write:
SELECT S.Sno, COUNT(*) NumCourses
FROM s41071030 S
JOIN sc41071030 SC ON S.Sno = SC.Sno
WHERE S.Sdept = 'CS'
GROUP BY S.Sno;
We now need to find the largest value of NumCourses:
SELECT MAX(NumCourses) MaxCourses
FROM (SELECT S.Sno, COUNT(*) NumCourses
FROM s41071030 S
JOIN sc41071030 SC ON S.Sno = SC.Sno
WHERE S.Sdept = 'CS'
GROUP BY S.Sno
)
Now we need to join that result with the sub-query, so it is time for a CTE (Common Table Expression):
WITH N AS
(SELECT S.Sno, COUNT(*) NumCourses
FROM s41071030 S
JOIN sc41071030 SC ON S.Sno = SC.Sno
WHERE S.Sdept = 'CS'
GROUP BY S.Sno
)
SELECT N.Sno
FROM N
JOIN (SELECT MAX(NumCourses) MaxCourses FROM N) M
ON M.MaxCourses = N.NumCourses;
And we need to get the student details, so we join that with the student table:
WITH N AS
(SELECT S.Sno, COUNT(*) NumCourses
FROM s41071030 S
JOIN sc41071030 SC ON S.Sno = SC.Sno
WHERE S.Sdept = 'CS'
GROUP BY S.Sno
)
SELECT S.*
FROM s41071030 S
JOIN N ON N.Sno = S.Sno
JOIN (SELECT MAX(NumCourses) MaxCourses FROM N) M
ON M.MaxCourses = N.NumCourses;
Lightly tested SQL: you were warned. To test, run the component queries, making sure you get reasonable results each time. Don't move on to the next query until the previous one is working correctly.
Note that the courses table turns out to be immaterial to the query you are solving.
Also note that this may return several rows if it turns out there are several students all taking the same number of courses and that number is the largest number that any student is taking. (So, if there are 3 students taking 7 courses each, and no student taking more than 7 courses, then you will see 3 rows in the result set.)
Aggregate sc41071030 rows to get the counts.
Join the results to s41071030 to:
filter rows on sdept;
get student details;
RANK() the joined rows on the count values.
Select rows with the ranking of 1.
WITH
aggregated AS (
SELECT
sno,
COUNT(*) AS coursecount
FROM
sc41071030
GROUP BY
sno
),
ranked AS (
SELECT
s.*,
RANK() OVER (ORDER BY agg.coursecount DESC) AS rnk
FROM
s41071030 s
INNER JOIN aggregated agg ON s.sno = agg.sno
WHERE
s.sdept = 'CS'
)
SELECT
sno,
sname,
ssex,
sage,
sdept
FROM
ranked
WHERE
rnk = 1
;

SQL query assistance

I have this schema:
Hotel (**hotelNo**, hotelName, city)
Room (**roomNo, hotelNo**, type, price)
Booking (**hotelNo, guestNo, dateFrom**, dateTo, roomNo)
Guest (**guestNo**, guestName, guestAddress)
** denotes primary keys
I have to complete this query:
Display each hotel and its most common room.
I have this query, which isn't quite correct:
SELECT r.hotelno, type, count(*)
FROM Hotel h, room r
WHERE h.hotelNo = r.hotelno
GROUP BY r.hotelNo, type;
This is what it outputs:
What am I doing wrong?
It looks as though you're seeking the type of room which has maximum number of bookings for rooms of a given type at each hotel - an aggregate (maximum) of another aggregate (count of bookings of room type).
Build it up piece-wise. The number of bookings of rooms of each type at each hotel:
SELECT r.hotelno, r.type, count(*) AS num_bookings
FROM Booking AS b
JOIN Room AS r ON b.hotelNo = r.hotelno AND b.roomNo = r.roomNo
GROUP BY r.hotelNo, r.type;
Now, you need to know which room type has the maximum at each hotel. That has to be done in two stages:
Find the maximum number of bookings at the hotel for any type.
Find the room types with that maximum number.
The first stage is:
SELECT s.hotelno, MAX(num_bookings) AS max_bookings
FROM (SELECT r.hotelno, r.type, count(*) AS num_bookings
FROM Booking AS b
JOIN Room AS r ON b.hotelNo = r.hotelno AND b.roomNo = r.roomNo
GROUP BY r.hotelNo, r.type
) AS s
GROUP BY s.hotelno;
The second stage uses both the previous results for a final answer:
SELECT t.hotelno, t.type
FROM (SELECT r.hotelno, r.type, count(*) AS num_bookings
FROM Booking AS b
JOIN Room AS r ON b.hotelNo = r.hotelno AND b.roomNo = r.roomNo
GROUP BY r.hotelNo, r.type) AS t
JOIN (SELECT s.hotelno, MAX(num_bookings) AS max_bookings
FROM (SELECT r.hotelno, r.type, count(*) AS num_bookings
FROM Booking AS b
JOIN Room AS r ON b.hotelNo = r.hotelno AND b.roomNo = r.roomNo
GROUP BY r.hotelNo, r.type
) AS s
GROUP BY s.hotelno) AS m
ON t.hotelno = m.hotelno AND t.num_bookings = m.max_bookings;
If your DBMS supports WITH clauses, you can write that more succinctly.
If you are looking for popularity, you would need to take into account the Booking table. Add the Booking table to your FROM statement, link on hotelNo and roomNo and do a count on the Booking table. This should give you the counts you want.
Edit:
Here is some sample code for you (tested):
SELECT TOP (100) PERCENT dbo.Hotel.hotelName, dbo.Room.type, COUNT(*) AS Count
FROM dbo.Booking INNER JOIN
dbo.Room ON dbo.Booking.roomNo = dbo.Room.roomNo AND dbo.Booking.hotelNo = dbo.Room.hotelNo
INNER JOIN dbo.Hotel ON dbo.Room.hotelNo = dbo.Hotel.hotelNo
GROUP BY dbo.Hotel.hotelName, dbo.Room.type
ORDER BY Count DESC
I think you're going to have to use an inner query to get this one working:
SELECT dbo.Hotel.hotelName, pop.type, pop.Count
FROM dbo.Hotel
INNER JOIN (
SELECT TOP 1 dbo.Hotel.hotelNo, dbo.Room.type, COUNT(*) AS Count
FROM dbo.Hotel
INNER JOIN dbo.Room ON dbo.Room.hotelNo = dbo.Hotel.hotelNo
INNER JOIN dbo.Booking ON dbo.Booking.roomNo = dbo.Room.roomNo AND dbo.Booking.hotelNo = dbo.Hotel.hotelNo
GROUP BY dbo.Hotel.hotelNo, dbo.Room.type
ORDER BY Count DESC, dbo.Room.type
) AS pop ON pop.hotelNo = dbo.Hotel.HotelNo
ORDER BY dbo.Hotel.hotelName

How to count number of different items in SQL

Database structure:
Clubs: ID, ClubName
Teams: ID, TeamName, ClubID
Players: ID, Name
Registrations: PlayerID, TeamID, Start_date, End_date, SeasonID
Clubs own several teams. Players may get registered into several teams (inside same club or into different club) during one year.
I have to generate a query to list all players that have been registered into DIFFERENT CLUBS during one season. So if player swapped teams that were owned by the same club then it doesn't count.
My attempts so far:
SELECT
c.short_name,
p.surname,
r.start_date,
r.end_date,
(select count(r2.id) from ejl_registration as r2
where r2.player_id=r.player_id and r2.season=r.season) as counter
FROM
ejl_registration AS r
left Join ejl_players AS p ON p.id = r.player_id
left Join ejl_teams AS t ON r.team_id = t.id
left Join ejl_clubs AS c ON t.club_id = c.id
WHERE
r.season = '2008'
having counter >1
I can't figure out how to count and show only different clubs... (It's getting too late for clear thinking). I use MySQL.
Report should be like: Player name, Club name, Start_date, End_date
This is a second try at this answer, simplifying it to merely count the distinct clubs, not report a list of club names.
SELECT p.surname, r.start_date, r.end_date, COUNT(DISTINCT c.id) AS counter
FROM ejl_players p
JOIN ejl_registration r ON (r.player_id = p.id)
JOIN ejl_teams t ON (r.team_id = t.id)
JOIN ejl_clubs c ON (t.club_id = c.id)
WHERE r.season = '2008'
GROUP BY p.id
HAVING counter > 1;
Note that since you're using MySQL, you can be pretty flexible with respect to columns in the select-list not matching columns in the GROUP BY clause. Other brands of RDBMS are more strict about the Single-Value Rule.
There's no reason to use a LEFT JOIN as in your example.
Okay, here's the first version of the query:
You have a chain of relationships like the following:
club1 <-- team1 <-- reg1 --> player <-- reg2 --> team2 --> club2
Such that club1 must not be the same as club2.
SELECT p.surname,
CONCAT_WS(',', GROUP_CONCAT(DISTINCT t1.team_name),
GROUP_CONCAT(DISTINCT t2.team_name)) AS teams,
CONCAT_WS(',', GROUP_CONCAT(DISTINCT c1.short_name),
GROUP_CONCAT(DISTINCT c2.short_name)) AS clubs
FROM ejl_players p
-- Find a club where this player is registered
JOIN ejl_registration r1 ON (r1.player_id = p.id)
JOIN ejl_teams t1 ON (r1.team_id = t1.id)
JOIN ejl_clubs c1 ON (t1.club_id = c1.id)
-- Now find another club where this player is registered in the same season
JOIN ejl_registration r2 ON (r2.player_id = p.id AND r1.season = r2.season)
JOIN ejl_teams t2 ON (r2.team_id = t2.id)
JOIN ejl_clubs c2 ON (t2.club_id = c2.id)
-- But the two clubs must not be the same (use < to prevent duplicates)
WHERE c1.id < c2.id
GROUP BY p.id;
Here's a list of players for one season.
SELECT sub.PlayerId
FROM
(
SELECT
r.PlayerId,
(SELECT t.ClubID FROM Teams t WHERE r.TeamID = t.ID) as ClubID
FROM Registrations r
WHERE r.Season = '2008'
) as sub
GROUP BY PlayerId
HAVING COUNT(DISTINCT sub.ClubID) > 1
Here's a list of players and seasons, for all seasons.
SELECT PlayerId, Season
FROM
(
SELECT
r.PlayerId,
r.Season,
(SELECT t.ClubID FROM Teams t WHERE r.TeamID = t.ID) as ClubID
FROM Registrations r
) as sub
GROUP BY PlayerId, Season
HAVING COUNT(DISTINCT sub.ClubID) > 1
By the way, this works in MS SQL.
SELECT p.Name, x.PlayerID, x.SeasonID
FROM (SELECT DISTINCT r.PlayerID, r.SeasonID, t.ClubID
FROM Registrations r
JOIN Teams t ON t.ID = r.TeamID) x
JOIN Players p ON p.ID = x.PlayerID
GROUP BY p.rName, x.PlayerID, x.SeasonID
HAVING COUNT(*) > 1