SQL Server : multiple transactions - sql

select
STUDIO.NAME, MOVIE.TITLE, MOVIE.YEAR
from
STUDIO
join
MOVIE on STUDIO.NAME = MOVIE.STUDIONAME
where
MOVIE.YEAR >= ALL (select MOVIE.YEAR from MOVIE)
I have this code which give me as a result the year of the last film, it's title and the name of the studio, which make the movie.
How can I rewrite this code, so I can get the last produced movie by each studio, not only by one?

Try:
SELECT a.NAME,
a.TITLE,
a.YEAR
FROM (select s.NAME,
m.TITLE,
m.YEAR,
ROW_NUMBER()OVER(PARTITION BY s.NAME ORDER BY m.YEAR DESC ) as rnk
from STUDIO s
join MOVIE m on s.NAME = m.STUDIONAME) a
WHERE a.rnk = 1

SELECT *
FROM (
SELECT STUDIO.NAME, MOVIE.TITLE, MOVIE.YEAR
from
STUDIO
join
MOVIE on STUDIO.NAME = MOVIE.STUDIONAME
ORDER BY MOVIE.YEAR DESC
) AS newTable
GROUP BY newTable.NAME

You might try something like
select
studio.name,
movie.title,
movie.year
from
studio
inner join movie on
studio.name = movie.studioname
inner join
(
select
movie.studioname
max(movie.year) year
from
movie
) t1 on
t1.studioname = studio.studioname and
t1.year = movie.year
Note that there are limitations to the way you have written your query:
There could be multiple movies in the same year - which one is the latest? You would need a date column for that or something that that affect
Also you are using varchar/chars to join on which is really bad practice since you have to do a string comparison on each match. It would be better to use id - integers - which give better performance

The answer above is fine for finding the most recent year a studio published a movie, however if you need the title too, using that method you'll need to use some form of subselect to get the largest year (and so the most recent) by studio name, and then bring in the title based on this.
The below does both in a fairly succinct fashion using a CROSS APPLY:
SELECT
S.NAME,
MM.TITLE
MM.YEAR
FROM STUDIO S
CROSS APPLY
(SELECT TOP 1 TITLE, YEAR FROM MOVIE M
WHERE M.STUDIONAME = S.STUDIO
ORDER BY M.YEAR DESC) MM
1st Edit: Added context
2nd Edit: Added edit notes.

What do you mean "not only by one"?
Normally you do such queries using aggregate functions:
select STUDIO.NAME, MOVIE.TITLE, MAX( MOVIE.YEAR )
from STUDIO
join MOVIE on STUDIO.NAME = MOVIE.STUDIONAME
Group by STUDIO.NAME, MOVIE.TITLE
Is this what you need?

Related

Two statements in one query

I want to select these pieces of information with 1 query and a subquery. How could I mix these statements?
SELECT
name, movie_id, type
FROM
Movie, Oscar
WHERE
rating = 'G'
AND type = 'BEST-PICTURE'
SELECT *
FROM Oscar
WHERE year IN (SELECT MAX(year) FROM Oscar)
Are you just trying to get all the oscars from the most recent year?
Without you showing us your table structures, I'm just guessing here - try something like this:
SELECT
m.name, m.movie_id, m.type
FROM
Movie m
INNER JOIN
Oscar o ON m.Movie_Id = o.Movie_Id
WHERE
m.rating = 'G'
AND m.type = 'BEST-PICTURE'
AND o.Year = (SELECT MAX(Year) FROM Oscar)
You can just use a Subquery, which is like 2 statement in one query.

SQL Collect duplicates to one place? PostgreSQL

Sorry I'm new here and I'm also new with SQL and can't really explain my problem in the title...
So I have a TV show database, and there I have a Genre column, but for a TV show there are multiple Genres stored, so when I'm selecting all my TV Shows how can I combine them?
It needs to look like this:
https://i.stack.imgur.com/3EhBj.png
So I have to combine the string together, here is my code so far what I wrote:
SELECT title,
year,
runtime,
MIN(name) as name,
ROUND(rating, 1) as rating,
trailer,
homepage
FROM shows
JOIN show_genres
on shows.id = show_genres.show_id
JOIN genres
on show_genres.genre_id = genres.id
GROUP BY title,
year,
runtime,
rating,
trailer,
homepage
ORDER BY rating DESC
LIMIT 15;
I also have some other stuff here, that's my exerciese tasks! Thanks!
Also here is the relationship model:
https://i.stack.imgur.com/M89ho.png
Basically you need string aggregation - in Postgres, you can use string_agg() for this.
For efficiency, I would recommend moving the aggregation to a correlated subquery or a lateral join rather than aggregating in the outer query, so:
SELECT
s.title,
s.year,
s.runtime,
g.genre_names,
ROUND(s.rating, 1) as rating,
s.trailer,
s.homepage
FROM shows s
LEFT JOIN LATERAL (
SELECT string_agg(g.name, ', ') genre_names
FROM show_genres sg
INNER JOIN genres g ON g.id = sg.genre_id
WHERE sg.show_id = s.id
) g ON 1 = 1
ORDER BY s.rating DESC
LIMIT 15

How to join two SQL queries into one?

I'm new to SQL and I'm currently trying to learn how to make reports in Visual Studio. I need to make a table, graph and few other things. I decided to do matrix as the last part and now I'm stuck. I write my queries in SQL Server.
I have two tables: Staff (empID, StaffLevel, Surname) and WorkOfArt (artID, name, curator, helpingCurator). In the columns Curator and HelpingCurator I used numbers from empID.
I'd like my matrix to show every empID and the number of paintings where they're acting as a Curator and the number of paintings where they're acting as a Helping Curator (so I want three columns: empID, count(curator), count(helpingCurator).
Select Staff.empID, count(WorkOfArt.Curator) as CuratorTotal
FROM Staff, WorkOfArt
WHERE Staff.empID=WorkOfArt.Curator
and Staff.StaffLevel<7
group by Staff.empID;
Select Staff.empID, count(WorkOfArt.HelpingCurator) as HelpingCuratorTotal
FROM Staff, WorkOfArt
WHERE Staff.empID=WorkOfArt.HelpingCurator
and Staff.StaffLevel<7
group by Staff.empID;
I created those two queries and they work perfectly fine, but I need it in one query.
I tried:
Select Staff.empID, count(WorkOfArt.Curator) as CuratorTotal,
COUNT(WorkOfArt.HelpingCurator) as HelpingCuratorTotal
FROM Staff FULL OUTER JOIN WorkOfArt on Staff.empID=WorkOfArt.Curator
and Staff.empID=WorkOfArt.HelpingCurator
WHERE Staff.StaffLevel<7
group by Staff.empID;
(as well as using left or right outer join)
- this one gives me a table with empID, but in both count columns there are only 0s - and:
Select Staff.empID, count(WorkOfArt.Curator) as CuratorTotal,
COUNT(WorkOfArt.HelpingCurator) as HelpingCuratorTotal
FROM Staff, WorkOfArt
WHERE Staff.empID=WorkOfArt.Curator
and Staff.empID=WorkOfArt.HelpingCurator
and Staff.StaffLevel<7
group by Staff.empID;
And this one gives me just the names of the columns.
I have no idea what to do next. I tried to find the answer in google, but all explanations I found were far more advanced for me, so I couldn't understand them... Could you please help me? Hints are fine as well.
The easiest way to do this is most likely with inner select in the select clause, with something like this:
Select
S.empID,
(select count(*) from WorkOfArt C where C.Curator = S.empID)
as CuratorTotal,
(select count(*) from WorkOfArt H where H.HelpingCurator = S.empID)
as HelpingCuratorTotal
FROM Staff S
WHERE S.StaffLevel<7
group by S.empID;
This way the rows with different role aren't causing problems with the calculation. If the tables are really large or you have a lot of different roles, then most likely more complex query with grouping the items first in the WorkOfArt table might have better performance since this requires reading the rows twice.
From a performance perspective, the following query is probably a little more efficient
select e.EmpId, CuratorForCount, HelpingCuratorForCount
from Staff s
inner join ( select Curator, count(*) as CuratorForCount
from WorkOfArt
group by Curator) mainCurator on s.EmpId = mainCurator.Curator
inner join ( select HelpingCurator, count(*) as HelpingCuratorForCount
from WorkOfArt
group by HelpingCurator) secondaryCurator on s.EmpId = secondaryCurator.HelpingCurator
One method, that can be useful if you want to get more than one value aggregated value from the WorkOfArt table is to pre-aggregate the results:
Select s.empID, COALESCE(woac.cnt, 0) as CuratorTotal,
COALESCE(woahc.cnt) as HelpingCuratorTotal
FROM Staff s LEFT JOIN
(SELECT woa.Curator, COUNT(*) as cnt
FROM WorkOfArt woa
GROUP BY woa.Curator
) woac
ON s.empID = woac.Curator LEFT JOIN
(SELECT woa.HelpingCurator, COUNT(*) as cnt
FROM WorkOfArt woa
GROUP BY woa.HelpingCurator
) woahc
ON s.empID = woahc.HelpingCurator
WHERE s.StaffLevel < 7;
Notice that the aggregation on the outer level is not needed.

SQL Server 2012 - Improve query performance

I'm looking for a way to improve the following query.
It collects members of organizations that have a membership of any organization in 2013.
I've been able to determine that the sub-query in this query is the real performance killer, but I can't find a way to remove the subquery and keep the resulting table correct.
The query simply collects all "PersonID" and "MemberId" for people that have a membership in this calendar year. BUT, it is possible to have two memberships in one calendar year. If that should happen, then we only want to select the last membership you have in that calendar year: that's what the subquery is for.
A "WorkingYear" is not the same as a calendar year. A workingyear can be an entire year, but it can also run from september 2013 to september 2014, for example. That's why I specify that the workingyear has to start or end in 2013.
This is the query:
SELECT DISTINCT PersonID,
m.id AS MemberId
FROM Members AS m
INNER JOIN WorkingYears AS w
ON m.WorkingYearID = w.ID
AND ( YEAR(w.StartDate) = 2013
OR YEAR(w.EndDate) = 2013 )
WHERE m.Id = (SELECT TOP 1 m2.id
FROM DBA_Member m2
WHERE personid = m.PersonID
AND ( ( droppedOut = 'false' )
OR ( droppedOut = 'true'
AND ( yeardropout = 2013 ) ) )
ORDER BY m.StartDate DESC)
This query should collect about 50.000 rows for me, so obviously it also executes the sub query at least 50.000 times and I'm looking for a way to avoid this. Does anyone have any ideas that could point me in the right direction?
All fields that are used in JOINS should be indexed correctly. There is also a seperate index on 'droppedOut' (bit), 'yeardropout' (int). I also created an index on both fields at the same time to no avail.
In the execution plan, I see that an "eager spool" is occurring, that takes up 60% of the query time. It has an outputlist of Member.ID, Member.DroppedOut, Member.YearDropout, which are indeed all the fields that I'm using in my subquery. Also, it gets 50.500 rebinds.
Does anyone have any advice?
You only need to do the sub-query once if you use a CTE
WITH subQall AS
(
select id, personID,
ROW_NUMBER() OVER (PARTITION BY personID ORDER BY StartDate DESC) as rnum
from DBA_Member
WHERE (droppedOut='false') OR (droppedOut='true' AND (yeardropout = 2013))
), subQ AS
(
select id, personID
from subQall
where rnum = 1
)
SELECT DISTINCT PersonID, m.id as MemberId
FROM Members AS m
INNER JOIN WorkingYears AS w ON m.WorkingYearID = w.ID
JOIN subQ ON m.ID = subQ.ID and m.personID = subQ.personID
WHERE StartDate BETWEEN '1-1-2013' AND '12-31-2013'
Can you try a join instead of the sub query?
like this
SELECT DISTINCT PersonID, m.id as MemberId
FROM Members AS m
INNER JOIN WorkingYears AS w ON m.WorkingYearID = w.ID
AND (year(w.StartDate) = 2013 OR year(w.EndDate) = 2013)
JOIN (select top 1 m2.id ID from DBA_Member m2 where personid= m.PersonID
and ((droppedOut='false') OR (droppedOut='true' AND (yeardropout = 2013)))
order by m.StartDate desc) Member ON m.Id = Member.ID

How do I select the Max in this query? Help for exam

So, I'm going thru a lot of exercises for a final SQL exam I have on thursday and I came across another query I'm having doubts about.
The tables in the exercise are supposed to be from a hotel DB. You have three tables involved:
STAY ROOM ROOM_TYPE
=========== ============ ============
PK ID_STAY PK ID_ROOM PK ID_ROOM_TYPE
DAYS_QUANT ID_ROOM_TYPE FK DESCRIPTION
DATE PRICE
ID_ROOM FK
The query they are asking me to do is "Show all data for the Room that has been rented for the highest amount of days (in total) in 2011, by room type (you have to show ID Room Type and Description)"
This is the way I solved it, I don't know if it's ok:
SELECT RT.ID_ROOM_TYPE, RT.DESCRIPTON, R.*, SUM(S.DAYS_QUANT)
FROM STAY S, ROOM R, ROOM_TYPE RT
WHERE YEAR(S.DATE) = '2011'
GROUP BY RT.ID_ROOM_TYPE, RT.DESCRIPTON, R.*
ORDER BY SUM(S.DAYS_QUANT) DESC
LIMIT 1
So, the first thing I'm not sure about, is that R.* I included. Can I put it like that in a SELECT? Can it also be included like that in a GROUP BY?
The other thing I'm not sure about if I will be allowed to use LIMIT or SELECT TOP 1 statements in the exam. Can anyone think of a way to solve this without using those? like with a MAX() statement or something?
I believe that you are not allowed to use CTEs so I expanded last part of Steve Kass's answer. You may get desired results without TOP or Limit clauses by comparing total days a room was occupied by max total number of days any room of the same type was occupied. To do so, you would first sum days by room and then, using identical derived table, get maximum of days per room type. Joining the two by room type and days you would isolate most used rooms. Then you join starting tables to show all the data. Unlike TOP or Limit this will produce more records in case of a tie.
P.S. this is NOT tested. I believe it will work, but there might be a typo.
select r.*, rt.*, roomDays.TotalDays
from Room r inner join Room_type rt
on r.id_room_type = rt.id_room_type
inner join
(select id_room, id_room_type, sum(days_quant) TotalDays
from Stay
inner join Room
on Stay.id_room = Room.id_room
where year(Date) = 2011
group by id_room, id_room_type) roomDays
on r.id_room = roomDays.id_room
inner join
(select id_room_type, max(TotalDays) TotalDays
from
(select id_room, id_room_type, sum(days_quant) TotalDays
from Stay
inner join Room
on Stay.id_room = Room.id_room
where year(Date) = 2011
group by id_room, id_room_type) roomDaysHelper
group by id_room_type) roomTypeDays
on r.id_room_type = roomTypeDays.id_room_type
and roomDays.TotalDays = roomTypeDays.TotalDays
select r.*, t.*
from room r
join room_type t on t.id_room_type = r.id_room_type
where r.id in
(select
(select r.id_room
from room r
join stay on stay.id_room = r.id_room
where year(s.date) = '2011'
and r.id_room_type = t.id_room_type
group by r.id_room
order by sum(s.days_quant) desc
limit 1) room_id
from room_type t)
It's always possible to avoid LIMIT 1 or SELECT TOP. One way is to express the top row as the row for which there is no higher row. WHERE NOT EXISTS expresses the idea of "for which there is no."
One way to think of this is as follows: Select those rooms (along with their total days and type information) for which there is no room of the same type with a greater number of total days. That gives you this query (not carefully proofread):
with StayTotals as (
select
STAY.ID_ROOM,
ROOM_TYPE.ID_ROOM_TYPE,
ROOM_TYPE.DESCRIPTION,
SUM(STAY.DAYS_QUANT) AS TotalDays2011
from STAY join ROOM on STAY.ID_ROOM = ROOM.ID_ROOM
join ROOM_TYPE on ROOM.ID_ROOM_TYPE = ROOM_TYPE.ID_ROOM_TYPE
where YEAR(STAY.DATE) = 2011
group by STAY.ID_ROOM, ROOM_TYPE.ID_ROOM_TYPE, ROOM_TYPE.DESCRIPTION
)
select *
from StayTotals as T1
where not exists (
select *
from StayTotals as T2
where T2.ID_ROOM_TYPE = T1.ID_ROOM_TYPE
and T2.TotalDays2011 > T1.TotalDays2011
);
If you can't use CTEs (the WITH clause), you can rewrite it using subqueries, but it's awkward.
Ranking functions have been part of the SQL standard for quite a while. If you can use them, this may also work:
with StayTotals as (
select
STAY.ID_ROOM,
ROOM_TYPE.ID_ROOM_TYPE,
ROOM_TYPE.DESCRIPTION,
SUM(STAY.DAYS_QUANT) AS TotalDays2011
from STAY join ROOM on STAY.ID_ROOM = ROOM.ID_ROOM
join ROOM_TYPE on ROOM.ID_ROOM_TYPE = ROOM_TYPE.ID_ROOM_TYPE
where YEAR(STAY.DATE) = 2011
group by STAY.ID_ROOM, ROOM_TYPE.ID_ROOM_TYPE, ROOM_TYPE.DESCRIPTION
), StayTotalsRankedByType as (
select
ID_ROOM,
ID_ROOM_TYPE,
DESCRIPTION,
TotalDays2011,
RANK() OVER (
PARTITION BY ID_ROOM_TYPE
ORDER BY TotalDays2011 DESC
) as RankInRoomType
from StayTotals
)
select
ID_ROOM,
ID_ROOM_TYPE,
DESCRIPTION,
TotalDays2011
from StayTotalsRankedByType
where RankInRoomType = 1;
Finally, one other way to pull in additional columns to describe the grouped MAX results is to use a "carryalong" sort, which was a handy technique before ranking functions were available. Adam Machanic gives an example here, and there are useful threads on the topic from Usenet, such as this one.
How about this?
select room.id_room, room_type.description, room.price
from room inner join room_type
on room.id_room.type = room_type.id_room_type
where room.room_id = (select room_id from stay
where year (date) = 2011
group by id_room
order by sum (days_quant) desc);
Unfortunately, this query (as it is now) doesn't show how for many days the most popular room had been rented. But there's no 'limit 1'!
Thank you all! with all the ideas you gave me I came up with this, let me know if you think it's ok please!
SELECT R.ID_ROOM, R.ID_ROOM_TYPE, T.DESCRIPTION, SUM(S.DAYS_CUANT)
FROM ROOM R, ROOM_TYPE T, STAY S
(SELECT ID_STAY, SUM(S.DAYS_QUANT) TOTALDAYS
FROM STAY S
WHERE YEAR(S.DATE) = 2011
GROUP BY S.ID_STAY) STAYHELPER
WHERE YEAR(S.DATE) = 2011
GROUP BY R.ID_ROOM, R.ID_ROOM_TYPE, T.DESCRIPTION
HAVING SUM(S.DAYS_QUANT) >= ALL STAYHELPER.TOTALDAYS