I am trying to select the highest pair from a table that has been created by joining (combining) two tables. I guess I should use a nested query but I'm not sure how.
I also came around a similar question that seems a bit less complex, but I am struggling to implement it into my problem.
Similar question: How to select max timestamp from each currency pair SQL?
My tables:
Book:
title
publisher
price
sold
book1
A
5
300
book2
B
15
150
book3
A
8
350
Publisher:
code
name
A
ABook
B
BBook
C
CBook
My query:
SELECT b.titel, p.name, max(b.price*b.sold) as 'Revenue"
FROM publisher p, book b
WHERE p.code = b.publisher
Gives:
title
publisher
Revenue
book1
ABook
1500
book2
BBook
2250
book3
ABook
2800
Desired output:
title
publisher
book2
BBook
book3
ABook
How to alter my query to get the highest revenue per book title and the corresponding publisher?
You can use this query:
SELECT b.titel, p.name
FROM publisher p, book b
WHERE p.code = b.publisher
order by b.price*b.sold desc
OR
select abc.titel,abc.name FROM (
select b.titel, p.name, max( b.price*b.sold) as balance
FROM publisher p, book b
WHERE p.code = b.publisher
group by b.titel, p.name ) abc order by abc.balance desc
You can use row_number window function to select the appropriate row for each group.
Your desired results don't align with your description (do you want a revenue column or not?), however this produces your desired output. Note the use of modern (for 30 years) ansi join syntax:
with sold as (
select *, Row_Number() over(partition by publisher order by (price * sold) desc) rn
from book b join publisher p on p.code=b.publisher
)
select title, name Publusher
from sold
where rn=1
I solved it by using a nested query and max function
select b.title, p.name
from book b join publisher p on p.code = b.publisher
where b.sold*b.price = (select max(sold*price)
from book t2
where t2.publisher = b.publisher
)
Related
I think I have end up in a bit of a dead end.
Let's say I have a dataset, which is fairly easy -
person_id and book_id. Which is pretty much factual table that says person X bought books A, B and C.
I know how to find out how many persons have bought Book X and Book Y together.
This is
select a.book_id as B1, b.book_id as B2, count(b.person_id) as
Bought_Together
from dbo.data a
cross join dbo.data b
where a.book_id != b.book_id and a.person_id = b.person_id
group by a.book_id, b.book_id
Yet again this is where my brain decided to shut down. I know that I would probably need to do it so that
count(b.person_id) / all the people that bought book A * 100
but im not entirely sure.
I hope I was clear enough.
EDIT1: I'm using SQL Server 2017 currently, so i think the correct answer is T-SQL?.
In the end the format should be something similliar to this. Also there is no cases where person A could have bought three copies of book X.
Book1 Book2 HowManyPeopleBoughtBook2
1 2 50%
1 3 7%
2 3 15%
2 1 40%
3 1 60%
3 2 20%
EDIT2: Let it be said there is hundreds of thousands of rows in the database. Yes this is bit related to a data science course i am taking - hence huge amounts of data.
You can extend your logic to do this:
select a.book_id as B1, b.book_id as B2,
count(b.book_id) as bought_second_book,
count(b.book_id) * 1.0 / book_cnt as ratio_Bought_Together
from (select a.*, count(*) over (partition by a.book_id) as book_cnt
from dbo.data a
) a left join
dbo.data b
on a.person_id = b.person_id and a.book_id <> b.book_id
group by a.book_id, b.book_id, a.book_cnt;
This assumes that people buy a book only once. If there are duplicates, then count(distinct) would adjust for that.
If you would like to generate all possible combinations of the pairs of books bought together along with the percentage of the persons who bought that combination the following can help
create table data1(book_id int, person_id int)
insert into data1
select *
from (values(1,300)
,(2,300)
,(2,301)
,(1,301)
,(3,301)
)t(book_id,person_id)
with books
as (select distinct book_id
from data1 a
)
,tot_persons
as (select count(distinct person_id) as tot_cnt
from data1
)
,pairs
as (
select a.book_id as col1 /* This block generates all possible pair combinations of books*/
,b.book_id as col2
from books a
join books b
on a.book_id<b.book_id
)
select a.col1,a.col2
,count(b.person_id)*100/(select tot_cnt from tot_persons) as percent_of_persons_buying_both
from pairs a
join data1 b
on a.col1=b.book_id
where exists(select 1
from data1 b1
where b.person_id=b1.person_id
and a.col2=b1.book_id)
group by a.col1,a.col2
On my phone, apologies for typo's
SELECT
SUM(bought_b) * 100.0 / COUNT(*)
FROM
(
SELECT
person_id,
MAX(CASE WHEN book_id = 'A' THEN 1 END) AS bought_a,
MAX(CASE WHEN book_id = 'B' THEN 1 END) AS bought_b
FROM
data
WHERE
book_id IN ('A', 'B')
GROUP BY
person_id
)
person_stats
WHERE
bought_a = 1
On my phone, apologies for typo's
EDIT : just saw that you want all combinations, just just one set combination.
WITH
book AS
(
SELECT DISTINCT book_id FROM data
)
SELECT
book_a_id,
book_b_id,
bought_b * 100.0 / bought_b
FROM
(
SELECT
book_a.book_id AS book_a_id,
book_b.book_id AS book_b_id,
COUNT(DISTINCT data_a.person_id) AS bought_a,
COUNT(DISTINCT data_b.person_id) AS bought_b
FROM
book AS book_a
CROSS JOIN
book AS book_b
INNER JOIN
data AS data_a
ON data_a.book_id = book_a.book_id
LEFT JOIN
data AS data_b
ON data_b.book_id = book_b.book_id
GROUP BY
book_a.book_id,
book_b.book_id
)
stats
I've got 2 tables, one with sales and one with companies:
Sales Table
Transaction_Id Shop_id Sale_date Client_ID
92356 24234 11.09.2018 12356
92345 32121 11.09.2018 32121
94323 24321 11.09.2018 21231
94278 45321 11.09.2018 42123
Company table
Client_ID Company_name
12345 ABC
13322 ABC
32321 BCD
22221 BCD
What I want to achieve is distinct count of Clients from each Company for each pair of shops(Clients who had at least 1 transaction in both of shops) :
Shop_Id_1 Shop_id_2 Company_name Count(distinct Client_id)
12356 12345 ABC 31
12345 14278 ABC 23
14323 12345 BCD 32
14278 12345 BCD 43
I think that I have to use self join, but my queries even with filter for one week is killing DB, any thoughts on that? I'm using Microsoft SQL server 2012.
Thanks
I think this is a self-join and aggregation, with a twist. The twist is that you want to include the company in each sales record, so it can be used in the self-join:
with sc as (
select s.*, c.company_name
from sales s join
companies c
on s.client_id = c.client_id
)
select sc1.shop_id, sc2.shop_id, sc1.company_name, count(distinct sc1.client_id)
from sc sc1 join
sc sc2
on sc1.client_id = sc2.client_id and
sc1.company_name = sc2.company_name
group by sc1.shop_id, sc2.shop_id, sc1.company_name;
I think there are some issues with your question. I interpreted it as such that the company table contains the shop ID's, not the ClienId's.
First you can create a solution to get the shops as rows for each company. Here I chose a maximum of 5 shops per company. Don't forget the semicolon in the previous statement before the cte's.
WITH CTE_Comp AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY CompanyName ORDER BY ShopID) AS RowNumb
FROM Company AS C
)
SELECT C1.ShopID,
C2.ShopID AS ShopID_2,
C3.ShopID AS ShopID_3,
C4.ShopID AS ShopID_4,
C5.ShopID AS ShopID_5,
C1.CompanyName
INTO ShopsByCompany
FROM CTE_Comp AS C1
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C2.CompanyName AND RowNumb = 2
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C3.CompanyName AND RowNumb = 3
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C4.CompanyName AND RowNumb = 4
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C5.CompanyName AND RowNumb = 5
WHERE C1.RowNumb = 1
After that, in a few steps, I think you could get the desired result:
WITH ClientsPerShop AS
(
SELECT ShopID,
COUNT (DISTINCT ClientID) AS TotalClients
FROM Sales
GROUP BY ShopID
)
, ClienstsPerCompany AS
(
SELECT CompanyName,
SUM (TotalClients) AS ClientsPerComp
FROM Company AS C
INNER JOIN ClientsPerShop AS CPS ON C.ShopID = CPS.ShopID
GROUP BY CompanyName
)
SELECT *
FROM ClienstsPerCompany AS CPA
INNER JOIN ShopsByCompany AS SBC ON SBC.CompanyName = CPA.CompanyName
Hopefully this will bring you closer to your solution, best of luck!
This is my query:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Which gives me this table:
NAME NUM_BOOKS
-------------------------------------------------- ----------
Dyremann 2
Nam mann 1
Thomas 1
Asgeir 1
Tullemann 5
Plantemann 1
Beste forfatter 1
Fagmann 5
Lars 1
Hans 1
Svein Arne 1
How could I easly alter the query to only display the author with the highest amount of released books? (While keeping in mind I'm rather new to sql)
Oracle, and as far as I know - only Oracle, allows you to nest two aggregate functions.
SELECT max (f.name) keep (dense_rank last order by count (*)) as name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
In order to get ALL top authors:
select name
from (SELECT f.name,rank () over (order by count(*) desc) as rnk
from author f
JOIN book b on b.tittle = f.book
Group by f.name
)
where rnk = 1
Since Oracle 12c:
SELECT f.name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
order by count (*) desc
fetch first row /* with ties (optional, in order to get all top authors) */
The best way to do is to use:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Order by num_books DESC
FETCH FIRST ROW ONLY
This will order the results from biggest to smallest and return the first result.
1) Oracle Specific : ( Using ROWNUM, For Postgres/MySql use limit )
select * from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name order by num_books desc )
where ROWNUM = 1
2) General Query for all databases :
select f.name,count(*) as max_num_books from author f
JOIN book b on b.tittle = f.book
Group by f.name
having count(*) =
(select max(num_books)
from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name)
);
I am not sure why you need a join in the first place. It appears that the author table has a column book - why is it not enough to count(book) from that table, grouping by name? This arrangement is very strange - the author table should only have author properties, the author name should be in the title table, but you do join on author.book = book.title which seems to suggest that you do, in fact, have that strange arrangement (and therefore you don't need a join). Also, having a table and a column (in another table) share the same name, book, is a practice best to be avoided.
The most elementary solution (not the most efficient though), in this case, is
select name, count(book) as max_num_books
from author
group by name
having count(book) = (select max(count(book) from author group by name);
The subquery groups by name, and then it selects the max over all group counts. The outer query selects the names that have a book count equal to this maximum. The subquery returns a single row in a single column - a single value. Such a query is called a "scalar" subquery and can be used wherever a single value is needed, such as the HAVING clause of the outer query. (It's in the HAVING clause and not a WHERE clause, since it refers to group properties - count(book) - and not to individual row properties).
The more efficient solution is as Dudu showed:
select name, ct as max_num_books
from ( select name, count(*) as ct, rank() over (order by count(*) desc) rnk
from author
group by name
)
where rnk = 1;
Im having a problem using group by with multiple selects. I want to select the minimum bid offer price for each Auction. But I also have to get the name of the user who made that Bid. As you can see in the results, I get multiple results for each auction, and if I remove elements from the Groupby statement I get an error message. How can I groupy by ID_Auction and still show the name of the User? Thanks for the help
SELECT A.ID_AUCTION,MIN(B.PRICE) AS PRICE,U.NAME
FROM AUCTION A,BIDS B,USERS U,PRODUCT P
WHERE P.TYPE='CocaCola'
--Joins
and A.ID_AUCTION=B.ID_AUCTION
and BID.ID_USER=U.ID_USER
and A.ID_PRODUCT=P.ID_PRODUCT
GROUP BY A.ID_AUCTION,U.NAME;
ID_AUCTION PRICE NAME
---------- ---------- -------------- ------------------------------------------
27 25 Andrew
28 40 John
27 30 Michael
28 35 Peter
The Output I Desire :
ID_AUCTION PRICE NAME
---------- ---------- -------------- ------------------------------------------
27 25 Andrew
28 35 Peter
SELECT A.ID_AUCTION,MIN(B.PRICE) AS PRICE,U.NAME
FROM AUCTION A,BIDS B,USERS U,PRODUCT P
WHERE P.TYPE='CocaCola'
--Joins
and A.ID_AUCTION=B.ID_AUCTION
and BID.ID_USER=U.ID_USER
and A.ID_PRODUCT=P.ID_PRODUCT
GROUP BY A.ID_AUCTION,U.NAME;
Should probably be like:
SELECT A.ID_AUCTION, U.NAME, B.PRICE as PRICE
from AUCTION A
inner join (SELECT ID_AUCTION, NAME, MIN(Price) from Bids order by 1,2,3) B
ON a.ID_AUCTION = b.ID_AUCTION
inner join Users U
ON B.ID_USER = U.ID_USER
inner join Product P
ON A.ID_PRODUCT = P.ID_PRODUCT
WHERE P.TYPE = 'CocaCola'
GROUP BY A.ID_AUCTION, U.NAME
I've never seen joins work the way you did it, but that might be an Oracle thing...actually I think that's considered bad practice in most SQL arenas...
Give this a try, this will do the trick. It is ok with the coding style. Some old systems still work well with the ANSI-89 SQL style:
SELECT A.ID_AUCTION,
B.PRICE,
U.NAME
FROM AUCTION A,
BIDS B,
USERS U,
PRODUCT P
WHERE P.TYPE='CocaCola' --Joins
AND A.ID_AUCTION=B.ID_AUCTION
AND B.ID_USER=U.ID_USER
AND A.ID_PRODUCT=P.ID_PRODUCT
AND B.PRICE = (SELECT MIN(bids.PRICE) FROM BIDS --get only the bid with MIN price
WHERE bids.ID_AUCTION = A.ID_AUCTION);
select * from
(
SELECT A.ID_AUCTION,B.PRICE,U.NAME,
Dense_Rank() over (partition by ID_AUCTION order by Price Asc)AS Rank
FROM
AUCTION A
JOIN
BIDS B
ON A.ID_AUCTION=B.ID_AUCTION
JOIN
USERS U
ON BID.ID_USER=U.ID_USER
JOIN
PRODUCT P
ON A.ID_PRODUCT=P.ID_PRODUCT
WHERE P.TYPE='CocaCola')
where rank='1'
First should just group by A.ID_AUCTION and second on your select you could try a min(U.NAME)
SELECT A.ID_AUCTION,MIN(B.PRICE) AS PRICE,min(U.NAME)
FROM AUCTION A,BIDS B,USERS U,PRODUCT P
WHERE P.TYPE='CocaCola'
--Joins
and A.ID_AUCTION=B.ID_AUCTION
and BID.ID_USER=U.ID_USER
and A.ID_PRODUCT=P.ID_PRODUCT
GROUP BY A.ID_AUCTION;
I have two tables: Car and CarBorrowed.
The Car table contains all cars in the car pool with an ID and a group the car belongs to. For example:
ID 1, Car 1, Group Renault
ID 2, Car 3, Group Renault
ID 3, Car 4, Group VW
ID 4, Car 6, Group BMW
ID 5, Car 7, Group BMW
The CarBorrow table contains all cars which are borrowed on a particular day
Car 1, Borrowed on 23.08.2012
Car 3, Borrowed on 23.08.2012
Car 5, Borrowed on 23.08.2012
Now I want all groups, where no cars are left (today= 23.08.2012). So I should get "Group Renault"
First, join the tables, so we have for every car its borrows(a day).
select c.id, c.GroupName, cb.day
from car c
left join (select * from CarBorrow where day = '23 Aug 2012') cb
on (c.id = cb.id)
All cars not borrowed will have null at day.
After this, we shoud select all Groups that does not have nulls.
Bellow an trick to get it:
select GroupName
FROM(
select c.id, c.GroupName, cb.day
from car c
left join (select * from CarBorrow where day = '23 Aug 2012') cb
on (c.id = cb.id)
)
group by GroupName
having count(day) = count(*)
(Days that are null are not counted by COUNT)
SELECT distinct(D1.CARGROUP)
FROM den_car d1
MINUS
(SELECT D.CARGROUP
FROM den_car d
WHERE d.id IN (SELECT c.ID
FROM den_car c
MINUS
SELECT b.id
FROM den_car_borrow b
WHERE B.DATE_BORROW = TO_DATE (SYSDATE)))
This may be optimized but the idea is simple: Find the borrowed ones, subtract it from all cars. Then find the remaning groups.
Hope it helps. (By the way of course there are lots of other ways to do it.)
Hmmm . . . One way to approach this query is to count the cars in a group and also count the cars on a particular day, then take the groups where the borrowed equals the available:
select borrowed.BorrowedOn, available.CarGroup
from (select c.CarGroup, count(*) as cnt
from car c
group by c.CarGroup
) available left outer join
(select c.CarGroup, cb.BorrowedOn, count(*) as cnt
from CarBorrowed cb join
Car c
on cb.CarId = c.CarId
group by c.CarGroup, cb.BorrowedOn
) borrowed
on available.CarGroup = borrowed.CarGroup
where available.cnt = borrowed.cnt
By the way, "Group" is a bad name for a column, since it is a SQL reserved word. I've renamed it to CarGroup.
If the same car can be borrowed more than once on a given day, then change the count(*) in the second subquery to count(distinct cb.carId).
If you want just one day, you can add a clause to the WHERE clause.
I think I have a solution now:
select x.groupname from
(select a.groupname, count(*) as cnt from car a group by a.groupname) x
inner join
(
select b.groupname, count(*) as cnt from car b where b.carid in (select caraid from modavail where day ='23.08.2012')
group by b.groupname
) y
on x.groupname = y.groupname
where x.cnt = y.cnt and y.cnt ! = 0 ORDER BY GROUPNAME;
Thanks for your help!!!!