SQL to calculate author with most books - sql

I have a table of books, a table of authors, and a "linker" table (many to many links between authors/books).
How do I find the authors with the highest number of books?
This is my schema:
books : rowid, name
authors : rowid, name
book_authors : rowid, book_id, author_id
This is what I came up with: (but it doesn't work)
SELECT count(*) IN book_authors
WHERE (SELECT count(*) IN book_authors
WHERE author_id = author_id)
And ideally I would like a report of the top 100 authors, something like:
author_name book_count
-----------------------------------
Johnny 25
Kelly 12
Ramboz 10
Do I need some kind of join? What is the fastest approach?

I'd join the three tables (via the book_authors table), group by the author, count occurrences and limit it to the top 100 rows:
SELECT a.name, COUNT(*)
FROM authors a
JOIN books_authors ba ON a.rowid = ba.author_id
JOIN books b ON ba.book_id = b.rowid
GROUP BY a.name
ORDER BY 2 DESC
LIMIT 100
EDIT:
Actually, we aren't using any data from books, just the fact the book actually exists, which can be inferred from books_authors, so this query can be improved by dropping the second join:
SELECT a.name, COUNT(*)
FROM authors a
JOIN books_authors ba ON a.rowid = ba.author_id
GROUP BY a.name
ORDER BY 2 DESC
LIMIT 100

Couldn't you just
select count(1) , Author_ID from Book_Authors group by Author_ID order by count(1) desc limit 100
The authors with the most books would be at the top (or the author_ID would be at least)
As for limiting to top 100... then add limit clause Sqlite LIMIT / OFFSET query

SELECT TOP 3 authors.author_name, authors.book_name, books.sold_copies,
(SELECT SUM(books.sold_copies) FROM books WHERE authors.book_name = books.book_name ) AS Total
FROM authors
INNER JOIN books
ON authors.book_name = books.book_name
ORDER BY sold_copies desc

Related

How to perform max on an inner join with 2 different counts on columns?

How to find the user with the most referrals that have at least three blue shoes using PostgreSQL?
table 1 - users
name (matches shoes.owner_name)
referred_by (foreign keyed to users.name)
table 2 - shoes
owner_name (matches persons.name)
shoe_name
shoe_color
What I have so far is separate queries returning parts of what I want above:
(SELECT count(*) as shoe_count
FROM shoes
GROUP BY owner_name
WHERE shoe_color = “blue”
AND shoe_count>3) most_shoes
INNER JOIN
(SELECT count(*) as referral_count
FROM users
GROUP BY referred_by
) most_referrals
ORDER BY referral_count DESC
LIMIT 1
Two subqueries seem like the way to go. They would look like:
SELECT s.owner_name, s.show_count, r.referral_count
FROM (SELECT owner_name, count(*) as shoe_count
FROM shoes
WHERE shoe_color = 'blue'
GROUP BY owner_name
HAVING shoe_count >= 3
) s JOIN
(SELECT referred_by, count(*) as referral_count
FROM users
GROUP BY referred_by
) r
ON s.owner_name = r.referred_by
ORDER BY r.referral_count DESC
LIMIT 1 ;

How to select the highest value after a count() | Sql Oracle

This is my query:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Which gives me this table:
NAME NUM_BOOKS
-------------------------------------------------- ----------
Dyremann 2
Nam mann 1
Thomas 1
Asgeir 1
Tullemann 5
Plantemann 1
Beste forfatter 1
Fagmann 5
Lars 1
Hans 1
Svein Arne 1
How could I easly alter the query to only display the author with the highest amount of released books? (While keeping in mind I'm rather new to sql)
Oracle, and as far as I know - only Oracle, allows you to nest two aggregate functions.
SELECT max (f.name) keep (dense_rank last order by count (*)) as name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
In order to get ALL top authors:
select name
from (SELECT f.name,rank () over (order by count(*) desc) as rnk
from author f
JOIN book b on b.tittle = f.book
Group by f.name
)
where rnk = 1
Since Oracle 12c:
SELECT f.name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
order by count (*) desc
fetch first row /* with ties (optional, in order to get all top authors) */
The best way to do is to use:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Order by num_books DESC
FETCH FIRST ROW ONLY
This will order the results from biggest to smallest and return the first result.
1) Oracle Specific : ( Using ROWNUM, For Postgres/MySql use limit )
select * from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name order by num_books desc )
where ROWNUM = 1
2) General Query for all databases :
select f.name,count(*) as max_num_books from author f
JOIN book b on b.tittle = f.book
Group by f.name
having count(*) =
(select max(num_books)
from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name)
);
I am not sure why you need a join in the first place. It appears that the author table has a column book - why is it not enough to count(book) from that table, grouping by name? This arrangement is very strange - the author table should only have author properties, the author name should be in the title table, but you do join on author.book = book.title which seems to suggest that you do, in fact, have that strange arrangement (and therefore you don't need a join). Also, having a table and a column (in another table) share the same name, book, is a practice best to be avoided.
The most elementary solution (not the most efficient though), in this case, is
select name, count(book) as max_num_books
from author
group by name
having count(book) = (select max(count(book) from author group by name);
The subquery groups by name, and then it selects the max over all group counts. The outer query selects the names that have a book count equal to this maximum. The subquery returns a single row in a single column - a single value. Such a query is called a "scalar" subquery and can be used wherever a single value is needed, such as the HAVING clause of the outer query. (It's in the HAVING clause and not a WHERE clause, since it refers to group properties - count(book) - and not to individual row properties).
The more efficient solution is as Dudu showed:
select name, ct as max_num_books
from ( select name, count(*) as ct, rank() over (order by count(*) desc) rnk
from author
group by name
)
where rnk = 1;

SQL Query needed to get information from TWO separate tables

I am trying to create a query that will list all books by the same author. Most of the list has only one book by one author, but I want the author that has multiple books listed in db to display those book for that author.
I have two tables:
Book - AuthorID, BkTitle, etc
Author - AuthorID, AuthFName, AuthLName
I want the result to be sorted by AuthLName and the report to consist of any books in db that have same authorid.
Example result wanted:
AUTHORID BKTITLE AUTHFNAME AUTHLNAME
--------- ----------------- ------------ -----------
504 KNIGHT FREEDOM Chris Feehan
504 KNIGHT SHOWDOWN Chris Feehan
Currently, I have the following code:
select AUTHORID, BKTITLE
from BOOK
where AUTHORID in
(select AUTHORID from
(select AUTHORID,
count(*) as BOOK_COUNT
from BOOK
group by AUTHORid
order by AUTHORid )
where BOOK_COUNT >= 2);
which gives:
AUTHORID BKTITLE
---------- --------------------
504 KNIGHT FREEDOM
504 KNIGHT SHOWDOWN
I need to find a way to get the information from the Author Table and add it in this.
This should do:
SELECT b.AUTHORID, b.BKTITLE, a.AUTHFNAME, a.AUTHLNAME
FROM BOOK b
INNER JOIN AUTHOR a
ON b.AUTHORID = a.AUTHORID
AND b.AUTHORID IN
(
SELECT AUTHORID
FROM BOOK
GROUP BY AUTHORID
HAVING COUNT(AUTHORID) > 1
)
ORDER BY a.AUTHLNAME, a.AUTHFNAME
How about this - updated to use a CTE (Common Table Expression) first to figure out which authors have more than one book in the database table BOOK, and then listing only those authors and their books:
;WITH AuthorsWithMoreThanOneBook AS
(
SELECT AUTHORID, BOOK_COUNT = COUNT(*)
FROM BOOK
GROUP BY AUTHORID
HAVING BOOK_COUNT > 1)
)
SELECT
b.AUTHORID, b.BKTITLE, a.AuthFName, a.AuthLName
FROM
BOOK b
INNER JOIN
AUTHOR a ON b.AuthorID = a.AuthorID
INNER JOIN
AuthorsWithMoreThanOneBook A2 ON a.AuthorID = A2.AuthorID
ORDER BY
a.AuthLName, a.AuthFName
Update: OK, you're using Oracle .... not sure (haven't used it in ages) - but can't you just extend your original query something like this:
select bk.AUTHORID, bk.BKTITLE, a.AUTHORFNAME, a.AUTHORLNAME
from BOOK AS bk
INNER JOIN AUTHOR AS a ON a.AUTHORID = bk.AUTHORID
where bk.AUTHORID in
(select AUTHORID from
(select AUTHORID,
count(*) as BOOK_COUNT
from BOOK
group by AUTHORid
order by AUTHORid )
where BOOK_COUNT >= 2);
Not sure if/how Oracle supports those table aliases (BOOK AS bk) - but I'm pretty sure it does support it some way or another....
You can do this with only one access to each table:
SELECT * FROM
(
SELECT AuthorID, BkTitle, AuthFName, AuthLName
,COUNT(*) OVER (PARTITION BY AuthorID)
AS c
FROM BOOK
JOIN AUTHOR USING (AuthorID)
)
WHERE c > 1;

Text search of a many-to-many data relation

I know this must have been answered before here, but I simply can't find a matching question.
Using a LIKE '%keyword%', I want to search a many-to-many data relationship in a MSSQL database and reduce it to a one-to-one result set. The two tables are joined through a linking table. Here's a very simplified version of what I'm talking about:
Books:
book_ id title
1 Treasure Island
2 Poe Collected Stories
3 Invest in Treasure Islands
Categories:
category_id name
1 Children
2 Adventure
3 Horror
4 Classic
5 Money
BookCategory:
book_id category_id
1 1
1 2
1 4
2 3
2 4
3 5
What I want to do is search for a phrase in the title (e.g. '%treasure island%') and get matching Books records that contain the search string and the single highest matching Categories record that goes with each book -- I want to discard the lesser category records. In other words, I'm looking for this:
book_id title category_id name
1 Treasure Island 4 Classic
3 Invest in Treasure Islands 5 Money
Any suggestions?
Try this. Filter your lookup table, then join:
With maxCategories AS
(select book_id, max(category_id) as category_id from BookCategory group by book_id)
select Books.book_id, Books.Title, Categories.category_id, Categories.name
from Books
inner join maxCategories on (Books.book_id = maxCategories.book_id)
inner join Categories on (Categories.category_id = maxCategories.category_id)
where Books.title like '%treasure island%'
Try:
select * from
(select b.*,
c.*,
row_number() over (partition by bc.book_id
order by bc.category_id desc) rn
from Books b
join BookCategory bc on b.book_id = bc.book_id
join Categories c on bc.category_id = c.category_id
where b.name like '%treasure island%') sq
where rn=1

I need a query that retrieves this result

I have a table of 3 columns:
AuthorID (id can be repeated)
JournalName (name can be repeated)
AuthorScore
I need a query that gets JournalName and the count of all authors having their maximum score in this journal.
Thank you in advance.
select
maxscorejournalinstances.journalname,
COUNT(*) as maxscorecount
from
(
select
journalname
from
foo inner join
(
select
authorid,
MAX(authorscore) as maxscore
from
foo
group by
authorid
) maxauthorscores
on foo.AuthorId = maxauthorscores.AuthorId
and foo.AuthorScore = maxauthorscores.maxscore
) maxscorejournalinstances
group by
maxscorejournalinstances.JournalName
Note that if an author has the same high score in two or more journals, each of those journals will be included in the resultset.
SELECT AuthorID, MAX(AuthorScore) as AuthorScore,
(
SELECT JournalName
FROM tab t2
WHERE t1.AuthorID = t2.AuthorID AND t2.AuthorScore = MAX(t1.AuthorScore)
) as JournalName
FROM tab t1
GROUP BY AuthorID
select
x.journalname, count(x.authorid)
from tableX x
inner join
(
select authorid, max(authorscore) max_authorscore
from tableX
group by authorid
) tmp on x.authorid=tmp.authorid and x.authorscore=tmp.max_authorscore
group by journalname
Something like this may work.
select authorid, journalname, authorscore, max(authorscore) over(parition by authorid)
from <table>
order by journalname
Doing some research on sql olap function should point you in the right direction if this doesn't work.
Sounds like you need 2 queries as the data returned can not be returned in one record set where the data layout is simple to see.
The count of authors per journal,
select JournalName, count(distinct AuthorID)
from table
group by JournalName
The author's max score per journal,
select JournalName, AuthorID, max(AuthorScore)
from table
group by JournalName, AuthorID