SQL: Select count of a record in right table with joins - sql

I have 2 tables one for mobiles and other is for reviews. Reviews table store the reviews of a specific mobile against its mobile id.
Structure of mobiles table.
mobile_id | mobile_name
Structure of reviews table.
review_id | mobile_id | review_body
So far I have written this query.
SELECT c.*, p.review_body
FROM ((select mobile_id, mobile_name from mobiles
WHERE brand_id=1 limit 0,5) c)
left JOIN
(
SELECT mobile_id,
MAX(review_id) MaxDate
FROM reviews
GROUP BY mobile_id
) MaxDates ON c.mobile_id = MaxDates.mobile_id left JOIN
reviews p ON MaxDates.mobile_id = p.mobile_id
AND MaxDates.MaxDate = p.review_id
This query returns the first 5 mobiles from mobile table and their latest (one) review from review table. This is the result it returns.
mobile_id | mobile_name | review_body
Question: But i also want review_count with it. review_count should be equal to total number of reviews a mobile has in reviews table against its mobile_id.
So please tell me how it can be done with a single query that I already have. Any help would be appreciated as i am trying to do this since 24 hours.

I think this would work
SELECT c.*, p.review_body, MaxDates.review_count
FROM ((select mobile_id, mobile_name from mobiles
WHERE brand_id=1 limit 0,5) c)
left JOIN
(
SELECT mobile_id,count(review_id) review_count,
MAX(review_id) MaxDate
FROM reviews
GROUP BY mobile_id
) MaxDates ON c.mobile_id = MaxDates.mobile_id left JOIN
reviews p ON MaxDates.mobile_id = p.mobile_id
AND MaxDates.MaxDate = p.review_id

Related

SQL multiple Joing Question, cant join 5 tables, problem with max

I got 6 tables:
Albums
id_album | title | id_band | year |
Bands
id_band | name |style | origin
composers
id_musician | id_song
members
id_musician | id_band | instrument
musicians
id_musician | name | birth | death | gender
songs
id_song | title | duration | id_album
I need to write a query where I get the six bands with more members and of those bands, get the longest song duration and it's title.
So far, I can get the biggest bands:
SELECT bands.name, COUNT(id_musician) AS numberMusician
FROM bands
INNER JOIN members USING (id_band)
GROUP BY bands.name
ORDER BY numberMusician DESC
LIMIT 6;
I can also get the longest songs:
SELECT MAX(duration), songs.title, id_album, id_band
FROM SONGs
INNER JOIN albums USING (id_album)
GROUP BY songs.title, id_album, id_band
ORDER BY MAX(duration) DESC
The problem occurs when I am trying to write a subquery to get the band with the corresponding song and its duration. Trying to do it with inner joins also gets me undesired results. Could someone help me?
I have tried to put the subquery in the where, but I can't find how to do it due to MAX.
Thanks
I find that using lateral joins makre the query easier to write. You already have the join logic all right, so we just need to correlate the bands with the musicians the songs.
So:
select b.name, m.*, s.*
from bands b
cross join lateral (
select count(*) as cnt_musicians
from members m
where m.id_band = b.id_band
) m
cross join lateral (
select s.title, s.duration
from songs s
inner join albums a using (id_album)
where a.id_band = b.id_band
order by s.duration desc limit 1
) s
order by m.cnt_musicians desc
limit 6
For each band, subquery m counts the number of musicians per group (its where clause correlates to the outer query), while s retrieves the longest song, using correlation, order by and limit. The outer query just combines the information, and then orders selects the top 6 bands.

INNER JOIN of pagevies, contacts and companies - duplicated entries

In short: 3 table inner join duplicates records
I have data in BigQuery in 3 tables:
Pageviews with columns:
timestamp
user_id
title
path
Contacts with columns:
website_user_id
email
company_id
Companies with columns:
id
name
I want to display all recorded pageviews and, if user and/or company is known, display this data next to pageview.
First, I join contact and pageviews data (SQL is generated by Metabase business intelligence tool):
SELECT
`analytics.pageviews`.`timestamp` AS `timestamp`,
`analytics.pageviews`.`title` AS `title`,
`analytics.pageviews`.`path` AS `path`,
`Contacts`.`email` AS `email`
FROM `analytics.pageviews`
INNER JOIN `analytics.contacts` `Contacts` ON `analytics.pageviews`.`user_id` = `Contacts`.`website_user_id`
ORDER BY `timestamp` DESC
It works as expected and I can see pageviews attributed to known contacts.
Next, I'd like to show pageviews of contacts with known company and which company is this:
SELECT
`analytics.pageviews`.`timestamp` AS `timestamp`,
`analytics.pageviews`.`title` AS `title`,
`analytics.pageviews`.`path` AS `path`,
`Contacts`.`email` AS `email`,
`Companies`.`name` AS `name`
FROM `analytics.pageviews`
INNER JOIN `analytics.contacts` `Contacts` ON `analytics.pageviews`.`user_id` = `Contacts`.`website_user_id`
INNER JOIN `analytics.companies` `Companies` ON `Contacts`.`company_id` = `Companies`.`id`
ORDER BY `timestamp` DESC
With this query I would expect to see only pageviews where associated contact AND company are known (just another column for company name). The problem is, I get duplicate rows for every pageview (sometimes 5, sometimes 20 identical rows).
I want to avoid selecting DISTINCT timestamps because it can lead to excluding valid pageviews from different users but with identical timestamp.
How to approach this?
Your description sounds like you have duplciates in companies. This is easy to test for:
select c.id, count(*)
from `analytics.companies` c
group by c.id
having count(*) >= 2;
You can get the details using window functions:
select c.*
from (select c.*, count(*) over (partition by c.id) as cnt
from `analytics.companies` c
) c
where cnt >= 2
order by cnt desc, id;

Select all customers loyal to one company?

I've got tables:
TABLE | COLUMNS
----------+----------------------------------
CUSTOMER | C_ID, C_NAME, C_ADDRESS
SHOP | S_ID, S_NAME, S_ADDRESS, S_COMPANY
ORDER | S_ID, C_ID, O_DATE
I want to select id of all customers who made order only from shops of one company - 'Samsung' ('LG', 'HP', ... doesn't really matter, it's dynamic).
I've come only with one solution, but I consider it ugly:
( SELECT DISTINCT c_id FROM order JOIN shop USING(s_id) WHERE s_company = "Samsung" )
EXCEPT
( SELECT DISTINCT c_id FROM order JOIN shop USING(s_id) WHERE s_company != "Samsung" );
Same SQL queries, but reversed operator. Isn't there any aggregate method which solves such query better?
I mean, there could be millions of orders(I don't really have orders, I've got something that occurs more often).
Is it efficient to select thousands of orders and then compare them to hundreds of thousands orders which have different company? I know, that it compares sorted things, so it's O( m + n + sort(n) + sort(m) ). But that's still large for millions of records, or isn't?
And one more question. How could I select all customer values (name, address). How can I join them, can I do just
SELECT CUSTOMER.* FROM CUSTOMER JOIN ( (SELECT...) EXCEPT (SELECT...) ) USING (C_ID);
Disclaimer: This question ain't homework. It's preparation for the exam and desire to things more effective. My solution would be accepted at exam, but I like effective programming.
I like to approach this type of question using group by and a having clause. You can get the list of customers using:
select o.c_id
from orders o join
shops s
on o.s_id = o.s_id
group by c_id
having min(s.s_company) = max(s.s_company);
If you care about the particular company, then:
having min(s.s_company) = max(s.s_company) and
max(s.s_company) = 'Samsung'
If you want full customer information, you can join the customers table back in.
Whether this works better than the except version is something that would have to be tested on your system.
How about a query that uses no aggregate functions like Min and Max?
select C_ID, S_ID
from shop
group by C_ID, S_ID;
Now we have a distinct list of customers and all the companies they shopped at. The loyal customers will be the ones who only appear once in the list.
select C_ID
from Q1
group by C_ID
having count(*) = 1;
Join back to the first query to get the company id:
with
Q1 as(
select C_ID, S_ID
from shop
group by C_ID, S_ID
),
Q2 as(
select C_ID
from Q1
group by C_ID
having count(*) = 1
)
select Q1.C_ID, Q1.S_ID
from Q1
join Q2
on Q2.C_ID = Q1.C_ID;
Now you have a list of loyal customers and the one company each is loyal to.

SQL SUM and COUNT returning wrong values

I found a bunch of similar questions but nothing worked for me, or I am too stupid to get how to do it right.
The visit count works fine if I use COUNT(DISTINCT visits.id) but then the vote count goes totally wrong - it displays a value 3 to 4 times larger than it should be.
So this is the query
SELECT SUM(votes.rating), COUNT(visits.id)
FROM topics
LEFT JOIN visits ON ( visits.content_id = topics.id )
LEFT JOIN votes ON ( votes.content_id = topics.id )
WHERE topics.id='1'
GROUP BY topics.id
The votes table looks like this
id int(11) | rating tinyint(4) | content_id int(11) | uid int(11)
visits table
id int(11) | content_id int(11) | uid int(11)
topics table
id int(11) | name varchar(128) | message varchar(512) | uid int(11)
help?
Basically, you're summing or counting the total number of rows potentially returned. So, if there are three visits and four votes for each id, then the visits will be multiplied by four and the votes by three.
I think what you want can easiest be ackomplished by using subqueries:
SELECT (SELECT SUM(v.rating) FROM votes v WHERE v.content_id = t.id),
(SELECT COUNT(vi.id) FROM visits vi WHERE vi.content_id = t.id)
FROM topics t
WHERE t.id=1
GROUP BY t.id
I suspect the problem is in the join with the table votes.
If votes have more than one row you will have the count using also that duplicated rows.
If you use distinct you skip the duplication of the Ids (due to the join with vote).
As a first tiral I will temporarely disapble the join with votes and see what happen.
Hope it helps
Without seeing the data it is a bit tough to debug, but I would guess it is because there are more visits than votes. The following should work for you:
SELECT (SELECT SUM (rating) FROM votes WHERE votes.content_id = topics.id),
(SELECT COUNT (1) FROM visits WHERE visits.content_id = topics.id)
FROM topics
WHERE topics.id = 1
You need to do this as two separate subqueries:
SELECT sumrating, numvisit
FROM (select visits.content_id, count(*) as numvisits
from visits
) tvisit left outer join
(select votes.content_id, SUM(votes.rating) as sumrating
from votes
group by votes.content_id
) v
ON ( v.content_id = tvisit.content_id )
WHERE tvisit.content_id='1'
As it turns out, you don't need to join in the topic table at all.

complex sql query from 4 tables

I am developing an online travel guide with a lot of hotels. Each hotel belongs to a specific category, has a lot room types and each of hotel room has different price per season. I want to make a complex query from 4 tables in order to get the total number of hotels per hotels category where the minimum price of each hotel rooms is between 2 values which are adjusted by a slider.
My tables look like:
Categories
id_category
category_name
Hotels
id_hotel
hotel_name
category_id
......
hotels_room_types
id_hotels_room_type
hotel_id
room_type_id
......
hotels_room_types_seasons
hotels_room_types_id
season_id
price
......
for example some values of category_name are: Hotels, apartments, hostels
I would like my results table to have two fields like the following:
Hotels 32
apartments 0
hostels 5
I tried the following query but it returns the total number of all hotels per category, not the number of hotels where the minimum price of their rooms is between the price range.
SELECT c.category_name, count( DISTINCT id_hotel ) , min( price ) min_price
FROM categories c
LEFT JOIN hotels w ON ( c.id_category = w.category_id )
LEFT JOIN (
hotels_room_types
INNER JOIN hotels_room_types_seasons ON hotels_room_types.id_hotels_room_types = hotels_room_types_seasons.hotels_room_types_id)
ON w.id_hotel = hotels_room_types.hotel_id
GROUP BY c.category_name
HAVING min_price >=10 AND min_price <=130
Could anyone help me how to write the appropriate query?
Thanks!!!
SELECT Categories.Name, COUNT(DISTINCT ID_Hotel) [Count]
FROM Hotels
INNER JOIN Categories
ON Category_ID = ID_Category
INNER JOIN
( SELECT Hotel_ID, MIN(Price) [LowestPrice]
FROM hotels_room_types
INNER JOIN hotels_room_types_seasons
ON id_hotels_room_type = hotels_room_types_id
-- CONSIDER FILTERING BY SEASON HERE
GROUP BY Hotel_ID
) price
ON price.Hotel_ID = Hotels.ID_Hotel
WHERE LowestPrice BETWEEN 10 AND 130 -- OR WHATEVER YOUR PARAMETERS ARE
GROUP BY Categories.Name
I have no idea what RDBMS you are using but I do not know any where your query would work. The problem you were having with the Min Price (I assume) is because you are applying the logic after grouping by category, so you are counting all hotels where the category has a lowest price between 10 and 130, not where the hotel has a room with the lowest price between 10 and 130.
select
c.Category_name,
count(*) NumHotels
from
( select distinct
byRoomType.hotel_id
from
hotels_room_types_seasons bySeason
join hotels_room_types byRoomType
on bySeason.hotels_room_types_id = byRoomType.id_hotels_room_type
where
bySeason.Price between LowPriceParameter and HighPriceParameter
) QualifiedHotels
join Hotels
on QualifiedHotels.hotel_id = Hotels.id_hotel
join Categories c
on category_id = c.id_category