I'm learning advanced SQL queries little by little and I'm fairly stumped with a problem:
I have three tables: news, author, and images. Each field in the news table (newsID) is a news story, which then has an associated author in the author table (authorID) and can have any number of images associated in the images table. Each image has and associated (newsID). So each story has one author but can have several images.
I want to make a list of all news stories and use just one of the images as a thumbnail image. The problem is that any sql query I try to list the news items with gives me results equal to the number of images in the images table rather than the number of news items.
I don't know where to go from here. Any help would be greatly appreciated.
If the 3 tables in question are [news], [author] and [image] with appropriate columns, then
Derived Table approach
you can have a derived image table to get one image per news and then join it with the news and author table as shown.
This has been written and tested in SQL Server.
SELECT
N.[newsStoryTitle]
,A.authorName
,I.imageData1
FROM [news] N
LEFT OUTER JOIN author A ON A.newsID = N.newsID
LEFT OUTER JOIN
(
SELECT newsID, MAX(imageData) AS imageData1 FROM [image]
GROUP BY newsID
) AS I ON I.newsID = N.newsID
ORDER BY N.newsID
You could replace the LEFT OUTER JOINs with INNER JOINs if you do not need news without any images.
Correlated Subquery approach (as suggested by Marcelo Cantos)
If the imageData is stored as a text or image, then the MAX in the derived table wouldn't work. In that case, you can use a correlated subquery like this:
SELECT N.newsStoryTitle ,
A.authorName ,
I.imageData
FROM dbo.news N
INNER JOIN dbo.author A ON N.newsID = A.newsID
INNER JOIN dbo.image I ON N.newsID = I.newsID
WHERE imageID = ( SELECT MAX(imageID)
FROM dbo.image
WHERE newsID = N.newsID
)
ORDER BY n.newsID
One option is to add the following predicate:
FROM news JOIN images ...
...
WHERE imageID = (SELECT MAX(imageID)
FROM image
WHERE newsID = news.newsID)
Note that this excludes news items without an image. If you don't want this, you'll need a left join on images and an additional condition on the WHERE:
FROM news LEFT JOIN images ...
...
WHERE imageID IS NULL
OR imageID = (SELECT MAX(imageID)
FROM image
WHERE newsID = news.newsID)
You can to modify the order by on the subselect to get the 1 image per news row you are looking for...
select
....
from news n
left outer join images i on i.imageID = (
select top 1 i2.imageID
from images i2
where i2.newsID = n.newsID
order by --??
)
If you have 3 table in mysql and you want to join it together. for example I have 3 table
1 student
2 subject
3 score
now want to join student with subject and score. so we you this syntax :
select * from student inner join subject inner join score;
Related
I have a column ContentID in a table that identifies content that exists in other tables. For example, it could refer to a pageID, productID, etc. My plan is to use that to pull through other information I need, such as a page name or product name. This is what I have so far:
SELECT TL.ID, TL.TableName, TL.FileName, TL.ContentID, p.PageName AS Content
FROM TranslationLog TL
LEFT JOIN Pages P ON TL.ContentID = P.PageID
LEFT JOIN Categories C ON TL.ContentID = C.CategoryID
LEFT JOIN ProductDescriptions PD ON TL.ContentID = PD.SKU
The idea is for each row, I want to get the data for the specified content using the TableName and ContentID fields. Currently, I'm able to get PageName by selecting p.PageName AS Content. However, I'd like to do this for each of the tables; if the row corresponds to the pages table, then query that table - same for categories and product descriptions. I also need it to have the alias "Content", regardless of which field from another table we're using, such as PageName or ProductName.
Is it possible to do this?
EDIT:
The solution posted by rd_nielsen was almost perfect, but it turned out there was actually a bit of overlap with the ContentID. Here's what I ended up with to fix it:
SELECT TL.ID, TL.TableName, TL.FileName, TL.ContentID, coalesce(P.PageName, C.CategoryName, PD.ProductName)
FROM TranslationLog TL
LEFT JOIN Pages P ON TL.ContentID = P.PageID AND TL.TableName = 'Pages'
LEFT JOIN Categories C ON TL.ContentID = C.CategoryID AND TL.TableName = 'Categories'
LEFT JOIN ProductDescriptions PD ON TL.ContentID = PD.SKU AND TL.TableName = 'Products'
You should use cross-apply. For example:
select
T.*
from
table_stored_data u
cross apply dbo.created_function(u.contentid,u.tablename,u.search_column) T
You need to write a function for it.
If the values of TranslationLog.ContentID can appear in only one of the related tables, then you can coalesce the values from those tables:
SELECT
TL.ID,
TL.TableName,
TL.FileName,
TL.ContentID,
coalesce(p.PageName, C.CategoryName, PD.ProductName) AS Content
FROM
...
Try concat function:
select concat(p.PageName, C.CategoryName, PD.ProductName) AS Content from ...
I want to show all the photos that have the specific tag, but it only duplicates the photos. If I choose another tag, it doesn't show duplicated photos.
For the tag "Natur" it should only be 2 photos and for the tag "Berg" it should only be 1 photo.
SQL
SELECT *
FROM photos AS p
JOIN tags_photos AS tp
JOIN tags_names AS tn
ON tp.id_tag = tn.id
WHERE tn.data_name_seo = :name_seo
ORDER BY p.datetime_taken DESC
Database: tags_photos
id
id_photo
id_tag
Database: tags_name
id
data_name
data_name_seo
Database: photos
id
data_file_name
datetime_taken
Have I missed something or what's the problem?
You are missing join conditions for the first two tables. This is probably the cause of your problem:
SELECT *
FROM photos AS p JOIN
tags_photos AS tp
ON tp.id_photo = p.id JOIN
tags_names AS tn
ON tp.id_tag = tn.id
WHERE tn.data_name_seo = :name_seo
ORDER BY p.datetime_taken DESC
In most databases, the missing on clause would generate an error. In MySQL, the JOIN is treated as a CROSS JOIN, which likely would result in duplicates.
I'm building a system in which there are the following tables:
Song
Broadcast
Station
Follow
User
A user follows stations, which have songs on them through broadcasts.
I'm building a "feed" of songs for a user based on the stations they follow.
Here's the query:
SELECT DISTINCT ON ("broadcasts"."created_at", "songs"."id") songs.*
FROM "songs"
INNER JOIN "broadcasts" ON "songs"."shared_id" = "broadcasts"."song_id"
INNER JOIN "stations" ON "broadcasts"."station_id" = "stations"."id"
INNER JOIN "follows" ON "stations"."id" = "follows"."station_id"
WHERE "follows"."user_id" = 2
ORDER BY broadcasts.created_at desc
LIMIT 18
Note: shared_id is the same as id.
As you can see I'm getting duplicate results, which I don't want. I found out from a previous question that this was due to selecting distinct on broadcasts.created_at.
My question is: How do I modify this query so it will return only unique songs based on their id but still order by broadcasts.created_at?
Try this solution:
SELECT a.maxcreated, b.*
FROM
(
SELECT bb.song_id, MAX(bb.created_at) AS maxcreated
FROM follows aa
INNER JOIN broadcasts bb ON aa.station_id = bb.station_id
WHERE aa.user_id = 2
GROUP BY bb.song_id
) a
INNER JOIN songs b ON a.song_id = b.id
ORDER BY a.maxcreated DESC
LIMIT 18
The FROM subselect retrieves distinct song_ids that are broadcasted by all stations the user follows; it also gets the latest broadcast date associated with each song. We have to encase this in a subquery because we have to GROUP BY on the columns we're selecting from, and we only want the unique song_id and the maxdate regardless of the station.
We then join that result in the outer query to the songs table to get the song information associated with each unique song_id
You can use Common Table Expressions (CTE) if you want a cleaner query (nested queries make things harder to read)
I would look like this:
WITH a as (
SELECT bb.song_id, MAX(bb.created_at) AS maxcreated
FROM follows aa
INNER JOIN broadcasts bb ON aa.station_id = bb.station_id
INNER JOIN songs cc ON bb.song_id = cc.shared_id
WHERE aa.user_id = 2
GROUP BY bb.song_id
)
SELECT
a.maxcreated,
b.*
FROM a INNER JOIN
songs b ON a.song_id = b.id
ORDER BY
a.maxcreated DESC
LIMIT 18
Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.
Try by adding GROUP BY Songs.id
I had a very similar query I was doing between listens, tracks and albums and it took me a long while to figure it out (hours).
If you use a GROUP_BY songs.id, you can get it to work by ordering by MAX(broadcasts.created_at) DESC.
Here's what the full SQL looks like:
SELECT songs.* FROM "songs"
INNER JOIN "broadcasts" ON "songs"."shared_id" = "broadcasts"."song_id"
INNER JOIN "stations" ON "broadcasts"."station_id" = "stations"."id"
INNER JOIN "follows" ON "stations"."id" = "follows"."station_id"
WHERE "follows"."user_id" = 2
GROUP BY songs.id
ORDER BY MAX(broadcasts.created_at) desc
LIMIT 18;
I'm not much of a database guru so I need some help on a query I'm working on. In my photo community project I want to richly visualize tags by not only showing the tag name and counter (# of images inside them), I also want to show a thumb of the most popular image inside the tag (most karma).
The table setup is as follow:
Image table holds basic image metadata, important is the karma field
Imagefile table holds multiple entries per image, one for each format
Tag table holds tag definitions
Tag_map table maps tags to images
In my usual trial and error query authoring I have come this far:
SELECT * FROM
(SELECT tag.name, tag.id, COUNT(tag_map.tag_id) as cnt
FROM tag INNER JOIN tag_map ON (tag.id = tag_map.tag_id)
INNER JOIN image ON tag_map.image_id = image.id
INNER JOIN imagefile on image.id = imagefile.image_id
WHERE imagefile.type = 'smallthumb'
GROUP BY tag.name
ORDER BY cnt DESC)
as T1 WHERE cnt > 0 ORDER BY cnt DESC
[column clause of inner query snipped for the sake of simplicity]
This query gives me somewhat what I need. The outer query makes sure that only tags are returned for which there is at least 1 image. The inner query returns the tag details, such as its name, count (# of images) and the thumb. In addition, I can sort the inner query as I want (by most images, alphabetically, most recent, etc)
So far so good. The problem however is that this query does not match the most popular image (most karma) of the tag, it seems to always take the most recent one in the tag.
How can I make sure that the most popular image is matched with the tag?
You are looking for the group by 'having' clause, not nested selects!
SELECT tag.name, tag.id, COUNT(tag_map.tag_id) as cnt
FROM tag
INNER JOIN tag_map
ON (tag.id = tag_map.tag_id)
INNER JOIN image
ON tag_map.image_id = image.id
INNER JOIN imagefile
on image.id = imagefile.image_id
WHERE imagefile.type = 'smallthumb'
GROUP BY tag.name HAVING COUNT(tag_map.tag_id) > 0
ORDER BY cnt DESC
This should be pretty close:
SELECT
tag.id,
tag.name,
tag_group.cnt,
tag_group.max_karma,
image.id,
imagefile.filename
/* ... */
FROM
tag
/* join against a list of max karma values (per tag) */
INNER JOIN (
SELECT MAX(image.karma) AS max_karma, COUNT(image.*) cnt, tag_map.tag_id
FROM image
INNER JOIN tag_map ON tag_map.image_id = image.id
GROUP BY tag_map.tag_id
) AS tag_group ON tag_group.tag_id = tag.id
/* join against a list of image ids (per max karma value and tag) */
INNER JOIN (
SELECT MAX(image.id) id, tag_map.tag_id, image.karma
FROM image
INNER JOIN tag_map ON tag_map.image_id = image.id
GROUP BY tag_map.tag_id, image.karma /* collapse >1 imgs with same karma */
) AS pop_img ON pop_img.tag_id = tag.id AND pop_img.karma = tag_group.max_karma
/* join against actual base data (per popular image id) */
INNER JOIN
image ON image.id = pop_img.id
INNER JOIN
imagefile ON imagefile.image_id = pop_img.id AND imagefile.type = 'smallthumb'
Basically, this is the ever-recurring "max-per-group" problem: How can I select the record that corresponds to the maximum/minimum value of a group?
And the general answer always is along the lines of: Select your group (tag_id, MAX(image.karma)) and then join your base data against these characteristics. There may be DBMS-specific proprietary extensions that take a different approach, for example using ROW_NUMBER()/PARTITION BY. However, these are not very portable and may leave you scratching your head when working with a DBMS that does not support them.
Basically, what I want to do is join 4 tables together and return 1 row for each boat.
Table Layouts
[Boats]
id, date, section, raft
[Photos]
id, boatid, pthurl, purl
[River_Company]
id, sort, company, company_short
[River_Section]
id, section
Its very simple as far as structure, however, I've having the time of my life trying to get it to return only 1 row. No boat will ever be on the same day, the only thing that's messing this up is the photo table.
If you know a better way for it to return the record table for all the boats boats and only 1 photo from the photo table, please, please post it!!
Desired Format
boats.id, boats.date, river_company.company, river_section.section, photos.purl, photos.pthurl
It's basically how joins work. Since boats and photos are in one-to-many relationships and you want one-to-one-like query, you need to explicitly express it with predicate. For example:
select b.*
from
boats b
inner join photos p
on b.id = p.boatid
where p.id = (select max(id) from photos where boatid = b.id)
Assuming your ID column is the relation that you have designed:
SELECT Boats.* FROM Boats
LEFT OUTER JOIN Photos on Photos.ID =
(
SELECT TOP 1 Photos.ID FROM Photos
INNER JOIN Boats ON Photos.BoatID = Boats.ID
)
INNER JOIN River_Company on River_Company.ID = Boats.ID
INNER JOIN River_Section on River_Section.ID = Boats.ID
So basically, this will:
Guarantee the maximum row count of 1. (It's a bit dirty, but fact is if you have more than one photo, more than one link will be returned otherwise)
If there are no photo's, the boat will still be returned