mysql group_concat - sql

I have two tables.
products
id title image_ids
---------------------
1 myproduct 1,2,3
images
id title file_name
-------------------------
1 myimage myimage.jpg
2 myimage2 myimage2.jpg
3 myimage3 myimage3.jpg
I want to query so that the names of the images are concatenated into a single field for each product reference.
This query doesn't work
SELECT products.title,
products.image_ids,
GROUP_CONCAT(images.file_name)
FROM products
LEFT JOIN images ON images.id IN (products.image_ids)
WHERE products.id = 1
GROUP BY products.id
This one does:
SELECT products.title,
products.image_ids,
GROUP_CONCAT(images.file_name)
FROM products
LEFT JOIN images ON images.id IN (1,2,3)
WHERE products.id = 1
GROUP BY products.id
And produces the desired results
title image_ids file_names
--------------------------------------------------------------
myproduct 1,2,3 myimage.jpg,myimage2.jpg,myimage3.jpg
Why doesn't the first query work when it is asking the same thing as the second query and how can I make it work.

IN does not work for comma-separated lists of values.
Basically, you are not comparing integers, you are comparing strings:
SELECT 1 IN (1, 2, 3) -- True
SELECT 1 IN ('1, 2, 3') -- False ('1' <> '1, 2, 3')
Use FIND_IN_SET instead:
SELECT products.title,products.image_ids, GROUP_CONCAT(images.file_name)
FROM products
LEFT JOIN
images
ON FIND_IN_SET(images.id, products.image_ids)
WHERE products.id = 1
GROUP BY
products.id
This, however, is not the best solution performance-wise since FIND_IN_SET is non-sargable. It will require a full table scan on images.
If you have some reasonable limit on the number of values in products.image_ids (say, no more than 5 images per product), you can use this query instead:
SELECT products.title,products.image_ids, GROUP_CONCAT(images.file_name)
FROM (
SELECT 1 AS n
UNION ALL
SELECT 2 AS n
UNION ALL
SELECT 3 AS n
UNION ALL
SELECT 4 AS n
UNION ALL
SELECT 5 AS n
) q
CROSS JOIN
products
LEFT JOIN
images
ON SUBSTRING_INDEX(SUBSTRING_INDEX(image_ids, ',', n), ',', 1)
WHERE products.id = 1
AND SUBSTRING_INDEX(image_ids, ',', n) <> SUBSTRING_INDEX(image_ids, ',', n - 1)
GROUP BY
products.id

Related

How to get all rows from one table which have all relations?

I have 3 tables:
companies (id, name)
union_products (id, name)
products (id, company_id, union_product_id, price_per_one_product)
I need to get all companies which have products with union_product_id in (1,2) and total price of products (per company) is less than 100.
What I am trying to do now:
select * from "companies" where exists
(
select id from "products"
where "companies"."id" = "products"."company_id"
and "union_product_id" in (1, 2)
group by id
having COUNT(distinct union_product_id) = 2 AND SUM(price_per_one_product) < 100
)
The problem I stuck with is that I'm getting 0 rows from the query above, but it works if I'll change COUNT(distinct union_product_id) = 2 to 1.
DB fiddle: https://www.db-fiddle.com/f/iRjfzJe2MTmnwEcDXuJoxn/0
Try to join the three tables as the following:
SELECT C.id, C.name FROM
products P JOIN union_products U
ON P.union_product_id=U.id
JOIN companies C
ON P.company_id=C.id
WHERE P.union_product_id IN (1, 2)
GROUP BY C.id, C.name
HAVING COUNT(DISTINCT P.union_product_id) = 2 AND
SUM(P.price_for_one_product) < 100
ORDER BY C.id
See a demo.
SELECT c.name FROM "companies" c
JOIN "products" p ON c.id = p.company_id
WHERE union_product_id IN (1, 2) AND price_for_one_product < 100
GROUP BY c.name
HAVING COUNT(DISTINCT p.name) =2
This would provide you all the company(s) name(s) which has provides both union_product_id 1 and 2 and the price_for_one_product/ price_per_one_product is less than 100.
Note: You might need to change price_for_one_product with price_per_one_product, as in question you have used price_per_one_product but db-fiddle link table defination has used price_for_one_product.

Fetch the data based on min(sort_order)

I have a table Product which have many Images.
Product
id name
1 Sofa
2 Bed
Images
id product_id image_url sort_order
1 1 "url5" 521
2 1 "url1" 200
3 1 "url1" 100
4 2 "url1" 1
5 2 "url2" 2
I want to fetch images where sort_order have minimum value for 100 products like below:
id product_id image_url sort_order
3 1 "url1" 100
4 2 "url1" 1
I know I need to use min(sort_order) for images but don't find the correct syntax.
I am trying the below. but no luck
select i.*
from images i
join product p on p.id = i.product_id
where p.id in (1, 2, ....)
and i.sort_order = min(sort_order)
Any help to build the correct query?
Number your images per product and keep the rows numbered 1.
select id, product_id, image_url, sort_order
from
(
select
i.*,
row_number() over (partition by product_id order by sort_order) as rn
from images i
) numbered
where rn = 1;
An alternative is to query the table twice:
select *
from images
where (product_id, sort_order) in
(
select product_id, min(sort_order)
from images
group by product_id
);
why you need to use min ?? correct me if i am wrong but you said you need images where sort_order = 1, so you just need to change your where and add limit.
WHERE i.sort_order = 1 LIMIT 100
This is pretty bad code, but maybe it'll give you an idea of how you can do this.
select p.id,p.name,*
From product p
inner join (
select i.product_id, min(sort_order) as minSortOrder
from images i
group by i.product_id
) i on i.product_id = p.id
inner join images ii on ii.product_id = p.id and ii.sort_order = i.minSortOrder
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ccae44d199f9c9ac0edf4b75dd1b973b
edit: another way to do this is to create a row_number ordered by Sort_Order, so the Row_number = 1 will always be the lowest Sort_order, then we wrap that query in an outer query to apply the where clause
SELECT *
FROM (
SELECT p.id
,p.name
,ii.image_url
,ii.sort_order
,row_number() OVER (
PARTITION BY ii.product_id ORDER BY sort_order
) AS RowNumber
FROM product p
INNER JOIN images ii ON ii.product_id = p.id
) s
WHERE s.rownumber = 1
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ca31c7650bc87174110bf9aa09b230d9

SQL | List all all tuples(a, b, c) if there exists another tuple with equal (b,c)

I have three tables where the bold attribute(s) is the primary key
Restaurants(restaurant_ID, name, ...)
resturant_ID, name, ...
1, Macdonalds
2, Hubert
3, Dorsia
... ...
Identifier(restaurant_ID, food_ID)
restaurant_ID, food_ID, ...
1, 1
1, 4
2, 1
2, 7
... ...
Food(food_ID, name, ...)
food_ID food_name
1 Chips
2 Burgers
3 Salmon
... ...
Using postgres I want to list out all restaurants (restaurant_id and name - 1 row per restaurant) that have share the exact same set of foods with at least one other restaurant.
For example, let's say
Restaurant with ID "1" has only associated food_id's 1 and 4 as shown in Identifier
Restaurant with ID "3" has only associated food_id's 4 and 1 as shown in Identifier
Restaurant with ID "7" has only associated food_id's 6 as shown in Identifier
Restaurant with ID "9" has only associated food_id's 6 as shown in Identifier
Then output
Restaurant_id name
1 name1
3 name3
7 ...
9 ...
Any help would be greatly appreciated!
Thank you
Use the aggregate function string_agg() to get the full list of foods for each restaurant:
with cte as (
select restaurant_ID,
string_agg(food_ID::varchar(10),',' order by food_ID) foods
from identifier
group by restaurant_ID
)
select r.*
from Restaurants r inner join cte c
on c.restaurant_ID = r.restaurant_ID
where exists (select 1 from cte where restaurant_ID <> c.restaurant_ID and foods = c.foods)
But I would prefer to group restaurants based on matching foods:
with cte as (
select restaurant_ID,
string_agg(food_ID::varchar(10),',' order by food_ID) foods
from identifier
group by restaurant_ID
)
select string_agg(r.name, ',') restaurants
from Restaurants r inner join cte c
on c.restaurant_ID = r.restaurant_ID
group by foods
having count(*) > 1
See the demo.
Here is a way to get the unique set of resturants having exactly same food items. This uses array_agg() and array_to_string() functions
With cte as
(select T.restaurant_id, array_to_string(array_agg(food_id), ',') as food_list
from
(select *
from Identifier t1
order by restaurant_id, food_id) T
group by T.restaurant_id)
select
concat(r1.name,',',r2.name) as resturant_names,
t1.restaurant_id as restaurant_id1,
r1.name as restaurant_1,
t2.restaurant_id as restaurant_id2,
r2.name as restaurant_2,
t1.food_list as common_food_ids
from cte t1
join cte t2
on t1.restaurant_id < t2.restaurant_id
and t1.food_list = t2.food_list
left join Restaurants r1
on t1.restaurant_id = r1.restaurant_id
left join Restaurants r2
on t2.restaurant_id = r2.restaurant_id;
EDIT : Here is a dB fiddle - https://dbfiddle.uk/?rdbms=postgres_12&fiddle=e2de05edfbe036cc0d81c64d60f0b599 . Also, just for reference, solution to the same problem in Oracle using listagg function - https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=12785c3d5abbca97be5d44dd45a6da4a
Update : Below query addresses the update output format of the question.
With cte as
(select T.restaurant_id, array_to_string(array_agg(food_id), ',') as food_list
from
(select *
from Identifier t1
order by restaurant_id, food_id) T
group by T.restaurant_id)
select
--concat(r1.name,',',r2.name) as resturant_names,
t1.restaurant_id as restaurant_id,
r1.name as restaurant--,
--t2.restaurant_id as restaurant_id2,
--r2.name as restaurant_2,
--t1.food_list as common_food_ids
from cte t1
join cte t2
on t1.restaurant_id = t2.restaurant_id
and t1.food_list = t2.food_list
left join Restaurants r1
on t1.restaurant_id = r1.restaurant_id
left join Restaurants r2
on t2.restaurant_id = r2.restaurant_id;
As I understand your question, you want all restaurants that have the same list of foods as restaurant 1.
If so, that's a relation division problem. Here is an approach using joins and aggregation:
select r.name
from identifier i1
inner join identifier i2 on i2.food_id = i1.food_id
inner join restaurant r on r.restaurant_id = i2.restaurant_id
where i1.restaurant_id = 1
group by r.restaurant_id
having count(*) = (select count(*) from identifier i3 where i3.restaurant_id = 1)

How to grab data from joined tables in a list of values condition?

I have three tables: Images table, Product table, and ProductGallery table. My Images table have two columns
In this way:
ID Path
1 JeffAtwood.jpg
2 GeoffDalgas.jpg
3 Jarrod Dixon.jpg
4 JoelSpolsky.jpg
My Products table have two columns :
ID Image ID
1 1
2 2
3 3
4 4
My ProductGallery has three columns
ID ImageID ProductID
1 1 1
2 2 1
3 3 1
4 4 1
5 1 2
6 2 2
7 3 2
8 4 2
I need to create a stored procedure that return for me all the paths for list of products
So I have a parameter called #ids NVARCHAR(MAX)
in which all the products ids that i need is there with a comma separating each value.
so I have :
#ids = '1' + ','+ '2'+',' ..... and so on
Any one can help me to write the sql statement does the job for me ?
So far I have this
Declare #Ids VARCHAR(MAX)
select #Ids = '1';
select P.ID ,I.Path from Images I
left outer join Products P on (I.ID IN(select ImageID from Products where ID in (SELECT Items FROM Split(#Ids,','))))
left outer join ProductGallery PG on (PG.ImageID = I.ID)
WHERE PG.ProductID IN(SELECT Items FROM Split(#Ids,',')) AND P.ID IN
(SELECT Items FROM Split(#Ids,','))
GROUP BY P.ID, I.Path
Notice #Ids parameter is already filled with values and I will pass it from my C# code.
Not at a machine to test but you could try
select I.Path from Images I
left outer join Products P on P.ImageID = I.ID
where CHARINDEX(convert(varchar, P.ID), #ids) > 0
By looking at your query, I assume you already have a function to split a delimited string.
Try this (to get images from product table, as per comment):
Select p.Id ProductId, i.Path imagePath
From Products p
Join Images i on p.imageId = i.Id
Where p.Id in (
--your Split() function here
SELECT Items FROM dbo.Split(#Ids,',')
)

MySQL query - possible to include this clause?

I have the following query, which retrieves 4 adverts from certain categories in a random order.
At the moment, if a user has more than 1 advert, then potentially all of those ads might be retrieved - I need to limit it so that only 1 ad per user is displayed.
Is this possible to achieve in the same query?
SELECT a.advert_id, a.title, a.url, a.user_id,
FLOOR(1 + RAND() * x.m_id) 'rand_ind'
FROM adverts AS a
INNER JOIN advert_categories AS ac
ON a.advert_id = ac.advert_id,
(
SELECT MAX(t.advert_id) - 1 'm_id'
FROM adverts t
) x
WHERE ac.category_id IN
(
SELECT category_id
FROM website_categories
WHERE website_id = '8'
)
AND a.advert_type = 'text'
GROUP BY a.advert_id
ORDER BY rand_ind
LIMIT 4
Note: The solution is the last query at the bottom of this answer.
Test Schema and Data
create table adverts (
advert_id int primary key, title varchar(20), url varchar(20), user_id int, advert_type varchar(10))
;
create table advert_categories (
advert_id int, category_id int, primary key(category_id, advert_id))
;
create table website_categories (
website_id int, category_id int, primary key(website_id, category_id))
;
insert website_categories values
(8,1),(8,3),(8,5),
(1,1),(2,3),(4,5)
;
insert adverts (advert_id, title, user_id) values
(1, 'StackExchange', 1),
(2, 'StackOverflow', 1),
(3, 'SuperUser', 1),
(4, 'ServerFault', 1),
(5, 'Programming', 1),
(6, 'C#', 2),
(7, 'Java', 2),
(8, 'Python', 2),
(9, 'Perl', 2),
(10, 'Google', 3)
;
update adverts set advert_type = 'text'
;
insert advert_categories values
(1,1),(1,3),
(2,3),(2,4),
(3,1),(3,2),(3,3),(3,4),
(4,1),
(5,4),
(6,1),(6,4),
(7,2),
(8,1),
(9,3),
(10,3),(10,5)
;
Data properties
each website can belong to multiple categories
for simplicity, all adverts are of type 'text'
each advert can belong to multiple categories. If a website has multiple categories that are matched multiple times in advert_categories for the same user_id, this causes the advert_id's to show twice when using a straight join between 3 tables in the next query.
This query joins the 3 tables together (notice that ids 1, 3 and 10 each appear twice)
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
inner join adverts a on a.advert_id = ac.advert_id and a.advert_type = 'text'
where wc.website_id='8'
order by a.advert_id
To make each website show only once, this is the core query to show all eligible ads, each only once
select *
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
The next query retrieves all the advert_id's to be shown
select advert_id, user_id
from (
select
advert_id, user_id,
#r := #r + 1 r
from (select #r:=0) r
cross join
(
# core query -- vvv
select a.advert_id, a.user_id
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
# core query -- ^^^
order by rand()
) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2
There are 3 levels to this query
aliased EligibleAdsAndUserIDs: core query, sorted randomly using order by rand()
aliased RowNumbered: row number added to core query, using MySQL side-effecting #variables
the outermost query forces mysql to collect rows as numbered randomly in the inner queries, and group by user_id causes it to retain only the first row for each user_id. limit 2 causes the query to stop as soon as two distinct user_id's have been encountered.
This is the final query which takes the advert_id's from the previous query and joins it back to table adverts to retrieve the required columns.
only once per user_id
feature user's with more ads proportionally (statistically) to the number of eligible ads they have
Note: Point (2) works because the more ads you have, the more likely you will hit the top placings in the row numbering subquery
select a.advert_id, a.title, a.url, a.user_id
from
(
select advert_id
from (
select
advert_id, user_id,
#r := #r + 1 r
from (select #r:=0) r
cross join
(
# core query -- vvv
select a.advert_id, a.user_id
from adverts a
where a.advert_type = 'text'
and exists (
select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
where wc.website_id='8'
and a.advert_id = ac.advert_id)
# core query -- ^^^
order by rand()
) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2
) Top2
inner join adverts a on a.advert_id = Top2.advert_id;
I'm thinking through something but don't have MySQL available.. can you try this query to see if it works or crashes...
SELECT
PreQuery.user_id,
(select max( tmp.someRandom ) from PreQuery tmp where tmp.User_ID = PreQuery.User_ID ) MaxRandom
from
( select adverts.user_id,
rand() someRandom
from adverts, advert_categories
where adverts.advert_id = advert_categories.advert_id ) PreQuery
If the "tmp" alias is recognized as a temp buffer of the preliminary query as defined by the OUTER FROM clause, I might have something that will work... I think the field as a select statement from a queried from WONT work, but if it does, I know I'll have something solid for you.
Ok, this one might make the head hurt a bit, but lets get the logical thing going... The inner most "Core Query" is a basis that gets all unique and randomly assigned QUALIFIED Users that have a qualifying ad base on the category chosen, and type = 'text'. Since the order is random, I don't care what the assigned sequence is, and order by that. The limit 4 will return the first 4 entries that qualify. This is regardless of one user having 1 ad vs another having 1000 ads.
Next, join to the advertisements, reversing the table / join qualifications... but by having a WHERE - IN SUB-SELECT, the sub-select will be on each unique USER ID that was qualified by the "CoreQuery" and will ONLY be done 4 times based on ITs inner limit. So even if 100 users with different advertisements, we get 4 users.
Now, the Join to the CoreQuery is the Advert Table based on the same qualifying user. Typically this would join ALL records against the core query given they are for the same user in question... This is correct... HOWEVER, the NEXT WHERE clause is what filters it down to only ONE ad for the given person.
The Sub-Select is making sure its "Advert_ID" matches the one selected in the sub-select. The sub-select is based ONLY on the current "CoreQuery.user_ID" and gets ALL the qualifying category / ads for the user (wrong... we don't want ALL ads)... So, by adding an ORDER BY RAND() will randomize only this one person's ads in the result set... then Limiting THAT by 1 will only give ONE of their qualified ads...
So, the CoreQuery restricts down to 4 users. Then for each qualified user ID, gets only 1 of the qualified ads (by its inner order by RAND() and LIMIT 1 )...
Although I don't have MySQL to try, the queries are COMPLETELY legit and hope it works for you.... man, I love brain teasers like this...
SELECT
ad1.*
from
( SELECT ad.user_id,
count(*) as UserAdCount,
RAND() as ANYRand
from
website_categories wc
inner join advert_categories ac
ON wc.category_id = ac.category_id
inner join adverts ad
ON ac.advert_id = ad.advert_id
AND ad.advert_type = 'text'
where
wc.website_id = 8
GROUP BY
1
order by
3
limit
4 ) CoreQuery,
adverts ad1
WHERE
ad1.advert_type = 'text'
AND CoreQuery.User_ID = ad1.User_ID
AND ad1.advert_id in
( select
ad2.advert_id
FROM
adverts ad2,
advert_categories ac2,
website_categories wc2
WHERE
ad2.user_id = CoreQuery.user_id
AND ad2.advert_id = ac2.advert_id
AND ac2.category_id = wc2.category_id
AND wc2.website_id = 8
ORDER BY
RAND()
LIMIT
1 )
I like to suggest that you do the random with php. This is way faster than doing it in mySQL.
"However, when the table is large (over about 10,000 rows) this method of selecting a random row becomes increasingly slow with the size of the table and can create a great load on the server. I tested this on a table I was working that contained 2,394,968 rows. It took 717 seconds (12 minutes!) to return a random row."
http://www.greggdev.com/web/articles.php?id=6
set #userid = -1;
select
a.id,
a.title,
case when #userid = a.userid then
0
else
1
end as isfirst,
(#userid := a.userid)
from
adverts a
inner join advertcategories ac on ac.advertid = a.advertid
inner join categories c on c.categoryid = ac.categoryid
where
c.website = 8
order by
a.userid,
rand()
having
isfirst = 1
limit 4
Add COUNT(a.user_id) as owned in the main select directive and add HAVING owned < 2 after Group By
http://dev.mysql.com/doc/refman/5.5/en/select.html
I think this is the way to do it, if the one user has more than one advert then we will not select it.