Am I using GROUP_CONCAT properly? - sql

I'm selecting properties and joining them to mapping tables where they get mapped to filters such as location, destination, and property type.
My goal is to grab all the properties and then LEFT JOIN them to the tables, and then basically get data that shows all the locations, destinations a property is attached to and the property type itself.
Here's my query:
SELECT p.slug AS property_slug,
p.name AS property_name,
p.founder AS founder,
IF (p.display_city != '', display_city, city) AS city,
d.name AS state,
type
GROUP_CONCAT( CONVERT(subcategories_id, CHAR(8)) ) AS foo,
GROUP_CONCAT( CONVERT(categories_id, CHAR(8)) ) AS bah
FROM properties AS p
LEFT JOIN destinations AS d ON d.id = p.state
LEFT JOIN regions AS r ON d.region_id = r.id
LEFT JOIN properties_subcategories AS sc ON p.id = sc.properties_id
LEFT JOIN categories_subcategories AS c ON c.subcategory_id = sc.subcategories_id
WHERE 1 = 1
AND p.is_active = 1
GROUP BY p.id
Before I do the GROUP BY and GROUP_CONCAT my data looks like this:
id name type category_id subcategory_id state
--------------------------------------------------------------------------
1 The Hilton Hotel 1 1 2 7
1 The Hilton Hotel 1 1 3 7
1 The BlaBla Resort 2 2 5 7
After the GROUP BY and GROUP_CONCAT it becomes...
id name type category_id subcategory_id state
--------------------------------------------------------------------------
1 The Hilton Hotel 1 1, 1 2, 3 7
1 The BlaBla Resort 2 1 3 7
Is this the preferred way of grabbing all the possible mappings for the property in one go, with GROUP_CONCAT into a CSV like this?
Using this data, I can render something like...
<div class="property" categories="1" subcategories="2,3">
<h2>{property_name}</h2>
<span>{property_location}</span>
</div>
Then use Javascript to show/hide based on if the user clicks on an anchor which has say, a subcategory="2" attribute it would hide each .property that doesn't have 2 inside of its subcategories attribute value.

I believe you want something like this:
CREATE TABLE property (id INT NOT NULL PRIMARY KEY, name TEXT);
INSERT
INTO property
VALUES
(1, 'Hilton'),
(2, 'Astoria');
CREATE TABLE category (id INT NOT NULL PRIMARY KEY, property INT NOT NULL);
INSERT
INTO category
VALUES
(1, 1),
(2, 1),
(3, 2);
CREATE TABLE subcategory (id INT NOT NULL PRIMARY KEY, category INT NOT NULL);
INSERT
INTO subcategory
VALUES
(1, 1),
(2, 1),
(3, 2),
(5, 3),
(6, 3),
(7, 3);
SELECT id, name,
CONCAT(
'{',
(
SELECT GROUP_CONCAT(
'"', c.id, '": '
'[',
(
SELECT GROUP_CONCAT(sc.id ORDER BY sc.id SEPARATOR ', ' )
FROM subcategory sc
WHERE sc.category = c.id
),
']' ORDER BY c.id SEPARATOR ', ')
FROM category c
WHERE c.property = p.id
), '}')
FROM property p;
which would output this:
1 Hilton {"1": [1, 2], "2": [3]}
2 Astoria {"3": [5, 6, 7]}
The last field is a properly formed JSON which maps category id's to the arrays of subcategory id's.

You should add DISTINCT, and possibly ORDER BY:
GROUP_CONCAT(DISTINCT CONVERT(subcategories_id, CHAR(8))
ORDER BY subcategories_id) AS foo,
GROUP_CONCAT(DISTINCT CONVERT(categories_id, CHAR(8))
ORDER BY categories_id) AS bah
It's "de-normalized" if you want to call it like this. If that's the best representation to be used for rendering is another question, I think it's fine. Some may say it's hack, but I guess it's not too bad.
By the way, a comma seems to be missing after the "type".

Related

Count all existing combinations of groupings of records

I have these db tables
questions: id, text
answers: id, text, question_id
answer_tags: id, answer_id, tag_id
tags: id, text
question has has many answers
answer has many tags through answer_tags, belongs to question
tag has many answers through answer_tags
An answer has an unlimited number of tags
I would like to show all combinations of groupings of tags that exist ordered by count
Examples data
Question 1, Answer 1, tag1, tag2, tag3, tag4
Question 2, Answer 2, tag2, tag3, tag4
Question 3, Answer 3, tag3, tag4
Question 4, Answer 4, tag4
Question 5, Answer 5, tag3, tag4, tag5
Question 1, Answer 6, <no tags>
How can I solve this using SQL?
I'm not sure if this is possible with SQL but if it does I think it would need RECURSIVE method.
Expected results:
tag3, tag4 occur 4 times
tag2, tag3, tag4 occur 2 times
tag2, tag3 occur 2 times
We would only return results with groupings greater than 1. No single tag is ever returned, it must be at least 2 tags together to be counted.
Building on #filiprem's answer and using a slightly modified function from the answer here you get:
--test data
create table questions (id int, text varchar(100));
create table answers (id int, text varchar(100), question_id int);
create table answer_tags (id int, answer_id int, tag_id int);
create table tags (id int, text varchar(100));
insert into questions values (1, 'question1'), (2, 'question2'), (3, 'question3'), (4, 'question4'), (5, 'question5');
insert into answers values (1, 'answer1', 1), (2, 'answer2', 2), (3, 'answer3', 3), (4, 'answer4', 4), (5, 'answer5', 5), (6, 'answer6', 1);
insert into tags values (1, 'tag1'), (2, 'tag2'), (3, 'tag3'), (4, 'tag4'), (5, 'tag5');
insert into answer_tags values
(1,1,1), (2,1,2), (3,1,3), (4,1,4),
(5,2,2), (6,2,3), (7,2,4),
(8,3,3), (9,3,4),
(10,4,4),
(11,5,3), (12,5,4), (13,5,5);
--end test data
--function to get all possible combinations from an array with at least 2 elements
create or replace function get_combinations(source anyarray) returns setof anyarray as $$
with recursive combinations(combination, indices) as (
select source[i:i], array[i] from generate_subscripts(source, 1) i
union all
select c.combination || source[j], c.indices || j
from combinations c, generate_subscripts(source, 1) j
where j > all(c.indices) and
array_length(c.combination, 1) <= 2
)
select combination from combinations
where array_length(combination, 1) >= 2
$$ language sql;
--expected results
SELECT tags, count(*) FROM (
SELECT q.id, get_combinations(array_agg(DISTINCT t.text)) AS tags
FROM questions q
JOIN answers a ON a.question_id = q.id
JOIN answer_tags at ON at.answer_id = a.id
JOIN tags t ON t.id = at.tag_id
GROUP BY q.id
) t1
GROUP BY tags
HAVING count(*)>1;
Note: this gives tag2,tag4 occurs 2 times which was missed in the expected results (from questions 1 and 2)
You can indeed use a recursive CTE to produce the possible combinations. First select all tag IDs as an array of one element. Then UNION ALL a JOIN of the CTE and the tag IDs appending the tag ID to the array if it is larger than the largest ID in the array.
To the CTE join an aggregation getting the tag IDs for every answer as an array. In the ON clause check that the answer's array contains the array from the CTE with the array contains operator #>.
Exclude the combinations from the CTE with only one tag in a WHERE clause as you're not interested in those.
Now GROUP BY the combination of tags an exclude all the combinations which occur less than twice in a HAVING clause -- you're not interested in them too. If you want you also "translate" the IDs to the names of the tags in the SELECT list.
WITH RECURSIVE "cte"
AS
(
SELECT ARRAY["t"."id"] "id"
FROM "tags" "t"
UNION ALL
SELECT "c"."id" || "t"."id" "id"
FROM "cte" "c"
INNER JOIN "tags" "t"
ON "t"."id" > (SELECT max("un"."e")
FROM unnest("c"."id") "un" ("e"))
)
SELECT "c"."id" "id",
(SELECT array_agg("t"."text")
FROM unnest("c"."id") "un" ("e")
INNER JOIN "tags" "t"
ON "t"."id" = "un"."e") "text",
count(*) "count"
FROM "cte" "c"
INNER JOIN (SELECT array_agg("at"."tag_id" ORDER BY "at"."tag_id") "id"
FROM "answer_tags" "at"
GROUP BY at.answer_id) "x"
ON "x"."id" #> "c"."id"
WHERE array_length("c"."id", 1) > 1
GROUP BY "c"."id"
HAVING count(*) > 1;
Result:
id | text | count
---------+------------------+-------
{2,3} | {tag2,tag3} | 2
{3,4} | {tag3,tag4} | 4
{2,4} | {tag2,tag4} | 2
{2,3,4} | {tag2,tag3,tag4} | 2
db<>fiddle
Try this:
SELECT tags, count(*) FROM (
SELECT q.id, array_agg(DISTINCT t.text) AS tags
FROM questions q
JOIN answers a ON a.question_id = q.id
JOIN answer_tags at ON at.answer_id = a.id
JOIN tags t ON t.id = at.tag_id
GROUP BY q.id
) t1
GROUP BY tags
HAVING count(*)>1;

How to populate column with list from another table?

Suppose I have three tables: STORES, STATES, and STORES_STATES.
STORES contains records of individual stores
STATES just contains a list of all 50 US states
STORES_STATES is a join table that contains pairs of stores and states to show which stores are available in which states
Consider the following tables (can't figure out how to format an ASCII table here):
What I need is a SELECT statement that will return each store in one column, and a list of states in another:
How do I combine the results of a subquery into a single column like this?
I would imagine it would be similar to this query:
SELECT
STORE_NAME,
(SELECT STATE_ABV FROM STORES_STATES WHERE STORES_STATES.STORE_ID = STORES.ID)
FROM STORES;
But this obviously fails since the subquery returns more than one result.
You can use APPLY :
SELECT s.store_name, STUFF(ss.state_abv, 1, 1, '') AS States
FROM stores s CROSS APPLY
( SELECT ', '+ss.state_abv
FROM stores_state ss
WHERE ss.store_id = s.id
FOR XML PATH('')
) ss(state_abv);
your two options are either the STRING_AGG function in the newest versions of SQL Server, or using an XML concatenation technique as described in this answer:
How to concatenate text from multiple rows into a single text string in SQL server?
The XML method is messy-looking in your code and difficult to remember - I always have to look up the syntax - but it's actually quite fast.
You can use STUFF() function along side with FOR XML PATH('') and join both tables on StoreID
CREATE TABLE Stores
(
ID INT,
StoreName VARCHAR(45)
);
CREATE TABLE States
(
StoreID INT,
StateABV VARCHAR(45)
);
INSERT INTO Stores VALUES
(1, 'Wolmart'), (2, 'Costco'), (3, 'Croegers');
INSERT INTO States VALUES
(1, 'NY'),
(1, 'WY'),
(1, 'MI'),
(2, 'AL'),
(2, 'GA'),
(2, 'TX'),
(3, 'FL');
SELECT SR.StoreName,
STUFF(
(
SELECT ',' + ST.StateABV
FROM States ST
WHERE ST.StoreID = SR.ID
FOR XML PATH('')
), 1, 1, ''
) State
FROM Stores SR;
Returns:
+-----------+----------+
| StoreName | State |
+-----------+----------+
| Wolmart | NY,WY,MI |
| Costco | AL,GA,TX |
| Croegers | FL |
+-----------+----------+

How to separate group by if value exists in multiple categories

I have a table like this:
CREATE TEMP TABLE users_category
(
user_id int,
category_id int
);
insert into users_category (user_id, category_id)
values (100, 1), (100, 2), (103, 4), (105, 4), (106, 2), (107, 1)
Then I'm trying to calculate how much users use categories like :
select category_id,count(*)
from users_category
group by category_id
Response is :
category_id count
4 2
1 2
2 2
How to add logic if users exists in more than one category , calculate it in other category.
For example user 100 exists in category 1 and 2 and i will not calculate it for both category-s . I want add new category ex. 99 and must add users there.
Response must be like :
category_id count
4 2
1 1
2 1
99 1 (user who was in both category)
How to do it ?
You need to aggregate first at the user level and then at the category level:
select category_id, count(*)
from (select user_id,
(case when min(category_id) = max(category_id) then min(category_id)
else 99
end) as category_id
from users_category uc
group by user_id
) uc
group by category_id;

To find infinite recursive loop in CTE

I'm not a SQL expert, but if anybody can help me.
I use a recursive CTE to get the values as below.
Child1 --> Parent 1
Parent1 --> Parent 2
Parent2 --> NULL
If data population has gone wrong, then I'll have something like below, because of which CTE may go to infinite recursive loop and gives max recursive error. Since the data is huge, I cannot check this bad data manually. Please let me know if there is a way to find it out.
Child1 --> Parent 1
Parent1 --> Child1
or
Child1 --> Parent 1
Parent1 --> Parent2
Parent2 --> Child1
With Postgres it's quite easy to prevent this by collecting all visited nodes in an array.
Setup:
create table hierarchy (id integer, parent_id integer);
insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3),
(5, 4),
(3, 5); -- endless loop
Recursive query:
with recursive tree as (
select id,
parent_id,
array[id] as all_parents
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.all_parents||c.id
from hierarchy c
join tree p
on c.parent_id = p.id
and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;
To do this for multiple trees at the same time, you need to carry over the ID of the root node to the children:
with recursive tree as (
select id,
parent_id,
array[id] as all_parents,
id as root_id
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.all_parents||c.id,
p.root_id
from hierarchy c
join tree p
on c.parent_id = p.id
and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
and c.root_id = p.root_id
)
select *
from tree;
Update for Postgres 14
Postgres 14 introduced the (standard compliant) CYCLE option to detect cycles:
with recursive tree as (
select id,
parent_id
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id
from hierarchy c
join tree p
on c.parent_id = p.id
)
cycle id -- track cycles for this column
set is_cycle -- adds a boolean column is_cycle
using path -- adds a column that contains all parents for the id
select *
from tree
where not is_cycle
You haven't specified the dialect or your column names, so it is difficult to make the perfect example...
-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2
(3, 2, 'SubChild')
-- End random data
;WITH RecursiveCTE (StartingID, Level, Parents, Loop, ID, ParentID, Description) AS
(
SELECT ID, 1, '|' + CAST(ID AS VARCHAR(MAX)) + '|', 0, * FROM #MyTable
UNION ALL
SELECT R.StartingID, R.Level + 1,
R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
MT.*
FROM #MyTable MT
INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)
SELECT StartingID, Level, Parents, MAX(Loop) OVER (PARTITION BY StartingID) Loop, ID, ParentID, Description
FROM RecursiveCTE
ORDER BY StartingID, Level
Something like this will show if/where there are loops in the recursive cte. Look at the column Loop. With the data as is, there is no loops. In the comments there are examples on how to change the values to cause a loop.
In the end the recursive cte creates a VARCHAR(MAX) of ids in the form |id1|id2|id3| (called Parents) and then checks if the current ID is already in that "list". If yes, it sets the Loop column to 1. This column is checked in the recursive join (the ABD R.Loop = 0).
The ending query uses a MAX() OVER (PARTITION BY ...) to set to 1 the Loop column for a whole "block" of chains.
A little more complex, that generates a "better" report:
-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2
(3, 3, 'SubChild')
-- End random data
-- The "terminal" childrens (that are elements that don't have childrens
-- connected to them)
;WITH WithoutChildren AS
(
SELECT MT1.* FROM #MyTable MT1
WHERE NOT EXISTS (SELECT 1 FROM #MyTable MT2 WHERE MT1.ID != MT2.ID AND MT1.ID = MT2.ParentID)
)
, RecursiveCTE (StartingID, Level, Parents, Descriptions, Loop, ParentID) AS
(
SELECT ID, -- StartingID
1, -- Level
'|' + CAST(ID AS VARCHAR(MAX)) + '|',
'|' + CAST(Description AS VARCHAR(MAX)) + '|',
0, -- Loop
ParentID
FROM WithoutChildren
UNION ALL
SELECT R.StartingID, -- StartingID
R.Level + 1, -- Level
R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
R.Descriptions + CAST(MT.Description AS VARCHAR(MAX)) + '|',
CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
MT.ParentID
FROM #MyTable MT
INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)
SELECT * FROM RecursiveCTE
WHERE ParentID IS NULL OR Loop = 1
This query should return all the "last child" rows, with the full parent chain. The column Loop is 0 if there is no loop, 1 if there is a loop.
Here's an alternate method for detecting cycles in adjacency lists (parent/child relationships) where nodes can only have one parent which can be enforced with a unique constraint on the child column (id in the table below). This works by computing the closure table for the adjacency list via a recursive query. It starts by adding every node to the closure table as its own ancestor at level 0 then iteratively walks the adjacency list to expand the closure table. Cycles are detected when a new record's child and ancestor are the same at any level other than the original level zero (0):
-- For PostgreSQL and MySQL 8 use the Recursive key word in the CTE code:
-- with RECURSIVE cte(ancestor, child, lev, cycle) as (
with cte(ancestor, child, lev, cycle) as (
select id, id, 0, 0 from Table1
union all
select cte.ancestor
, Table1.id
, case when cte.ancestor = Table1.id then 0 else cte.lev + 1 end
, case when cte.ancestor = Table1.id then cte.lev + 1 else 0 end
from Table1
join cte
on cte.child = Table1.PARENT_ID
where cte.cycle = 0
) -- In oracle uncomment the next line
-- cycle child set isCycle to 'Y' default 'N'
select distinct
ancestor
, child
, lev
, max(cycle) over (partition by ancestor) cycle
from cte
Given the following adjacency list for Table1:
| parent_id | id |
|-----------|----|
| (null) | 1 |
| (null) | 2 |
| 1 | 3 |
| 3 | 4 |
| 1 | 5 |
| 2 | 6 |
| 6 | 7 |
| 7 | 8 |
| 9 | 10 |
| 10 | 11 |
| 11 | 9 |
The above query which works on SQL Sever (and Oracle, PostgreSQL and MySQL 8 when modified as directed) rightly detects that nodes 9, 10, and 11 participate in a cycle of length 3.
SQL(/DB) Fiddles demonstrating this in various DBs can be found below:
Oracle 11gR2
SQL Server 2017
PostgeSQL 9.5
MySQL 8
You can use the same approach described by Knuth for detecting a cycle in a linked list here. In one column, keep track of the children, the children's children, the children's children's children, etc. In another column, keep track of the grandchildren, the grandchildren's grandchildren, the grandchildren's grandchildren's grandchildren, etc.
For the initial selection, the distance between Child and Grandchild columns is 1. Every selection from union all increases the depth of Child by 1, and that of Grandchild by 2. The distance between them increases by 1.
If you have any loop, since the distance only increases by 1 each time, at some point after Child is in the loop, the distance will be a multiple of the cycle length. When that happens, the Child and the Grandchild columns are the same. Use that as an additional condition to stop the recursion, and detect it in the rest of your code as an error.
SQL Server sample:
declare #LinkTable table (Parent int, Child int);
insert into #LinkTable values (1, 2), (1, 3), (2, 4), (2, 5), (3, 6), (3, 7), (7, 1);
with cte as (
select lt1.Parent, lt1.Child, lt2.Child as Grandchild
from #LinkTable lt1
inner join #LinkTable lt2 on lt2.Parent = lt1.Child
union all
select cte.Parent, lt1.Child, lt3.Child as Grandchild
from cte
inner join #LinkTable lt1 on lt1.Parent = cte.Child
inner join #LinkTable lt2 on lt2.Parent = cte.Grandchild
inner join #LinkTable lt3 on lt3.Parent = lt2.Child
where cte.Child <> cte.Grandchild
)
select Parent, Child
from cte
where Child = Grandchild;
Remove one of the LinkTable records that causes the cycle, and you will find that the select no longer returns any data.
Try to limit the recursive result
WITH EMP_CTE AS
(
SELECT
0 AS [LEVEL],
ManagerId, EmployeeId, Name
FROM Employees
WHERE ManagerId IS NULL
UNION ALL
SELECT
[LEVEL] + 1 AS [LEVEL],
ManagerId, EmployeeId, Name
FROM Employees e
INNER JOIN EMP_CTE c ON e.ManagerId = c.EmployeeId
AND s.LEVEL < 100 --RECURSION LIMIT
)
SELECT * FROM EMP_CTE WHERE [Level] = 100
Here is the solution for SQL Server:
Table Insert script:
CREATE TABLE MyTable
(
[ID] INT,
[ParentID] INT,
[Name] NVARCHAR(255)
);
INSERT INTO MyTable
(
[ID],
[ParentID],
[Name]
)
VALUES
(1, NULL, 'A root'),
(2, NULL, 'Another root'),
(3, 1, 'Child of 1'),
(4, 3, 'Grandchild of 1'),
(5, 4, 'Great grandchild of 1'),
(6, 1, 'Child of 1'),
(7, 8, 'Child of 8'),
(8, 7, 'Child of 7'), -- This will cause infinite recursion
(9, 1, 'Child of 1');
Script to find the exact records which are the culprit:
;WITH RecursiveCTE
AS (
-- Get all parents:
-- Any record in MyTable table could be an Parent
-- We don't know here yet which record can involve in an infinite recursion.
SELECT ParentID AS StartID,
ID,
CAST(Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
FROM MyTable
UNION ALL
-- Recursively try finding all the childrens of above parents
-- Keep on finding it until this child become parent of above parent.
-- This will bring us back in the circle to parent record which is being
-- keep in the StartID column in recursion
SELECT RecursiveCTE.StartID,
t.ID,
CAST(RecursiveCTE.[ParentChildRelationPath] + ' -> ' + t.Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
FROM RecursiveCTE
INNER JOIN MyTable AS t
ON t.ParentID = RecursiveCTE.ID
WHERE RecursiveCTE.StartID != RecursiveCTE.ID)
-- FInd the ones which causes the infinite recursion
SELECT StartID,
[ParentChildRelationPath],
RecursiveCTE.ID
FROM RecursiveCTE
WHERE StartID = ID
OPTION (MAXRECURSION 0);
Output of above query:

query to count number of unique relations

I have 3 tables:
t_user (id, name)
t_user_deal (id, user_id, deal_id)
t_deal (id, title)
multiple user can be linked to the same deal. (I'm using oracle but it should be similar, I can adapt it)
How can I get all the users (name) with the number of unique user he made a deal with.
let's explain with some data:
t_user:
id, name
1, joe
2, mike
3, John
t_deal:
id, title
1, deal number 1
2, deal number 2
t_user_deal:
id, user_id, deal_id
1, 1, 1
2, 2, 1
3, 1, 2
4, 3, 2
the result I expect:
user_name, number of unique user he made a deal with
Joe, 2
Mike, 1
John, 1
I've try this but I didn't get the expected result:
SELECT tu.name,
count(tu.id) AS nbRelations
FROM t_user tu
INNER JOIN t_user_deal tud ON tu.id = tud.user_id
INNER JOIN t_deal td ON tud.deal_id = td.id
WHERE
(
td.id IN
(
SELECT DISTINCT td.id
FROM t_user_deal tud2
INNER JOIN t_deal td2 ON tud2.deal_id = td2.id
WHERE tud.id <> tud2.user_id
)
)
GROUP BY tu.id
ORDER BY nbRelations DESC
thanks for your help
This should get you the result
SELECT id1, count(id2),name
FROM (
SELECT distinct tud1.user_id id1 , tud2.user_id id2
FROM t_user_deal tud1, t_user_deal tud2
WHERE tud1.deal_id = tud2.deal_id
and tud1.user_id <> tud2.user_id) as tab, t_user tu
WHERE tu.id = id1
GROUP BY id1,name
Something like
select name, NVL (i.ud, 0) ud from t_user join (
SELECT user_id, count(*) ud from t_user_deal group by user_id) i on on t_user.id = i.user_id
where i.ud > 0
Unless I'm missing somethig here. It actually sounds like your question references having a second user in the t_user_deal table. The model you've described here doesn't include that.
PostgreSQL example:
create table t_user (id int, name varchar(255)) ;
create table t_deal (id int, title varchar(255)) ;
create table t_user_deal (id int, user_id int, deal_id int) ;
insert into t_user values (1, 'joe'), (2, 'mike'), (3, 'john') ;
insert into t_deal values (1, 'deal 1'), (2, 'deal 2') ;
insert into t_user_deal values (1, 1, 1), (2, 2, 1), (3, 1, 2), (4, 3, 2) ;
And the query.....
SELECT
name, COUNT(DISTINCT deal_id)
FROM
t_user INNER JOIN t_user_deal ON (t_user.id = t_user_deal.user_id)
GROUP BY
user_id, name ;
The DISTINCT might not be necessary (in the COUNT(), that is). Depends on how clean your data is (e.g., no duplicate rows!)
Here's the result in PostgreSQL:
name | count
------+-------
joe | 2
mike | 1
john | 1
(3 rows)