Optimizing query for finding friends - sql

I have a table which represents relations between pairs of people.
user_relating | user_related | relation_type
--------------------------------------------
1 | 2 | 1 --> means user 1 follows user 2
I need to find all friends for a specific user X.
Friendship means that two people follow each other.
So if user A follows users B, C. And users B, C follow A, then B, C are friends of A.
I've written this query :
SELECT users.* -- user's followers
FROM users
JOIN user_relations rel
ON users.id = rel.user_relating AND user_related = 2
INTERSECT
SELECT users.* -- who the user follows
FROM users
JOIN user_relations rel
ON users.id = rel.user_related AND user_relating = 2;
But I think it's inefficient.
Is there a more optimized way to get this done?
I've tried doing something like this :
SELECT DISTINCT f.*
FROM users f
JOIN user_relations u1
on f.id = u1.user_related
JOIN user_relations u2
on f.id = u2.user_relating
WHERE u1.user_related = 2
or u2.user_related = 2;
it seems much more efficient judging by the EXPLAIN ANALYZE (though I only have a really small table, like 10 rows so I'm not sure it's a good measurement).
But the problem here is that it returns the user in question too. Meaning, if I want friends of User B , then this query returns User B together with all his friends. Can I somehow exclude User B from the query result?
And, as stated formerly, I would be glad to receive some ideas for the most optimized and efficient way to do this kind of queries.

Probably the most efficient method is:
select ur.*
from user_relations ur
where ur.user_relating < ur.user_related and
ur.relation_type = 1 and
exists (select 1
from user_relations ur2
where ur2.user_relating = ur.user_related and
ur2.user_related = ur.user_relating and
ur2.relation_type = 1
);
And for performance, you want an index on user_relations(user_relating, user_related, relationship_type).
That said, this is similar to your version with the join, but does not require removing duplicates.
If you have lots of different relationship types, then an index on that column could also help.
EDIT:
If you have a particular user and want their friends:
select (case when user_relating = 2 then user_related else user_relating end) as user_friend
from user_relations ur
where ur.user_relating < ur.user_related and
ur.relation_type = 1 and
exists (select 1
from user_relations ur2
where ur2.user_relating = ur.user_related and
ur2.user_related = ur.user_relating and
ur2.relation_type = 1
) and
2 in (ur.user_relating, user_related)

Related

SELECT from 2 differents tables

I would like to select all possible brands for different products where the fk_category_id is for example equal to "2".
produits :
id titre fk_category_id fk_marque_id is_active is_delete
1 Swoke 2 1 1 0
2 Café 2 2 1 0
3 Fraise 2 3 1 0
4 Fruits 2 4 1 0
manufacturers :
id name
1 Swoke
2 Liqua
3 Alfaliquid
4 TJuice
5 otherBrands
I already tried a lot of things and for example this request :
SELECT m.name, m.id, p.fk_category_id
FROM produits p
INNER JOIN manufacturers m
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.fk_marque_id = m.id
AND p.is_delete = 0;
But it doesn't works.
The expected result is :
result :
id name
1 Swoke
2 Liqua
3 Alfaliquid
4 TJuice
It's the same as the table "manufacturers" but I have to sort by fk_category_id because I only want the brand with the fk_category = 2.
So if someone could explain me or help me to understand how to solve my "problem" ? Thanks you in advance, I continue my research by my side :).
If you need something else i can give you anything.
I think what you need to do is have a condition on your join, to say how the data from the 2 table should join together.
On the assumption that fk_marque_id is your reference in produits to an item in manufacturers (assumed from looking at your where clause), your sql could look like this:
SELECT
p.id, p.titre, m.name, m.id, p.fk_category_id
FROM
produits p
INNER JOIN manufacturers m ON p.fk_marque_id = m.id
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.is_delete = 0;
The naming convention of your fields in produits is a little weird however, if one of the the FK fields is a link to an ID in manufacturers. You'd normally expect to see something like FK_manufacturers_Id so it's clear that this column is the reference to that field (Id) in that table (manufacturers)
If you are just looking to join products and manufacturers table where fk_category_id = 2, it can be done something like this
SELECT m.id, m.name
FROM manufacturers m
INNER JOIN produits p
ON m.id = p.fk_marque_id
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.is_delete = 0;

Join Tables to return 1 or 0 based on multiple conditions

I am working on a project management website and have been asked for a new feature in a review meeting section.
A meeting is held to determine whether to proceed to the next phase, and I need to maintain a list of who attended each phase review meeting. I need to write an SQL query to return all people, with an additional column that states they have already been added before.
There are two tables involved to get my desired result, with the relevant columns listed below:
Name: PersonList
ID | Name | Division
Name: reviewParticipants
ProjectID | PersonID | GateID
The query I am looking for is something that returns all people in PersonList, with an additional "hasAttended" bit that is TRUE if reviewParticipants.ProjectID = 5 AND reviewParticpants.CurrentPhase = 'G0' ELSE FALSE.
PersonName | PersonID | hasAttended
Mr Smith | 1 | 1
Mr Jones | 2 | 0
I am not sure how to structure such a query with multiple conditions in a (left?) join, that would return as a different column name and data type, so I would appreciate if anybody can point me in the right direction?
With the result of this query I am going to add a series of checkboxes, and use this additional bit to mark it checked, or not, for page refreshes.
You can use LEFT JOIN as well:
SELECT DISTINCT p.*
,CASE WHEN rp.id IS NOT NULL THEN 1 ELSE 0 END AS hasAttended
FROM personlist p
LEFT JOIN reviewParticipants rp ON rp.personid = p.id
AND rp.projectid = 5
AND rp.currentphase = 'GO'
I agree with Gordon Linoff: I would prefer an int or tinyint over a bit value,
You can use exists to see if there is a matching row.
select p.*,
(case when exists (select 1
from reviewParticipants rp
where rp.personid = p.id and
rp.projectid = 5 and
rp.currentphase = 'GO'
)
then 1 else 0 end)
from personlist p;
I see no reason to prefer a bit over an integer, but you can return a bit if you really prefer.
This will do :
select a.* from PersonList a where a.hasAttended=1 and
a.Id in (select b.PersonId from reviewParticipants b
where b.ProjectID =5 and exists (
select 1 from reviewParticipants c where c.CurrentPhase = 'G0'and
c.Project =b.projectId
)
)

SQL - Updating rows in table based on multiple conditions

I'm kinda newbie in SQL and I have to create a request to update multiple rows in table based on multiple conditions.
From this example:
id email organisationid principaluserid role
1 john#smith.com MULT null 100
2 john#smith.C-100.com C-100 1 25
3 john#doe.com MULT null 100
4 john#doe.C-101.com C-101 3 50
5 john#doe.C-102.com C-102 3 25
6 jessica#smith.com C-102 null 25
The goal is to update all the entries from the User table where organisationid equals 'MULT' and who have only 1 principaleuserid match.
From the example above, the first 2 entries match my conditions.
I need then to replace the id=2 email (john#smith.C-100.com) with the one from id=1 email (john#smith.com).
To do the job step by step, I tried to retrieve all the entries that match my condition with this request:
Edit from #The_impaler answer:
SELECT * FROM User a1 WHERE a1.organisationid = 'MULT' AND (
SELECT COUNT(*) FROM User a2 WHERE a2.principaluserid = a1.id
) = 1;
But i'm still bugging on the way to update the entries. Any help is appreciated!
If I understand correctly, an update should do the trick:
update user u
set u.email = um.email
from user um
where um.id = u.principaluserid and
um.organizationid = 'MULT' and
not exists (select 1
from user up2
where up2.principaluserid = u.principaluserid and
up2.id <> u.id
);
Based on #The_impaler advice, I did this query that seems to answer my need:
UPDATE user u1
SET organisationid = (SELECT u2.organisationid FROM user u2 WHERE u1.id = u2.principaluserid),
WHERE u1.organisationid = 'MULT' AND
(SELECT COUNT(*) FROM user u2 WHERE u2.principaluserid = u1.id) = 1;
You could use an update based on join
UPDATE user u1
SET u1.email = u2.email
FROM user u2
WHERE u2.organisationid = 'MULT'
AND u1.id = u2.principaluserid
and if you need only the value that have only a single principaluserid
the you could use
UPDATE user u1
SET u1.email = u2.email
FROM user u2
INNER JOIN
(
select principaluserid , count(*)
from user
group by principaluserid
having count(*) =1
) t2 ON t2.principaluserid = u2.principaluserid
AND u2.organisationid = 'MULT'
AND u1.id = u2.principaluserid

Oracle SQL - selective filtering causes cartesian

Oracle 12.2
I have a SQL statement that is causing me issues. I am retrieving data from a table called BURNDOWN. If the user is an admin, they get to see all the data. If the user is NOT an admin, they are restricted to what they can see, based on some join conditions.
The issue I am running into is when the user is an ADMIN, I don’t need the other tables… subsequently, the JOIN condition is not relevant, so Oracle is deciding to do a cartesian join across everything…
How do I get around this so that is the user is an Admin, I only look at one table, else I look at all tables and include the join condition?
The example SQL is a contrived example, but it shows the issue.
Select
BURNDOWN.NAME,
BURNDOWN.ADDRESS,
BURNDOWN.STATE
from BURNDOWN, FILTER_A, FILTER_B, FILTER_C
Where
(
:ISAdmin = 1
Or
(
BURNDOWN.x=FILTER_A.x and
FILTER_A.y=FILTER_B.y and
FILTER_B.z=FILTER_C.z and
FILTER_C.user = :ThisUser
)
)
Use an EXISTS to see if the data exists in the FILTER tables without joining them in to the results.
select bd.*
from burndown bd
where ( :isadmin = 1 or
exists ( select 1
from filter_a a
inner join filter_b b on b.y = a.y
inner join filter_c c on c.z = b.z
where a.x = bd.x
and c.user = :ThisUser )
)
Presumably, you want:
select bd.*
from burndown bd
where :ISAdmin = 1 or
(exists (select 1 from FILTER_A a where bd.x = a.x) or
exists (select 1 from FILTER_B b where bd.y = b.y) or
exists (select 1 from FILTER_C c where bd.z = c.z)
);

SQL SELECT criteria in another table

I have 2 related tables:
messages
--------
mid subject
--- -----------------
1 Hello world
2 Bye world
3 The third message
4 Last one
properties
----------
pid mid name value
--- --- ---------------- -----------
1 1 read false
2 1 importance high
3 2 read false
4 2 importance low
5 3 read true
6 3 importance low
7 4 read false
8 4 importance high
And I need to get from messages using the criteria on the properties table.
Eg: if I have a criteria like return unread (read=false) high prio (importance=high) messages it should return
mid subject
--- -----------------
1 Hello world
4 Last one
How could I get this with a SELECT clause (MySQL dialect)?
In SQL, any expression in a WHERE clause can only reference one row at a time. So you need some way of getting multiple rows from your properties table onto one row of result. You do this with self-joins:
SELECT ...
FROM messages AS m
JOIN properties AS pRead
ON m.mid = pRead.mid AND pRead.name = 'read'
JOIN properties AS pImportance
ON m.mid = pImportance.mid AND pImportance.name = 'importance'
WHERE pRead.value = 'false' AND pImportance.value = 'high';
This shows how awkward it is to use the EAV antipattern. Compare with using conventional attributes, where one attribute belongs in one column:
SELECT ...
FROM messages AS m
WHERE m.read = 'false' AND m.importance = 'high';
By the way, both answers from #Abe Miessler and #Thomas match more mid's than you want. They match all mid's where read=false OR where importance=high. You need to combine these properties with the equivalent of AND.
I believe the query below will work.
UPDATE: #Gratzy is right, this query won't work, take a look at the structure changes I suggested.
SELECT DISTINCT m.id as mid, m.subject
FROM message as m
INNER JOIN properties as p
ON m.mid = p.mid
where (p.name = 'read' and p.value = 'false') or (p.name = 'importance' AND p.value = 'high')
The structure of your properties table seems a little off to me though...
Would it be possible to structure the table like this:
messages
--------
mid subject Read Importance
--- ----------------- --------- ------------
1 Hello world false 3
2 Bye world false 1
3 The third message true 1
4 Last one false 3
importance
----------
iid importanceName
--- --------------
1 low
2 medium
3 high
and use this query:
SELECT m.id as mid, m.subject
FROM message as m
where m.read = false AND m.importance = 3
Clearly, you are using an EAV (Entity-Attribute-Value) schema. One of the many reasons for avoiding such a structure is that it makes queries more difficult. However, for the example you gave, you could do something like:
Select ...
From messages As M
Where Exists (
Select 1
From Properties As P1
Where P1.mid = M.mid
And P1.name = 'unread' And P1.value = 'false'
)
And Exists (
Select 1
From Properties As P2
Where P2.mid = M.mid
And P2.name = 'importance' And P2.value = 'high'
)
A more succinct solution would be:
Select ...
From messages As M
Where Exists (
Select 1
From Properties As P1
Where P1.mid = M.mid
And ((P1.name = 'unread' And P1.value = 'false')
Or (P1.name = 'importance' And P1.value = 'high'))
Having Count(*) = 2
)
Select m.mid, m.subject
from properties p
inner join properties p1 on p.mid = p1.mid
inner join messages m on p.mid = m.mid
where
p.name = 'read'
and p.value = 'false'
and p1.name = 'importance'
and p2.value = 'high'
I prefer to put my filter criteria in the where clause and leave my join's to elements that are in both tables and are the actual criteria for the join.
Another way might be (untested) to use a derived table to hold the criteria that all messages must meet then use the standard relational division technique of double NOT EXISTS
SELECT mid,
subject
FROM messages m
WHERE NOT EXISTS
( SELECT *
FROM ( SELECT 'read' AS name,
'false' AS value
UNION ALL
SELECT 'importance' AS name,
'high' AS value
)
c
WHERE NOT EXISTS
(SELECT *
FROM properties P
WHERE p.mid = m.mid
AND p.name =c.name
AND p.value=c.value
)
)
If you want to keep your existing data model, then go with Bill Karwin's first suggestion. Run it with this select clause to understand what it's doing:
select m.*, r.value as read, i.value as importance
from message m
join properties r
on r.mid = m.mid and r.name = 'read'
join properties i
on i.mid = m.mid and i.name = 'importance'
where r.value = 'false' and i.value = 'high';
But if you go this way, there are a few constraints you should put in place to avoid storing and retrieving bad data:
A unique index on message(mid) and a unique index on properties(pid), both of which I'm sure you have already.
A unique index on properties(mid, name) so that each property can only be defined once for a message -- otherwise you may get duplicate results from your query. This will also help your query performance by allowing an index access for both joins.