Efficient SQL to calculate # of shared affiliations - sql

So I have
a table that stores asymmetrical connections between two persons
(like a Twitter follow; not like a Facebook friend) and
a table that stores a person's affiliations to various groups
My task is to find, for each asymmetrical relationship, the number of affiliations shared between the "from person" and the "to person".
I made this brute force solution, but I'm wondering if brighter minds could come up with something more efficient.
select frm01.from_person_id, frm01.to_person_id, count(*) num_affl
from
(
select lnk.from_person_id, lnk.to_person_id, ga.grp_id from_grp_id
from links lnk
left outer join grp_affl ga on lnk.from_person_id = ga.person_id
group by lnk.from_person_id, lnk.to_person_id, grp_id
) frm01
inner join
(
select lnk.from_person_id, lnk.to_person_id, ga.grp_id to_grp_id
from links lnk
left outer join grp_affl ga on lnk.to_person_id = ga.person_id
group by lnk.from_person_id, lnk.to_person_id, grp_id
) to01
on (
frm01.from_person_id = to01.from_person_id
and frm01.to_person_id = to01.to_person_id
and frm01.from_grp_id = to01.to_grp_id
)
group by frm01.from_person_id, frm01.to_person_id;
Using ANSI SQL on Netezza (which doesn't allow correlated subqueries).
TIA!
Edited to add table schema:
table lnk:
from_person_id to_person_id
1 4
2 5
3 6
4 2
5 3
table grp_affl:
person_id grp_id
1 A
1 B
1 C
2 A
3 B
4 C
5 A
5 B
5 C
6 A
expected output:
from_person_id to_person_id num_affl
1 4 1
2 5 1
3 6 0
4 2 0
5 3 1
Persons 1 & 4 have 1 affiliation in common (C), 2 & 5 have A in common, 5 & 3 have B in common. 3 & 6 have nothing in common. Likewise 4 & 2.

You can do this with aggregation and the right joins:
select pairs.from_person, pairs.to_person, count(*)
from links pairs join
grp_affil fromga
on fromga.person_id = pairs.from_person join
grp_affil toga
on toga.person_id = pairs.to_person and
toga.grp_id = fromga.grp_id
group by pairs.from_person, pairs.to_person;
The joins bring in the groups. The last condition only brings in matching groups between the two persons. The final group by counts them.

Related

Find rows that contains same value on different columns

The table to find which rows contains same value on two different columns for 2 rows. Here is a small sample rows among 2k+ rows.
id left right
1 3 4
2 4 1
3 1 9
4 2 6
5 2 5
6 9 8
7 0 7
In the above case, I need to get row 1,2,3,6 as it contains 4 on two rows of two different columns i.e (id=1&2),1 on two rows of two different columns(id=1&3) and 9 on two rows of two different columns(id=3&6)
My thoughts:
I did thought many things for example cross join on left and right column, group by and count etc.
with Final as (With OuterTable as (WITH Alias AS (SELECT id as left_id , left FROM Test)
SELECT DISTINCT id, left_id FROM Alias
INNER JOIN Test ON Alias.left = Test.right)
SELECT id from OuterTable
UNION ALL
SELECT left_id from OuterTable)
SELECT DISTINCT * from Final;
It's messy, but it works.
You can do it with EXISTS:
SELECT t1.*
FROM tablename t1
WHERE EXISTS (
SELECT 1 FROM tablename t2
WHERE t1.id <> t2.id AND (t2.left = t1.right OR t1.left = t2.right)
)
See the demo.
Results:
id
left
right
1
3
4
2
4
1
3
1
9
6
9
8

SQL Query left join not filtering data

This is my query
SELECT
tblDiseases.disease
FROM
tblRel
LEFT JOIN
tblDiseases ON tblRel.diseaseID = tblDiseases.diseaseID
WHERE
tblRel.symptomID = '1' AND tblRel.symptomID = '2' AND tblRel.symptomID = '3'
and here are my tables
#tblDiseases - holds all disease names
######################################
diseaseID | disease
-----------------------
1 Tifoyd
2 Jondis
3 Malarya
4 Pneomonia
5 Dengu
#tblSymptoms - holds all symptoms
#################################
symptomID | symptom
-------------------------
1 Headache
2 Temparature
3 Less Pain
4 Sever Pain
5 Mussle Pain
#tblRel - holds relation between diseases and symptoms
######################################################
relID | dieaseID | symptomID
-----------------------------
1 1 1
2 1 2
3 3 1
4 3 2
5 3 3
I have selected the disease with symptoms "headache" "temperature" and "less pain" so it should give "Malarya" but instead it gives nothing.
A single column in a single row cannot have three different values. What you want is aggregation, to compare values in different rows:
SELECT d.disease
FROM tblRel r JOIN
tblDiseases d
ON r.diseaseID = d.diseaseID
WHERE r.symptomID IN (1, 2, 3)
GROUP BY d.disease
HAVING COUNT(*) = 3; -- has all three symptoms
Note that LEFT JOIN is not necessary, because you need a match to name the disease. (Presumably, the disease ids match between the tables, as they would in a well-formed database.)

Get max value from a joined list paired with another column in DB2

I have the following tables:
Table I:
etu | nr |
1 2
2 2
2 3
2 1
3 4
3 9
Table A:
etu | rsp | nr
2 8 2
2 7 3
2 3 1
3 2 4
3 6 9
Now what I want to have as a result table is
etu | nr | rsp
2.. 3 7
3.. 9 6
So etu and nr are linked together and if multiple equal etu entries are available only the one with the highest nr is taken and the rsp value is added in the result table. in addition if more etu entries are available in the table I there are .. added to the etu value.
Explain: For the 3 9 6 row: The last row on table I is 3 9 so 3 is the number that is looked for and 9 is the highest number for the 3 rows. So we take that and add the rsp value for that ( 6 ) and we add that to the result table. For the 2 row it is the same 2 3 being the highest 2 row in table I.
I got something like:
select x.etu, x.rsp, y.nr from(
select i.etu etu, max(i.nr) maxnr, a.rsp from i left join a on
i.etu=a.etu and i.nr=a.nr group by etu)t
inner join a x on x.etu=t.etu and x.nr=t.nr inner join y on y.etu=t.etu
and y.nr=t.nr
or
select i.etu, max(i.nr) a.rsp from i left join a on i.etu=a.etu and
i.nr=a.nr grounp by
None even get me close to get the results that I want less add the .. after the etu when having the right result.
The system is DB10.5 Windows.
Thank you for all your help in advance.
Viking
I would use a CTE here like this:
with tmp as (
select i.etu, max(i.nr) as nt, count(*) as cnt
from i
group by i.etu)
select case
when tmp.cnt = 1 then char(a.etu)
else concat(rtrim(char(a.etu)), '..')
end as etu,
a.nr,
a.rsp
from tmp
left outer join a
on a.etu = tmp.etu
and a.nr = tmp.nr
The CTE provides the information necessary to join with a to get the correct response, and append the .. as necessary.

sql query to collect users having common items

I'm facing a problem with Postgres. Here is the example:
i got 3 tables: users, items and boxes
boxes table:
user_id | item_id
1 | 3
1 | 4
1 | 6
1 | 7
2 | 5
2 | 10
2 | 11
3 | 5
3 | 6
3 | 7
Given this boxes table, i would like to retrieve items among users who share minimum 2. So the SQL query result expected should be
item_id: 6, 7
because user 1 and user 3 share items 6 and 7.
But user 2 and 3 share only one item: the item 5 so item 5 is not in result.
I'm trying so many ways without success. I wonder if someone can help me.
Try this. It returns 6 and 7 (and 5,6,7 if you add a record "1,5"), but I haven't tested it extensively.
-- The Outer query gets all the item_ids matching the user_ids returned from the subquery
SELECT DISTINCT c.item_id FROM boxes c -- need DISTINCT because we get 1,3 and 3,1...
INNER JOIN boxes d ON c.item_id = d.item_id
INNER JOIN
--- the subquery gets all the combinations of user ids which have more than one shared item_id
(SELECT a.user_id as first_user,b.user_id as second_user FROM
boxes a
INNER JOIN boxes b ON a.item_id = b.item_id AND a.user_id <> b.user_id -- don't count items where the user_id is the same! Could just make the having clause be > 2 but this way is clearer
GROUP BY a.user_id,b.user_id
HAVING count(*) > 1) s
ON s.first_user = c.user_id AND s.second_user = d.user_id

How to compare two columns in SQL for multiple rows?

I have a data set with four columns (author, document, rating 1, rating 2)
How do I pick authors who have written a document that has been rated higher in rating 1 than rating 2, and has also written another document that has been rated higher in rating 2 than rating 1.
Basically:
AUTHOR DOCUMENT RATING 1 RATING 2
A 1 1 2
B 2 1 2
B 3 3 1
C 4 2 2
C 5 3 4
C 6 1 3
D 7 1 2
D 8 1 2
So my desired query will give me B and C because it has written docs that have had both higher and lower numbers in both ratings.
What I have:
SELECT DISTINCT author
FROM(
(SELECT author
FROM table_name
WHERE rating1 < rating2)
UNION
(SELECT author
FROM table_name
WHERE rating1 > rating2)
)
AS a
What I cant figure out is how to group the authors, test whether rating 1 and rating 2 are both higher and lower, output the name and then move on to the next group of authors. What the above prints is just the set of distinct names with either higher or lower numbers. So this one would print D as well for example.
What is my SQL code missing that would satisfy the criteria mentioned above
Try this,
select *
from myTable as t1
inner join MyTable as t2
on t1.author = t2.author
and t2.rating1 > t2.rating2
where t1.rating1 > t1.rating2