select unique combination in two overlap datasets

select unique combination in two overlap datasets - sql

For example, in SAS, I have 5 IDs in dataset A(below left). There is a dataset B,(could potentially contain some A's IDs,below right).What I need is to find one unique combination( A is the primary outcome dataset) on A and B has same sex, age range within 5 and income range within 10000.Tt is possible that there are a lot of b.id could merge with a.id. But here's the kick, I can only use b.id once. In this case, 101 merge with 106, 102 merge with 111,103 merge with 112,105 merge with 110. Sorry I have a hard time how to describe my question. Hopefully it is clear enough. Thanks!
ID sex age income ID sex age income
101 F 30 20000 106 F 26 21000
102 M 20 10000 102 M 20 10000
103 F 38 30000 110 M 45 44000
104 M 55 35000 111 M 19 14000
105 M 43 45000 112 F 33 34000
outcome
ID_a sex_a age_a income_a ID_b sex_b age_b income_b
101 F 30 20000 106 F 26 21000
102 M 20 10000 111 M 19 14000
103 F 38 30000 112 F 33 34000
104 M 55 35000
105 M 43 45000 110 M 45 44000

select a.Id, b.Id from SetA a
left join SetB b on a.sex = b.sex and
a.age between b.age - 5 and b.age + 5 and
a.income between b.income - 10000 and b.income + 10000

You should be able to use the techniques used in the answers to this question to do your fuzzy matching while making sure that each b.id is only used once.
The idea is to load your B dataset into a temporary array / hash object, so you can keep track of which b.ids have already been used while matching it onto A.

SELECT
A.ID,
B.ID
FROM lefttable A
INNER JOIN righttable B
ON (A.sex = B.sex) AND (A.age BETWEEN B.age -5 AND B.age + 5) AND (A.income BETWEEN B.income -10000 AND B.age + 10000)

Related

How do I change my SQL SELECT GROUP BY query to show me which records are missing a value?

I have a list of codes by area and type. I need to get the unique codes for each type, which I can do with a simple SELECT query with a GROUP BY. I now need to know which area does not have one of the codes. So how do I run a query to group by unique values and tell me how records do not have one of the values?
ID Area Type Code
1 10 A 123
2 10 A 456
3 10 B 789
4 10 B 987
5 10 C 654
6 10 C 321
7 20 A 123
8 20 B 789
9 20 B 987
10 20 C 654
11 20 C 321
12 30 A 137
13 30 A 456
14 30 B 579
15 30 B 789
16 30 B 987
17 30 C 654
18 30 C 321
I can run this query to group them by type and get get the unique codes:
SELECT tblExample.Type, tblExample.Code
FROM tblExample
GROUP BY tblExample.Type, tblExample.Code
This gives me this:
Type Code
A 123
A 137
A 456
B 579
B 789
B 987
C 321
C 654
Now I need to know which areas do not have a given code. For example, Code 123 does not appear for Area 10 and code 137 does not appear for codes 10 and 20. How do I write a query to give me that areas are missing a code? The format of the output doesn't matter, I just need to get the results. I'm thinking the results could be in one column or spread out in multiple columns:
Type Code Missing Areas or Missing1 Missing2
A 123 30 30
A 137 10, 20 10 20
A 456 20 20
B 579 10, 20 10 20
B 789
B 987
C 321
C 654

You can get a list of the missing code/areas by first generating all combinations and then filtering out the ones that exist:
select t.type, c.code
from (select distinct type from tblExample) t cross join
(select distinct code from tblExample) c left join
tblExample e
on t.type = e.type and c.code = e.code
where e.type is null;

how to write this sql

I have one table : SCORE
name score
mike 97
tom 86
lucy 44
and another table : RANK
low up rank
90 100 A
80 90 B
70 80 C
60 70 D
0 60 E
and I want the result like that
name score rank
mike 97 A
tom 86 B
lucy 44 E
How to write sql

Try the below query, JOIN two table with condition of rank between low and up
SELECT name,score,rank
FROM score
JOIN rank ON score > low AND score<=up

Join on second table if value not found in first table

I would like to join on a second table only if the results of the first join are blank. Below is a subsection of Table A data:
ID Metro Submarket
1 NYC Manhattan
2 NYC Brooklyn
3 NYC Queens
4 NYC Bronx
5 NYC Newark
The tables I'm using for the joins are:
Table B Table C
Metro Submarket A.Price B.Price C.Price Metro A.Price B.Price C.Price
NYC Manhattan 54 32 48 NYC 50 49 69
NYC Queens 35 39 59 Philly 49 48 37
NYC Brooklyn 20 49 58 Chicago 20 48 36
NYC Bronx 49 30 20
NYC Newark 49 50 -
I'm adding the Price columns from Table B to Table A based on a Metro and Submarket match. However, Table B doesn't have all the prices. If I can't find a match in Table B then I want to look into Table C for a match only on Metro.
For ID 5, we can find the A and B prices in Table B. However, the C price is blank. In that case, I want it to retrieve the C price from Table C (69 is what it would choose).
I'm using SAS 9.4. SQL, macros, or anything else SAS can handle is welcome!

You can left join both tables to the main table and simply use COALESCE(). This will give you the value if present in Table B, otherwise it will give you the value in Table C:
PROC SQL;
CREATE TABLE Output AS
SELECT
ta.ID,
ta.Metro,
ta.Submarket,
COALESCE(tb.A_Price,tc.A_Price) AS A_Price,
COALESCE(tb.B_Price,tc.B_Price) AS B_Price,
COALESCE(tb.C_Price,tc.C_Price) AS C_Price
FROM
tablea ta
LEFT JOIN
tableb tb
ON (tb.Metro = ta.Metro)
AND (tb.Submarket = ta.Submarket)
LEFT JOIN
tablec tc
ON (tc.Metro = ta.Metro);
QUIT;

how to get the unique records with min and max for each user

I have the following table:
id gender age highest weight lowest weight abc
a f 30 90 70 1.3
a f 30 90 65 null
a f 30 null null 1.3
b m 40 100 86 2.5
b m 40 null 80 2.5
c f 50 105 95 6.4
I need this result in sql server. What I need is the minimum of the weight and maximum of the weight and one record per user.
id gender age highest weight lowest weight abc
a f 30 90 65 1.3
b m 40 100 80 2.5
c f 50 105 95 6.4

Just do a grouping:
select id,
max(gender),
max(age),
max([highest weight]),
min([lowest weight]),
max(abc)
from SomeTable
group by id

You can do this using grouping:
select id, gender, max(highest_weight), min(lowwest_weight) from student
group by id, gender
But you need do define the rule for the other fields with variable value, like abc
Can you post more information?

Please help me to solve this [duplicate]

This question already has an answer here:
How can I perform this aggregate?
(1 answer)
Closed 9 years ago.
I have crated two table one is cutomer and other one is ord
select * from customers;
Customer table
1 101 jun 23 yyyy 15000
2 102 jas 24 zzzz 10000
3 103 fat 20 kkkk 20000
4 104 jini 40 llll 30000
5 105 michael 30 dddd 25000
6 106 das 25 hhhh 10000
7 107 vijay 26 mmmm 12000
8 108 thanku 31 jjjj 26000
9 109 vishnu 34 gggg 24000
10 110 vas 28 ffff 18000
select * from ord;
This is order table
1 12/11/2013 1:00:00 AM 102 2500
2 202 12/11/2013 4:14:17 AM 102 3000
3 203 12/9/2013 9:18:16 PM 103 2000
4 204 12/8/2013 12:00:00 PM 102 1000
5 205 12/24/2013 107 2000
This is tha union command that I have used
select c.name,c.salary,o.amount
from CUSTOMERS c
inner join ord o
on c.id=o.customer_id;
then the resulting table is
1 jas 10000 1000
2 jas 10000 3000
3 jas 10000 2500
4 fat 20000 2000
5 vijay 12000 2000
I want resulting table like this
1 jas 10000 6500
2 fat 20000 2000
3 vijay 12000 2000
plz help me for solving this.

group by c.name, c.salary with sum(salary) is what you want:
select c.name, c.salary, sum(o.amount )
from CUSTOMERS c
inner join ord o on c.id=o.customer_id
group by c.name, c.salary;

try this if it will work.
select c.name,c.salary,sum(o.amount)
from CUSTOMERS c
inner join ord o
on c.id=o.customer_id
group by 1,2;
Thanks.

select c.name,c.salary,SUM(o.amount )
from CUSTOMERS c
inner join ord o
on c.id=o.customer_id
GROUP BY c.name,c.salary
I think this will work

Use Left Join or RIGHT JOIN
select c.name,c.salary,o.amount
from CUSTOMERS c
left join ord o
on c.id=o.customer_id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

select unique combination in two overlap datasets - sql

select a.Id, b.Id from SetA a left join SetB b on a.sex = b.sex and a.age between b.age - 5 and b.age + 5 and a.income between b.income - 10000 and b.income + 10000

SELECT A.ID, B.ID FROM lefttable A INNER JOIN righttable B ON (A.sex = B.sex) AND (A.age BETWEEN B.age -5 AND B.age + 5) AND (A.income BETWEEN B.income -10000 AND B.age + 10000)

Related

How do I change my SQL SELECT GROUP BY query to show me which records are missing a value?

how to write this sql

Join on second table if value not found in first table

how to get the unique records with min and max for each user

Please help me to solve this [duplicate]

Categories

Resources