Filling the NULL parts of the tabels in the left join - sql

I have two tables that have similar structure but I need to obtain all information relative to zip code.
tableA
zip_code location
1 A
2 C
2 D
3 E
4 F
5 G
tableB
zip_code location n
2 A 1
2 C 2
2 D 3
3 A 4
3 E 5
4 F 6
4 H 7
6 Y 8
As you can see, one locatıon can have multiple zip_code. So, I have to use zip_code and location for the join condition. When I applied left join, I couldn't manage to fill NULL parts. My strategy to fill it is like this:
If zip_code in the tableA is not in the tableB, I search for the location names and choose the n based on the minimum difference between zip_codes.
If zip_code and location in the tableA are not in the tableB, I want to search for n based on the minimum difference between zip_codes, and if there are multiple possibilities based on this I want to choose the minimum n.
NOTE: When looking at the difference between zip_codes, if there is a situation where they are equal I want to get the smaller number. For example, 5 could get 4 and 6 in terms of minimum difference but I want to go with 4 and look other conditions then.
The resulting table should be something like this:
zip_code location n
1 A 1
2 C 2
2 D 3
3 E 5
4 F 6
5 G 6
I know it is a bit complicated but I can explain the fuzzy parts with more details

This sounds like an OUTER APPLY:
select a.*, b.n
from tableA a outer apply
(select top (1) b.*
from tableB b
order by (case when b.zip_code = a.zip_code and b.location = a.location
then -1
when b.location = a.location
then abs(b.zip_code - a.zip_code)
else abs(b.zip_code - a.zip_code)
end),
b.n
) b;
Here is a db<>fiddle.

Related

SQL query - Cumulatively concatenate strings in consecutive rows

I'm a data analyst, so I write SQL queries to retrieve data from a database. I'm not sure what kind of SQL exactly, just assume the most standard (also not things like 'DECLARE #tbl', and no create functions etc.)
Here is my problem.
Given the following table:
name
number
letter
A
1
a
A
2
b
A
3
c
A
4
d
B
1
a
B
2
b
B
3
c
B
4
d
I want the following result: (concatenate letter cumulatively, order by number))
name
number
letter
result
A
1
a
a
A
2
b
a,b
A
3
c
a,b,c
A
4
d
a,b,c,d
B
1
a
a
B
2
b
a,b
B
3
c
a,b,c
B
4
d
a,b,c,d
Any help is highly appreciated. Thanks very much.
This answers the original version of the question which was tagged MySQL.
MySQL doesn't support group_concat() as a window function. So a subquery may be your best alternative:
select t.*,
(select group_concat(t2.letter order by t2.number)
from t t2
where t2.name = t.name and t2.number <= t.number
) as letters
from t;

Creating a VIEW to get a connection count

I have a table below which stores Connections between 2 person
TABLE (CONNECTION)
ID | REQUEST_PERSON | REQUESTEE_PERSON
I would like to build a VIEW which gets the REQUEST_PERSON, REQUESTEE_PERSON and MUTUAL_CONNECTION_COUNT(other common connections count between them). Any help is appreciated
For Example if we have a table data as below
ID | REQUEST_PERSON | REQUESTEE_PERSON
1 A B
2 A C
3 B C
4 D B
5 D A
6 A E
7 B E
8 A F
9 C G
I need a VIEW display below
ID | REQUEST_PERSON | REQUESTEE_PERSON | MUTUAL_CONNECTION_COUNT
1 A B 3
2 A C 1
3 B C 1
4 D B 1
5 D A 1
6 A E 1
7 B E 1
8 A F 0
9 C G 0
This is rather tricky. Here is code that does what you want:
select c.*,
(select count(*)
from (select v.person2
from connections c2 cross apply
(values (c2.REQUESTEE_PERSON, c2.REQUEST_PERSON), (c2.REQUEST_PERSON, c2.REQUESTEE_PERSON)
) v(person1, person2)
where v.person1 IN (c.Request_Person, c.Requestee_Person)
group by v.person2
having count(*) = 2
) v
) in_common
from connections c
order by id;
Here is a SQL Fiddle.
The essence of the problem is finding people who are connected to both people in each row. Your connections are unidirectional, which makes the logic hard to express -- C could be either the first or second person in either connection.
Arrgh!
So, the innermost subquery adds reverse links to the graph. Then, it can focus on filtering by the first person -- who has to match the persons in the outer query. The second person is the one that might be in common.
The inner aggregation is just summarizing by the second person. It filters using having count(*) = 2 to indicate that both people in the outer query need to be connected to the second person in the inner query. The count(*) assumes that you have no duplicates.
Then, these are counted, which is the value you want.

Query problem-getting wrong result when a condition or a set of data included

I have bug in my query, but I have no idea what happen.
So there is 3 tables.
table 1
name grade min max
a
b
c
d
e
table 2
fullname name min max
a a123 1 10
bbbb b 2 20
c cccc 3 30
d dd 1 10
E Ed 2 20
table 3
value grade
25 A
15 B
5 C
my goal is using name, show the grade of the name( the max > value in table 3).
for example, c has 30 in max, it should have A grade, instead of B and C.
Also, the name usually is the fullname in table 2, but sometime it is name in the table2(like b)(here is one of bugs). That's how the table look like, I can't change it.
if I am not include the checking table 1.name = table 2.name . no bug at all, but cannot get grade for b
if i include the table 1.name = table 2.name.then, it has problem
for the query of matching the grade, it is like(assume get the min and max from table 2 before)
update table1
set table1.grade = table3.grade
from table1 inner join table3
on table1.max > table3.value
All the cases are include the checking table 1.name = table 2.name
case 1:
the grade will equal = C for all data if there is some data inlcude.
for example, in table1, if I am not include E, then everything is fine.
but if I include E, will get C grade for all records.
case 2:
if I run the query for all data at the same time, the result goes wrong.
it work fine if I update the record one by one.,
for example, i add one more condition in update query
update table1
set table1.grade = table3.grade
from table1 inner join table3
on table1.max > table3.value and fullname='c'
after getting wrong result, i add the condition and run it again,
then c will get grade 'A' instead of 'C'. but if I remove the condition and run the query again.
c will get grade 'C' again.
case 3:
there is no problem when I only run the set of data that will cause case 1 problem independently.
but if I put the data together, It cases problem.
That is all cases. I don't know what cause the problem. Please help
The result should be:
table 1
name grade min max
a C 1 10
b B 2 20
c A 3 30
d C 1 10
e B 2 20
If I remove table1.name = table2.name, result will be
table 1
name grade min max
a C 1 10
b null null null
c A 3 30
d C 1 10
e B 2 20
with table1.name = table2.name, result will be
table 1
name grade min max
a C 1 10
b C 2 20
c C 3 30
d C 1 10
e C 2 20
with table1.name = table2.name but remove e , result will be
table 1
name grade min max
a C 1 10
b B 2 20
c A 3 30
d C 1 10
with table1.name = table2.name but only for e,result will be
name grade min max
e B 2 20
those situation happen when I run the update query for whole table.
there is no problem with table1.name = table2.name if I update each row one by one.
At the very least, you should avoid the way you update the table. You should be carefull with joins on update. If you happen to have more than one value for the same row, the result is not deterministic. Better use this form:
update table1
set table1.grade = (SELECT TOP 1 table3.grade FROM table3
WHERE table3.value < table1.max
ORDER BY table3.value DESC)
I am not sure about the expected results but you may have traced the problem already: incorrect criteria.
Before doing an update, just try a select with the same criteria eg:
select *
from table1 inner join table3
on table1.max > table3.value
and see what you get.

Using multiple joins (e.g left join)

I would like to know what's the logic for multiple joins (for example below)
SELECT * FROM B returns 100 rows
SELECT B.* FROM B LEFT JOIN C ON B.ID = C.ID returns 120 rows
As I know using left join will returns any matching data from the left table which is B if data are found for both table. But how come when using left join, it returns more data than table B itself?
What am I do wrong or misunderstood here? Any guidance are very appreciated. Thanks in advance.
Let be table B:
id
----
1
2
3
Let be table C
id name
------------
1 John
2 Mary
2 Anne
3 Stef
Any id from b is matched with ids from c, then id=2 will be matched twice. So a left join on id will return 4 rows even if base table B has 3 rows.
Now look at a more evil example:
Table B
id
----
1
2
2
3
4
table C
id name
------------
1 John
2 Mary
2 Anne
3 Stef
Every id from b is matched with ids from c, then first id=2 will be matched twice and second id=2 will be matched twice so the result of
select b.id, c.name
from b left join c on (b.id = c.id)
will be
id name
------------
1 John
2 Mary
2 Mary
2 Anne
2 Anne
3 Stef
4 (null)
The id=4 is not matched but appears in the result because is a left join.
Look at the following example :
B = {1,2}
C = {(1,a),(1,b),(1,c),(1,d),(1,e)}
The result of B left join C will be :
1 | a
1 | b
1 | c
1 | d
1 | e
2 | null
The number of rows in the result is definitely larger than rows in B (2).
In general the number of rows in result of B left join C is bounded by B.size + C.size and not only by B.size as you think...
As per your query it do the join to B Table with C and B table is Left Table so it will display all the records of Left table in our case it is B and related from other Table in our Case it is C.

comparing rows in sql on two different columns

id address retailer
1 A 11
2 A 11
3 A 11
4 A 12
5 A 13
6 B 12
7 B 12
8 B 13
My output should be
id address retailer
1 A 11
4 A 12
5 A 13
6 B 12
8 B 13
i.e my query should return id's which have same address but not same retailer.
How toget this?
Try to use group by clause as below:
select min(id), address, retailer
from tab
group by address, retailer
Assuming you're joining on columns with no duplicates, which is by far the most common case:
An inner join of A and B gives the result of A intersect B, i.e. the inner part of a venn diagram intersection.
An outer join of A and B gives the results of A union B, i.e. the outer parts of a venn diagram union.
Examples:
Suppose you have two Tables, with a single column each, and data as follows:
A B
- -
1 3
2 4
3 5
4 6
Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.
Inner join:
An inner join using either of the equivalent queries gives the intersection of the two tables, i.e. the two rows they have in common.
select *
from a
INNER JOIN b on a.a = b.b;
select a.*,b.*
from a,b
where a.a = b.b;
a | b
--+--
3 | 3
4 | 4
Left outer join:
A left outer join will give all rows in A, plus any common rows in B.
select *
from a
LEFT OUTER JOIN b on a.a = b.b;
select a.*,b.*
from a,b
where a.a = b.b(+);
a | b
--+-----
1 | null
2 | null
3 | 3
4 | 4
Full outer join:
A full outer join will give you the union of A and B, i.e. All the rows in A and all the rows in B. If something in A doesn't have a corresponding datum in B, then the B portion is null, and vice versa.
select *
from a
FULL OUTER JOIN b on a.a = b.b;
a | b
-----+-----
1 | null
2 | null
3 | 3
4 | 4
null | 6
null | 5
select min(id) as id,address, retailer
from table1
group by address, retailer
order by id
The query you need is:
SELECT min(id), address, retailer
FROM table1 AS t1
group by address, retailer
order by address
Here's the source
Use This: It's working:
SELECT * FROM `sampletable` GROUP BY address, retailer