pad database out with NULL criteria - sql

If I have the following sample table (order by ID)
ID Date Type
-- ---- ----
1 01/01/2000 A
2 22/04/1995 A
2 14/02/2001 B
Where you can immediate see that ID=1 does not have a Type=B, but ID=2 does. What I want to do, if fill in a line to show this:
ID Date Type
-- ---- ----
1 01/01/2000 A
1 NULL B
2 22/04/1995 A
2 14/02/2001 B
where there could potentially be 100's of different types, (so may need to end up inserting 100's rows per person if they lack 100's Types!)
Is there a general solution to do this?
Could I possibly outer join the table on itself and do it that way?

You can do this with a cross join to generate all the rows and a left join to get the actual data values:
select i.id, s.date, t.type
from (select distinct id from sample) i cross join
(select distinct type from sample) t left join
sample s
on s.id = i.id and
s.type = t.type;

Related

How to join three tables and set blank fields to null?

I used full join and left join to join the Person, Tasks and Task tables. The result shown on the screen resulted in a number of lines greater than six. Unset fields have the value NULL and that's good.
The expected output can be obtained using such joins, however it is necessary to use clauses that allow the common fields to be joined. How can I do this?
A CROSS JOIN produces all the combinations you want. Then a simple outer join can retrieve the related rows (should they exist).
You don't mention the database you are using so a faily standard query will do. For example (in PostgreSQL):
select
row_number() over(order by p.id, t.id) as id,
p.name,
case when x.st is not null then t.hr end,
x.st
from person p
cross join tasks t
left join task x on x.personid_fk = p.id and x.taskid_fk = t.id
order by p.id, t.id;
Result:
id name case st
--- ----- ------ -----
1 Anna null null
2 Anna null null
3 Luo 13:00 true
4 Luo 14:00 false
5 John null null
6 John null null
See running example at DB Fiddle.

Optimize a complex PostgreSQL Query

I am attempting to make a complex SQL join on several tables: as shown below. I have included an image of the dB schema also.
Consider table_1 -
e_id name
1 a
2 b
3 c
4 d
and table_2 -
e_id date
1 1/1/2019
1 1/1/2020
2 2/1/2019
4 2/1/2019
The issue here is performance. From the tables 2 - 4 we only want the most recent entry for a given e_id but because these tables contain historical data (~ >3.5M rows) it's quite slow. I've attached an example of how we're currently trying to achieve this but it only includes one join of 'table_1' with 'table_x'. We group by e_id and get the max date for it. The other way we've thought about doing this is creating a Materialized View and pulling data from that and refreshing it after some period of time. Any improvements welcome.
from fds.region as rg
inner join (
select e_id, name, p_id
from fds.table_1
where sec_type = 'S' AND active_flag = 1
) as table_1 on table_1.e_id = rg.e_id
inner join fds.table_2 table_2 on table_2.e_id = rg.e_id
inner join fds.sec sec on sec.p_id = table_1.p_id
inner join fds.entity ent on ent.int_entity_id = sec.int_entity_id
inner join (
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4 on table_4.e_id = rg.e_id
where rg.region_str like '%US' and ent.sec_type = 'P'
order by table_2.int_price
limit 500;
You can simplify this logic:
(
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4
To:
(SELECT DISTINCT ON (int_1.e_id) int_1.*
FROM fds.table_4 int_1
ORDER BY int_1.e_id, int_1.date DESC
) table_4
This can take advantage of an index on fds.table_4(e_id, date desc) -- and might be wicked fast with such an index.
You also want appropriate indexes for the joins and filtering. However, it is hard to be more specific without an execution plan.

JOIN query, SQL Server is dropping some rows of my first table

I have two tables customer_details and address_details. I want to display customer details with their corresponding address, so I was using a LEFT JOIN, but when I'm executing this query, SQL Server drops rows where street_no of customer_details table doesn't match with the street_no in address_detials table and displays only rows where `street_no' of customer_detials = street_no of address_details table. I need to display a complete customer_details table and in case if street_no doesn't matches it should display empty string or anything. Am I doing anything wrong in my SQL join?
Table customer_details:
case_id customer_name mob_no street_no
-------------------------------------------------
1 John 242342343 4324234234234
1 Rohan 343233333 43332
1 Ankit 234234233 2342332423433
1 Suresh 234234324 2342342342342
1 Ranjeet 343424323 32233
1 Ramu 234234333 2342342342343
Table address_details:
s_no streen_no address city case_id
------------------------------------------------------
1 4324234234234 Roni road Delhi 1
2 2342332423433 Natan street Lucknow 1
3 2342342342342 Koliko road Herdoi 1
SQL JOIN query:
select
a.*, b.address
from
customer_details a
left join
address_details b on a.street_no = b.street_no
where
b.case_id = 1
Now that it became clear that you used b.case_id=1, I will explain why it filters:
The LEFT JOIN itself returns some rows that contain all NULL values for table b in the result set, which is what you want and expect.
But by using WHERE b.case_id=1, the rows containing NULL values for table b are filtered out because none of them matches the condition (all those rows have b.case_id=NULL so they don't match).
It might work to instead use WHERE a.case_id=1, but we don't know if a.case_id and b.case_id are always the same value for matching rows (they might not be; and if they are always the same, then we just identified a potential redundancy).
There are two ways to fix this for sure.
(1) Move b.case_id = 1 into the left join condition:
left join address_details b on a.street_no = b.street_no and b.case_id = 1
(2) Keep b.case_id = 1 in the WHERE but also allow for NULLED-out b values:
left join address_details b on a.street_no = b.street_no
where b.case_id = 1
or b.street_no IS NULL
Personally I'd go for (1) because that is the most clear way to express that you want to filter b on two conditions, without affecting the rows of a that are being returned.
I do think that Wilhelm Poggenpohl answer is kind of right. You just need to change the last join condition a.case_id=1 to b.case_id=1
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and b.case_id=1
This query will show every row from customer_details and the corresponding adress if there is a match of street_no and the adress meets the condition case_id=1.
This is because of the where clause. Try this:
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and a.case_id=1

SQL query construction: checking if query result is subset of another

Hi Guys I have a table relation which works like this (legacy)
A has many B and B has many C;
A has many C as well
Now I am having trouble coming up with a SQL which will help me to get all B (Id of B to make it simple) mapped to certain A(by Id) AND any B which has a collection of C that's a subset of Cs of that A.
I have failed to come up with a decent sql specially for the second part and was wondering if I can get any tips / suggestions re how I can do that.
Thanks
EDIT:
Table A
Id |..
------------
1 |..
Table B
Id |..
--------------
2 |..
Table A_B_rel
A_id | B_id
-----------------
1 | 2
C is a strange table. The data of C (single column) is actually just duped in 2 rel table for A and B. so its like this
Table B_C_Table
B_Id| C_Value
-----------------
2 | 'Somevalue'
Table A_C_Table
A_Id| C_Value
-------------
1 | 'SomeValue'
So I am looking for Bs the C_Values of which are subset of certain A_C_Values.
Yes, the second part of your problem is a bit tricky. We've got B_C_Table on the one hand, and a subset of A_C_Table where A_ID is a specific ID, on the other.
Now, if we use an outer join, we'll be able to see which rows in B_C_Table have no match in A_C_Table:
SELECT *
FROM B_C_Table bc
LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = #A_ID
Note that it is important to put the ac.A_ID = #A_ID into the ON clause rather than into WHERE, because in the latter case we would be filtering out non-matching rows of #A_ID, which is not what we want.
The next step (to achieving the final query) would be to group rows by B and count rows. Now, we will calculate both the total number of rows and the number of matching rows.
SELECT
bc.B_ID,
COUNT(*) AS TotalCount,
COUNT(ac.A_ID) AS MatchCount
FROM B_C_Table bc
LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = #A_ID
GROUP BY bc.B_ID
As you can see, to count matches, we simply count ac.A_ID values: in case of no match the corresponding column will be NULL and thus not counted. And if indeed some rows in B_C_Table do not match any rows in the subset of A_C_Table, we will see different values of TotalCount and MatchCount.
And that logically leads us towards the final step: comparing those counts. (For, obviously, if we can obtain values, we can also compare them.) But not in the WHERE clause, of course, because aggregate functions aren't allowed in WHERE. It's the HAVING clause that is used to compare values of grouped rows, including aggregated values too. So...
SELECT
bc.B_ID,
COUNT(*) AS TotalCount,
COUNT(ac.A_ID) AS MatchCount
FROM B_C_Table bc
LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = #A_ID
GROUP BY bc.B_ID
HAVING COUNT(*) = COUNT(ac.A_ID)
The count values aren't really needed, of course, and when you drop them you will be able to UNION the above query with the one selecting B_ID from A_B_rel:
SELECT B_ID
FROM A_B_rel
WHERE A_ID = #A_ID
UNION
SELECT bc.B_ID
FROM B_C_Table bc
LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = #A_ID
GROUP BY bc.B_ID
HAVING COUNT(*) = COUNT(ac.A_ID)
Sounds like you need to think in terms of double negation, i.e. there should not exist any B_C that does not have a matching A_C (and I'm guessing there should be at least one B_C).
So, try something like
select B.B_id
from Table_B B
where exists (select 1 from B_C_Table BC
where BC.B_id = B.B_id)
and not exists (select 1 from B_C_Table BC
where BC.B_id = B.B_id
and not exists(select 1 from B_C_Table AC
join A_B_Rel ABR on AC.A_id = ABR.A_id
where ABR.B_id = B.B_id
and BC.C_Value = AC.C_Value))
Perhaps this is what you're looking for:
SELECT B_id
FROM A_B_rel
WHERE A_id = <A ID>
UNION
SELECT a.B_Id
FROM B_C_Table a
LEFT JOIN A_C_Table b ON a.C_Value = b.C_Value AND b.A_Id = <A ID>
GROUP BY a.B_Id
HAVING COUNT(CASE WHEN b.A_Id IS NULL THEN 1 END) = 0
The first SELECT gets all B's which are mapped to a particular A (<A ID> being the input parameter for the A ID), then we tack onto that result set any additional B's whose entire set of C_Value's are within the subset of the C_Value's of the particular A (again, <A ID> being the input parameter).

Left Outer Join with one result per match and "priority" of match set by a different field in match SQL Server 2005

I am trying to get a single result (PHONE) per match (CONTACT_ID) in a Left Outer Join. I imagine that there is a way to accomplish this with the preference (or order) being set by another column/field- the phone type (TYPE), but I haven't been able to figure it out. Below is a list of facts to help better explain what I am trying to accomplish and then following is an example Table A and B with the desired result. I've looked at min() and group by, but I don't know how to make those work here. As a side note, after this is working, I will be joining it to more tables to the left of it in a simpler fashion.
The student can have an unlimited number of CONTACT_ID.
A contact does not always have all phone types.
The preferred order of phone types (TYPE) is C,H,W (which, fortunately, happens to be alphabetical)
ignore match and go to the next in priority if PHONE is null
TableA:
STUDENT_ID CONTACT_ID
---------- ----------
X 1
X 2
Y 3
Y 4
TableB:
CONTACT_ID TYPE PHONE
---------- ---- -----
1 H 21
1 C
1 W 44
2 H 78
2 C 92
2 W 11
Desired Result:
STUDENT_ID CONTACT_ID TYPE PHONE
---------- ---------- ---- -----
X 1 H 21
X 2 C 92
Y 3
Y 4
Here is the query that I have that will make a join with all phone matches (minus all of my crazy attempts at getting what I want).
SELECT *
FROM Table TableA T1
LEFT OUTER JOIN TableB T2 ON T1.CONTACT_ID = T2.CONTACT_ID
All help greatly appreciated!
Edited code from Stefan Onofrei's solution:
(results in some duplicate entries)
SELECT
T1.STUDENT_ID,
T1.CONTACT_ID,
T2.PHONE_TYPE,
T3.PHONE
FROM REG_STU_CONTACT T1
INNER JOIN
(SELECT MIN(PHONE_TYPE) AS PHONE_TYPE, CONTACT_ID
FROM REG_CONTACT_PHONE
WHERE PHONE IS NOT NULL
GROUP BY CONTACT_ID) T2 ON T1.CONTACT_ID = T2.CONTACT_ID
INNER JOIN REG_CONTACT_PHONE T3 ON T2.CONTACT_ID = T3.CONTACT_ID AND T2.PHONE_TYPE = T3.PHONE_TYPE
ORDER BY T1.STUDENT_ID
Select A.STUDENT_ID A.CONTACT_ID B.TYPE c.PHONE
from TableA A
inner join
(select MIN(type ) as type, Contact_ID
from Tableb
where phone is not null
group by contactid) B
on A.contactid = b.contactid
inner join Tableb C
on B.contactid = c.conatctid and b.type = c.type