Left Join SQL based on criteria that may be missing - sql

I've got several tables that I want to merge for one big query. The criteria for the search are based on two tables: section_types and sections. I want to pull all section_types and their associated sections where the sections match certain criteria, PLUS any section_types that are active but don't have any sections associated. It seems like a basic LEFT JOIN, but I think because some of my filter criteria are based on sections, I can't get section_types that have no associated sections
`section_types`
id | name | active
---+------+-------
1 | a | 1
2 | b | 0
3 | c | 1
`sections`
type | issue | location
-----+-------+----------
1 | 0611 | 1
2 | 0611 | 1
1 | 0511 | 1
Say I want to pull all sections for issue 0611 at location 1, plus any empty section types. Like so:
(edited. see below)
But I'm only getting section_types that have corresponding sections. So in this query, section_types row 3 would not show up. What am I doing wrong?
EDIT:
I'm getting all the section_types now, but not all the sections I need. I guess LEFT JOIN will do that. There can be many sections for each section_type, or none. My query is at this point now:
SELECT * FROM `section_types` st
RIGHT JOIN `sections` s
ON s.type=st.id
AND s.issue='0611'
AND s.location=1
WHERE st.active OR s.issue IS NOT NULL
ORDER BY st.id
which gets me:
id | name | active | issue | location
---+------+--------+-------+---------
1 | a | 1 | 0611 | 1
2 | b | 0 | 0611 | 1
3 | c | 1 | |
but I still need that second type-1 section

EDIT
I deleted this, but based on the conversation, I think it accomplishes what you're looking for.
ORIGINAL
Feels like a hack... but I think it works.
Declare #tmp TABLE(
id int,
name varchar(50),
active int,
type int,
issue int,
location int
)
Insert Into #tmp
SELECT * FROM section_types st
LEFT JOIN sections s
ON st.id=s.type
AND s.issue='0611'
AND s.location=1
WHERE st.active = 1 OR s.issue IS NOT NULL
ORDER BY st.id
Select * FROM #tmp
UNION
Select
*, NULL, NULL, NULL
From
section_types
WHERE
id NOT IN ( SELECT id FROM #tmp)
AND active = 0

Is this what you need?
All section_types and ALL their related sections where at least one section has issue '0611' and location 1. Plus all the rest section_types that are active:
SELECT *
FROM section_types st
JOIN sections s
ON s.type = st.id
WHERE EXISTS
( SELECT *
FROM sections s2
WHERE s2.type = st.id
AND s2.issue = '0611'
AND s2.location = 1
)
UNION ALL
SELECT *, NULL, NULL, NULL
FROM section_types st
WHERE st.active
AND NOT EXISTS
( SELECT *
FROM sections s2
WHERE s2.type = st.id
AND s2.issue = '0611'
AND s2.location = 1
)
ORDER BY id

You just have your tables reversed. LEFT OUTER JOIN requires the left table to have a row for the ON condition. Use a RIGHT OUTER JOIN or swap the tables.

Related

After joining two queries (each having different columns) with UNION I'm getting only one column

I have joined two queries with UNION keyword (Access 2016). It looks like that:
SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION SELECT ITEM.IName, Sum(STOCK_OUT.StockOut) AS SumOfOut
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName
I get the following result:
IName | SumOfIN
----------------
Abis Nig | 3
Abrotanum | 1
Acid Acet | 2
Aconite Nap | 2
Aconite Nap | 3
Antim Crud | 3
Antim Tart | 1
But I want the following result:
IName | SumOfIN | SumOfOut
----------------
Abis Nig | 3 | 0
Abrotanum | 1 | 0
Acid Acet | 2 | 0
Aconite Nap | 2 | 3
Antim Crud | 0 | 3
Antim Tart | 0 | 1
Can anyone tell me what changes should I make here?
You need to add dummy values for the third column where they don't exist in the table you are UNIONing. In addition, you need an overall SELECT/GROUP BY since you can have values for both StockIn and StockOut:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION ALL
SELECT ITEM.IName, 0, Sum(STOCK_OUT.StockOut)
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName) s
GROUP BY IName
Note that column names in the result table are all taken from the first table in the UNION, so we must name SumOfOut in that query.
You can do this query without UNION at all:
select i.iname, si.sumofin, so.sumofout
from (item as i left join
(select si.iname, sum(si.stockin) as sumofin
from stock_in as si
group by si.iname
) as si
on si.iname = i.iname
) left join
(select so.iname, sum(so.stockout) as sumofout
from stock_out as so
group by so.iname
) as so
on so.iname = i.iname;
This will include items that have no stock in or stock out. That might be a good thing, or a bad thing. If a bad thing, then add:
where si.sumofin > 0 or so.sumofout > 0
If you are going to use union all, then you can dispense with the join to items entirely:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT si.IName, Sum(si.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM STOCK_IN as si
GROUP BY si.INAME
UNION ALL
SELECT so.IName, 0, Sum(so.StockOut)
STOCK_OUT so
GROUP BY so.IName
) s
GROUP BY IName;
The JOIN would only be necessary if you had stock items that are not in the items table. That would be a sign of bad data modeling.

Comparing different columns in SQL for each row

after some transformation I have a result from a cross join (from table a and b) where I want to do some analysis on. The table for this looks like this:
+-----+------+------+------+------+-----+------+------+------+------+
| id | 10_1 | 10_2 | 11_1 | 11_2 | id | 10_1 | 10_2 | 11_1 | 11_2 |
+-----+------+------+------+------+-----+------+------+------+------+
| 111 | 1 | 0 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
| 111 | 1 | 0 | 1 | 0 | 333 | 0 | 0 | 0 | 0 |
| 111 | 1 | 0 | 1 | 0 | 444 | 1 | 0 | 1 | 1 |
| 112 | 0 | 1 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
+-----+------+------+------+------+-----+------+------+------+------+
The ids in the first column are different from the ids in the sixth column.
In a row are always two different IDs that are matched with each other. The other columns always have either 0 or 1 as a value.
I am now trying to find out how many values(meaning both have "1" in 10_1, 10_2 etc) two IDs have on average in common, but I don't really know how to do so.
I was trying something like this as a start:
SELECT SUM(CASE WHEN a.10_1 = 1 AND b.10_1 = 1 then 1 end)
But this would obviously only count how often two ids have 10_1 in common. I could make something like this for example for different columns:
SELECT SUM(CASE WHEN (a.10_1 = 1 AND b.10_1 = 1)
OR (a.10_2 = 1 AND b.10_1 = 1) OR [...] then 1 end)
To count in general how often two IDs have one thing in common, but this would of course also count if they have two or more things in common. Plus, I would also like to know how often two IDS have two things, three things etc in common.
One "problem" in my case is also that I have like ~30 columns I want to look at, so I can hardly write down for each case every possible combination.
Does anyone know how I can approach my problem in a better way?
Thanks in advance.
Edit:
A possible result could look like this:
+-----------+---------+
| in_common | count |
+-----------+---------+
| 0 | 100 |
| 1 | 500 |
| 2 | 1500 |
| 3 | 5000 |
| 4 | 3000 |
+-----------+---------+
With the codes as column names, you're going to have to write some code that explicitly references each column name. To keep that to a minimum, you could write those references in a single union statement that normalizes the data, such as:
select id, '10_1' where "10_1" = 1
union
select id, '10_2' where "10_2" = 1
union
select id, '11_1' where "11_1" = 1
union
select id, '11_2' where "11_2" = 1;
This needs to be modified to include whatever additional columns you need to link up different IDs. For the purpose of this illustration, I assume the following data model
create table p (
id integer not null primary key,
sex character(1) not null,
age integer not null
);
create table t1 (
id integer not null,
code character varying(4) not null,
constraint pk_t1 primary key (id, code)
);
Though your data evidently does not currently resemble this structure, normalizing your data into a form like this would allow you to apply the following solution to summarize your data in the desired form.
select
in_common,
count(*) as count
from (
select
count(*) as in_common
from (
select
a.id as a_id, a.code,
b.id as b_id, b.code
from
(select p.*, t1.code
from p left join t1 on p.id=t1.id
) as a
inner join (select p.*, t1.code
from p left join t1 on p.id=t1.id
) as b on b.sex <> a.sex and b.age between a.age-10 and a.age+10
where
a.id < b.id
and a.code = b.code
) as c
group by
a_id, b_id
) as summ
group by
in_common;
The proposed solution requires first to take one step back from the cross-join table, as the identical column names are super annoying. Instead, we take the ids from the two tables and put them in a temporary table. The following query gets the result wanted in the question. It assumes table_a and table_b from the question are the same and called tbl, but this assumption is not needed and tbl can be replaced by table_a and table_b in the two sub-SELECT queries. It looks complicated and uses the JSON trick to flatten the columns, but it works here:
WITH idtable AS (
SELECT a.id as id_1, b.id as id_2 FROM
-- put cross join of table a and table b here
)
SELECT in_common,
count(*)
FROM
(SELECT idtable.*,
sum(CASE
WHEN meltedR.value::text=meltedL.value::text THEN 1
ELSE 0
END) AS in_common
FROM idtable
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_a
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedL ON (idtable.id_1 = meltedL.id)
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_b
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedR ON (idtable.id_2 = meltedR.id
AND meltedL.key = meltedR.key)
GROUP BY idtable.id_1,
idtable.id_2) tt
GROUP BY in_common ORDER BY in_common;
The output here looks like this:
in_common | count
-----------+-------
2 | 2
3 | 1
4 | 1
(3 rows)

Hive / SQL - Left join with fallback

In Apache Hive I have to tables I would like to left-join keeping all the data from the left data and adding data where possible from the right table.
For this I use two joins, because the join is based on two fields (a material_id and a location_id).
This works fine with two traditional left joins:
SELECT
a.*,
b.*
FROM a
INNER JOIN (some more complex select) b
ON a.material_id=b.material_id
AND a.location_id=b.location_id;
For the location_id the database only contains two distinct values, say 1 and 2.
We now have the requirement that if there is no "perfect match", this means that only the material_id can be joined and there is no correct combination of material_id and location_id (e.g. material_id=100 and location_id=1) for the join for the location_id in the b-table, the join should "default" or "fallback" to the other possible value of the location_id e.g. material_id=001 and location_id=2 and vice versa. This should only be the case for the location_id.
We have already looked into all possible answers also with CASE etc. but to no prevail. A setup like
...
ON a.material_id=b.material_id AND a.location_id=
CASE WHEN a.location_id = b.location_id THEN b.location_id ELSE ...;
we tried or did not figure out how really to do in hive query language.
Thank you for your help! Maybe somebody has a smart idea.
Here is some sample data:
Table a
| material_id | location_id | other_column_a |
| 100 | 1 | 45 |
| 101 | 1 | 45 |
| 103 | 1 | 45 |
| 103 | 2 | 45 |
Table b
| material_id | location_id | other_column_b |
| 100 | 1 | 66 |
| 102 | 1 | 76 |
| 103 | 2 | 88 |
Left - Join Table
| material_id | location_id | other_column_a | other_column_b
| 100 | 1 | 45 | 66
| 101 | 1 | 45 | NULL (mat. not in b)
| 103 | 1 | 45 | DEFAULT TO where location_id=2 (88)
| 103 | 2 | 45 | 88
PS: As stated here exists etc. does not work in the sub-query ON.
The solution is to left join without a.location_id = b.location_id and number all rows in order of preference. Then filter by row_number. In the code below the join will duplicate rows first because all matching material_id will be joined, then row_number() function will assign 1 to rows where a.location_id = b.location_id and 2 to rows where a.location_id <> b.location_id if exist also rows where a.location_id = b.location_id and 1 if there are not exist such. b.location_id added to the order by in the row_number() function so it will "prefer" rows with lower b.location_id in case there are no exact matching. I hope you have caught the idea.
select * from
(
SELECT
a.*,
b.*,
row_number() over(partition by material_id
order by CASE WHEN a.location_id = b.location_id THEN 1 ELSE 2 END, b.location_id ) as rn
FROM a
LEFT JOIN (some more complex select) b
ON a.material_id=b.material_id
)s
where rn=1
;
Maybe this is helpful for somebody in the future:
We also came up with a different approach.
First, we create another table to calculate averages from the table b based on material_id over all (!) locations.
Second, In the join table we create three columns:
c1 - the value where material_id and location_id are matching (result from a left join of table a with table b). This column is null if there is no perfect match.
c2 - the value from the table where we write the number from the averages (fallback) table for this material_id (regardless of the location)
c3 - the "actual value" column where we use a case statement to decide if when the column 1 is NULL (there is no perfect match of material and location) then we use the value from column 2 (the average over all the other locations for the material) for the further calculations.

How to query 2 tables in sql server with many to many relationship to identify differences

I have two tables with a many to many relationship and I am trying to merge the 2 tables in a select statement. I want to see all of the records from both tables, but only match 1 record from table A to 1 record to table b, so null values are ok.
For example table A has 20 records that match only 15 records from table B. I want to see all 20 records, the 5 that are unable to be matched can show null.
Table 1
Something | Code#
apple | 75
pizza | 75
orange | 6
Ball | 75
green | 4
red | 6
Table 2
date | id#
Feb-15 | 75
Feb-11 | 75
Jan-10 | 6
Apr-08 | 4
The result I need is
Something | Date | Code# | ID#
apple | Feb-15 | 75 | 75
pizza | Feb-11 | 75 | 75
orange | Jan-10 | 6 | 6
Ball | NULL | 75 | NULL
green | Apr-08 | 4 | 4
red | NULL | 6 | NULL
I'm imagining something like this. You want to pair of the rows side by side but one side is going to have more than the others.
select * /* change to whatever you need */
from
(
select *, row_number() over (partition by "code#" order by "something") as rn
from tableA
) as a
full outer join /* sounds like maybe left outer join will work too */
(
select *, row_number() over (partition by "id#" order by "date" desc) as rn
from tableB
) as b
on b."id#" = a."code#" and b.rn = a.rn
Actually I don't know how you're going to get "ball" to comes after "apple" and "pizza" without some other column to sort on. Rows in SQL tables don't have any ordering and you can't rely on the default listing from select *... or assume that the order of insertion is significant.
A regular Left-join should do it for you.
select tableA.*
, tableB.*
from tableA
left join tableB
on tableB.PrimaryKey = tableA.PrimaryKey
we would need to see the table structure to tell you for sure, but essentially you join on the full key (if possible)
SELECT * FROM TABLEA A
JOIN TABLEB B ON
A.FULLKEY = B.FULLKEY
Left outer join
Question changed
Make that a full outer join
select table1.*, table2.*
from table1
full outer join table2
on table1.Code# = table2.id#
This is probably not a true many to many but I think this is what you are asking for

Subtracting value from parent table with SUM(value from child table)

I have 2 tables, tblBasicInfo and tblPayment.
Relationship is 1 to many, where tblBasicInfo is on the 1 side, and tblPayment is on the many side.
Relationship is optional and that is the problem.
I need to subtract value of certain field from parent table with sum of certain fields from child table that match certain criteria.
If there are no records in child table that fulfill the criteria then this should be represented with zero ( data from parent table - 0 ).
I apologize if this is not crystal clear, English is not my native and I am not experienced enough to know how to properly describe the problem.
It would be best to demonstrate what I mean with a small example:
We shall start from table schema:
tblBasicInfo: #ID, TotalPrice (double)
tblPayment: #P_ID, $ID, Amount (double), IsPaid (bool)
Here is the content for parent table tblBasicInfo:
ID | TotalPrice
1 | 100
2 | 150
3 | 200
4 | 250
Here is the content for child table tblPayment:
P_ID | ID | IsPaid | Amount
1 | 1 | true | 50
2 | 1 | false | 25
3 | 2 | false | 100
4 | 2 | false | 25
5 | 3 | true | 200
This is what I have accomplished on my own:
SELECT tblBasicInfo.ID,
( tblBasicInfo.TotalPrice - sum(tblPayment.Amount) ) AS [Difference]
FROM tblBasicInfo, tblPayment
WHERE ( tblBasicInfo.ID = tblPayment.ID )
GROUP BY tblBasicInfo.TotalPrice, tblPayment.IsPaid
HAVING ( tblPayment.IsPaid = TRUE ) --this is the criteria I talked above
ORDER BY tblBasicInfo.ID;
This is what I get from the above query:
ID | Difference
1 | 50
3 | 0
.
.
.
I need to get the following result:
ID | Difference
1 | 50
2 | 150 -- does not meet the criteria ( IsPayed = false )
3 | 0
4 | 250 -- no records in child table
.
.
.
I apologize for imperfect title of the question, but I really did not know how to describe this problem.
I tried this on SQL Server, but you can achieve same in other RDMS you can achieve this in probably more than one way here I presented two solutions I found that first solution performs better than second
SELECT ti.id,MAX(totalprice) - ISNULL(SUM(CASE WHEN is_payed = ((0)) THEN 0 ELSE amount END),0) amount
FROM tblbasicinfo ti LEFT OUTER JOIN tblpayment tp ON ti.id = tp.p_id
GROUP BY ti.id
--OR
SELECT id,totalprice-ISNULL((SELECT SUM(amount)
FROM tblpayment tp
WHERE ti.id = tp.p_id AND is_payed = ((1))
GROUP BY id),0) AS reconsile
FROM tblbasicinfo ti
CREATE TABLE tblBasicInfo (id INT IDENTITY(1,1),totalprice MONEY)
CREATE TABLE tblPayment (id INT IDENTITY(1,1), P_ID INT ,is_payed BIT,amount MONEY)
INSERT INTO tblbasicinfo
VALUES(100),(150),(200),(250)
INSERT INTO tblpayment(p_id,is_payed,amount)
VALUES(1,((1)),50),(1,((0)),25),(2,((0)),100),(2,((0)),25),(3,((1)),200)
try this
select a.Id,(a.TotalPrice-payment.paid) as Difference from tblBasicInfo a
left join
(
select sum(Amount) as paid,Id
from
tblPayment
group by Id
where IsPaid =1)payment
on a.Id=payment.Id
(minor correction - IsPaid rather than IsPayed)
This isn't tested or anything it is just to point you in the right direction hopefully.
You want to use a left join and then check to see if amount is null in your calculation of difference
SELECT
bi.ID,
( bi.TotalPrice - sum(IIF(p.Amount is null,0,p.Amount)) ) AS [Difference]
FROM tblBasicInfo bi,
left join tblPayment p
on p.id = bi.id
and p.IsPaid = 1
GROUP BY bi.ID, bi.TotalPrice
ORDER BY bi.ID;