SQL Full join on two columns - sql

I'm trying to join two tables but the columns emp_id and scheme_id may be null or not populated in either table but if it is populated in either then I need to return the total pen_ee for that employee for each scheme (further table descriptions below). I can't edit the table structure and have to work with what I have.
I've been trying to use a full join to do this but don't understand if you can do a full join on two fields emp_id & scheme_id to get the required result.
Table PAYAUDPEN
This is the first two months of the year.
- Employee A has given 44.06 to scheme BMAL.
- Employee B has given 98.06 to scheme BMAL.
- Employee B has given 98.06 to scheme CLFL.
emp_id, period_id, scheme_id, pen_ee
A, 201601, BMAL, 22.03
A, 201602, BMAL, 22.03
B, 201601, BMAL, 98.06
B, 201602, CLFL, 98.06
Table PAYISPEN
This is the third & current month of the year. The system always puts the current month into this table)
- Employee A gave 22.03.
- Employee B gave 98.06.
(Note employee B did not contribute to the BMAL scheme again in month 3 which is part of the issue).
emp_id, scheme_id, pen_ee
A, BMAL, 22.03
B, CLFL, 98.06
Required Result
The SQL statement needs to return the 3 periods added together, for each employee for each scheme that they contributed to.
- Employee A would be 44.06 + 22.03=66.09 for scheme BMAL.
- Employee B would be 98.06 + NULL =98.06 for scheme BMAL.
- Employee B would be 98.06 + 98.06=196.12 for scheme CLFL.
A, BMAL, 66.09
B, BMAL, 98.06
B, CLFL, 196.12
To create the basics of the two tables and populate with the example data above run the following queries.
CREATE TABLE [dbo].[payaudpen](
[emp_id] [char](10) NOT NULL,
[period_id] [char](6) NOT NULL,
[scheme_id] [char](10) NOT NULL,
[pen_ee] [numeric](15, 2) NULL)
CREATE TABLE [dbo].[payispen](
[emp_id] [char](10) NOT NULL,
[scheme_id] [char](10) NOT NULL,
[pen_ee] [numeric](15, 2) NULL )
INSERT INTO payaudpen VALUES ('A','201601','BMAL','22.03'), ('A','201602','BMAL','22.03'), ('B','201601','BMAL','98.06'), ('B','201602','CLFL','98.06')
INSERT INTO payispen VALUES ('A','BMAL','22.03'), ('B','CLFL','98.06')
Current statement that I'm using:
SELECT a.emp_id,
a.scheme_id,
SUM(a.pen_ee)+AVG(b.pen_ee)
FROM payaudpen a
FULL JOIN payispen b
ON a.emp_id=b.emp_id
GROUP BY a.scheme_id, a.emp_id
Incorrect result
Doesn't return the correct value for employee B for each scheme.
A, BMAL, 66.09
B, BMAL, 196.12
B, CLFL, 196.12

You are trying to sum across two tables, use union all to make the tables into one relation with more rows, instead of join to make the tables into a relation with more columns:
WITH all_records AS (SELECT emp_id
, scheme_id
, pen_ee
FROM payispen
UNION ALL
SELECT emp_id
, scheme_id
, pen_ee FROM payaudpen)
SELECT emp_id, scheme_id, SUM(pen_ee)
FROM all_records
GROUP BY emp_id, scheme_id
Results:
emp_id scheme_id (No column name)
A BMAL 66.09
B BMAL 98.06
B CLFL 196.12

Apparently you want to join only rows that have both the same emp_id and the same scheme_id. This is possible in outer joins, just as it is in inner joins. I infer that you also want to merge the emp_id and scheme_id columns from the two tables so that when a does not provide them, they come from b, instead. This will do it:
SELECT
COALESCE(a.emp_id, b.emp_id) AS emp_id,
COALESCE(a.scheme_id, b.scheme_id) AS scheme_id,
SUM(a.pen_ee)+AVG(b.pen_ee) AS pen_ee
FROM
payaudpen a
FULL JOIN payispen b
ON a.emp_id = b.emp_id AND a.scheme_id = b.scheme_id
WHERE
COALESCE(a.emp_id, b.emp_id) in ('A','B')
AND (a.period_id IS NULL OR a.period_id in ('201601','201602'))
GROUP BY COALESCE(a.scheme_id, b.scheme_id), COALESCE(a.emp_id, b.emp_id)
Note the use of COALESCE() to handle the cases where table a does not provide emp_id or scheme_id; with SQL Server you could also use ISNULL() in its place. Note also the allowance for a.period_id IS NULL in the WHERE condition -- this is necessary (in conjunction with the COALESCE()ing) to include data from b rows that do not have corresponding a rows.

The key thing you need to change in your solution is to use ISNULL() to make sure the key appears when table b has data but not table a. Otherwise, you'll get rows which look like:
NULL | NULL | 98.06
I recommend:
SELECT ISNULL(a.emp_id,b.emp_id) AS emp_id
ISNULL(a.scheme_id, b.scheme_id) AS scheme_id
SUM(a.pen_ee)+AVG(b.pen_ee) AS pen_ee
FROM payaudpen a
FULL JOIN payispen b
ON a.emp_id=b.emp_id
AND a.scheme_id=b.scheme_id
WHERE a.emp_id in ('A','B')
and period_id in ('201601','201602')
GROUP BY ISNULL(a.emp_id,b.emp_id), ISNULL(a.scheme_id, b.scheme_id)

Related

Multiple JOINS in SQL with dataset

I have 3 tables that I need to JOIN in SQL.
Table A: Contains Employee ID
Table B: Contains Employee ID AND fica number
Table C: Contains fica number
Table A and B both contain an employee ID column. Table B and C both contain a column that has an employee's fica number. I need to find out all of the employees that are present in table C but not in table A. I wanted to do that by joining the fica number from table A and table A, then from there I was going to match up the employee ID that correlates to the fica number and try and JOIN that with table A. I try writing the syntax like this:
select distinct(a.employee_id), c.FICA_NBR, b.first_name, b.last_name from benefit A
join B on b.employee_id = a.employee_id
left join C on c.FICA_NBR = b.FICA_NBR
where c.FICA_NBR IS NULL
order by a.employee_id;
How would I go about editing this syntax?
select distinct a.employee_id, c.FICA_NBR, b.first_name, b.last_name from C
join B on c.FICA_NBR = b.FICA_NBR
left join benefit A on b.employee_id = a.employee_id
where A.employee_id IS NULL
order by a.employee_id;
The above code should help you achieve the goal since your condition is data is not present in table a, then its primary key should be null.
Hope I answered your question

Oracle "NOT IN" not returning correct result?

I'm comparing two tables that share unique values between each other using NOT IN function in Oracle but I'm getting
select count(distinct CHARGING_ID) from BILLINGDB201908 where CDR_TYPE='GPRSO'
the output is: 521254 for all charging ids --< this is the total unique charging ID's in BILLINGDB201908
Now I want to find id's in table BILLINGDB201908 that also exist in table CBS_CHRG_ID_AUG
select count(distinct CHARGING_ID) from BILLINGDB201908 where CDR_TYPE='GPRSO'
AND charging_id IN (select CHARGINGID from CBS_CHRG_ID_AUG);
--- the result back315567 charging ID exist BILLINGDB201908 and also exist in CBS_CHRG_ID_AUG
Now I want to find charging ids that not exist in CBS_CHRG_ID_AUG but exist BILLINGDB201908
select count(distinct CHARGING_ID) from prmdb.CDR_TAPIN_201908#prmdb where CDR_TYPE='GPRSO'
AND charging_id NOT IN (select CHARGINGID from CBS_CHRG_ID_AUG);
--the result back 0 !? I should get 205687 exactly because 521254-315567 = 205687 ?
NOT IN returns no rows if any value from the subquery is NULL. Hence, I strongly, strongly recommend NOT EXISTS:
SELECT count(distinct CHARGING_ID)
FROM prmdb.CDR_TAPIN_201908#prmdb ct
WHERE CDR_TYPE = 'GPRSO' AND
NOT EXISTS (SELECT 1
FROM CBS_CHRG_ID_AUG ccia
WHERE ccia.charging_id = ct.charging_id
);
I also recommend changing your first query to EXISTS. In fact, just don't use IN and NOT IN with subqueries, and you won't have this problem.
The missing record is having null value CHARGINGID.
Please try doing select where CHARGINGID is null vs is not null
I would recommend not exists rather than not in; it is null-safe, and usually more efficient:
select count(distinct charging_id)
from billingdb201908 b
where
b.cdr_type = 'gprso'
and not exists (select 1 from cbs_chrg_id_aug a where a.chargingid = b.chargingid)
You can get this list using LEFT OUTER JOIN.
SQL to return list of charging ids that not exist in CBS_CHRG_ID_AUG but exist BILLINGDB201908 -
select count(distinct CHARGING_ID)
from prmdb.CDR_TAPIN_201908#prmdb a
left join CBS_CHRG_ID_AUG b on a.CHARGING_ID = b.CHARGINGID
where a.CDR_TYPE='GPRSO' and b.CHARGINGID is null;
There are two dangers with not in when the subquery key may contain nulls:
If there actually is a null value, you may not get the result you were expecting (as you have found). The database is actually correct, even though nobody in the history of SQL has ever expected this result.
Even if all key values are populated, if it is possible for the key column to be null (if it is not defined as not null) then the database has to check in case there is a null value, so queries are limited to inefficient row by row filter operations, which can perform disastrously for large volumes. (This was true historically, although these days there is a Null-aware anti-join and so the performance issue may not be so disastrous.)
create table demo (id) as select 1 from dual;
select * from demo;
ID
----------
1
create table table_with_nulls (id) as (
select 2 from dual union all
select null from dual
);
select * from table_with_nulls;
ID
----------
2
select d.id
from demo d
where d.id not in
( select id from table_with_nulls );
no rows selected
select d.id
from demo d
where d.id not in
( select id from table_with_nulls
where id is not null );
ID
----------
1
The reason is that 1 <> null is null, not false. If you substitute a fixed list for the not in subquery, it would be:
select d.id
from demo d
where d.id not in (2, null);
which is really the same thing as
select d.id
from demo d
where d.id <> 2 and d.id <> null;
Obviously d.id <> null will never be true. This is why your not in query returned no rows.

SQL query to select records WHERE NOT EXISTS

I was having some problem when trying to write SQL query to filter out certain data. Basically my table design is 1 ward can have many beds, and 1 bed can have many enrollments.
My ward table has w_id as PK, bed table with b_id as PK and w_id as FK, enrollment table with e_id as PK and b_id as FK.
What I am trying to do now is get the list of beds together with ward details that is not exist in enrollment table. I tried my SQL query in Oracle database:
SELECT * FROM bed b
INNER JOIN ward w ON b.WARD_ID = w.ID
WHERE NOT EXISTS ( SELECT * FROM bed b2
INNER JOIN enroll e ON e.BED_ID = b2.ID
WHERE b2.ID = b.ID );
It did managed to returned me the desired result. However, when I tried to put the above query as native query in Spring Boot, I am getting the error message:
Encountered a duplicated sql alias [ID] during auto-discovery of a native-sql query; nested exception is org.hibernate.loader.custom.NonUniqueDiscoveredSqlAliasException: Encountered a duplicated sql alias [ID] during auto-discovery of a native-sql query
Any ideas? Thanks!
SELECT * FROM bed b
INNER JOIN ward w ON b.WARD_ID = w.ID
It appears the bed and ward table both have columns named id. By doing select * you are implicitly including all the columns from both the bed and the ward table. So you are including two columns named id. The not exists part is a distraction. Some sql clients will allow this, but hibernate is more strict. I don't have an environment to do an immediate test on, but something like the following would fix, if this is the issue.
SELECT b.id, b.ward_id, w.ward_name FROM bed b
INNER JOIN ward w ON b.WARD_ID = w.ID
WHERE NOT EXISTS ( SELECT b2.id FROM bed b2
INNER JOIN enroll e ON e.BED_ID = b2.ID
WHERE b2.ID = b.ID );
I don't know if this is related to your problem, but you don't need the JOIN in the subquery. A simpler version is:
SELECT *
FROM bed b INNER JOIN
ward w
ON b.WARD_ID = w.ID
WHERE NOT EXISTS (SELECT 1
FROM enroll e
WHERE e.BED_ID = b.ID
);

SQL Inner join in a nested select statement

I'm trying to do an inner join in a nested select statement. Basically, There are first and last reason IDs that produce a certain number (EX: 200). In another table, there are definitions for the IDs. I'm trying to pull the Last ID, along with the corresponding comment for whatever is pulled (EX: 200 - Patient Cancelled), then the first ID and the comment for whatever ID it is.
This is what I have so far:
Select BUSN_ID
AREA_NAME
DATE
AREA_STATUS
(Select B.REASON_ID
A.LAST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A.LAST_REASON _ID=B.REASON_ID,
(Select B.REASON_ID
A. FIRST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A_FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO
I believe an inner join is best, but I'm stuck on how it would actually work.
Required result would look like (this is example dummy data):
First ID -- Busn Reason -- Last ID -- Busn Reason
1 Patient Sick 2 Patient Cancelled
2 Patient Cancelled 2 Patient Cancelled
3 Patient No Show 1 Patient Sick
Justin_Cave's SECOND example is the way I used to solve this problem.
If you want to use inline select statements, your inline select has to select a single column and should just join back to the table that is the basis of your query. In the query you posted, you're selecting the same numeric identifier multiple times. My guess is that you really want to query a string column from the lookup table-- I'll assume that the column is called reason_description
Select BUSN_ID,
AREA_NAME,
DATE,
AREA_STATUS,
a.last_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.LAST_REASON_ID=B.REASON_ID),
a.first_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO A
More conventionally, though, you'd just join to the busn_reasons table twice
SELECT i.busn_id,
i.area_name,
i.date,
i.area_status,
i.last_reason_id,
last_reason.reason_description,
i.first_reason_id,
first_reason.reason_description
FROM busn_info i
JOIN busn_reason first_reason
ON( i.first_reason_id = first_reason.reason_id )
JOIN busn_reason last_reason
ON( i.last_reason_id = last_reason.reason_id )

How to have Counts in a Join query

I have to create a join between two tables on a column and display the counts of the fields on which they are joined
For example here 'business' is my key on which i want to join.
The first query is
select
[business], count(*) as total from dimhexpand group by [business]
and I get a result as:
DA 54100
Dual 6909
ECM 1508
Flex 15481
Another query is :
select business, count (*) from LODG
group by business order by business
the result of the query is :
DA 100
Dual 909
ECM 508
Flex 15481
I want to return the data by joining these two tables to show something like
**dimhexpand.business dimhexpand.count LODG.Count**
DA 54100 100
Dual 6909 909
ECM 1508 508
Flex 15481 151481
You can JOIN the two tables on the business column:
select d.business,
count(d.business) as dimCount,
l.lodgCount
from dimhexpand d
left join
(
select business, count (*) lodgCount
from LODG
group by business
) l
on d.business = l.business
group by d.business;
If you might have different business values in each table, then you can use a FULL OUTER JOIN between both queries, similar to this:
select coalesce(d.business, l.business),
coalesce(d.dimCount, 0) dimCount,
coalesce(l.lodgCount, 0) lodgCount
from
(
select business, count(*) as dimCount
from dimhexpand
group by business
) d
full outer join
(
select business, count (*) lodgCount
from LODG
group by business
) l
on d.business = l.business
Some questions require answering to provide an accurate answer. Can you have business in dimhexpand and not in LODG or vice versa?
The subquery answer provided above will work if you will always have a one to one on business. if not, you will lose values from either table if not fully joined.
If you can have business unique to either table, can you work with memory tables?
Declare #tblDimhExpand TABLE (
business varchar(50) null,
CountDimHExpand int null
)
Declare #tblLoDG TABLE (
business varchar(50) null,
CountLodG int null
)
Insert into #tblDimhExpand select business, count(*) from DimhExpand Group By Business
Insert into #tblLoDG select business, count(*) from LODG Group By Business
Select coalesce(dim.business, lodg.business) as Business, dim.countDimhExpand, lodg.countlodg
From #tblDimhExpand dim
FULL JOIN #tblLodG lodg on dim.Business = lodg.Business
This will return all business records from both tables regardless of whether they are present on the other table, and will join the results when present in both, and give NULL for the table missing that value when they are present in only one.