SQL Full Outer Join without duplicates with like condition - sql

Following are two tables.
I want to find and join records between two tables with having value -1 or +1.
But in that case it returns me duplicate records also.
How can i get the result without duplicate records?
drop table if exists #A
CREATE TABLE #A(ID float, Category varchar(10), Amount float )
insert into #A values
(1,'A',150.4),
(2,'A',151.0),
(3,'A',149.8),
(4,'A',165.0),
(5,'A',165.0)
drop table if exists #B
CREATE TABLE #B(BID float, BCategory varchar(10), BAmount float )
insert into #B values
(95,'A',151),
(101,'A',150),
(115,'A',165.0),
(118,'A',165.0)
i have tried following query, which returns duplicates.
select *
from
(select
ID, category, Amount,
row_number() over (partition by category, Amount order by category, Amount) as Sr
from #A) A
full outer join
(select
BID, Bcategory, BAmount,
row_number() over (partition by Bcategory, BAmount order by Bcategory, BAmount) as Sr
from #B) B on a.category = b.bCategory
and a.amount between b.bamount - 1 and b.BAmount + 1
and a.sr = b.sr
Logic: How the current user is mapping manually:-
First, try to Exact match (category and amount) with another table.
Then whatever left try to match Category and Amount with (+/- 1) of TableB.
So 149.8 or 150.4 of A, both any can join with 150 of B. Since only one (150) in Table B is left and 151 already assigned to 151 under exact match, one record of A will be join to null.
Let's say, since 150.4 is appearing first in tableA it can go with 150 of TableB.
And 149.8 will remain unmatched. Practically user does not mind to match with either or. Important is, any one row (150.4 or 149.8) shall match to null. Problem with left or full outer join is, 150 of B is being assigned to both (149.8 and 150.4).

If you want unique record we need to mapped relation between the table
Table Definition
drop table if exists #A
CREATE TABLE #A(ID float, Category varchar(10), Amount float )
insert into #A values
(1,'A',150.4),
(2,'A',151.0),
(3,'A',149.8),
(4,'A',165.0),
(5,'A',165.0)
drop table if exists #B
CREATE TABLE #B(AID float,BID float, BCategory varchar(10), BAmount float )
insert into #B values
(2,95,'A',151),
(1,101,'A',150),
(4,115,'A',165.0),
(5,118,'A',165.0)
Query
SELECT * FROM #A A
LEFT JOIN #B B ON A.ID = B.AID
Other way
select *
from
(select
ID, category, Amount,
row_number() over (partition by category, Amount order by category, Amount) as Sr
from #A) A
full outer join
(select
AID,BID, Bcategory, BAmount,
row_number() over (partition by Bcategory, BAmount order by Bcategory, BAmount) as Sr
from #B) B on a.category = b.bCategory
and a.amount between b.bamount - 1 and b.BAmount + 1
and a.sr = b.sr AND A.ID = B.AID
OutPut result

Related

Join and select column with max value

I am working on two tables in Oracle, TABLE_A containing unique IDs and Balance (number). TABLE_B shows, for the same IDs, the specific transactions and contains the following fields: IDs (not unique), BAL, Sequence_number.
I want to check that TABLE_A.Balance is always equal to TABLE_B.Balance having Max(Sequence_number).
So I expect to have just one row for each ID.
I've tried the following, yet it does not return a unique row for each ID but multiples. Why is that?
Select a.ID, a.Balance,b.Balance, b.sequence_number From TABLE_A a Inner join (select ID, Balance, max(sequence_number) as sequence_number from TABLE_B group by ID, Balance) b On a.ID = B.ID Group by a.ID, a.Balance, b.Balance, b.sequence_number
TABLE_A
ID_______Balance
1_______10
2_______15
3_______50
TABLE_B
ID____Balance____Sequence_number
1_______19_______1
1_______75_______2
1_______10_______3
2_______39_______1
2_______15_______2
3_______120_______1
3_______89_______2
3_______57_______3
3_______50_______4
You can use window functions:
select a.*, b.balance
from a left join
(select b.*,
row_number() over (partition by id order by sequence_number desc) as seqnum
from b
) b
on b.id = a.id and b.seqnum = 1;
I'm not quite sure what you want to compare, but this returns every row in a with the row in b that has the highest sequence number.
You can use row_number()over() window function to get the row having ID wise highest sequence_number.
Schema:
create table TABLE_A(ID int, Balance int);
insert into TABLE_A values(1,10);
insert into TABLE_A values(2,15);
insert into TABLE_A values(3,50);
create table TABLE_B (ID int , Balance int,Sequence_number int);
insert into TABLE_B values(1,19,1);
insert into TABLE_B values(1,75,2);
insert into TABLE_B values(1,10,3);
insert into TABLE_B values(2,39,1);
insert into TABLE_B values(2,15,2);
insert into TABLE_B values(3,120,1);
insert into TABLE_B values(3,89,2);
insert into TABLE_B values(3,57,3);
insert into TABLE_B values(3,50,4);
GO
Query:
with cte as
(
select id, balance, sequence_number, row_number()over (partition by id order by sequence_number desc)rnk from table_b
)
Select a.ID, a.Balance,b.Balance, b.sequence_number From TABLE_A a
Inner join cte b on a.id=b.id and rnk=1;
GO
Output:
ID
Balance
Balance
sequence_number
1
10
10
3
2
15
15
2
3
50
50
4
db<>fiddle here

Compare the results of a ROW COUNT

I have 2 databases in the same server and I need to compare the registers on each one, since one of the databases is not importing all the information
I was trying to do a ROW count but it's not working
Currently I am doing packages of 100,000 rows approximate, and lookup at them in Excel.
Let's say I want a query that does a count for each ID in TABLE A and then compares the count result VS TABLE B count for each ID, since they are the same ID the count should be the same, and I want that brings me the ID on which there where any mismatch between counts.
--this table will contain the count of occurences of each ID in tableA
declare #TableA_Results table(
ID bigint,
Total bigint
)
insert into #TableA_Results
select ID,count(*) from database1.TableA
group by ID
--this table will contain the count of occurences of each ID in tableB
declare #TableB_Results table(
ID bigint,
Total bigint
);
insert into #TableB_Results
select ID,count(*) from database2.TableB
group by ID
--this table will contain the IDs that doesn't have the same amount in both tables
declare #Discordances table(
ID bigint,
TotalA bigint,
TotalB bigint
)
insert into #Discordances
select TA.ID,TA.Total,TB.Total
from #TableA_Results TA
inner join #TableB_Results TB on TA.ID=TB.ID and TA.Total!=TB.Total
--the final output
select * from #Discordances
The question is vague, but maybe this SQL Code might help nudge you in the right direction.
It grabs the IDs and Counts of each ID from database one, the IDs and counts of IDs from database two, and compares them, listing out all the rows where the counts are DIFFERENT.
WITH DB1Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseOne.dbo.TableOne
GROUP BY ID
), DB2Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseTwo.dbo.TableTwo
GROUP BY ID
)
SELECT a.ID, a.CountOfIDs AS DBOneCount, b.CountOfIDs AS DBTwoCount
FROM DB1Counts a
INNER JOIN DB2Counts b ON a.ID = b.ID
WHERE a.CountOfIDs <> b.CountOfIDs
This SQL selects from the specific IDs using the "Database.Schema.Table" notation. So replace "DatabaseOne" and "DatabaseTwo" with the names of your two databases. And of course replace TableOne and TableTwo with the names of your tables (I'm assuming they're the same). This sets up two selects, one for each database, that groups by ID to get the count of each ID. It then joins these two selects on ID, and returns all rows where the counts are different.
You could full outer join two aggregate queries and pull out ids that are either missing in one table, or for which the record count is different:
select coalesce(ta.id, tb.id), ta.cnt, tb.cnt
from
(select id, count(*) cnt from tableA) ta
full outer join (select id, count(*) cnt from tableB) tb
on ta.id = tb.id
where
coalesce(ta.cnt, -1) <> coalesce(tb.cnt, -1)
You seem to want aggregation and a full join:
select coalesce(a.id, b.id) as id, a.cnt, b.cnt
from (select id, count(*) as cnt
from a
group by id
) a full join
(select id, count(*) as cnt
from b
group by id
) b
on a.id = b.id
where coalesce(a.cnt, 0) <> coalesce(b.cnt, 0);

SQL Hide/Show rows based on row count from another table

I have a SQL question. I have two tables, tableA has 4 records, tableB has 0 records right now, but will go over 200 total records. I was wondering if there is away to hide the last two records of tableA if tableB is under 200 records?
What I got so far is very simple
SELECT
id, dateSlot, timeSlot
FROM
tableA a
INNER JOIN
tableB b ON a.id = b.dateTimeSlotId;
I just don't know how to hide records based on another tables total records.
Can anyone help?
It is only an idea. if the tables are not symmetric you need to improve the logic.
declare #tableA table (id int)
declare #tableB table (dateTimeSlotId int , dateSlot date, timeSlot time)
insert #tableA values (1),(2),(3),(4),(5),(6),(7)
insert #tableB values
(1,'20170801', '00:00'),
(2,'20170802', '00:01'),
(3,'20170803', '00:02'),
(4,'20170804', '00:03'),
(5,'20170805', '00:04'),
(6,'20170806', '00:05'),
(7,'20170807', '00:06')
;with cte as(
SELECT ROW_NUMBER() over (order by id) rNumber, id, dateSlot, timeSlot
FROM #tableA a INNER JOIN
#tableB b
ON a.id = b.dateTimeSlotId)
SELECT id, dateSlot, timeSlot
FROM cte where rNumber <= (SELECT case when Count(1) >= 200 then Count(1) -2 else Count(1) end from #tableB)

How to query total when I have a join table

Hallo,
I have a join table, said tableA and tableB. tableA have a column called Amount. tableB have a column called refID. I would like to total up the Amount column when refID having the same value. I was using SUM in my query, but it throw me an error:
ORA-30483: window functions are not allowed here
30483. 00000 - "window functions are not allowed here"
*Cause: Window functions are allowed only in the SELECT list of a query.
And, window function cannot be an argument to another window or group
function.
Here is my query for your reference:
select *
from (
select SUM(A.Amount), B.refId, Rank() over (partition by B.refID order by B.id desc) as ranking
from table A
left outer join table B on A.refID = B.refID
)
where ranking=1;
May I know is there any alternate solution in order for me to SUM the Amount?
THanks #!
select
SUM(A.Amount),
B.refId
from table A
left outer join table B on A.refID = B.refID
GROUP BY
B.refId
SELECT *
FROM (
SELECT A.Amount, B.refId,
Rank() over (partition by A.refID order by B.id desc) as ranking,
SUM(amount) OVER (PARTITION BY a.refId) AS asum
FROM tableA A
LEFT JOIN
tableB B
ON B.refID = A.refID
)
WHERE ranking = 1
Declare #T table(id int)
insert into #T values (1),(2)
Declare #T1 table(Tid int,fkid int,Amount int)
insert into #T1 values (1,1,200),(2,1,250),(3,2,100),(4,2,25)
Select SUM(t1.Amount) as amount,t1.fkid as id from #T t
left outer join #T1 t1 on t1.fkid = t.id group by t1.fkid
SELECT refid, sum(a.amount)
FROM table AS a LEFT table AS b USING (refid)
GROUP BY refid;
I'm a little confused. The query you posted did not have a SUM function anywhere, and performed a self-join of a table named "TABLE" to itself. I'm going to guess that you actually have two tables (I'll call them TABLE_A and TABLE_B), in which case the following should do it:
SELECT a.REFID, SUM(a.AMOUNT)
FROM TABLE_A a
INNER JOIN TABLE_B b
ON (b.REFID = a.REFID)
GROUP BY a.REFID;
If I understood your question you only wanted results when you have a TABLE_B.REFID which matches a TABLE_A.REFID, so an INNER JOIN would be appropriate.
Share and enjoy.

SQL to resequence items by groups

Lets say I have a database that looks like this:
tblA:
ID, Name, Sequence, tblBID
1 a 5 14
2 b 3 15
3 c 3 16
4 d 3 17
tblB:
ID, Group
14 1
15 1
16 2
17 3
I would like to sequence A so that the sequences go 1...n for each group of B.
So in this case, the sequences going down should be 1,2,1,1.
The ordering needs to be consistent with the current ordering, but there are no guarantees as to the current ordering.
I am not exactly a sql master and I am sure there is a fairly easy way to do this, but I really don't know the right route to take. Any hints?
If you are using SQL Server 2005+ or higher, you can use a ranking function:
Select tblA.Id, tblA.Name
, Row_Number() Over ( Partition By tblB.[Group] Order By tblA.Id ) As Sequence
, tblA.tblBID
From tblA
Join tblB
On tblB.tblBID = tblB.ID
Row_Number ranking function.
Here's another solution that would work in SQL Server 2000 and prior.
Select A.Id, A.Name
, (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1 As Sequence
, A.tblBID
From tblA As A
Join tblB As B
On B.Id = A.tblBID
EDIT
Also want to make it clear that I want to actually update tblA to reflect the proper sequences.
In SQL Server, you can use their proprietary From clause in an Update statement like so:
Update tblA
Set Sequence = (
Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID
) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
The Hoyle ANSI solution might be something like:
Update tblA
Set Sequence = (
Select (Select Count(*)
From tblB As B1
Where B1.[Group] = B.[Group]
And B1.Id < B.ID) + 1
From tblA As A
Join tblB As B
On B.Id = A.tblBID
Where A.Id = tblA.Id
)
EDIT
Can we do that [the inner group] comparison based on A.Sequence instead of B.ID?
Select A1.*
, (Select Count(*)
From tblB As B2
Join tblA As A2
On A2.tblBID = B2.Id
Where B2.[Group] = B1.[Group]
And A2.Sequence < A1.Sequence) + 1
From tblA As A1
Join tblB As B1
On B1.Id = A1.tblBID
Because it's SQL 2000, we can't use a windowing function. That's okay.
Thomas's queries are good and will work. However, they will get worse and worse as the number of rows increases—with different characteristics depending on how wide (the number of groups) and how deep (the number of items per group). This is because those queries use a partial cross-join, perhaps we could call it a "pyramidal cross-join" where the crossing part is limited to right side values less than left side values rather than left crossing to all right values.
What to do?
I think you will be surprised to find that the following long and painful-looking script will outperform the pyramidal join at a certain size of data (which may not be all that big) and eventually, with really large data sets must be considered a screaming performer:
CREATE TABLE #tblA (
ID int identity(1,1) NOT NULL,
Name varchar(1) NOT NULL,
Sequence int NOT NULL,
tblBID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #tblA VALUES ('a', 5, 14)
INSERT #tblA VALUES ('b', 3, 15)
INSERT #tblA VALUES ('c', 3, 16)
INSERT #tblA VALUES ('d', 3, 17)
CREATE TABLE #tblB (
ID int NOT NULL PRIMARY KEY CLUSTERED,
GroupID int NOT NULL
)
INSERT #tblB VALUES (14, 1)
INSERT #tblB VALUES (15, 1)
INSERT #tblB VALUES (16, 2)
INSERT #tblB VALUES (17, 3)
CREATE TABLE #seq (
seq int identity(1,1) NOT NULL,
ID int NOT NULL,
GroupID int NOT NULL,
PRIMARY KEY CLUSTERED (ID)
)
INSERT #seq
SELECT
A.ID,
B.GroupID
FROM
#tblA A
INNER JOIN #tblB B ON A.tblBID = b.ID
ORDER BY B.GroupID, A.Sequence
UPDATE A
SET A.Sequence = S.seq - X.MinSeq + 1
FROM
#tblA A
INNER JOIN #seq S ON A.ID = S.ID
INNER JOIN (
SELECT GroupID, MinSeq = Min(seq)
FROM #seq
GROUP BY GroupID
) X ON S.GroupID = X.GroupID
SELECT * FROM #tblA
DROP TABLE #seq
DROP TABLE #tblB
DROP TABLE #tblA
If I understood you correctly, then ORDER BY B.GroupID, A.Sequence is correct. If not, you can switch A.Sequence to B.ID.
Also, my index on the temp table should be experimented with. For a certain quantity of rows, and also the width and depth characteristics of those rows, clustering on one of the other two columns in the #seq table could be helpful.
Last, there is a possible different data organization possible: leaving GroupID out of the #seq table and joining again. I suspect it would be worse, but am not 100% sure.
Something like:
SELECT a.id, a.name, row_number() over (partition by b.group order by a.id)
FROM tblA a
JOIN tblB on a.tblBID = b.ID;