Eliminate duplicate rows by outer joining two table in Oracle 11i

Eliminate duplicate rows by outer joining two table in Oracle 11i - sql

I have following two table with sample Data.
PLACED_PERSON_INFO
*PLACED_PERSON_INFO_GUID CPR*
P1 0201026157
P2 0309929493
P3 0002170000
P4 0000011037
P5 1201006694
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420
PLACED_PERSON_PLACES
*PP_ID PLACEMENT_DATE PLACEMENT_STOP PLACED_PERSON_INFO_GUID*
1 01-01-2014 31-12-2014 P1
2 01-01-2014 31-12-2014 P1
3 01-01-2013 31-12-2013 P2
4 01-06-2014 30-10-2014 P3
5 01-02-2014 30-10-2014 P3
6 01-01-2013 01-01-2015 P4
7 01-01-2013 30-05-2013 P4
8 01-01-2012 30-03-2013 P5
I have written the following SQL Query to get the result combining these two tables.
SQL Query :
SELECT
PPI.PLACED_PERSON_INFO_GUID, PPI.CPR
FROM PLACED_PERSON_PLACES PPP, PLACED_PERSON_INFO PPI
WHERE (PPP.PLACEMENT_DATE <= SYSDATE OR PPP.PLACEMENT_DATE IS NULL)
AND (PPP.PLACEMENT_STOP >= SYSDATE OR PPP.PLACEMENT_STOP IS NULL)
AND PPP.PLACED_PERSON_INFO_GUID (+) = PPI.PLACED_PERSON_INFO_GUID
ORDER BY PPI.CPR;
Query Result:
PLACED_PERSON_INFO_GUID CPR
P1 0201026157
P1 0201026157
P3 0002170000
P3 0002170000
P4 0000011037
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420
But I want the following result where duplicate rows will not be shown. I do not want to use DISTINCT keyword. Can anyone help me in this result? I am using Oracle 11i.
Expected Result:
PLACED_PERSON_INFO_GUID CPR
P1 0201026157
P3 0002170000
P4 0000011037
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420

First, you should write your query using explicit join syntax:
SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR
FROM PLACED_PERSON_INFO PPI LEFT JOIN
PLACED_PERSON_PLACES PPP
ON PPP.PLACEMENT_DATE <= SYSDATE AND
PPP.PLACEMENT_STOP >= SYSDATE AND
PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
ORDER BY PPI.CPR;
If you only want one row, then you can use row_number():
SELECT PLACED_PERSON_INFO_GUID, CPR
FROM (SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR,
ROW_NUMBER() OVER (PARTITION BY PPI.PLACED_PERSON_INFO_GUID, PPI.CPR ORDER BY PPI.CPR) as seqnum
FROM PLACED_PERSON_INFO PPI LEFT JOIN
PLACED_PERSON_PLACES PPP
ON PPP.PLACEMENT_DATE <= SYSDATE AND
PPP.PLACEMENT_STOP >= SYSDATE AND
PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
) p
WHERE seqnum = 1;
ORDER BY CPR;
You can add additional columns and still only get one row per pair.

Solution is :
SELECT PLACED_PERSON_INFO_GUID, CPR
FROM (SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR,
ROW_NUMBER() OVER (PARTITION BY PPI.PLACED_PERSON_INFO_GUID, PPI.CPR ORDER BY PPI.CPR) AS SEQNUM
FROM PLACED_PERSON_INFO PPI LEFT JOIN PLACED_PERSON_PLACES PPP
ON PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
WHERE (PPP.PLACEMENT_DATE <= SYSDATE OR PPP.PLACEMENT_DATE IS NULL)
AND (PPP.PLACEMENT_STOP >= SYSDATE OR PPP.PLACEMENT_STOP IS NULL)
) P
WHERE SEQNUM = 1
ORDER BY CPR

Related

Dynamically selecting the column to select from the row itself in SQL

I have a SQL Server table with some data as follows. The number of P columns are fixed but there will be too many columns. There will be multiple columns in the fashion like S1, S2 etc
Id
SelectedP
P1
P2
P3
P4
P5
1
P2
3
8
4
15
7
2
P1
0
2
6
0
3
3
P3
1
15
2
1
11
4
P4
3
4
6
2
4
I need to write a SQL statement which can get the below result. Basically which column that needs to be selected from each row depends upon the SelectedP value in that row itself. The SelectedP contains the column to select for each row.
Id
SelectedP
Selected-P-Value
1
P2
8
2
P1
0
3
P3
2
4
P4
2
Thanks in advance.

You just need a CASE expression...
SELECT
id,
SelectedP,
CASE SelectedP
WHEN 'P1' THEN P1
WHEN 'P2' THEN P2
WHEN 'P3' THEN P3
WHEN 'P4' THEN P4
WHEN 'P5' THEN P5
END
AS SelectedPValue
FROM
yourTable
This will return NULL for anything not mentioned in the CASE expression.
EDIT:
An option with just a little less typing...
SELECT
id, SelectedP, val
FROM
yourTable AS pvt
UNPIVOT
(
val FOR P IN
(
P1,
P2,
P3,
P4,
P5
)
)
AS unpvt
WHERE
SelectedP = P
NOTE: If the value of SelectedP doesn't exist in the UNPIVOT, then the row will not appear at all (unlike the CASE expression which will return a NULL)
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b693738aac0b594cf37410ee5cb15cf5
EDIT 2:
I don't know if this will perform much worse than the 2nd option, but this preserves the NULL behaviour.
(The preferred option is still to fix your data-structure.)
SELECT
id, SelectedP, MAX(CASE WHEN SelectedP = P THEN val END) AS val
FROM
yourTable AS pvt
UNPIVOT
(
val FOR P IN
(
P1,
P2,
P3,
P4,
P5
)
)
AS unpvt
GROUP BY
id, SelectedP
Demo : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=f3f64d2fb6e11fd24d1addbe1e50f020

How to perform Complex SQL join with multiple approximate matches and return only the first match

I am trying to perform a Left join in SQL where I need to check multiple match criteria and only retain the first match in the right table after a certain sort operation on the right table.
Below is my Left table.
(No Null values)
Date
Customer
Shop
Product
Customer_Score
1/1/2020
C1
S1
P1
2
1/2/2020
C2
S1
P2
8
1/5/2020
C3
S2
P1
6
1/6/2020
C4
S2
P2
10
1/7/2020
C1
S2
P3
2
1/8/2020
C2
S2
P4
4
And this is the right Table
(Null values allowed only in Product column)
Shop
Product
Min_Customer_Score
Valid_From
Valid_To
Percent_Discount
S1
P1
4
1/1/2020
1/5/2020
10
S1
P1
5
1/1/2020
1/5/2020
11
S1
P1
7
1/1/2020
1/5/2020
12
S1
5
1/1/2020
1/5/2020
13
S2
P1
4
1/1/2020
1/5/2020
14
S2
P2
4
1/1/2020
1/5/2020
15
S2
6
1/1/2020
1/5/2020
16
S2
9
1/1/2020
1/5/2020
17
S2
P1
4
1/6/2020
1/8/2020
18
S2
P2
4
1/6/2020
1/8/2020
19
S2
6
1/6/2020
1/8/2020
20
S2
9
1/6/2020
1/8/2020
21
I want to sort the right table first by Product(nulls at last) and then by Min_Customer_Score(ascending).
Then I want to pull the Min_Customer_Score and Discount value from first row matching below conditions:
Left.Date >= Right.Valid_From
Left.Date <= Right.Valid_To
Left.Shop = Right.Shop
Left.Product = Right.Product OR Right.Product = null
Left.Customer_Score >= Right.Min_Customer_Score
My final result should look like below.
Date
Customer
Shop
Product
Customer_Score
Min_Customer_Score
Percent_Discount
1/1/2020
C1
S1
P1
2
null
null
1/2/2020
C2
S1
P2
8
5
13
1/5/2020
C3
S2
P1
6
4
14
1/6/2020
C4
S2
P2
10
4
19
1/7/2020
C1
S2
P3
2
null
null
1/8/2020
C2
S2
P4
4
null
null
Basically, I want to find the right discount for each purchase, considering null values in the Right.Product as default discount that is applicable to all other products.
I am familiar with making Left Joins and also using Sub Queries in SQL. But I couldn't even understand where to start to do such complex queries. I have also referred to other answers which suggest using ROW_NUMBER() OVER (PARTITION BY, But couldn't work it out for this case.
Edit:
This is what I was able to work out so far.
SELECT left_table.*, right_table.Percent_Discount, right_table.Min_Customer_Score
, ROW_NUMBER() OVER (
PARTITION BY left_table.Date, left_table.Customer, left_table.Shop, left_table.Product
ORDER BY right_table.Product DESC right_table.Min_Customer_Score ASC) as row_num
LEFT JOIN right_table
ON left_table.Date >= right_table.Valid_From
AND left_table.Date <= right_table.Valid_To
AND left_table.Shop>= right_table.Shop
AND (left_table.Product = right_table.Product OR right_table.Product is NULL)
AND left_table.Customer_Score >= right_table.Min_Customer_Score
WHERE row_num = 1
But It gives me below error
ERROR: column "row_num" does not exist
LINE: WHERE row_num = 1

Use apply:
select l.*, r.*
from left l outer apply
(select top (1)
from right r
where l.Date >= r.Valid_From and
l.Date <= r.Valid_To and
l.Shop = r.Shop and
(l.Product = r.Product or r.Product = null) and
(l.Customer_Score >= r.Min_Customer_Score)
order by (case when product is not null then 1 else 2 end),
Min_Customer_Score asc
) r

Finally, I was able to solve it as below. Thanks to #iamdave for your comment
SELECT Date, Customer, Shop, Product, Customer_Score, Min_Customer_Score, Percent_Discount
FROM
(
SELECT left_table.*, right_table.Percent_Discount, right_table.Min_Customer_Score
, ROW_NUMBER() OVER (
PARTITION BY left_table.Date, left_table.Customer, left_table.Shop, left_table.Product
ORDER BY right_table.Product DESC right_table.Min_Customer_Score ASC) as row_num
LEFT JOIN right_table
ON left_table.Date >= right_table.Valid_From
AND left_table.Date <= right_table.Valid_To
AND left_table.Shop = right_table.Shop
AND (left_table.Product = right_table.Product OR right_table.Product is NULL)
AND left_table.Customer_Score >= right_table.Min_Customer_Score
) as sub_query
WHERE row_num = 1

how to fetch rows in sql server based on values of column

id dept Person Rating
-------------------------------------------
1 ece p1 R1
2 ece p2 t1
3 eee P3 R2
4 eee p4 M
5 Civil P5 R2
6 Civil P6 t2
7 Civil P7 t2
8 Mech p8 R2
9 Mech P9 NULL
10 IT P10 R2J
11 IT P11 T2
12 IT P12 T2
I would like to fetch all the rows whose department's rating has at least one value like 'P%' and one like 'T%'.

A rather direct method uses exists:
select t.*
from t
where exists (select 1 from t t2 where t2.dept = t.dept and t2.rating like 'P%') and
exists (select 1 from t t2 where t2.dept = t.dept and t2.rating like 'T%') ;

Take the max value of a column in a sql table

I have this query:
SELECT DISTINCT S.PRODOTTO, D.CODPROD, D.IDPROD
FROM D_PROD D, APP_SALES S
WHERE D.CODPROD = S.PRODOTTO
The result is:
PRODOTTO CODPROD IDPROD
P2 P2 2
P1 P1 1
P3 P3 4
P3 P3 3
Now I would the result was
PRODOTTO CODPROD IDPROD
P2 P2 2
P1 P1 1
P3 P3 4
with the product P3 that take the max idprod it has encountered.
How can I say to the query to take the max value if there are more rows of one product?
I want the max idprod.

SELECT DISTINCT S.PRODOTTO, D.CODPROD, MAX(D.IDPROD)
FROM D_PROD D, APP_SALES S
WHERE D.CODPROD = S.PRODOTTO
GROUP BY S.PRODOTTO, D.CODPROD

How to insert new data while keeping most columns value but changing some values dynamically?

I am using Sql Server 2005. I have this tableA with 70 columns and about 5000rows. I would like to create new data(around 200 new records) for simulation purposes. Out of the 70 columns, i only want to change values of 3 columns (TERMID, OUTLET, SNUM) the rest remains. E.g
TABLEA
SNO COMPANY.......TERMID........OUTLET........SNUM.....
1 ABC PP2 P1-P5 P5
1 ABC PP2 P1-P5 P4
2 ABC PP2 P1-P5 P4
1 ABC PP2 P1-P5 P3
3 ABC PP2 P1-P5 P3
so i would like to keep all values for the new records except changing all TERMID from PP2 to PP3 and all outlet from P1-P5 to P6-P8. As for SNUM all P5 will become P8, P4 will become P7 and P3 will become P6. Meaning TABLEA will look like this after i do the insert:
TABLEA
SNO COMPANY.......TERMID........OUTLET........SNUM.....
1 ABC PP2 P1-P5 P5
1 ABC PP2 P1-P5 P4
2 ABC PP2 P1-P5 P4
1 ABC PP2 P1-P5 P3
3 ABC PP2 P1-P5 P3
1 ABC PP3 P6-P8 P8
1 ABC PP3 P6-P8 P7
2 ABC PP3 P6-P8 P7
1 ABC PP3 P6-P8 P6
3 ABC PP3 P6-P8 P6
I do not want to do this manually as it will be very tedious for 200 rows. Is this possible using SQL statements?
I have tot of writing normal insert statements with subquery but i guess it will be as tedious or maybe even more to write INSERT INTO TABLEA ( COL1, COL2,.......COL70) VALUE (.....)
Any smart idea?

I just reread your response, and realize you're trying to Insert records in your current table, not create a new table.
How about:
SELECT *
INTO #NewTable
FROM TABLEA;
UPDATE #NewTable
SET TERMID = 'PP3',
OUTLET= 'P6-P8',
SNUM = CASE
WHEN SNUM = 'P5' THEN 'P8'
WHEN SNUM = 'P4' THEN 'P7'
WHEN SNUM = 'P3' THEN 'P6'
END;
INSERT INTO TABLEA
SELECT *
FROM #NewTable
Sorry for the confusion.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Eliminate duplicate rows by outer joining two table in Oracle 11i - sql

Related

Dynamically selecting the column to select from the row itself in SQL

How to perform Complex SQL join with multiple approximate matches and return only the first match

how to fetch rows in sql server based on values of column

Take the max value of a column in a sql table

How to insert new data while keeping most columns value but changing some values dynamically?

Categories

Resources