Create table of unique values from join - sql

I have two tables with various addresses. One is a table of locations we already have on file, the other is new business. The idea is that I'm joining these two tables on their co-ordinates in order to show if we have a clash between the new business and the current business.
What I have found is that in the new business we have one location that matches three locations that we already have.
When I do my simple inner join I get back 3 records when really I want to display the 4 (1 from new, 3 from current). I have tried other joins and union as well as sub queries but with no luck. I know there is a way but just can't figure it out.
SELECT *
FROM NewBusiness
INNER JOIN Live L ON N.Latitude = L.Latitude AND N.Longitude = L.Longitude
Thanks in advance

Maybe not the best answer (it catch duplicate rows in the same table) but still working correctly
select distinct 'oldbiz', l.id, l.latitude, l.longitude
from Live l, NewBusiness n
where l.latitude = n.latitude
and l.longitude = n.longitude
union all
select distinct 'newbiz', n.id, n.latitude, n.longitude
from Live l, NewBusiness n
where l.latitude = n.latitude
and l.longitude = n.longitude
SQLFIDDLE

So I was able to get what I think you are looking for using a union. It may not be the best way to do it but it looks like it works. I made a SQL Fiddle to show it.
You can see the fiddle here: SQLFIDDLE
to test, I created two tables, Live and NewBusiness.
Created them like this
CREATE TABLE Live
([ID] varchar(1), [latitude] int, [longitude] int)
;
INSERT INTO Live
([ID], [latitude], [longitude])
VALUES
('a', 1, 2),
('b', 1, 2),
('c', 1, 2),
('d', 4, 3),
('e', 5, 4),
('k', 5, 7),
('l', 5, 9),
('M', 5, 7)
;
and
CREATE TABLE NewBusiness
([ID] varchar(1), [latitude] int, [longitude] int)
;
INSERT INTO NewBusiness
([ID], [latitude], [longitude])
VALUES
('f', 1, 2),
('g', 5, 2),
('h', 1, 8),
('i', 6, 3),
('z', 5, 7),
('y', 12, 4),
('x', 5, 7)
;
The query I used was
(
SELECT L.ID
,L.Latitude
,L.Longitude
FROM Live L
INNER JOIN NEWBUSINESS N
ON L.Latitude = N.Latitude AND L.Longitude = N.Longitude
)
UNION
(
SELECT N.ID
,N.LATITUDE
,N.LONGITUDE
FROM NEWBUSINESS N
INNER JOIN Live L
ON N.Latitude = L.Latitude AND N.Longitude = L.Longitude
GROUP BY N.ID
,N.LATITUDE
,N.LONGITUDE
)
The first part of the union gets all the things in Live that have matches in Newbusiness. The second part of the union gets all the things in Newbusiness that have matches in Live. The results are then union'd together.
Tables:
Live
ID latitude longitude
a 1 2
b 1 2
c 1 2
d 4 3
e 5 4
k 5 7
l 5 9
M 5 7
**NewBusiness**
ID latitude longitude
f 1 2
g 5 2
h 1 8
i 6 3
z 5 7
y 12 4
x 5 7
**Query Results**
ID Latitude Longitude
a 1 2
b 1 2
c 1 2
f 1 2
k 5 7
M 5 7
x 5 7
z 5 7

Does something like this give you what you need (it will count once for each record in either table that has a match in the other table):
WITH cte1
AS
(
SELECT
N.latitude
, N.longitude
, ROW_NUMBER() OVER (ORDER BY N.latitude, N.longitude) r1
FROM NewBusiness N
)
,
cte2
AS
(
SELECT
N.latitude
, N.longitude
, ROW_NUMBER() OVER (PARTITION BY N.r1 ORDER BY N.latitude, N.longitude) r2
FROM
cte1 N
JOIN Live L ON
N.Latitude = L.Latitude
AND N.Longitude = L.Longitude
)
SELECT
latitude
, longitude
, 'NewBusiness' sourceTable
FROM cte2
WHERE r2 = 1
UNION ALL
SELECT
latitude
, longitude
, sourceTable
FROM
(
SELECT DISTINCT
latitude
, longitude
, r2
, 'Live' sourceTable
FROM cte2
) Q

Related

Common point in multiple polygons in SQL

i have two tables that contains a list geometry data
Ex. (0xE6100000010CFB24190B88E44A40AADDAB69817F3740)
i did the intersection of shapes between the two tables , now i'm trying to find a common point in all the intersected shapes
i tried to find the STCentroid() of each shape , but i can't find out how to find the common point in all of them
select p1.shape_data.STIntersection(p2.shape_data).STCentroid() as inter_geometry
from map_shapes p1
inner join areas_map_shapes p2 on p2.shape_data.STIntersects(p1.shape_data) = 1
where p2.shape_data.STIntersects(p1.shape_data) = 1
and p2.shape_id = 206
i tried also to aggregate all the intersected shapes
SELECT
geometry::UnionAggregate(ss.shape_data),
geometry::STGeomFromText( geometry::UnionAggregate(ss.shape_data).STCentroid().ToString(), 0).STY as lat,
geometry::STGeomFromText( geometry::UnionAggregate(ss.shape_data).STCentroid().ToString(), 0).STX as lon
FROM areas_map_shapes T
inner join map_shapes SS on SS.shape_data.STIntersects(T.shape_data) = 1
WHERE SS.shape_data.STIntersects(T.shape_data) = 1
AND T.shape_id = 206
and T.status = 1
and SS.status = 1
and T.country_id = 4
my problem is that i need to find the only one common point in all the shapes that intersects
adding image to represent what i got so far , this shows all the shapes the intersects with the main shape , i need to find a common point in all of them
Its hard to tell from your example because (as #nbk pointed out) its difficult to reproduce what you're asking for. That said, it looks like you're looking for the STIntersection function.
DECLARE #GeometryTable TABLE(
ID INT,
geom GEOMETRY
)
INSERT INTO #GeometryTable (ID, Geom) VALUES (1, GEOMETRY::STGeomFromText('POLYGON((0 0, 0 2, 2 2, 2 0, 0 0))', 0))
INSERT INTO #GeometryTable (ID, Geom) VALUES (2, GEOMETRY::STGeomFromText('POLYGON((1 1, 1 3, 3 3, 3 1, 1 1))', 0))
INSERT INTO #GeometryTable (ID, Geom) VALUES (3, GEOMETRY::STGeomFromText('POLYGON((0 1, 0 3, 2 3, 2 1, 0 1))', 0))
SELECT
G1.geom.STIntersection(G2.geom).STIntersection(G3.geom)
FROM
#GeometryTable G1
INNER JOIN
#GeometryTable G2
ON
G1.geom.STIntersects(G2.geom) = 1
INNER JOIN
#GeometryTable G3
ON
G1.geom.STIntersects(G3.geom) = 1
AND G2.geom.STIntersects(G3.geom) = 1
WHERE
G1.ID = 1
AND G2.ID = 2
AND G3.ID = 3
Not sure there's an easy/fast way to do it. One idea is to use STIntersection to create a intersection polygon of all your areas in a recursive CTE:
drop table #t_geoms
create table #t_geoms (geom geometry, row_id int identity)
-- create some random data
insert into #t_geoms
select top 30 GEOMETRY::Point(ROW_NUMBER() OVER(ORDER BY object_id) * 0.01 + 10,ROW_NUMBER() OVER(ORDER BY object_id) * 0.01 + 10, 4326).STBuffer(3) x
from sys.objects
;with cte as (
select geom, row_id
from #t_geoms
where row_id = 1
union all
select g.geom.STIntersection(c.geom), g.row_id
from cte c
inner join #t_geoms g
ON g.row_id = c.row_id + 1
)
select top 1 geom, geom.STCentroid() AS centerPointOfIntersection
from cte
order by row_id desc
option(MAXRECURSION 0)
Note that if not all polygons actually intersect, you get an emptry geom

Find matching first 7 chars to identify duplicates

I'm trying to identify duplicate state_num that are failing validation. The R is causing issues with validation, but I want to just search the first 7 characters and find the duplicate values, so that it returns the row that has an R in the string and the row that doesn't. The column is a type: char(15) But when trying to run a query it is not finding the matching 7 characters. My table only showing how it should look, its not showing what is actually being returned. It basically is just finding the state and only finding non R state_num in results. It should be returning around 480 rows but is returning like 20k rows and not just showing the duplicates
I've tried querying a bunch of different ways but i've spen the last hour only being able to return the R row if i ad AND state_num[8] = 'R' to the end of the query. Which defeats what I'm trying to find the duplicate first 7 characters. This is an informix db.
My Query:
SELECT id_ref, cont_ref, formatted, state_num, type, state
FROM state_form sf1
WHERE EXISTS (select cont_ref, san
FROM state_form sf2
WHERE sf1.cont_ref = sf2.cont_ref and left(sf1.state_num,7) = LEFT(sf2.state_num,7)
GROUP BY cont_ref, state_num
HAVING COUNT(state_num) > 1)
AND state = 'MT';
This is what I'd like my results to return:
id_ref
cont_ref
formatted
state_num
type
state
658311
5237
71-75011R
7175011R
Y
MT
1459
5237
71-75011
7175011
I
MT
7501
555678
99-67894
9967894
I
MT
345443
555678
99-67894R
9967894R
Y
MT
Here are a couple options producing the same results. This may need to be changed if you need to identify the 8th character as something such as a Letter. That is, this will also catch 12345678 and 1234567.
create table my_data (
id_ref integer,
cont_ref integer,
state_num varchar(20),
type varchar(5),
state varchar(5)
);
insert into my_data values
(1, 5237, '7175011R', 'Y', 'MT'),
(2, 5237, '7175011', 'I', 'MT'),
(3, 6789, '7878787', 'Y', 'CA'),
(4, 6789, '7878787R', 'I', 'CA'),
(5, 555678, '9967894', 'I', 'MT'),
(6, 555678, '9967894R', 'Y', 'MT'),
(7, 98765, '123456', 'I', 'MT');
Query #1
with dupes as (
select cont_ref
from my_data
where state = 'MT'
group by cont_ref, left(state_num, 7)
having count(*) > 1
)
select m.id_ref, m.cont_ref, m.state_num, m.type, m.state
from my_data m
join dupes d
on m.cont_ref = d.cont_ref;
Query #2
select m.id_ref, m.cont_ref, m.state_num, m.type, m.state
from my_data m
where m.cont_ref in (
select cont_ref
from my_data
where state = 'MT'
group by cont_ref, left(state_num, 7)
having count(*) > 1
);
id_ref
cont_ref
state_num
type
state
1
5237
7175011R
Y
MT
2
5237
7175011
I
MT
5
555678
9967894
I
MT
6
555678
9967894R
Y
MT
View on DB Fiddle
UPDATE
If Informix does not want to group by left(column, 7), then you could get the target cont_ref values using this. Here's the CTE method, but you could also do with sub-query.
with dupes as (
select cont_ref
from (
select cont_ref, left(state_num, 7) as left_seven
from my_data
where state = 'MT'
)z
group by cont_ref
having count(*) > 1
)
select m.*
from my_data m
join dupes d
on m.cont_ref = d.cont_ref;

How to calculate the degree of agreement by row comparisons in SQL Server?

For a minimal, reproducible example (reprex) let's assume I have a database object (dbo) in a Microsoft SQL Server and I want to query things in T-SQL.
My dbo looks like this:
Animal-ID Marker-ID Allele1 Allele2
--------------------------------------------
1 OAR1 A G
1 OAR2 C C
1 OAR3 T G
2 OAR1 A C
2 OAR2 C C
2 OAR3 A C
What I would like to do is calculate an allele match percentage per Marker-ID across all Animal-IDs.
Given the dbo example from above the desired result looks like this:
Animal-ID-pair Marker-ID Match-percentage
--------------------------------------------
1-2 OAR1 50
1-2 OAR2 100
1-2 OAR3 0
So far, I tried the following approaches:
First I thought selecting individual rows is sufficient.
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY Animal-ID ASC) AS rownumber,
Animal-ID, Marker-ID,
Allele1, Allele2
FROM
dbo) AS foo
WHERE
rownumber BETWEEN 1 AND 3;
and then compare that to the range between 4 and 6.
The problem here is that, in my real and way lager data set, not all animal-ID pairs have the same number of rows, i.e. not the same number of markers.
That is why I thought grouping might be helpful:
SELECT
Animal-ID, Marker-ID,
Allele1, Allele2
FROM
dbo
WHERE
Animal-ID IN (SELECT Animal-ID FROM dbo
GROUP BY Animal-ID
HAVING COUNT(*) > 1);
but that does not allow me to do comparisons and/or calculations across groups.
Thus I would like to ask how to calculate the degree of agreement in the comparison of row pairs.
Sample data
create table genomes
(
AnimalId int,
MarkerId nvarchar(10),
Allele1 nvarchar(1),
Allele2 nvarchar(2)
)
insert into genomes (AnimalId, MarkerId, Allele1, Allele2) values
(1, 'OAR1', 'A', 'G'),
(1, 'OAR2', 'C', 'C'),
(1, 'OAR3', 'T', 'G'),
(2, 'OAR1', 'A', 'C'),
(2, 'OAR2', 'C', 'C'),
(2, 'OAR3', 'A', 'C'),
(3, 'OAR1', 'A', 'G'), --new sample Animal with less data (no OAR3)
(3, 'OAR2', 'C', 'G');
Solution
Select all unique animals cte_AllAnimals.
Select all unique markers cte_AllMarkers.
Combine every animal with every animal behind it a2.AnimalId > a1.AnimalId. This will give you all unique animal combinations.
Combine every pair with every marker cross join cte_AllMarkers.
This gives me:
with cte_AllMarkers as
(
select g.MarkerId
from genomes g
group by g.MarkerId
),
cte_AllAnimals as
(
select g.AnimalId
from genomes g
group by g.AnimalId
)
select convert(nvarchar(10), a1.AnimalId) + '-' +
convert(nvarchar(10), a2.AnimalId) as AnimalIdPair,
m.MarkerId,
case g1.Allele1 when g2.Allele1 then 50 else 0 end +
case g1.Allele2 when g2.Allele2 then 50 else 0 end as MatchPercentage
from cte_AllAnimals a1
join cte_AllAnimals a2
on a2.AnimalId > a1.AnimalId
cross join cte_AllMarkers m
left join genomes g1
on g1.AnimalId = a1.AnimalId
and g1.MarkerId = m.MarkerId
left join genomes g2
on g2.AnimalId = a2.AnimalId
and g2.MarkerId = m.MarkerId
order by a1.AnimalId,
a2.AnimalId,
m.MarkerId;
Result
AnimalIdPair MarkerId MatchPercentage
------------ -------- ---------------
1-2 OAR1 50
1-2 OAR2 100
1-2 OAR3 0
1-3 OAR1 100
1-3 OAR2 50
1-3 OAR3 0
2-3 OAR1 50
2-3 OAR2 50
2-3 OAR3 0
Fiddle to see it in action.
By Using SUBQUERY & STUFF
DECLARE #T TABLE(Animal_ID INT, Marker_ID CHAR(10) , Allele1 CHAR, Allele2 CHAR)
INSERT INTO #T VALUES
(1,'OAR1','A','G'),
(1,'OAR2','C','C'),
(1,'OAR3','T','G'),
(2,'OAR1','A','C'),
(2,'OAR2','C','C'),
(2,'OAR3','A','C')
SELECT * FROM #T
SELECT S.*,(ISNULL(S1.C,0)+ISNULL(S2.C,0))*100/LEN(Allele_Pair) AS Percentage
FROM (
SELECT STUFF((SELECT CONCAT('-' , Animal_ID ) FROM #T t1
WHERE t1.Marker_ID = t2.Marker_ID FOR XML PATH ('')), 1, 1, '') AS Animal_ID_Pair
,Marker_ID,
STUFF((SELECT CONCAT(Allele1,Allele2) FROM #T t1
WHERE t1.Marker_ID = t2.Marker_ID FOR XML PATH ('')), 1, 0, '') AS Allele_Pair
FROM #T t2
GROUP BY Marker_ID) S
LEFT JOIN (SELECT Marker_ID,Allele2,COUNT(Allele2) AS C FROm #T GROUP BY Allele2,Marker_ID HAVING COUNT(Allele2)>1) S1 ON S1.Marker_ID=S.Marker_ID
LEFT JOIN (SELECT Marker_ID,Allele1,COUNT(Allele1) AS C FROm #T GROUP BY Allele1,Marker_ID HAVING COUNT(Allele1)>1) S2 ON S2.Marker_ID=S.Marker_ID
Output:
Animal_ID_Pair Marker_ID Allele_Pair Percentage
1-2 OAR1 AGAC 50
1-2 OAR2 CCCC 100
1-2 OAR3 TGAC 0
A self-join does what you want -- with some arithmetic:
select t1.animal_id, t2.animal_id,
( case when t1.allele1 = t2.allele1 then 1.0 else 0 end +
case when t1.allele2 = t2.allele2 then 1.0 else 0 end +
) / 2.0 as match_percentage
from t t1 join
t t2
on t1.marker_id = t2.marker_id and
t1.animal_id < t2.animal_id;
Although it is easy enough to add new alleles into this. You can also express this as by unpivoting the alleles and aggregating:
with ta as (
select t.*,, v.*
from t cross apply
(values (1, allele1), (2, allele2)) v(allele, val)
)
select ta1.animal_id, ta2.animal_id, mta1.marker,
avg(case when ta1.val = ta2.val then 1.0 else 0 end) as match_percentage
from ta ta1 join
ta ta2
on ta1.marker_id = ta2.marker_id and
ta1.animal_id < ta2.animal_id
group by ta1.animal_id, ta2.animal_id;

SQL Server loop through a table for every 5 rows

I need to write a stored procedure or table function to return a new data table as a new data source.
I wish to loop through the original table for every 5 rows base on the invoice ID column (it's possible not start from 1), the first 5 rows add to the left of the new table and the second 5 rows add to the right of the new table, the third 5 rows to the left and so on.
For example, Here is the original table:
Here is the expect table:
Thanks in advance!
declare #rowCount int = 5;
with cte as (
select *,( (IN_InvoiceID-1) / #rowCount ) % 2 group1
,( (IN_InvoiceID-1) / #rowCount ) group2
,IN_InvoiceID % #rowCount group3
from T
)
select * from cte
select T1.INID,T1.IN_InvoiceID,T1.IN_InvoiceAmount,T2.INID,T2.IN_InvoiceID,T2.IN_InvoiceAmount
from CTE T1
left join CTE T2 on T2.group1 = 1 and T1.group2 = T2.group2-1 and T1.group3 = T2.group3
where T1.group1 = 0
Test DDL
CREATE TABLE T
([INID] varchar(38), [IN_InvoiceID] int, [IN_InvoiceAmount] int)
;
INSERT INTO T
([INID], [IN_InvoiceID], [IN_InvoiceAmount])
VALUES
('DB3E17E6-35C5-41:121-93B1-F809BF6B2972', 1, 2999),
('3212F048-8213-4FCC-AB64-121485B77D4E43', 2, 3737),
('E3526373-A204-40F5-801C-7F8302A4E5E2', 3, 3175),
('76CC9C19-BF79-4E8A-8034-A33805AD3390', 4, 391),
('EC7A2FBC-B62D-4865-88DE-A8097975F125', 5, 1206),
('52AD3046-21331-4F0A-BD1D-67F232C54244', 6, 402),
('CA48F132-A9F5-4516-9E58-CDEE6644AAD1', 7, 1996),
('02E10C31-CAB2-4220-B66A-CEE5E67A9378', 8, 3906),
('98F1EEFF-B07A-4B65-87F4-E165264284DD', 9, 2575),
('91EBDD8B-B73C-470C-8900-DD66078483DB', 10, 2965),
('6E2490E5-C4DE-4833-877F-1590F7BDC1B8', 11, 1603),
('00985921-AC3C-4E3E-BAE1-7F58302F831A', 12, 1302)
;
Result:
Could you please check article Display Data in Multiple Columns using SQL showing with example case how a database developer can show the list of data rows in a columnar mode using Row_Number() function and mode arithmetic expression
You need to add additional columns from the same row that is different in the sample
Seems as if you want to split the table into 2 tables with alternating 5 rows. An easy way to do this would be:
Take data into a temp table having an extra column (lets say
grouping_id)
Update the grouping id so that each 5 rows have the same id. You can
use in_invoiceId % 5 (the nod function). After this step the first 5
rows will have grouping_id 0, next 5 will have 1, next will have 2
(assuming your invoice id is incremented +1 for all rows).
You can just do a normal select with where clause for odd and even grouping_id
Ideally, you can manage with the 2 tables Master and detail table.
But due to my curiosity, I am able to solve and give the answer as
Declare #table table(id int identity, invoice_id int)
; WITH Numbers AS
(
SELECT n = 1
UNION ALL
SELECT n + 1
FROM Numbers
WHERE n+1 <= 50
)
insert into #table SELECT n
FROM Numbers
Select (a.id )%5 ,* from #table a join #table b on a.id+5 = b.id and a.id != b.id
;WITH Numbers AS
(
SELECT n = 1, o = 5
UNION ALL
SELECT n + 10, o = o+10
FROM Numbers
WHERE n+1 <= 50
)
select a.id ParentId,a.invoice_id ParentInvoiceId, --b.n, b.o,
c.invoice_id childInvoiceID from #table a
join Numbers b on a.id between b.n and b.o
left join #table c on a.id + 5 = c.id
Here is my solution
First i create grps based on whether the in_invoiceid is divisible by 5 or not.(Ignore the remainders)
After that i create a category to indicate between alternative groups(ie by checking if the remainder is 0 or otherise)
Then its a matter of dense_ranking the records on the basis of the category field ordered by in_invoiceid
Lastly a join with category=1 rows with same dense_rank as those records in category=0
create table Invoicetable(IN_ID varchar(100), IN_InvoiceID int)
INSERT INTO Invoicetable (IN_ID, IN_InvoiceID)
VALUES
('2345-BCDE-6645-1DDF', 1),
('2345-BCDE-6645-3DDF', 2),
('2345-BCDE-6645-4DDF', 3),
('2345-BCDE-6645-5DDF', 4),
('2345-BCDE-6645-6DDF', 5),
('2345-BCDE-6645-7DDF', 6),
('2345-BCDE-6645-aDDF', 7),
('2345-BCDE-6645-sDDF', 8),
('2345-BCDE-6645-dDDF', 9),
('2345-BCDE-6645-dDDF', 10),
('2345-BCDE-6645-dDDF', 11),
('2345-BCDE-6645-dDDF', 12);
with data
as (
select *
,(in_invoiceid-1)/5 as grp
,case when ((in_invoiceid-1)/5)%2=0 then '1' else '0' end as category
,dense_rank() over(partition by case when ((in_invoiceid-1)/5)%2=0 then '1' else '0' end
order by in_invoiceid) as rnk
from invoicetable a
)
select *
from data a
left join data b
on a.rnk=b.rnk
and b.category=0
where a.category=1
Here is db fiddle link.
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=287f101737c580ca271940764b2536ae
You may try with the following approach. Dividing the table is done with (((ROW_NUMBER() OVER (ORDER BY IN_InvoiceID) - 1) / 5) % 2 = 0) which groups records in left and right groups.
CREATE TABLE #InvoiceTable(
IN_ID varchar(24),
IN_InvoiceID int
)
INSERT INTO #InvoiceTable (IN_ID, IN_InvoiceID)
VALUES
('2345-BCDE-6645-1DDF', 1),
('2345-BCDE-6645-3DDF', 2),
('2345-BCDE-6645-4DDF', 3),
('2345-BCDE-6645-5DDF', 4),
('2345-BCDE-6645-6DDF', 5),
('2345-BCDE-6645-7DDF', 6),
('2345-BCDE-6645-aDDF', 7),
('2345-BCDE-6645-sDDF', 8),
('2345-BCDE-6645-dDDF', 9),
('2345-BCDE-6645-dDDF', 10),
('2345-BCDE-6645-dDDF', 11),
('2345-BCDE-6645-dDDF', 12);
WITH cte AS (
SELECT
IN_ID,
IN_InvoiceID,
CASE
WHEN (((ROW_NUMBER() OVER (ORDER BY IN_InvoiceID) - 1) / 5) % 2 = 0) THEN 'L'
ELSE 'R'
END AS IN_Position
FROM #InvoiceTable
),
cteL AS (
SELECT IN_ID, IN_InvoiceID, ROW_NUMBER() OVER (ORDER BY IN_InvoiceID) AS IN_RowNumber
FROM cte
WHERE IN_Position = 'L'
),
cteR AS (
SELECT IN_ID, IN_InvoiceID, ROW_NUMBER() OVER (ORDER BY IN_InvoiceID) AS IN_RowNumber
FROM cte
WHERE IN_Position = 'R'
)
SELECT cteL.IN_ID, cteL.IN_InvoiceID, cteR.IN_ID, cteR.IN_InvoiceID
FROM cteL
LEFT JOIN cteR ON (cteL.IN_RowNumber = cteR.IN_RowNumber)
Output:
IN_ID IN_InvoiceID IN_ID IN_InvoiceID
2345-BCDE-6645-1DDF 1 2345-BCDE-6645-7DDF 6
2345-BCDE-6645-3DDF 2 2345-BCDE-6645-aDDF 7
2345-BCDE-6645-4DDF 3 2345-BCDE-6645-sDDF 8
2345-BCDE-6645-5DDF 4 2345-BCDE-6645-dDDF 9
2345-BCDE-6645-6DDF 5 2345-BCDE-6645-dDDF 10
2345-BCDE-6645-dDDF 11 NULL NULL
2345-BCDE-6645-dDDF 12 NULL NULL

Select Distinct values once from multiple columns in this table preserving original order?

I have a (subquery) table that lists meal preferences for my friends. Each meal can only be taken once, and each person can only eat one meal.
row_number person_id meal_id
1 1 3
2 2 1
3 2 2
4 2 3
5 3 1
6 3 2
7 3 3
The picking order is determined by the original order of the table, so I would like the result to be:
person_id meal_id
1 3
2 1
3 2
Because meal 1 is taken by user 2, user 3 gets meal 2. I think this could be solved by selecting distinct values in both columns based on their original order, but I cannot figure out how to write that query. Any help appreciated.
Update Added row_number to original table.
If I understand correctly, this is a rather complicated graph walking problem. I should first note that there is no guarantee of an optimal solution -- without lots and lots of work. But you can implement a greedy algorithm using recursive CTEs:
with recursive t as (
select v.*
from (values (1, 1, 3), (2, 2, 1), (3, 2, 2), (4, 2, 3), (5, 3, 1), (6, 3, 2), (7, 3, 3)
) v(row_number, person_id, meal_id)
),
cte (row_number, person_id, meal_id, rows, persons, meals, lev) as (
select row_number, person_id, meal_id, array[row_number], array[person_id], array[meal_id], 1 as lev
from t
where row_number = 1
union all
select t.row_number, t.person_id, t.meal_id,
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.rows
else array_append(cte.rows, t.row_number)
end),
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.persons
else array_append(cte.persons, t.person_id)
end),
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.meals
else array_append(cte.meals, t.meal_id)
end),
cte.lev + 1
from cte join
t
on t.row_number = cte.row_number + 1
)
select t.*
from t cross join
(select rows from cte order by lev desc fetch first 1 row only) as last1
where t.row_number = any (last1.rows);
Here is a db<>fiddle.