How to remove mirror results in PosgreSQL?

How to remove mirror results in PosgreSQL? - sql

I couldn't figure out how to remove mirror results like this:
select
b.column1 as result1,
c.column2 as result2
from table a
left join table b on a.column1 = b.column1
left join table c on a.column2 = c.column1;
The results I get are the following:
result1|results2
b1 |b22
b5 |b66
b74 |b31
......
b22 |b1
b66 |b5
b31 |b74
How could I get only the first combination - if there is a combination b1-b22, I don't need b22-b1.
I've tried which distinct on the sum of b.column1 and c.column1 - I casted them to int and it works, but I don't believe it's the best way since there could be duplicated sums coming from different combinations and I'll lose some data.

This is a bit tricky if you don't have all combinations. Your sample results do not have null, so I will change the joins to inner joins and then use distinct on:
select distinct on (least(b.column1, c.column2), greatest(b.column1, c.column2))
b.column1 as result1, c.column2 as result2
from table a join
table b
on a.column1 = b.column1 join
table c
on a.column2 = c.column1
order by least(b.column1, c.column2), greatest(b.column1, c.column2);
Actually as your query is phrased, the joins don't seem needed at all. So you might consider:
select distinct on (least(a.column1, a.column2), greatest(a.column1, a.column2))
a.column1 as result1, a.column2 as result2
from table a
order by least(a.column1, a.column2), greatest(a.column1, a.column2);

One method is to add an inequality condition in the last join:
select
b.column1 as result1,
c.column2 as result2
from table a
left join table b on a.column1 = b.column1
left join table c on a.column2 = c.column1 and b.column1 < c.column2

"Remove mirror results" that sounds like you are for delete. This seems best addressed with a Exists clause:
delete from rtab r1
where r1.result1 > r1.result2
and exists (select null
from rtab r2
where r1.result1 = r2.result2
and r1.result2 = r2.result1
);

Related

SQL inner join and where performance comparison

If we have 2 tables, tableA (with column1, column2) and tableB (with column1, column2), what's the difference between the following two queries? Which one has better performance? What if we have indexing for both tables?
Query #1:
select
b.column2
from
tableA a,
tableB b
where
a.column1 = b.column1
and a.column2 = ?;
Query #2:
select
b.column2
from
tableA a
inner join
tableB b on a.column1 = b.column1
where
a.column2 = ?;

2nd query has better performance.
You are using cross join in your first query and then filtering the results. Imagine having 10000 records in both the tables, it will produce 10000*10000 combinations.

Both will perform equally. One is an ansi style and the other is old fashioned style of joining
You may compare the explain plans and most likely you will find them to be the same.

Access (VBA) - Append Records Not Already In Table

I have append Queries A, B to Table C. I am creating a report based off C, but I need the distinct rows. Now I could select the distinct rows from C, but I want to delete them as I go (ie so Table C does not contain 1,000,000,000+ records over time for each append), so the report has ALL the unique records from C, past, present, future until the end user deletes them.
My question is simply this. Is there any way to append only distinct (not append distinct, rather append to the table distinct) rows to Table C?
If not directly possible, VBA?

Use constraints to enforce this behavior, in this case a (composite) primary key: https://support.office.com/en-us/article/Add-or-change-a-table-s-primary-key-in-Access-07b4a84b-0063-4d56-8b00-65f2975e4379
This will make sure, that you can't insert duplicate values into your table, which means that you won't have to delete the duplicates later on.
Define the primary key over all the columns that make a dataset unique.
Before adding a pk or constraint make sure to clean up your data though in order to remove all duplicate rows. The easiest way would probably to create a new table and fill it by using a SELECT DISTINCT .... from your current table.

Consider using any of the three NOT IN, NOT EXISTS, or LEFT JOIN...NULL queries to append data not currently in that table. Below assumes a primary key, ID, is used for the distinctness.
INSERT INTO TableC (Column1, Column2, Column3)
SELECT a.Column1, a.Column2, a.Column3
FROM QueryA a
LEFT JOIN TableC c
ON a.ID = c.ID
WHERE c.ID IS NULL;
INSERT INTO TableC (Column1, Column2, Column3)
SELECT a.Column1, a.Column2, a.Column3
FROM QueryA a
WHERE NOT EXISTS
(SELECT 1 FROM TableC c
WHERE a.ID = c.ID);
INSERT INTO TableC (Column1, Column2, Column3)
SELECT a.Column1, a.Column2, a.Column3
FROM QueryA a
WHERE a.ID NOT IN
(SELECT c.ID FROM TableC c);
Now, if no single column but multiple fields denote uniqueness, add to the JOIN or WHERE clauses:
...
FROM QueryA a
LEFT JOIN TableC c
ON a.Column1 = c.Column1 AND a.Column2 = c.Column2 AND a.Column3 = c.Column3
WHERE a.Column1 IS NULL OR a.Column2 IS NULL OR a.Column3 IS NULL;
...
WHERE NOT EXISTS
(SELECT 1 FROM TableC c
WHERE a.Column1 = c.Column1 AND a.Column2 = c.Column2 AND a.Column3 = c.Column3);
...
WHERE a.Column1 NOT IN
(SELECT c.Column1 FROM TableC c)
AND a.Column2 NOT IN
(SELECT c.Column2 FROM TableC c)
AND a.Column3 NOT IN
(SELECT c.Column3 FROM TableC c);

Combining SQL Queries to pass result from 1 as a parameter of the 2nd (SQL Server)

I am a little out of practice with SQL and I am trying to verify some data that has been converted in a system. Some of the queries I originally developed prior to the conversion are not proving out the work. I have been able to trace the source data back and verify that conversion was correct, but this is on an account by account basis. I would like to have a query to show the full dataset.
I have been able to work a solution down to 2 queries, but I cannot figure out how to combine them into one piece to show the full data set, where one value from the first query needs to be an element in the second query.
Query 1
select distinct
CreatedDate, AccountNum
From
Table1 A
Join
Table2 B on A.Column1 = B.Column1 and a.Column2 = b.Column2
Join
Table3 C on A.Column3 = C.Column3 and A.Column4 = C.Column4
where
Condition A and Condition B
Query 2
Select distinct
AccountNum, Responsible
From
Table3 D
Join
Table4 E on D.Column1 = E.Column2
where
StartDate <= 'DateValue' and EndDate > 'DateValue'
I would like to use the CreatedDate value from query 1 as the DateValue in query 2, but I have not found a solution to give the results I am looking for.
If I add a qualifier to each query, like account number, I end up with 1 result from query 1. I then put that CreatedDate into query 2 and I get the results I want. If I only have the account number on the 2nd query, I get two results, one from time period A to B with a responsible value of X and the 2nd from time period C to D with Responsible Value Y, which is where the CreateDate value falls between. Everything I have tried to combine these queries either ends up with a Responsible value of X (or no results), when I want that Y value.
I have not been able to successfully integrate the two queries, so that I can have that CreatedDate value passed as a parameter to figure out the Responsible value.
A solution that would work would be to create an intermediate table for the results of the 1st query and then join that table to 2nd query. However, I do not have access to create/insert/update tables/records on the database, so I cannot use this method.

I think you are looking for this
SELECT DISTINCT accountnum,
responsible
FROM table1 A
JOIN table2 B
ON A.column1 = B.column1
AND a.column2 = b.column2
JOIN table3 C
ON A.column3 = C.column3
AND A.column4 = C.column4
JOIN table4 D
ON D.column1 = C.column2
AND startdate <= createddate
AND enddate > createddate
where Condition A and Condition B
Note: You may have to add proper alias name to the columns

Select distinct AccountNum, Responsible
From Table3 D
Join Table4 E on D.Column1 = E.Column2
Join (
select distinct CreatedDate, AccountNum
From Table1 A
Join Table2 B on A.Column1 = B.Column1 and a.Column2 = b.Column2
Join Table3 C on A.Column3 = C.Column3 and A.Column4 = C.Column4
where Condition A and Condition B
) X
on D.AccountNum=X.AccountNum
and D.StartDate <= X.CreatedDate and EndDate > X.CreatedDate

Another solution is to make the first query into a table-valued UDF:
Create function GetCreateDateAndAcctId([Parameters for 2 conditions here])
Returns table As
Return
select distinct CreatedDate, AccountNum
From Table1 a
Join Table2 b
on b.Column1 = a.Column1
and b.Column2 = a.Column2
Join Table3 c
on c.Column3 = a.Column3
and c.Column4 = a.Column4
where condition1 -- here put predicate
and condition2 -- using input parameters
Then, to use it, just include it as a table in your second query like this:
Select distinct AccountNum, Responsible
From Table3 d
Join Table4 e
on e.Column2 = d.Column1
outer apply dbo.GetCreateDateAndAcctId(Parameters) cd
where StartDate <= cd.CreatedDate and EndDate > cd.CreatedDate
If you do this, the logic for the first query remains in a separate database object for reusability (you can use it in any other process without copying it). and better maintainability, (it's in only one place for fixing bugs and enhancements, etc. Also, since it's a table valued UDF, the SQL Server query processor will actually combine it with the second query's SQL into a single reusable compiled execution plan.

complete a table with data from an other one

I have two tables A(column1,column2) and B(column1,column2).
How to ensure that a value from A(column1) is not contained within B(column1) and insert it in this column.
My Query will be like this :
insert into B.column1 values()
where
...
I want to complete B.column1 with data from A.column1
What should i put in the where clause ?

Insert Into B(column1)
Select A.Column1
From A
Where A.Column1 not in (Select Column1 From B)

I would use MINUS command and select all rows from A(column1) which are not in B(column1) and then SELECT INTO result into B table.

insert into B
select a.column1, a.column2 from a
left join b
on a.column1 = b.column1
where b.column1 is null

SQL Selecting from 3 tables returns syntax error

I have 3 tables on which i need to select data from using ms-access DB
I tried this SQL:
SELECT a.column1, a.column2, a.column3, a.columnID, b.column1
From TableA a INNER JOIN TableB b
ON a.columnID = b.columnID INNER JOIN TableC c
ON c.columnID = a.columnRelativeID
WHERE a.columnID=16
Although when I try to execute the query I receive Syntax error.
In addition, when i remove the second join, with the third table, the query works fine so this is the place where the error stays.
This example of joining 3 tables didn't help me understand where my problem is.
Is it OK if I just select from two tables and complete the third-table data from a LINQ in C#? I have the Third-table data in a data source in my code
Thanks in advance,
Oz.

You can absolutely select from three (or more) tables in MS Access. However, you have to use Access' craptastic parenthesis system which pairs tables together in the From clause.
Select A.Column1, A.Column2, A.Column3, A.ColumnID, B.Column1
From (Table1 AS A
Inner Join Table2 AS B
On A.ColumnID = B.ColumnId)
Inner Join Table3 AS C
ON A.ColumnRelativeId = C.ColumnId
Where A.ColumnId = 16

SELECT a.column1, a.column2, a.column3, a.columnID, b.column1
From TableA a , TableB b, TableC
WHERE a.columnID = b.columnID
AND c.columnID = a.columnRelativeID
AND a.columnID=16

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas