Oracle SQL - Select duplicates based on two columns

Oracle SQL - Select duplicates based on two columns - sql

I need to select duplicate rows based on two columns in a join, and i can't seem to figure out how that is done.
Currently i got this:
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id
And the output is something along the lines of below:
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Warlob | 66323 | ext531 | id444 |
| Ozzy | 53123 | ext632 | id333 |
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |
What i am looking for, is how to simply select these 4 rows, as they are duplicates based on column: adm_id and Identifier_value
| Name | adm_id | external_code |identifier_value |
|:-----------|------------:|:------------: |:----------------:|
| Motorhead | 521 | ext733 | id222 |
| Perez | 123 | ext833 | id111 |
| Starlight | 521 | ext934 | id222 |
| Aligned | 123 | ext235 | id111 |

First group by ADM_ID, IDENTIFIER_VALUE and find groups that has more than one row in it.
Then select all rows that has these couples
SELECT S.NAME
,ADMINISTRATIVE_SITE_ID AS ADM_ID
,S.EXTERNAL_CODE
,SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
WHERE (ADMINISTRATIVE_SITE_ID, SI.IDENTIFIER_VALUE) IN (SELECT ADMINISTRATIVE_SITE_ID AS ADM_ID, SI.IDENTIFIER_VALUE
FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
GROUP BY ADM_ID, IDENTIFIER_VALUE
HAVING COUNT(*) > 1)

Or an alternate way that may perform better on big datasets:
with t as (
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value
COUNT(*) OVER (PARTITION BY administrative_site_id ,identifier_value ) AS cnt
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id)
select name, adm_id, external_code, identifier_value
from t
where cnt > 1

Related

Can someone help me figure out if I'm making a mistake in my query?

I'm trying to create a query that returns the names of all people in my database that have less than half of the money of the person with the most money.
These is my query:
select P1.name
from Persons P1 left join
AccountOf A1 on A1.person_id = P1.id left join
BankAccounts B1 on B1.id = A1.account_id
group by name
having SUM(B1.balance) < MAX((select SUM(B1.balance) as b
from AccountOf A1 left join
BankAccounts B1 on B1.id = A1.account_id
group by A1.person_id
order by b desc
LIMIT 1)) * 0.5
This is the result:
+-------+
| name |
+-------+
| Evert |
+-------+
I have the following tables in the database:
+---------+--------+--+
| Persons | | |
+---------+--------+--+
| id | name | |
| 11 | Evert | |
| 12 | Xavi | |
| 13 | Ludwig | |
| 14 | Ziggy | |
+---------+--------+--+
+--------------+---------+
| BankAccounts | |
+--------------+---------+
| id | balance |
| 11 | 525000 |
| 12 | 750000 |
| 13 | 1900000 |
| 14 | 1600000 |
+--------------+---------+
+-----------+-----------+------------+
| AccountOf | | |
+-----------+-----------+------------+
| id | person_id | account_id |
| 301 | 11 | 12 |
| 302 | 13 | 12 |
| 303 | 13 | 14 |
| 304 | 14 | 11 |
| 305 | 14 | 13 |
+-----------+-----------+------------+
What am I missing here? I should get two entries in the result (Evert, Xavi)

I wouldn't approach the logic this way (I would use window functions). But your final having has two levels of aggregation. That shouldn't work. You want:
having SUM(B1.balance) < (select 0.5 * SUM(B1.balance) as b
from AccountOf A1 join
BankAccounts B1 on B1.id = A1.account_id
group by A1.person_id
order by b desc
limit 1
)
I also moved the 0.5 into the subquery and changed the left join to a join -- the tables need to match to get balances.

I would recommend window functions, if your - undisclosed! - database supports them.
You can join and aggregate just once, and then use a window max() to get the top balance. All that is then left to is to filter in an outer query:
select *
fom (
select p.id, p.name, coalesce(sum(balance), 0) balance,
max(sum(balance)) over() max_balance
from persons p
left join accountof ao on ao.person_id = p.id
left join bankaccounts ba on ba.id = ao.account_id
group by p.id, p.name
) t
where balance > max_balance * 0.5

Joining table on two columns only joins it on a single

How do I correctly join a table on two columns. My issue is that the result is not correct as it only joins on a single column.
This question started of in this other question: SQL query returns product of results instead of sum . I am creating a new question as there is an other issue I am trying to solve.
I join a table of materials on a table which contains multiple supply and disposal movements. Each movement references a material id. I would like to join the material on each movement.
My query:
SELECT supply_material_refer, disposal_material_refer, material_id, material_name
FROM "construction_sites"
JOIN projects ON construction_sites.project_refer = projects.project_id
JOIN addresses ON construction_sites.address_refer = addresses.address_id
cross join lateral ( select *
from (select row_number() over () as rn, *
from supplies
where supplies.supply_project_refer = projects.project_id) as supplies
full join (select row_number() over () as rn, *
from disposals
where disposals.disposal_project_refer = projects.project_id
) as disposals
on (supplies.rn = disposals.rn)
) as combined
LEFT JOIN materials material ON combined.disposal_material_refer = material.material_id
OR combined.supply_material_refer = material.material_id
WHERE (projects.project_name = 'Project 15')
ORDER BY construction_site_id asc;
The result of the query:
+-----------------------+-------------------------+-------------+---------------+
| supply_material_refer | disposal_material_refer | material_id | material_name |
+-----------------------+-------------------------+-------------+---------------+
| 1 | 1 | 1 | Materialtest |
| 2 | 1 | 1 | Materialtest |
| 2 | 1 | 2 | Dirt |
| 1 | 1 | 1 | Materialtest |
| 2 | 1 | 1 | Materialtest |
| 2 | 1 | 2 | Dirt |
| 1 | (null) | 1 | Materialtest |
| 4 | (null) | 4 | Stones |
+-----------------------+-------------------------+-------------+---------------+
An example line I have issues with:
+------------------------+-------------------------+-------------+---------------+
| supply_material_refer | disposal_material_refer | material_id | material_name |
+------------------------+-------------------------+-------------+---------------+
| 2 | 1 | 1 | Materialtest |
+------------------------+-------------------------+-------------+---------------+
A prefered output would be like:
+------------------------+----------------------+-------------------------+------------------------+
| supply_material_refer | supply_material_name | disposal_material_refer | disposal_material_name |
+------------------------+----------------------+-------------------------+------------------------+
| 2 | Dirt | 1 | Materialtest |
+------------------------+----------------------+-------------------------+------------------------+
I have created a sqlfiddle with dummy data: http://www.sqlfiddle.com/#!17/863d78/2
To my understanding the solution would be to have a disposal_material column and and supply_material column for the material names. I do not know how I can achieve this goal though...
Thanks for any help!

PostgreSQL can't make Self Join

I have a table:
| acctg_cath_id | parent | description |
| 1 | 20 | Bills |
| 9 | 20 | Invoices |
| 20 | | Expenses |
| 88 | 30 |
| 89 | 30 |
| 30 | |
And I want to create a self join in order to group my items under a parent.
Have tried this, but it doesn't work:
SELECT
accounting.categories.acctg_cath_id,
accounting.categories.parent
FROM accounting.categories a1, accounting.categories a2
WHERE a1.acctg_cath_id=a2.parent
I get error: invalid reference to FROM-clause entry for table "categories"
When I try:
a.accounting.categories.acctg_cath_id
b.accounting.categories.acctg_cath_id
I get error: cross-database references are not implemented: a.accounting.categories.acctg_cath_id
Desired output:
Expenses (Parent 20)
Bills (Child 1)
Invoices (Child 9)
What am I doing wrong here?

It seems you merely want to sort the rows:
select *
from accounting.categorie
order by coalesce(parent, acctg_cath_id), parent nulls first, acctg_cath_id;
Result:
+---------------+--------+-------------+
| acctg_cath_id | parent | description |
+---------------+--------+-------------+
| 20 | | Expenses |
| 1 | 20 | Bills |
| 9 | 20 | Invoices |
| 30 | | |
| 88 | 30 | |
| 89 | 30 | |
+---------------+--------+-------------+

Your syntax is performing a cross join:
FROM accounting.categories a1, accounting.categories a2
Try the following:
SELECT
a2.acctg_cath_id,
a2.parent
FROM accounting.categories a1
JOIN accounting.categories a2 ON (a1.acctg_cath_id = a2.parent)
;
Examine the DBFiddle.

You don't need grouping, only self join:
select
c.acctg_cath_id parentid, c.description parent,
cc.acctg_cath_id childid, cc.description child
from (
select distinct parent
from categories
) p inner join categories c
on p.parent = c.acctg_cath_id
inner join categories cc on cc.parent = p.parent
where p.parent = 20
You can remove the WHERE clause if you want all the parents with all their children.
See the demo.
Results:
> parentid | parent | childid | child
> -------: | :------- | ------: | :-------
> 20 | Expences | 1 | Bills
> 20 | Expences | 9 | Invoices

You don't need a self-join. You don't need aggregation. You just need a group by clause:
SELECT ac.*
FROM accounting.categories ac
ORDER BY COALESCE(ac.parent, ac.acctg_cath_id),
(CASE WHEN ac.parent IS NULL THEN 1 ELSE 2 END),
ac.acctg_cath_id;

Can't show all records with the same id while join in oracle xe 11g

I'm getting this message while using this query, is there anything wrong?
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t, transaksi_detail td, operator o
WHERE td.transaksi_id = t.transaksi_id AND o.operator_id = t.operator_id
GROUP BY t.transaksi_id
Updated :
After using the answer from #Barbaros Özhan using this query :
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
INNER JOIN operator o ON ( o.operator_id = t.operator_id )
GROUP BY t.tanggal_transaksi, o.nama_lengkap;
the data is successfully displayed. but, there are few problems that occur, the value of the same operator_id cannot appear more than 1 time. Here is the sample data :
+--------------+-------------+-------------------+
| TRANSAKSI_ID | OPERATOR_ID | TANGGAL_TRANSAKSI |
+--------------+-------------+-------------------+
| 1 | 5 | 09/29/2018 |
| 2 | 3 | 09/29/2018 |
| 3 | 3 | 09/29/2018 |
| 4 | 1 | 09/29/2018 |
| 5 | 1 | 09/29/2018 |
+--------------+-------------+-------------------+
After use the query command, the output is :
+-------------------+------------------+--------+
| TANGGAL_TRANSAKSI | NAMA_LENGKAP | TOTAL |
+-------------------+------------------+--------+
| 09/29/2018 | Lina Harun | 419800 |
| 09/29/2018 | Titro Kusumo | 484000 |
| 09/29/2018 | Muhammad Kusnadi | 402000 |
+-------------------+------------------+--------+
When viewed from the operator table, there are 2 data with the same operator_id that is unreadable
+-------------+------------------+
| OPERATOR_ID | NAMA_LENGKAP |
+-------------+------------------+
| 1 | Muhammad Kusnadi |
| 3 | Lina Harun |
| 5 | Tirto Kusumo |
+-------------+------------------+

You need to include the columns in the SELECT-list t.tanggal_transaksi, o.nama_lengkap, also in the GROUP BY-list but not the others like t.transaksi_id. So, you might use the following without any issue :
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
INNER JOIN operator o ON ( o.operator_id = t.operator_id )
GROUP BY t.tanggal_transaksi, o.nama_lengkap;
Or this one :
SELECT t.transaksi_id, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
GROUP BY t.transaksi_id;
P.S. Prefer using ANSI-92 JOIN standard rather than old-style comma-type JOIN.

Access Queries comparing two tables

I have two tables in Access, Table A and Table B:
Table MasterLockInsNew:
+----+-------+----------+
| ID | Value | Date |
+----+-------+----------+
| 1 | 123 | 12/02/13 |
| 2 | 1231 | 11/02/13 |
| 4 | 1265 | 16/02/13 |
+----+-------+----------+
Table InitialPolData:
+----+-------+----------+---+
| ID | Value | Date |Type
+----+-------+----------+---+
| 1 | 123 | 12/02/13 | x |
| 2 | 1231 | 11/02/13 | x |
| 3 | 1238 | 10/02/13 | y |
| 4 | 1265 | 16/02/13 | a |
| 7 | 7649 | 18/02/13 | z |
+----+-------+----------+---+
All I want are the rows from table B for IDs not contained in A. My current code looks like this:
SELECT Distinct InitialPolData.*
FROM InitialPolData
WHERE InitialPolData.ID NOT IN (SELECT Distinct InitialPolData.ID
from InitialPolData INNER JOIN
MasterLockInsNew
ON InitialPolData.ID=MasterLockInsNew.ID);
But whenever I run this in Access it crashes!! The tables are fairly large but I don't think this is the reason.
Can anyone help?
Thanks

or try a left outer join:
SELECT b.*
FROM InitialPolData b left outer join
MasterLockInsNew a on
b.id = a.id
where
a.id is null

Simple subquery will do.
select * from InitialPolData
where id not in (
select id from MasterLockInsNew
);

Try using NOT EXISTS:
SELECT Distinct i.*
FROM InitialPolData AS i
WHERE NOT EXISTS (SELECT 1
FROM MasterLockInsNew AS m
WHERE m.ID = i.ID)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Oracle SQL - Select duplicates based on two columns - sql

Related

Can someone help me figure out if I'm making a mistake in my query?

Joining table on two columns only joins it on a single

PostgreSQL can't make Self Join

Can't show all records with the same id while join in oracle xe 11g

Access Queries comparing two tables

Categories

Resources