PostgreSQL: Remove rows from a table using id's - sql

I have a bill table with id as the pk and a billno column which I should remove duplicates from
total rows (62924)
select count(billno) from bill
unique billno (59704), so need to remove 3220 rows
select count(distinct billno) from bill
query to get the duplicates (3220)
select count(*) from bill
WHERE bill.billno IN (SELECT billno
FROM bill
GROUP BY billno HAVING COUNT(*) > 1)
AND bill.company_code like '1'
However when I remove duplicates by id, the total does not tally :-
count after remove duplicated rows (61385) => SHOULD GET 59704 here..
select count (*) from bill
where bill.id not in
(
select id from bill
WHERE bill.billno IN (SELECT billno
FROM bill
GROUP BY billno HAVING COUNT(*) > 1)
AND bill.company_code like '1'
)
Can I know why this is happening?

You seem to be removing all duplicate rows. If you want a result set with no duplicates, use distinct on:
select distinct on (billno) b.*
from bill b
order by billno, id desc;
This returns the row for each bill that has the highest id.
I'm not sure why your query filters on the company. The question mentions nothing about that filtering.

Related

sql deleting duplicate row

I have a table in SQL
Id owner_id amount
1 100 1000
2 101 2000
3 100 3000
4 104 800
5 100 1200
i want only one owner_id i don't want 100 multiple times, but i want amount of all owner_id 100 i,e that amount should be added(i,e. 1000+3000+12000) if i delete duplicate Owner_id row. how to do it
And one more issue that owner_id from another table, how to get Owner name from another table. How to add join to get name of the owner
SELECT
owner_id,
SUM(amount) total_amount
FROM
table
GROUP BY
owner_id
try this :
-- Acumulate all the amount to be able to do the cleanup
UPDATE table SET amount = sumAmount
FROM table t
JOIN (SELECT owner_id, SUM(amount) sumAmount
FROM table
GROUP BY owner_id) x ON x.owner_id = t.owner_id;
-- Delete duplicated data
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY owner_id, amount ORDER BY Id) row
FROM table)
DELETE CTE WHERE row <> 1
Please use below query,
select owner_id, sum(amount) from table_name group by owner_id;
It works with the SUM Function, here is a reference: Link
SUM sums the amount of a the column which you insert, here: amount.
Please be aware that some functions only works if all other selected columns within another function OR grouped by.
SELECT
owner_id,
SUM(amount) total_amount
FROM
table
GROUP BY
owner_id

how can I select rows that column does NOT have more than 1 value?

I am very new to SQL and I am wondering how to solve this issue. For example, my table looks as follows:
As you see in the table item_id 1 appears in both city_id 1 and 2, so does the item_id 4, but I want to get all the items where appears only in one city_id.
In this example, these would be item_id 2 (appearing only in city_id 2) and item_id 3 (appearing in city_id 1).
Use aggregation on item_id and count distinct values of city_id. The having clause can be used to filter on aggregates.
select item_id from mytable group by id having count(distinct city_id) = 1
You can use the following query:
SELECT item_id
FROM table_name
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
In case you want to see the city_id to you can use this query:
SELECT item_id, MIN(city_id) AS city_id
FROM example
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
Since there is only one city_id you can use MIN or MAX to get the id.
demo on dbfiddle.uk
You want all the id where they have only one distinct city:
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
It works by counting all the different values that city_id has for the same item_id. For those item ids where they repeat a lot, but the city_id is always the same the count of unique values in the city id is 1, and we can look for these using a HAVING clause. "Having" is like a where clause that runs after a GROUP BY operation is completed. It is the conceptual equivalent of this:
SELECT item_id
FROM
(
SELECT item_id, count(distinct city_id) as cdci
FROM table
GROUP BY item_id
) x
WHERE cdci = 1
If you want the city id too you can either get the MAX city (because in this case there is only one city so it's safe to do):
SELECT item_id, MAX(city_id) as city_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
or you could join this query back to the item table as a subquery:
SELECT t.*
(
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
) x
INNER JOIN
table t
ON x.item_id = t.item_id
This technique is the more general process for performing a group by that finds some particular set of rows, then bringing in the rest of the data from that row. You cant always stick every other column you want in a MAX because it will mix row data up, and you can't put the extra columns in your group by because that will subdivide what you're grouping on, giving the wrong results. Doing the group as a subquery and joining it back is a typical way to get all the row data when you have to group it to find which rows are interesting
In your case this form of query will bring all the duplicated rows (whereas the group by/max won't). If you don't want the duplicate rows you can make the top line SELECT DISTINCT t.* but don't make a habit of slapping distinct in to get rid of duplicated rows; if your tables don't have duplicates to start with but suddenly after you wrote a JOIN you got duplicated rows, google fornwhat a Cartesian product is in database queries and how to prevent it
You just need a group by on item id with having
Select item_id from table group by
item_id having count(distinct city_id)
=1
Also, if you want to have majority of same no of rows as input then
Select item_id, city, rank()
over(partition by item_id order by city)
rn
From table where rn=1;

How to select 1 record from same values in sql?

I have a customer table:
I have to display customer names to select for sale. Name can be duplicate, and I have to select record with customer name. How can I select that particular record?
I am using the following query:
select tot_amt
from customer
where cust_name = 'someone'
Although You should use cust_id to display a result as each cust_id will be unique, but if you want to show only 1 row even if there are rows more than 1 in you result then you can use this query:
select top 1 tot_amt from customer where cust_name='someone';

Show duplicate rows(all columns of that row) where all columns are duplicate except one column

In below table, I need to select duplicate records where all columns are duplicate except Customer Type and Price for a particular week.
For e.g
Week Customer Product Customer Type Price
1 Alex Cycle Consumer 100
1 Alex Cycle Reseller 101
2 John Motor Consumer 200
3 John Motor Consumer 200
3 John Motor Reseller 201
I am using below query but this query doesn't show me both costumer type, it just shows me consumer count(*) for a combination.
select Week, Customer, product, count(distinct Customer Type)
from table
group by Week, Customer, product
having count(distinct Customer Type) > 1
I would like to see below result, that shows me duplicate values and not just the count(*) of duplicate row. I am trying to see customers assigned to multiple customer types in a particular week for a product and at the same time show me all columns. It doesn't matter if the price is different.
Week Customer Product Customer Type Price
1 Alex Cycle Consumer 100
1 Alex Cycle Reseller 101
3 John Motor Consumer 200
3 John Motor Reseller 201
Thanks
Shaki
WITH CustomerDistribution_CTE (WeekC ,CustomerC, ProductC)
AS
(
select Week, Customer, product
from Your_Table_Name group by Week, Customer,
product having count(distinct CustomerType) > 1
)
SELECT Y.*
FROM CustomerDistribution_CTE C
inner join Your_Table_Name Y on C.WeekC =Y.Week
and C.CustomerC =Y.Customer and C.productC =Y.product
Note :Please replace "Your_Table_Name" with exact table name and Try.
One way to achieve this, using generic SQL, is to use a "derived table" like this:
select x.*
from tablex x
inner join (
select Week, Customer, Product
from tablex
group by Week, Customer, Product
having count(*) > 1
) d on x.Week = d.Week and x.Customer = d.Customer and x.Product = d.Product
You can do that by using DISTINCT like
select DISTINCT Customer,Product,Customer_Type,Price from Your_Table_Name
will look for DISTINCT combination.
Note: This query if of SQL Server
From the expected result that you have pasted, it looks like you are not concerned about the week.
If you have a ID (incremental PK), it would be much simpler like below
select * from table where ID not in
(select max(ID) from table group by Customer, Product, CustomerType having count(*) > 1 )
This is tested on MySQL. Do you have a ID column?
In case you don't have a ID column, try the below:
select max(week) week, Customer, Product, CustomerType, max(price) from device group by Customer, Product, CustomerType;
I have not verified this one.
This will return your expected result set:
select *
from table
-- Teradata syntax to filter the result of an OLAP-function
-- (similar to HAVING after GROUP BY)
qualify
count(*)
over (partition by Week, Customer, product) > 1
For other DBMSes you will need to nest your query:
select *
from
(
select ...,
count(*)
over (partition by Week, Customer, product) as cnt
from table
) as dt
where cnt > 1
Edit:
After re-reading your description above Select might be not exactly what you want, because it will also return rows with a single type. Then switch to:
select *
from table
-- Teradata syntax to filter the result of an OLAP-function
-- (similar to HAVING after GROUP BY)
qualify -- at least two different types:
min(Customer_Type) over (partition by Week, Customer, product)
<> max(Customer_Type) over (partition by Week, Customer, product)

SQL - fetch records based on a condition of two columns

I need to fetch records where
Order No = ABC
has more than one tracking number in a large table
Can someone help with that?
I'll assume that you're fetching records from Orders and the "large table" is TrackingNumbers. You can group by the OrderNo in a sub-query and refine the sub-query by using a having clause. Then the sub-query will return only OrderNos that are in the table more than once. For example:
select OrderNo
from Orders
where OrderNo in (select OrderNo
from TrackingNumbers
group by OrderNo
having count(*) > 1)
To identify duplicates in a single table (as mentioned in the comments):
select *
from Orders
group by TrackingNumber
having count(*) > 1