Duplicate entries with different timestamp

Duplicate entries with different timestamp - sql

I have a Table for Customers by name : Customer_SCD in SQL
I have 3 Columns present in it : Customer_Name, Customer_ID Customer_TimeStamp
There are duplicate entries in this table with different Timestamp.
For example
ABC, 1, 2012-12-05 11:58:20.370
ABC, 1, 2012-12-03 12:11:09.840
I want to eliminate this from the database and keep the first time/date available.
Thanks.

This works, try it:
DELETE Customer_SCD
OUTPUT deleted.*
FROM Customer_SCD b
JOIN (
SELECT MIN(a.Customer_TimeStamp) Customer_TimeStamp,
Customer_ID,
Customer_Name
FROM Customer_SCD a
GROUP BY a.Customer_ID, a.Customer_Name
) c ON
c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp <> b.Customer_TimeStamp
In a subquery it determines which record is the first one for every Customer_Name,Customer_ID and then it deletes all the other records for a duplicate. I also added the OUTPUT clause which returns rows affected by the statement.
You could also do it by using ranking function ROW_NUMBER:
DELETE Customer_SCD
OUTPUT deleted.*
FROM Customer_SCD b
JOIN (
SELECT Customer_ID,
Customer_Name,
Customer_TimeStamp,
ROW_NUMBER() OVER (PARTITION BY Customer_ID, Customer_Name ORDER BY Customer_TimeStamp) num
FROM Customer_SCD
) c ON
c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp = b.Customer_TimeStamp
AND c.num <> 1
See which one has a smaller query cost and use it, when I checked it, first approach was more efficient (it had a better execution plan).
Here's an SQL Fiddle

The following query will give you the results you want to keep.
Select Customer_Name, Customer_ID, MIN(Customer_TimeStamp) as Customer_TimeStamp
from Customer_SCD
group by Customer_Name, Customer_ID
store the result in a table variable, say #correctTbl
then join with this table and remove duplicates
delete
from Customer_SCD a
inner join #correctTbl b on a.Customer_Name = b.Customer_Name and a.Customer_ID = b.Customer_ID and a.Customer_TimeStamp != b.Customer_TimeStamp

Related

Returning complex query on update sql

I want to return query with multiple joins and with clause after updating something.
For example my query is:
WITH orders AS (
SELECT product_id, SUM(amount) AS orders
FROM orders_summary
GROUP BY product_id
)
SELECT p.id, p.name,
p.date_of_creation,
o.orders, s.id AS store_id,
s.name AS store_name
FROM products AS p
LEFT JOIN orders AS o
ON p.id = o.product_id
LEFT JOIN stores AS s
ON s.id = p.store_id
WHERE p.id = '1'
id
name
date
orders
store_id
store_name
1
pen
11/16/2022
10
1
jj
2
pencil
11/10/2022
30
2
ff
I want to return the exact query but with updated result in my update:
UPDATE products
SET name = 'ABC'
WHERE id = '1'
RETURNING up_qeury
Desired result on update:
id
name
date
orders
store_id
store_name
1
ABC
11/16/2022
10
1
jj

You can try UPDATE products ... RETURNING *. That may get you the content of the row you just updated.
As for UPDATE .... RETURNING someQuery, You Can't Do That™. You want to do both the update and a SELECT operation in one go. But that's not possible.
If you must be sure your SELECT works on the precisely the same data as you just UPDATEd, you can wrap your two queries in a BEGIN; / COMMIT; transaction. That prevents concurrent users from making changes between your UPDATE and SELECT.

SQL joined tables are causing duplicates

So table A is an overall table of policy_id information, while table b is policy_id's with claims attached. Not all of the id's in A exist in B, but I want to join the two tables and sum(total claims).
The issue is that the sum is way higher than the actual sum within the table itself.
Here is what I've tried so far:
select a.policy_id, coalesce(sum(b.claim_amt), 0)
from database.table1 as a
left join database2.table2 as b on a.policy_id = b.policy_id
where product_code = 'CI'
group by a.policy_id
The id's that don't exist in b show up just fine with a 0 next to them, it's the ones that do exist where the claim_amt's seem like they're being duplicated heavily in the sum.

I suspect your policy_id in table1 are not unique and that leads to the doubled,tripled ,etc. amounts
You could aggregate the sums from table2 in a CTE to get around this.
WITH CTE AS (
SELECT
policy_id
coalesce(sum(claim_amt), 0) as sum_amt
FROM database2.table2
group by policy_id
)
select a.policy_id, b.sum_amt
from database.table1 as a
left join CTE as b on a.policy_id = b.policy_id
where product_code = 'CI'

SQL LEFT JOIN - Inner select not returning columns

I have two tables called 'Customers' and 'Orders'. Tables column names are as follow:
Customers: id, name, address
Orders: id, person_id, product, price
The desired outcome is to query all customers with one of their latest purchases. I have a lot of duplicates in 'Orders' table whereby two records with same time-stamp due to some bug.
I have written the following code but the issue is that the query does not return table 2(Orders) column values. Can anyone advise what the issue is?
SELECT C.Id,C.Name, O.item, O.price, O.product
FROM Customers C
LEFT JOIN
(
SELECT TOP 1 person_id
FROM Orders
WHERE status = 'Pending'
) O ON C.ID = O.person_id
Results: O.item, O.price, O.product values are all null
Edit: Sample Data
ID/ NAME/ ADDRESS/
1/ A/ Ad1/
2/ B/ Ad2/
3/ C/ Ad3/
ID/ Person ID/ PRODUCT PRICE/ Created Date
ID-1234/ 1/ Book/ $5/ 26-2-2017
ID-1235/ 1/ Book/ $5/ 26-2-2017
ID-1236/ 2/ Calendar/ $10/ 4-2-2017
ID-1238/ 1/ Pen/ $2/ 1-1-2016

Assuming that the id column in Orders is a primary key autoincrement, then the following should work:
SELECT c.id,
c.name,
COALESCE(t1.price, 0.0) AS price,
COALESCE(t1.product, 'NA') AS product
FROM Customers c
LEFT JOIN Orders t1
ON c.id = t1.person_id
LEFT JOIN
(
SELECT person_id, MAX(CAST(SUBSTRING(id, 4, LEN(id)) AS INT)) AS max_id
FROM Orders
GROUP BY person_id
) t2
ON t1.person_id = t2.person_id AND
t2.max_id = CAST(SUBSTRING(t1.id, 4, LEN(t1.id)) AS INT)
This answer assumes that taking the greatest order ID per customer will yield the most recent purchase. Ideally you should have a timestamp column which captures when a transaction took place. Note that even in the query above, we still have no way of knowing when the most recent transaction took place.

So where is the timestamp column? It's not mentioned in your table schema. But your description does not mention the status column either, and that is clearly in there.
Is orders.id unique? Is it the key for the Orders table?> If it is, then your schema has no way to identify "duplicate" records. You cannot mean to imply that only one order per customer is allowed, so if there are multiple orders for a single customer, how do we identify the duplicates? By the unmentioned timestamp column?
If there IS a `timestamp column, and that's how you would identify dupes, then use it.
SELECT C.Id,C.Name, O.item, O.price, O.product
FROM Customers C LEFT JOIN Orders o
on o.id = (Select Min(id) from orders
where person_id = c.Id
and timestamp = o.timestamp
and status = 'Pending')

MS ACCESS Query with sub queries

I need your help to create two queries for MS Access Database that I can run from VB6. Here is my Table Schema of the tables (ORDER, AMC, Customer)
Table 1: Order
Order_ID
Order_Date
Customer_ID
Table 2: AMC
AMC_ID
Order_ID
Next_Renew_ID
Table 3: Customer
Customer_ID
Customer_Name
Now I want to do 2 selection from Order table That Does the following
Query 1
Part 1: Select all those Order_ID from AMC where Next_Renew_ID='N/A'
Part 2: Now Select all those records from Order where the Order_ID is not in the result of Part 1 of this Query.
Query 2
Part 1: Select all those Order_ID from AMC where Next_Renew_ID='N/A'.
Part 2: Now Select all those Customer_IDfrom Customer where the Customer_Name Like 'Krish%'.
Part 3: Now Select all those records from Order where the Order_ID is not in the result of Part 1 of this Query and Customer_ID is in the result of Part 2 of this Query.
I know that It can be easily done using joining or something like that, but I'm really no good at sql. Please help me.

Part 1 -- this uses the LEFT JOIN / NULL check. This can also be achieved using NOT IN (or possibly NOT EXISTS assuming Access supports it):
SELECT O.*
FROM Order O
LEFT JOIN AMC A ON O.Order_ID = A.Order_Id
AND A.Next_Renew_ID = 'N/A'
WHERE A.Order_Id IS NULL
Part 2 -- using the same query as above, but adding an INNER JOIN to the customer table to make sure the customer first exists in that table:
SELECT O.*
FROM Order O
INNER JOIN Customer C ON O.Customer_ID = C.Customer_Id
LEFT JOIN AMC A ON O.Order_ID = A.Order_Id
AND A.Next_Renew_ID = 'N/A'
WHERE A.Order_Id IS NULL AND
C.Customer_Name Like 'Krish%'

My English is very poor, do not know if there is no understand what you mean
Query 1
part 2
select * from Order where Order_ID not in (select a.Order_ID from Order a left join AMC b on a.Order_ID=b.a.Order_ID where Next_Renew_ID='N/A')
Query 2
part 2
select Customer_ID from Customer where Customer_Name Like 'Krish%'
part 3
select * from Order where Order_ID not in (select Distinct Order_Id from AMC where Next_Renew_ID='N/A') and Customer_ID in (select Customer_ID from Customer where Customer_Name like 'Krish%')

SQL query for join with condition

I have these two tables:
Customers: Id, Name
Orders: Id, CustomerId, Time, Status
I want to get a list of customers for which the LAST order does not have a status of 'Wrong'.
I know how to use a LEFT JOIN to get a count of orders for each customer, but I don't know how I can use this statement for what I want. Maybe a JOIN is not the right thing to use too, I'm not sure.
It's possible that customers do not have any order, and they should be returned.
I'm abstracting the real tables here, but the scenario is for a windows phone app sending notifications. I want to get all clients for which their last notification does not have a 'Dropped' status. I can sort their notifications (orders) by the 'Time' field. Thanks for the help, while I continue experimenting with subqueries in the where clause.

Select ...
From Customers As C
Where Not Exists (
Select 1
From Orders As O1
Join (
Select O2.CustomerId, Max( O2.Time ) As Time
From Orders As O2
Group By O2.CustomerId
) As LastOrderTime
On LastOrderTime.CustomerId = O1.CustomerId
And LastOrderTime.Time = O1.Time
Where O1.Status = 'Dropped'
And O1.CustomerId = C.Id
)
There are obviously alternatives based on the actual database product and version. For example, in SQL Server one could use the TOP command or a CTE perhaps. However, without knowing what specific product is being used, the above solution should produce the results you want in almost any database product.
Addition
If you were using a product that supported ranking functions (which database product and version isn't mentioned) and common-table expressions, then an alternative solution might be something like so:
With RankedOrders As
(
Select O.CustomerId, O.Status
, Row_Number() Over( Partition By CustomerId Order By Time Desc ) As Rnk
From Orders As O
)
Select ...
From Customers
Where Not Exists (
Select 1
From RankedOrders As O1
Where O1.CustomerId = C.Id
And O1.Rnk = 1
And O1.Status = 'Dropped'
)

Assuming Last order refers to the Time column here is my query:
SELECT C.Id,
C.Name,
MAX(O.Time)
FROM
Customers C
INNER JOIN Orders O
ON C.Id = O.CustomerId
WHERE
O.Status != 'Wrong'
GROUP BY C.Id,
C.Name
EDIT:
Regarding your table configuration. You should really consider revising the structure to include a third table. They would look like this:
Customer
CustomerId | Name
Order
OrderId | Status | Time
CompletedOrders
CoId | CustomerId | OrderId
Now what you do is store the info about a customer or order in their respective tables ... then when an order is made you just create a CompletedOrders entry with the ids of the 2 individual records. This will allow for a 1 to Many relationship between customer and orders.

Didn't check it out, but something like this?
SELECT c.CustmerId, c.Name, MAX(o.Time)
FROM Customers c
LEFT JOIN Orders o ON o.CustomerId = c.CustomerId
WHERE o.Status <> 'Wrong'
GROUP BY c.CustomerId, C.Name

You can get list of customers with the LAST order which has status of 'Wrong' with something like
select customerId from orders where status='Wrong'
group by customerId
having time=max(time)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Duplicate entries with different timestamp - sql

Related

Returning complex query on update sql

SQL joined tables are causing duplicates

SQL LEFT JOIN - Inner select not returning columns

MS ACCESS Query with sub queries

SQL query for join with condition

Categories

Resources