delete data based on sum of values - sql

Hi I would like to ask if it possible to delete some data based on sum in big query.
Here is the problem I would like to delete only rows which has sum over 100. I try to use:
DELETE FROM (SELECT user, sum(paid) as money FROM test) where money > 100
but it didn't work then I try use:
with table2 as (SELECT a.*, sum(paid) as money from `test` a)
DELETE from table2 where table2.money > 100
it also didn't work
id
login
paid
1
john
99
1
john
2
2
josh
50
3
mark
800
and the result should be only 1 row here.

Try below SQL to delete data from test where sum(paid) > 100.
Once it executed, perform select on test table to see the result.
delete from `test`
where id in (select distinct id
from `test`
group by id
having sum(paid) > 100)

Related

Removing records based on two columns

What I'm trying to do is taking these records that looks like this:
Name Enrollment_Month Premium
John 20201201 $76.00
John 20201201 $54.00
Tony 20201201 $20
and change it to look like this:
Name Enrollment_Month Premium
Tony 20201201 $20
Basically trying to remove both records where name and enrollment month are the same.
Any thought, I would be really appreciate it
You can do:
delete from t
where exists (select 1
from t t2
where t2.name = t.name and
t2.enrollment_month = t.enrollment_month and
t2.rowid <> t.rowid
);
Note: This will delete rows where there are 2 or more. If you specifically want only pairs deleted:
delete from t
where 2 = (select count(*)
from t t2
where t2.name = t.name and
t2.enrollment_month = t.enrollment_month
);
One option would be using HAVING clause to determine duplicates whenever GROUPed BY those columns(enrollment_month, name) such as
DELETE t
WHERE (enrollment_month, name) IN
(SELECT enrollment_month, name
FROM t
GROUP BY enrollment_month, name
HAVING COUNT(*) > 1)
You can use analytical function to identify duplicate records and delete them based on rowid as follows:
DELETE FROM YOUR_TABLE T
WHERE T.ROWID IN (
SELECT CASE
WHEN COUNT(1) OVER (PARTITION BY ENROLLMENT_MONTH, NAME) > 1
THEN ROWID
END AS ROWIDS
FROM YOUR_TABLE)

Optimize SQL query: how to check if an Id is assigned to more than other Id (it should not)

I have a simple table like
client_id , company_id
100 1
101 1
102 1
200 2
200 2
201 2
For each client_id I should have just one company_id. I need to make a query to check this.
What I was doing is:
SELECT
client_id,
count(DISTINCT company_id) as count
FROM table GROUP BY
client_id
HAVING count > 1;
If it's not empty, it should trigger an alert.
However, I was wondering if I could optimize it a little, because I don't really need to get every row, I just need to know if this query results in AT LEAST one row.
Is it possibe?
EXISTS might be faster:
select distinct client_id
from t
where exists (select 1
from t t2
where t2.client_id = t.client_id and t2.company_id <> t.company_id
);
I would expect a performance improvement under two conditions:
There is an index on (client_id, company_id).
Most clients have only one company.

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

How to sum amounts according their ID-s

I have a database with 2 tables in SQL Server 2008 Express.
My problem is the following: I would like to create a trigger to sum some values in first table and copy the sum into second one.
For example the first table (Head) has 5 columns :
ID Transaction Acount Date Total_sum
1 some text acount1 2014.04.15 300
2 some text acount2 2014.04.15 500
3 some text acount1 2014.04.15 200
And the second table Transaction:
HeadID Amount Remarks
1 100 test1
1 200 test2
2 500 test3
3 100 test3
3 100 test4
So finally I would like to sum the values in Transaction if they have the same HeadID and copy the result into the Head table. (Total_sum column).Maybe first Could I find the last ID in the 'Head' table and group HeadID-s in the 'Transaction' table and sum the values
Please help me!
You can UPDATE table using JOIN
UPDATE h
SET h.Total_sum = t.sumTotal
FROM Head h
INNER JOIN
(
SELECT HeadID,SUM(Amount) as sumTotal
FROM [Transaction]
GROUP BY HeadID
) t
ON h.ID=t.HeadID
perhaps the following:
create trigger TransactionInsertUpdateDelete
on Transaction
for insert, update, delete
as
begin
update H
set Total_sum =
-- set the total sum to the sum of all records for this head id in transaction
(select sum(Amount)
from Transaction as T
where T.HeadID = H.HeadID
)
from Head as H
join -- join on the inserted and deleted tables to find out which HeadID's were affected.
(select HeadID
from inserted
union
select HeadID
from deleted
) as C
on C.HeadID = H.HeadID
end
go
Maybe can try like this ?
UPDATE Head h
SET h.Total_sum=(SELECT SUM(t.Amount) FROM Transaction t
Where t.HeadID = t.HeadID
GROUP BY t.HeadID)
WHERE h.ID= t.HeadID
you could achive it without using join:
UPDATE tansaction_table
SET Total_sum = (SELECT SUM(t2.Amount)
FROM head_table t2
WHERE t2.HeadID = ID)

MS Access query table without primary key

Claim# Total ValuationDt
1 100 1/1/12
2 550 1/1/12
1 2000 3/1/12
2 100 4/1/12
1 2100 8/1/12
3 200 8/1/12
3 250 11/1/12
Using MS Access, I need a query that returns only claims which have been valuated greater than $500 at some point in that claim's life time. In this example, the query should return
Claim# Total ValuationDt
1 100 1/1/12
2 550 1/1/12
1 2000 3/1/12
2 100 4/1/12
1 2100 8/1/12
because claim# 1 was valuated greater than $500 on 3/1/12, claim# 2 was valuated greater than $500 on 1/1/12, and claim# 3 was never valuated greater than $500.
You can use IN:
SELECT *
FROM Table1
WHERE Claim IN (SELECT Claim
FROM Table1
WHERE Total > 500)
Sql Fiddle Demo
Try this:
Select * from table where claim in (Select claim from table where total > 500)
Here table is the name of your table.
This could be the solution
SELECT distinct *
FROM YourTableName
WHERE claim# IN (SELECT DISTINCT claim#
FROM YourTableName
WHERE total > 500)
ORDER BY 3;
Optionally order by
This should work
Select DISTINCT Claim FROM yourtable Where Total > 500
EDIT:
In the case that my initial answer does not fulfill your requirements, then you can use a sub-query. A subquery is a query inside your query (nested queries). The reason we have to do it like that is because if you use something like
Select * FROM yourtable Where Total > 500
Then the result set would only be those moments where the total of the claim was higher than 500, but it would not indicate other moments where it was less or equal than 500.
Therefore, as others have stated, you use a subquery like:
SELECT *
FROM Table1
WHERE Claim IN (SELECT Claim
FROM Table1
WHERE Total > 500)
Note: see that there is a query after the IN keyword, so we have nested queries (or subquery if you prefer).
Why does it work? well, because:
SELECT Claim
FROM Table1
WHERE Total > 500
Will return every claim (only the number of the claim) in which the total was greater than 500 at some point. Therefore, this query will return 1 and 2. If you substitute that in the original query you get:
SELECT *
FROM Table1
WHERE Claim IN (1, 2)
Which will get you every column of every row with Claim numbers equal to either 1 or 2.
You can identify which [Claim#] values satisfy your condition ...
SELECT DISTINCT [Claim#]
FROM YourTable
WHERE [Total] > 500
If that was correct, use it as a subquery which you INNER JOIN to your table, to restrict the result set to only those claims.
SELECT y.[Claim#], y.[Total], y.[ValidationDt]
FROM YourTable AS y
INNER JOIN
(
SELECT DISTINCT [Claim#]
FROM YourTable
WHERE [Total] > 500
) AS sub
ON y.[Claim#] = sub.[Claim#];
Compare this approach vs. the IN() suggestions and see whether you notice any difference in execution speed.
You should be able to use
SELECT [Claim#],[Total],[ValidationDt]
FROM yourtable
WHERE [Claim#] IN (SELECT [Claim#]
FROM yourtable
WHERE Total >= 500)
Should return all values >= 500 for any ValidationDt.