I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.
Related
I have a table with 60 columns in it. I would like to identify how many duplicates there are in the table based on all the columns being identical.
I don't want to have to type out every field name in the SELECT or GROUP BY clauses. Is there a way to do that?
You can use an approach like this for each table:
SELECT
MD5(OBJECT_CONSTRUCT(SRC.*)::VARCHAR) DUP_MD5, SUM(1) AS TOTAL_COUNT
FROM <table> SRC
GROUP BY 1
HAVING SUM(1) > 1;
I have a table that I would like to update one column data on every nth row if it meets row requirement.
My table has many columns but the key are Object_Id (in case this could be useful for creating temp table)
But the one I'm trying to update is online_status, it looks like below, but on bigger scales so I usually have 10rows that has same time but they all have %Online% in it and in total around 2000 rows (with Online and about another 2000 with Offline). I just need to update every 2-4 rows of those 10 that are repeating itself.
Table picture here: (for some reason table formatting doesn't come up good)
Table
So what I tried is: This pulls a list of every 3rd record that matches criteria Online, I just need a way to update it but can't get through this.
SELECT * FROM (SELECT *, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE online_status LIKE '%Online%' AND foo.rn % 3 =0
What I also tried is:
However this has updated every single row. not the ones I needed.
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE people.Object_id IN
(SELECT *
FROM
(SELECT people.Object_id, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE people LIKE '%Online%' AND foo.rn % 3 =0);
Is there a way to take list from Select code above and simply update it or run a few scripts that could add it to like temp table and store object ids, and the next script would update main table if object id would match temp table.
Thank you for any help :)
Don't select other columns but Object_id in the subquery at WHERE people.Object_id IN (..)
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE Object_id IN
( SELECT Object_id
FROM
( SELECT p.Object_id, row_number() over() rn
FROM people p
WHERE p.online_status LIKE '%Online%') foo
WHERE foo.rn % 3 = 0
);
I have a table Trial_tb with columns p_id,t_number and rundate.
Sample values:
p_id|t_number|rundate
=====================
111|333 |1/7/2016||
111|333 |1/1/2016||
222|888 |1/8/2016||
222|444 |1/2/2016||
666|888 |1/6/2016||
555|777 |1/5/2016||
pid and tnumber are key columns. I need fetch values such that the result should not have any record in which pid-tnumber combination are duplicated. For example there is duplication for 111|333 and hence not valid. The query should fetch all other than first two records.
I wrote below script but it fetches only the last record. :(
select rundate,p_id,t_number from
(
select rundate,p_id,t_number,
count(p_id) over (partition by p_id) PCnt,
count(t_number) over (partition by t_number) TCnt
from trialtb
)a
where a.PCnt=1 and a.TCnt=1
The having clause is ideal for this job. Having allows you to filter on aggregated records.
-- Finding unique combinations.
SELECT
p_id,
t_number
FROM
trialtb
GROUP BY
p_id,
t_number
HAVING
COUNT(*) = 1
;
This query returns combinations of p_id and t_number that occur only once.
If you want to include rundate you could add MAX(rundate) AS rundate to the select clause. Because you are only looking at unique occurrences the max or min would always be the same.
Do you mean:
select
p_id,t_number
from
trialtb
group by
p_id,t_number
having
count(*) = 1
or do you need the run date too?
select
p_id,t_number,max(rundate)
from
trialtb
group by
p_id,t_number
having
count(*) = 1
Seeing as you are only looking items with one result using max or min should work fine
Sorry for the Title, But didn't know how to explain.
I have a table that have 2 fields A and B.
I want find all rows in the table that have duplicate A (more than one record) but at the same time A will consider as a duplicate only if B is different in both rows.
Example:
FIELD A Field B
10 10
10 10 // This is not duplicate
10 10
10 5 // this is a duplicate
How to to this in a single query
Let's break this down into how you would go about constructing such a query. You don't make it clear whether you're looking for all values of A or all rows but let's assume all values of A initially.
The first step therefore is to create a list of all values of A. This can be done two ways, DISTINCT or GROUP BY. I'm going to use GROUP BY because of what else you want to do:
select a
from your_table
group by a
This returns a single column that is unique on A. Now, how can you change this to give you the unique values? The most obvious thing to use is the HAVING clause, which allows you to restrict on aggregated values. For instance the following will give you all values of A which only appear once in the table
select a
from your_table
group by a
having count(*) = 1
That is the count of all values of A inside the group is 1. You don't want this of course, you want to do this with the column B. You need there to exist more than one value of B in order for the situation you want to identify to be possible (if there's only one value of B then it's impossible). This gets us to
select a
from your_table
group by a
having count(b) > 1
This still isn't enough as you want two different values of B. The above just counts the number of records with the column B. Inside an aggregate function you use the DISTINCT keyword to determine unique values; bringing us to:
select a
from your_table
group by a
having count(distinct b) > 1
To transcribe this into English this means select all unique values of A from YOUR_TABLE that have more than one values of B in the group.
You can use this method, or something similar, to build up your own queries as you create them. Determine what you want to achieve and slowly build up to it.
select FIELD from your_table group by FIELD having count(b) > 1
take in consideration that this will return count of all duplicate
example
if you have values
1
1
2
1
it will return 3 for value 1 not 2
I have a complex query and which may return more than one record per group. There is a field that has a numeric sequential number. If in a group there is more than one record returned I just want the record with the highest sequential number.
I’ve tried using the SQL MAX function, but if I try to add more than one field it returns all records, instead of the one with the highest sequential field in that group.
I am trying to accomplish this in MS Access.
Edit: 4/5/11
Trying to create a table as an example of what I am trying to do
I have the following table:
tblItemTrans
ItemID(PK)
Eventseq(PK)
ItemTypeID
UserID
Eventseq is a number field that increments for each ItemID. (Don’t ask me why, that’s how the table was created.) Each ItemID can have one or many Evenseq’s. I only need the last record (max(Eventseq)) PER each ItemTypeID.
Hope this helps any.
SELECT A.*
FROM YourTable A
INNER JOIN (SELECT GroupColumn, MAX(SequentialColumn) MaxSeq
FROM YourTable
GROUP BY GroupColumn) B
ON A.GroupColumn = B.GroupColumn AND A.SequentialColumn = B.MaxSeq
If your SequentialNumber is an ID (unique across the table), then you could use
select *
from tbl
where seqnum in (
select max(seqnum) from tbl
group by groupcolumn)
If it is not, an alternative to Lamak's query is the Access domain function DMAX
select *
from tbl
where seqnum = DMAX("seqnum", "tbl", "groupcolumn='" & groupcolumn & "'")
Note: if the groupcolumn is a date, use # instead of single quotes ' in the above, if it is a numeric, remove the single quotes.