SQL Query to group text based on numeric column - sql

I have a table 'TEST' as shown below
Number | Seq | Name
-------+-------+------
123 | 1 | Hello
123 | 2 | Hi
123 | 3 | Greetings
234 | 1 | Goodbye
234 | 2 | Bye
I want to write a query, to group the table by 'Number', and select the rows with the maximum sequence number (MAX(Seq)). The output of the query would be
Number | Seq | Name
-------+-------+------
123 | 3 | Greetings
234 | 2 | Bye
How do I go about this?
EDIT: TEST is actually a table that is the result from a long query (joining multiple tables) that I have already written. I already have a (SELECT ...) statement to get the values I need. Is there a way to remove duplicate rows (with the same 'Number' as shown above) and select only the one with maximum 'Seq' value.
I am on Microsoft SQL Server 2008 (SP2)
I was hoping there would be a way to achieve this by
SELECT * FROM (SELECT ...) TEST <condition to group>

You can use a select win in clause
select * from test
where (number, count) in (select number, max(count) from test group by Number)

Another option is to use a windowed ROW_NUMBER() function with a partition on the number:
With Cte As
(
Select *,
Row_Number() Over (Partition By Number Order By Count Desc) RN
From TEST
)
Select Number, Count, Name
From Cte
Where RN = 1

SELECT *
FROM (SELECT test.*, MAX (seq) OVER (PARTITION BY num) max_seq
FROM test)
WHERE seq = max_seq
I changed the column name from number because you can't use a reserved word for a column name. This is pretty much the same as the other answers, except that it explicitly gets the maximum sequence number for each NUM.

You want to use an ANALYTIC function together with a conditional clause to get you only the rows of TEST that you desire.
WITH TEST as (
...your really complex query that generates TEST...
)
SELECT
Number, Seq, Name,
RANK() OVER (PARTITION By Number ORDER BY Seq DESC) AS aRank
FROM Test
WHERE aRank = 1
;
This returns the Number, Seq, Name for each Number grouping where the Seq is maximum. Yes, it also returns a column named aRank with all '1' in it...hopefully it can be ignored.

The solution to this is to do an self join on only the MAX(Seq) values.
This answer can be found at SQL Select only rows with Max Value on a Column

Related

Selecting pair(including reverse order) with highest date value

I have a messages table like this
Messages Table
I want to select each unique pair (including reversed order) with highest date. Therefore resulting SQL Select Statement would be like this:
from_id | to_id | date | message
1 2 13:06 I'm Alp
2 3 13:06 I'm Oliver
3 1 11:38 From third to one
I tried to use distinct with max function but it didn't help.
You can use window functions:
select *
from (
select m.*,
row_number() over(partition by min(from_id, to_id), max(from_id, to_id) order by date desc) rn
from messages m
) m
where rn = 1
Note: counter-intuitively enough, SQLite's min() and max() functions, when given several arguments, are the equivalent of least() and greatest() in other databases.

PostgreSQL using sum in where clause

I have a table which has a numeric column named 'capacity'. I want to select first rows which the total sum of their capacity is no greater than X, Sth like this query
select * from table where sum(capacity )<X
But I know I can not use aggregation functions in where part.So what other ways exists for this problem?
Here is some sample data
id| capacity
1 | 12
2 | 13.5
3 | 15
I want to list rows which their sum is less than 26 with the order of id, so a query like this
select * from table where sum(capacity )<26 order by id
and it must give me
id| capacity
1 | 12
2 | 13.5
because 12+13.5<26
A bit late to the party, but for future reference, the following should work for a similar problem as the OP's:
SELECT id, sum(capacity)
FROM table
GROUP BY id
HAVING sum(capacity) < 26
ORDER by id ASC;
Use the PostgreSQL docs for reference to aggregate functions: https://www.postgresql.org/docs/9.1/tutorial-agg.html
Use Having clause
select * from table order by id having sum(capacity)<X
You can use the window variant of sum to produce a cumulative sum, and then use it in the where clause. Note that window functions can't be placed directly in the where clause, so you'd need a subquery:
SELECT id, capacity
FROM (SELECT id, capacity, SUM(capacity) OVER (ORDER BY id ASC) AS cum_sum
FROM mytable) t
WHERE cum_sum < 26
ORDER BY id ASC;

How to find first duplicate row in a table sql server

I am working on SQL Server. I have a table, that contains around 75000 records. Among them there are several duplicate records. So i wrote a query to know which record repeated how many times like,
SELECT [RETAILERNAME],COUNT([RETAILERNAME]) as Repeated FROM [Stores] GROUP BY [RETAILERNAME]
It gives me result like,
---------------------------
RETAILERNAME | Repeated
---------------------------
X | 4
---------------------------
Y | 6
---------------------------
Z | 10
---------------------------
Among 4 record(s) of X record, i need take only first record of X.
so here i want to retrieve all fields from first row of duplicate records. i.e. Take all records whose RETAILERNAME='X' we will get some no. of duplicate records, we need to get only first row from them.
Please guide me.
You could try using ROW_NUMBER.
Something like
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM [Stores ]
)
SELECT *
FROm Vals
WHERE RowID = 1
SQL Fiddle DEMO
You can then also remove the duplicates if need be (BUT BE CAREFUL THIS IS PERMANENT)
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM Stores
)
DELETE
FROM Vals
WHERE RowID > 1;
You Can write query as under
SELECT TOP 1 * FROM [Stores] GROUP BY [RETAILERNAME]
HAVING your condition
WITH cte
AS (SELECT [retailername],
Row_number()
OVER(
partition BY [retailername]
ORDER BY [retailername])'RowRank'
FROM [retailername])
SELECT *
FROM cte

SQL Query to get all rows with duplicate values but are not part of the same group

The database schema is organized as follows:
ID | GroupID | VALUE
--------------------
1 | 1 | A
2 | 1 | A
3 | 2 | B
4 | 3 | B
In this example, I want to GET all Rows with duplicate VALUE, but are not part of the same group. So the desired result set should be IDs (3, 4), because they are not in the same group (2, 3) but still have the same VALUE (B).
I'm having trouble writing a SQL Query and would appreciate any guidance. Thanks.
So far, I'm using SQL Count, but can't figure out what to do with the GroupId.
SELECT *
FROM TABLE T
HAVING COUNT(T.VALUE) > 1
GROUP BY ID, GroupId, VALUE
The simplest method for this is using EXISTS:
SELECT
ID
FROM
MyTable T1
WHERE
EXISTS (SELECT 1
FROM MyTable
WHERE Value = t1.Value
AND GroupID <> t1.GroupID)
Here is one method. First you have to identify the values that appear in more than one group and then use that information to find the right rows in the original table:
select *
from t
where value in (SELECT value
FROM TABLE T
GROUP BY VALUE
HAVING COUNT(distinct groupid) > 1
)
order by value
Actually, I prefer a slight variant in this case, by changing the HAVING clause:
HAVING min(groupid) <> max(groupid)
This works when you are looking for more than one group and should be faster than the COUNT DISTINCT version.
SELECT ALL_.*
FROM (SELECT *
FROM TABLE_
GROUP BY ID, GROUPID, VALUE
ORDER BY ID) GROUPED,
TABLE_ ALL_
WHERE GROUPED.VALUE = ALL_.VALUE
AND GROUPED.GROUPID <> ALL_.GROUPID

How to write excluding sum query?

I have a values table:
+------------+---------+
| name | value |
+------------+---------+
| parameter1 | 53.8462 |
| parameter2 | 7.6923 |
| parameter3 | 23.0769 |
| parameter4 | 15.3846 |
+------------+---------+
What is the query for sum values of the three last parameters (parameter 2, parameter 3, parameter 4) without the first parameter (parameter1)?
SELECT SUM(value) tot
FROM table
WHERE name='parameter2' OR name='parameter3' OR name='parameter4'
or
SELECT SUM(value) tot
FROM table
WHERE name<>'parameter1'
This may be a bit simplistic, but can't you do this:
select sum(value) from table where name != 'parameter2'
If what you are really after is the sum past n-th value, you could do this (in SQL Server):
WITH OrderedRows AS
(
SELECT name, value,
ROW_NUMBER() OVER (ORDER BY name) AS 'RowNumber'
FROM table
)
SELECT sum(value)
FROM OrderedRows
WHERE RowNumber > 1;
If you are specif to this only
SELECT SUM(value) tot
FROM table
WHERE name<>'parameter1'
but if you need some generic solution than do not use this
with some null-checking, so the sum can still work:
SELECT SUM(coalesce(value, 0)) your_total
FROM table
WHERE coalesce(name, '') <> 'parameter1'
select sum(value) from values where name!='parameter1';
In place of ! you could also use <>.
If your goal is to sum the last three columns, even on bigger tables than your example, you are looking for moving window functions.
In Oracle you can write
WITH T AS (
SELECT 'parameter1' PAR, 2 VAL FROM DUAL
UNION ALL
SELECT 'parameter2' PAR, 3 VAL FROM DUAL
UNION ALL
SELECT 'parameter3' PAR, 5 VAL FROM DUAL
UNION ALL
SELECT 'parameter4' PAR, 7 VAL FROM DUAL
)
SELECT PAR, SUM(VAL) OVER (ORDER BY PAR ROWS 2 PRECEDING) LAST3SUM FROM T;
This would yield to
PAR LAST3SUM
---------- ----------
parameter1 2
parameter2 5
parameter3 10
parameter4 15
You shoudl look at the Oracle Documentation about Analytic Functions and keep the following in mind:
Note that the query uses SUM, but no GROUP BY. This is because we are not aggregating data, but calculating the SUM for each row we select.
Note that order is important, in my example I ORDER BY PAR, but you can as well order by any other column available in your query.
Oracle Data Warehousing Guide also discusses windowing functions, giving a lot of useful examples.