Select a different column from the row containing the max - sql

My query looks like this:
SELECT [ScriptName]
,[BranchName]
,AVG([XMLColumn].value('count(//data)', 'int'))
,MIN([XMLColumn].value('count(//data)', 'int'))
,MAX([XMLColumn].value('count(//data)', 'int'))
FROM temp
GROUP BY [BranchName], [ScriptName]
The table I'm querying is structured like this:
ScriptName | BranchName | XMLLog | Developer | Revision
Currently, my results query produces output like this:
ScriptName | BranchName | Average | Min | MAX
-------------------------------------------------
Script 1 | trunk | 80 | 11 | 120
Script 2 | branch1 | 15 | 11 | 21
I want to add two columns to my results table: the developer from the row containing the minimum value and the developer from the row containing the maximum value. This would result in the output from the query looking like this:
ScriptName | BranchName | Average | Min | MAX | DeveloperWhoCausedMinimum | DeveloperWhoCausedMaximum
-----------------------------------------------------------------------------------------------------------------
Script 1 | trunk | 80 | 11 | 120 | me | The Boss
Script 2 | branch1 | 15 | 11 | 21 | me | The Boss
I am not sure where to start on this. Thanks!

The following uses window function to calculate the three values. It then selects the entire row with the max value:
select t.*, avgval, minval, maxval
from (select t.*,
avg([XMLColumn].value('count(//data)', 'int') over
(partition by BranchName, ScriptName) as avgval,
min([XMLColumn].value('count(//data)', 'int') over
(partition by BranchName, ScriptName) as minval,
max([XMLColumn].value('count(//data)', 'int') over
(partition by BranchName, ScriptName) as maxval
from temp t
) t
where [XMLColumn].value('count(//data)', 'int') = maxval;
Note: if multiple rows have the maximum value, you will get multiple rows out. If you want only one, then use row_number() instead.
EDIT:
Oh, you changed the question to be one column from two different rows rather than two rows from one column.
Use the same idea but with aggregation:
select BranchName, ScriptName, minval, avgval, maxval,
avg([XMLColumn].value('count(//data)', 'int') as avgval,
min([XMLColumn].value('count(//data)', 'int') as minval,
max([XMLColumn].value('count(//data)', 'int') as maxval,
max(case when seqnum = 1 then Developer end) as minDeveloper,
max(case when seqnum = cnt then Developer end) as maxDeveloper
from (select t.*,
row_number() over (partition by BranchName, ScriptName
order by [XMLColumn].value('count(//data)', 'int')
) as seqnum,
count(*) over (partition by BranchName, ScriptName) as cnt
from temp t
) t
group by BranchName, ScriptName;

Related

Select all values (all rows) in one row Oracle

I get multiple rows after executing the select-query.
But I need to get all the values of these rows in one row.
̶C̶o̶u̶n̶t̶ ̶o̶f̶ ̶r̶o̶w̶s̶ ̶i̶s̶ ̶u̶n̶k̶n̶o̶w̶n̶ ̶(̶b̶e̶f̶o̶r̶e̶ ̶t̶h̶e̶ ̶̶̶s̶e̶l̶e̶c̶t̶̶̶-̶q̶u̶e̶r̶y̶ ̶i̶s̶ ̶e̶x̶e̶c̶u̶t̶e̶d̶)̶
For example:
|----------|-----------|
| **Name** | **Value** |
|----------|-----------|
| Alex | 150 |
|----------|-----------|
| Peter | 220 |
|----------|-----------|
| Katty | 34 |
|----------|-----------|
I want to get:
|-----------|-----------|-----------|-----------|-----------|-----------|
| **Col_1** | **Col_2** | **Col_3** | **Col_4** | **Col_5** | **Col_6** |
|-----------|-----------|-----------|-----------|-----------|-----------|
| Alex | 150 | Peter | 220 | Katty | 34 |
|-----------|-----------|-----------|-----------|-----------|-----------|
Oracle 11g.
UPDATE: I realized that with an unknown number of rows, the task is difficult, so I can assume that the number of rows will be known.
To pivot over a fixed number of column, one option uses row_number() and conditional aggregation:
select
max(case when rn = 1 then name end) name1,
max(case when rn = 1 then value end) value1,
max(case when rn = 2 then name end) name2,
max(case when rn = 2 then value end) value2,
...
from (
select t.*, row_number() over(order by id) rn
from mytable t
) t
You need a column that defines the ordering of the rows in the original dataset (and of the columns in the resultset): I assumed id.
You might be better off putting the values into a string or JSON column. For instance, you can aggregate the names and values into separate strings:
select list_agg(name, ',') within group (order by name) as names,
list_agg(value, ',') within group (order by name) as values
from t;
Or into a single string:
select list_agg(name || ':' || value, ',') within group (order by name) as name_values
from t;
Note: The maximum length of strings in Oracle for this purpose is 2000 characters. So this only works on a small amount of data.

How to keep the first row of a certain group based on some condition on Teradata SQL?

I have table in Teradata that looks like this
ID | Date | Values
------------------------
abc | 1Jan2015 | 1
abc | 1Dec2015 | 0
def | 2Feb2015 | 0
def | 2Jul2015 | 0
I want to write a piece of SQL that keeps only the earliest date of each ID. So the result I wanted is
ID | Date | Values
------------------------
abc | 1Jan2015 | 1
def | 2Feb2015 | 0
I know there is top n syntax but it only seems to work on the whole table not within groups.
Basically how do I do a top n within groups?
TOP can be easily rewritten using ROW_NUMBER:
select *
from tab
qualify
row_number() over (partition by id order by date) = 1
You can do this using row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by date) as seqnum
from table t
) t
where seqnum = 1;

sql query distinct with Row_Number

I am fighting with the distinct keyword in sql.
I just want to display all row numbers of unique (distinct) values in a column & so I tried:
SELECT DISTINCT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
however the below code giving me the distinct values:
SELECT distinct id FROM table WHERE fid = 64
but when tried it with Row_Number.
then it is not working.
This can be done very simple, you were pretty close already
SELECT distinct id, DENSE_RANK() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
Use this:
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM
(SELECT DISTINCT id FROM table WHERE fid = 64) Base
and put the "output" of a query as the "input" of another.
Using CTE:
; WITH Base AS (
SELECT DISTINCT id FROM table WHERE fid = 64
)
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS RowNum FROM Base
The two queries should be equivalent.
Technically you could
SELECT DISTINCT id, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
but if you increase the number of DISTINCT fields, you have to put all these fields in the PARTITION BY, so for example
SELECT DISTINCT id, description,
ROW_NUMBER() OVER (PARTITION BY id, description ORDER BY id) AS RowNum
FROM table
WHERE fid = 64
I even hope you comprehend that you are going against standard naming conventions here, id should probably be a primary key, so unique by definition, so a DISTINCT would be useless on it, unless you coupled the query with some JOINs/UNION ALL...
This article covers an interesting relationship between ROW_NUMBER() and DENSE_RANK() (the RANK() function is not treated specifically). When you need a generated ROW_NUMBER() on a SELECT DISTINCT statement, the ROW_NUMBER() will produce distinct values before they are removed by the DISTINCT keyword. E.g. this query
SELECT DISTINCT
v,
ROW_NUMBER() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
... might produce this result (DISTINCT has no effect):
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| a | 2 |
| a | 3 |
| b | 4 |
| c | 5 |
| c | 6 |
| d | 7 |
| e | 8 |
+---+------------+
Whereas this query:
SELECT DISTINCT
v,
DENSE_RANK() OVER (ORDER BY v) row_number
FROM t
ORDER BY v, row_number
... produces what you probably want in this case:
+---+------------+
| V | ROW_NUMBER |
+---+------------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
| e | 5 |
+---+------------+
Note that the ORDER BY clause of the DENSE_RANK() function will need all other columns from the SELECT DISTINCT clause to work properly.
All three functions in comparison
Using PostgreSQL / Sybase / SQL standard syntax (WINDOW clause):
SELECT
v,
ROW_NUMBER() OVER (window) row_number,
RANK() OVER (window) rank,
DENSE_RANK() OVER (window) dense_rank
FROM t
WINDOW window AS (ORDER BY v)
ORDER BY v
... you'll get:
+---+------------+------+------------+
| V | ROW_NUMBER | RANK | DENSE_RANK |
+---+------------+------+------------+
| a | 1 | 1 | 1 |
| a | 2 | 1 | 1 |
| a | 3 | 1 | 1 |
| b | 4 | 4 | 2 |
| c | 5 | 5 | 3 |
| c | 6 | 5 | 3 |
| d | 7 | 7 | 4 |
| e | 8 | 8 | 5 |
+---+------------+------+------------+
Using DISTINCT causes issues as you add fields and it can also mask problems in your select. Use GROUP BY as an alternative like this:
SELECT id
,ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
where fid = 64
group by id
Then you can add other interesting information from your select like this:
,count(*) as thecount
or
,max(description) as description
How about something like
;WITH DistinctVals AS (
SELECT distinct id
FROM table
where fid = 64
)
SELECT id,
ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM DistinctVals
SQL Fiddle DEMO
You could also try
SELECT distinct id, DENSE_RANK() OVER (ORDER BY id) AS RowNum
FROM #mytable
where fid = 64
SQL Fiddle DEMO
Try this:
;WITH CTE AS (
SELECT DISTINCT id FROM table WHERE fid = 64
)
SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM cte
WHERE fid = 64
Try this
SELECT distinct id
FROM (SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS RowNum
FROM table
WHERE fid = 64) t
Or use RANK() instead of row number and select records DISTINCT rank
SELECT id
FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS RowNum
FROM table
WHERE fid = 64) t
WHERE t.RowNum=1
This also returns the distinct ids
Question is too old and my answer might not add much but here are my two cents for making query a little useful:
;WITH DistinctRecords AS (
SELECT DISTINCT [col1,col2,col3,..]
FROM tableName
where [my condition]
),
serialize AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY [colNameAsNeeded] ORDER BY [colNameNeeded]) AS Sr,*
FROM DistinctRecords
)
SELECT * FROM serialize
Usefulness of using two cte's lies in the fact that now you can use serialized record much easily in your query and do count(*) etc very easily.
DistinctRecords will select all distinct records and serialize apply serial numbers to distinct records. after wards you can use final serialized result for your purposes without clutter.
Partition By might not be needed in most cases

Grouping SQL Results based on order

I have table with data something like this:
ID | RowNumber | Data
------------------------------
1 | 1 | Data
2 | 2 | Data
3 | 3 | Data
4 | 1 | Data
5 | 2 | Data
6 | 1 | Data
7 | 2 | Data
8 | 3 | Data
9 | 4 | Data
I want to group each set of RowNumbers So that my result is something like this:
ID | RowNumber | Group | Data
--------------------------------------
1 | 1 | a | Data
2 | 2 | a | Data
3 | 3 | a | Data
4 | 1 | b | Data
5 | 2 | b | Data
6 | 1 | c | Data
7 | 2 | c | Data
8 | 3 | c | Data
9 | 4 | c | Data
The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.
Additional Info
ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.
For the clarified requirements in the comments
The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they
could be "1,2,4,6" ... a higher number followed by a lower would be a
new group.
A SQL Server 2012 solution could be as follows.
Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
Calculate a running sum of these flags to use as the grouping value.
Code
WITH T1 AS
(
SELECT *,
LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
RowNumber,
Data,
SUM(NewGroup) OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM T2
SQL Fiddle
Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.
If the ids are truly sequential, you can do:
select t.*,
(id - rowNumber) as grp
from t
Also you can use recursive CTE
;WITH cte AS
(
SELECT ID, RowNumber, Data, 1 AS [Group]
FROM dbo.test1
WHERE ID = 1
UNION ALL
SELECT t.ID, t.RowNumber, t.Data,
CASE WHEN t.RowNumber != 1 THEN c.[Group] ELSE c.[Group] + 1 END
FROM dbo.test1 t JOIN cte c ON t.ID = c.ID + 1
)
SELECT *
FROM cte
Demo on SQLFiddle
How about:
select ID, RowNumber, Data, dense_rank() over (order by grp) as Grp
from (
select *, (select min(ID) from [Your Table] where ID > t.ID and RowNumber = 1) as grp
from [Your Table] t
) t
order by ID
This should work on SQL 2005. You could also use rank() instead if you don't care about consecutive numbers.

How can I calculate the remaining amount per row?

I have a table that I want to find for each row id the amount remaining from the total. However, the order of amounts is in an ascending order.
id amount
1 3
2 2
3 1
4 5
The results should look like this:
id remainder
1 10
2 8
3 5
4 0
Any thoughts on how to accomplish this? I'm guessing that the over clause is the way to go, but I can't quite piece it together.Thanks.
Since you didn't specify your RDBMS, I will just assume it's Postgresql ;-)
select *, sum(amount) over() - sum(amount) over(order by amount) as remainder
from tbl;
Output:
| ID | AMOUNT | REMAINDER |
---------------------------
| 3 | 1 | 10 |
| 2 | 2 | 8 |
| 1 | 3 | 5 |
| 4 | 5 | 0 |
How it works: http://www.sqlfiddle.com/#!1/c446a/5
It works in SQL Server 2012 too: http://www.sqlfiddle.com/#!6/c446a/1
Thinking of solution for SQL Server 2008...
Btw, is your ID just a mere row number? If it is, just do this:
select
row_number() over(order by amount) as rn
, sum(amount) over() - sum(amount) over(order by amount) as remainder
from tbl
order by rn;
Output:
| RN | REMAINDER |
------------------
| 1 | 10 |
| 2 | 8 |
| 3 | 5 |
| 4 | 0 |
But if you really need the ID intact and move the smallest amount on top, do this:
with a as
(
select *, sum(amount) over() - sum(amount) over(order by amount) as remainder,
row_number() over(order by id) as id_sort,
row_number() over(order by amount) as amount_sort
from tbl
)
select a.id, sort.remainder
from a
join a sort on sort.amount_sort = a.id_sort
order by a.id_sort;
Output:
| ID | REMAINDER |
------------------
| 1 | 10 |
| 2 | 8 |
| 3 | 5 |
| 4 | 0 |
See query progression here: http://www.sqlfiddle.com/#!6/c446a/11
I just want to offer a simpler way to do this in descending order:
select id, sum(amount) over (order by id desc) as Remainder
from t
This will work in Oracle, SQL Server 2012, and Postgres.
The general solution requres a self join:
select t.id, coalesce(sum(tafter.amount), 0) as Remainder
from t left outer join
t tafter
on t.id < tafter.id
group by t.id
SQL Server 2008 answer, I can't provide an SQL Fiddle, it seems it strips the begin keyword, resulting to syntax errors. I tested this on my machine though:
create function RunningTotalGuarded()
returns #ReturnTable table(
Id int,
Amount int not null,
RunningTotal int not null,
RN int identity(1,1) not null primary key clustered
)
as
begin
insert into #ReturnTable(id, amount, RunningTotal)
select id, amount, 0 from tbl order by amount;
declare #RunningTotal numeric(16,4) = 0;
declare #rn_check int = 0;
update #ReturnTable
set
#rn_check = #rn_check + 1
,#RunningTotal =
case when rn = #rn_check then
#RunningTotal + Amount
else
1 / 0
end
,RunningTotal = #RunningTotal;
return;
end;
To achieve your desired output:
with a as
(
select *, sum(amount) over() - RunningTotal as remainder
, row_number() over(order by id) as id_order
from RunningTotalGuarded()
)
select a.id, amount_order.remainder
from a
inner join a amount_order on amount_order.rn = a.id_order;
Rationale for guarded running total: http://www.ienablemuch.com/2012/05/recursive-cte-is-evil-and-cursor-is.html
Choose the lesser evil ;-)