SQL ORDER BY with grouping - sql

I have the following query
SELECT Id, Request, BookingDate, BookingId FROM Table ORDER BY Request DESC, Date
If a row has a similar ForeignKeyId, I would like that to go in before the next ordered row like:
Request Date ForeignKeyId
Request3 01-Jun-11 56
Request2 03-Jun-11 89
NULL 03-Jun-11 89
Request1 05-Jun-11 11
NULL 20-Jul-11 57
I have been looking at RANK and OVER but haven't found a simple fix.
EDIT
I've edited above to show the actual fields and pasted data using the following query from Andomar's answer
select *
from (
select row_number() over (partition by BookingId order by Request DESC) rn
, Request, BookingDate, BookingID
from Table
WHERE Date = '28 aug 11'
) G
order by
rn
, Request DESC, BookingDate
1 ffffff 23/01/2011 15:57 350821
1 ddddddd 10/01/2011 16:28 348856
1 ccccccc 13/09/2010 14:44 338120
1 aaaaaaaaaa 21/05/2011 20:21 364422
1 123 17/09/2010 16:32 339202
1 NULL NULL
2 gggggg 08/12/2010 14:39 346634
2 NULL NULL
2 17/09/2010 16:32 339202
2 NULL 10/04/2011 15:08 361066
2 NULL 02/05/2011 14:12 362619
2 NULL 11/06/2011 13:55 366082
3 NULL NULL
3 16/10/2010 13:06 343023
3 22/10/2010 10:35 343479
3 30/04/2011 10:49 362435
The booking ID's 339202 should appear next to each other but don't

You could partition by ForeignKeyId, then sort each second or lower row below their "head". With the "head" defined as the first row for that ForeignKeyId. Example, sorting on Request:
; with numbered as
(
select row_number() over (partition by ForeignKeyID order by Request) rn
, *
from #t
)
select *
from numbered n1
order by
(
select Request
from numbered n2
where n2.ForeignKeyID = n1.ForeignKeyID
and n2.rn = 1
)
, n1.Request
The subquery is required because SQL Server doesn't allow row_number in an order by clause.
Full example at SE Data.

Related

Query to pull data from column based off max value of second column

I have a table that has [Order], [Yield], [Scrap], [OpAc] columns. I need to pull the yield based on the max value of [OpAc].
Order
Yield
Scrap
OpAc
1234
140
0
10
1234
140
0
20
1234
130
10
30
1234
130
0
40
1234
125
5
50
1234
110
15
60
1235
140
0
10
1235
138
2
20
1235
138
0
30
1235
138
0
40
1235
138
0
50
1235
137
1
60
1235
137
0
70
Expected Results
Order
Yield
1234
110
1235
137
The query that I have tried is
select [Order], [Yield], MAX([OpAc]) as Max_OpAc
from SCRAP
GROUP BY [Order], [Yield]
order by [order]
This produces
Order
Yield
Max_OpAc
1234
110
60
1234
125
50
1234
130
40
1234
140
20
1235
137
70
1235
138
50
1235
140
10
I've tried setting up some CTE queries to break it down into separate functions but I keep getting caught at this step.
WITH CTE1 AS(
SELECT ROW_NUMBER() OVER(PARTITION BY [Order] ORDER BY [Order],[OpAc]) AS RN , *
FROM SAP_SCRAP
),
This proved to be redundant due to the fact that the [OpAc] field is sequential for each step.
Thanks in advance for any help
You almost got it!
WITH Orders_By_OpAc_Desc AS (
SELECT
[Order],
[Yield].
ROW_NUMBER() OVER (PARTITION BY [Order] ORDER BY OpAc DESC) AS [rn],
FROM
SCRAP
)
SELECT [Order],
[Yield]
FROM
Orders_By_OpAc_Desc
WHERE
rn = 1
The trick here is ROW_NUMBER() OVER (PARTITION BY [Order] ORDER BY OpAc DESC) AS [rn]. It might be confusing to understand in SQL, but when expressed in words it's a bit clearer.
This statement takes each group of rows with the same Order value (PARTITION BY [Order]), orders each group by OpAc in descending order so that the higher OpAc values end up "on top" of the group (ORDER BY OpAc DESC), and numbers each row in the group "top" to "bottom", starting with 1 (ROW_NUMBER()).
Meaning, each row with this number set to 1 has the highest OpAc value for the OrderId.
Wrap that into a CTE and then select just the rows with this number (rn) set to 1. Voi-la.
You definitely want the OVER (PARTITION BY) but MAX() is also an option here. You want something like:
SELECT
*
FROM
(
SELECT
t3.*
, MAX(OpAc) OVER (PARTITION BY [Order]) max1
FROM
SCRAP t3
) a
WHERE
a.Max1 = a.OpAc
for MAX()
Depending on your SQL Server edition, version, and query needs, you may be able to use FIRST_VALUE() as well:
SELECT
DISTINCT
t3.[Order],
FIRST_VALUE(Yield) OVER(PARTITION BY [Order] ORDER BY OpAc DESC) Yield
FROM
SCRAP t3
You were so close. Just missing an ORDER BY OpAc DESC in your ROW_NUMBER function.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE orders (
[Order] int null
, Yield int null
, Scrap int null
, OpAc int null
);
INSERT INTO orders ([Order], Yield, Scrap, OpAc)
VALUES (1234,140,0,10)
, (1234,140,0,20)
, (1234,130,10,30)
, (1234,130,0,40)
, (1234,125,5,50)
, (1234,110,15,60)
, (1235,140,0,10)
, (1235,138,2,20)
, (1235,138,0,30)
, (1235,138,0,40)
, (1235,138,0,50)
, (1235,137,1,60)
, (1235,137,0,70)
;
Query 1:
WITH CTE1 AS (
SELECT *
, ROW_NUMBER() OVER(PARTITION BY [Order] ORDER BY OpAc DESC) as row_num
FROM orders
)
SELECT *
FROM CTE1 as c
WHERE c.row_num = 1
Results:
| Order | Yield | Scrap | OpAc | row_num |
|-------|-------|-------|------|---------|
| 1234 | 110 | 15 | 60 | 1 |
| 1235 | 137 | 0 | 70 | 1 |

HOW to find the avg difference in order time in SQL

the dataset of subquery is here
id itm_id paid_at ord_r total_r
17 3266 2013-05-25 08:27:17 1 3
17 3219 2013-05-25 08:27:17 2 3
17 3964 2013-05-25 08:27:17 3 3
25 2105 2013-05-17 03:11:48 1 2
25 1376 2013-05-17 03:11:48 2 2
63 2140 2013-07-07 11:26:45 1 3
the code is here
for find out the average difference in order time, BUt i looked up here, and i found this piece of code
But i didn't understand why i use (toatl-1)
if someone kindly explain the process
i am doing this on mode analytics i don't know which machine it use, whether mysql or sqlserver
SELECT
user_id,
item_id,
CASE WHEN total_order-1 > 0
THEN datediff(day, max(paid_at), min(paid_at))/ (total_order-1)
ELSE datediff(day, max(paid_at), min(paid_at)) END AS avg_time
FROM
(SELECT
user_id,
item_id,
paid_at,
ROW_NUMBER( ) OVER (PARTITION BY user_id ORDER by paid_at ASC) AS order_rank,
COUNT(item_id) OVER(PARTITION BY user_id ORDER BY paid_at ASC) AS total_order
from
dsv1069.orders) user_level
But the problems is
ERROR: column "day" does not exist

SQL query to group by data but with order by clause

I have table booking in which I have data
GUEST_NO HOTEL_NO DATE_FROM DATE_TO ROOM_NO
1 1 2015-05-07 2015-05-08 103
1 1 2015-05-11 2015-05-12 104
1 1 2015-05-14 2015-05-15 103
1 1 2015-05-17 2015-05-20 101
2 2 2015-05-01 2015-05-02 204
2 2 2015-05-04 2015-05-05 203
2 2 2015-05-17 2015-05-22 202
What I want is to get the result as.
1 ) It should show output as Guest_no, Hotel_no, Room_no, and column with count as number of time previous three column combination repeated.
So OutPut should like
GUEST_NO HOTEL_NO ROOM_NO Count
1 1 103 2
1 1 104 1
1 1 101 1
2 2 204 1
etc. But I want result to in ordered way e.g.: The output should be order by bk.date_to desc
My query is as below its showing me count but if I use order by its not working
select bk.guest_no, bk.hotel_no, bk.room_no,
count(bk.guest_no+bk.hotel_no+bk.room_no) as noOfTimesRoomBooked
from booking bk
group by bk.guest_no, bk.hotel_no, bk.room_no, bk.date_to
order by bk.date_to desc
So with adding order by result is showing different , because as I added order by date_to column so i have to add this column is group by clause too which will end up in different result as below
GUEST_NO HOTEL_NO ROOM_NO Count
1 1 103 1
1 1 104 1
1 1 103 1
1 1 101 1
2 2 204 1
Which is not the output I want.
I want these four column but with order by desc of date_to column and count as no of repetition of first 3 columns
I think a good way to do this would be grouping by guest_no, hotel_no and room_no, and sorting by the maximum (i.e. most recent) booking date in each group.
SELECT
guest_no,
hotel_no,
room_no,
COUNT(1) AS BookingCount
FROM
booking
GROUP BY
guest_no,
hotel_no,
room_no
ORDER BY
MAX(date_to) DESC;
Maybe this is what you're looking for?
select
guest_no,
hotel_no,
room_no,
count(*) as Count
from
booking
group by
guest_no,
hotel_no,
room_no
order by
min(date_to) desc
Or maybe max() instead of min(). SQL Fiddle: http://sqlfiddle.com/#!6/e684c/3
You could try this.
select t.* from
(
select bk.guest_no, bk.hotel_no, bk.room_no, bk.date_to,
count(*) as noOfTimesBooked from booking bk
group by bk.guest_no, bk.hotel_no, bk.room_no, bk.date_to
) t
order by t.date_to
You will also have to select date_to and then group the result by it.
If you use 'group by' clause, SQL Server doesn't allow you to use 'order by'. So you can make a sub query and use 'order by' in the outer query.
SELECT * FROM
(select bk.guest_no,bk.hotel_no,bk.room_no
,count(bk.guest_no+bk.hotel_no+bk.room_no) as noOfTimesRoomBooked,
(SELECT MAX(date_to) FROM booking CK
WHERE CK.guest_no=BK.guest_no AND bk.hotel_no=CK.bk.hotel_no
bk.room_no=CK.ROOM_NO ) AS DATEBOOK
from booking bk
group by bk.guest_no,bk.hotel_no,bk.room_no,bk.date_to) A
ORDER BY DATEBOOK
IT MIGHT HELP YOU

How to add a running count to rows in a 'streak' of consecutive days

Thanks to Mike for the suggestion to add the create/insert statements.
create table test (
pid integer not null,
date date not null,
primary key (pid, date)
);
insert into test values
(1,'2014-10-1')
, (1,'2014-10-2')
, (1,'2014-10-3')
, (1,'2014-10-5')
, (1,'2014-10-7')
, (2,'2014-10-1')
, (2,'2014-10-2')
, (2,'2014-10-3')
, (2,'2014-10-5')
, (2,'2014-10-7');
I want to add a new column that is 'days in current streak'
so the result would look like:
pid | date | in_streak
-------|-----------|----------
1 | 2014-10-1 | 1
1 | 2014-10-2 | 2
1 | 2014-10-3 | 3
1 | 2014-10-5 | 1
1 | 2014-10-7 | 1
2 | 2014-10-2 | 1
2 | 2014-10-3 | 2
2 | 2014-10-4 | 3
2 | 2014-10-6 | 1
I've been trying to use the answers from
PostgreSQL: find number of consecutive days up until now
Return rows of the latest 'streak' of data
but I can't work out how to use the dense_rank() trick with other window functions to get the right result.
Building on this table (not using the SQL keyword "date" as column name.):
CREATE TABLE tbl(
pid int
, the_date date
, PRIMARY KEY (pid, the_date)
);
Query:
SELECT pid, the_date
, row_number() OVER (PARTITION BY pid, grp ORDER BY the_date) AS in_streak
FROM (
SELECT *
, the_date - '2000-01-01'::date
- row_number() OVER (PARTITION BY pid ORDER BY the_date) AS grp
FROM tbl
) sub
ORDER BY pid, the_date;
Subtracting a date from another date yields an integer. Since you are looking for consecutive days, every next row would be greater by one. If we subtract row_number() from that, the whole streak ends up in the same group (grp) per pid. Then it's simple to deal out number per group.
grp is calculated with two subtractions, which should be fastest. An equally fast alternative could be:
the_date - row_number() OVER (PARTITION BY pid ORDER BY the_date) * interval '1d' AS grp
One multiplication, one subtraction. String concatenation and casting is more expensive. Test with EXPLAIN ANALYZE.
Don't forget to partition by pid additionally in both steps, or you'll inadvertently mix groups that should be separated.
Using a subquery, since that is typically faster than a CTE. There is nothing here that a plain subquery couldn't do.
And since you mentioned it: dense_rank() is obviously not necessary here. Basic row_number() does the job.
You'll get more attention if you include CREATE TABLE statements and INSERT statements in your question.
create table test (
pid integer not null,
date date not null,
primary key (pid, date)
);
insert into test values
(1,'2014-10-1'), (1,'2014-10-2'), (1,'2014-10-3'), (1,'2014-10-5'),
(1,'2014-10-7'), (2,'2014-10-1'), (2,'2014-10-2'), (2,'2014-10-3'),
(2,'2014-10-5'), (2,'2014-10-7');
The principle is simple. A streak of distinct, consecutive dates minus row_number() is a constant. You can group by the constant, and take the dense_rank() over that result.
with grouped_dates as (
select pid, date,
(date - (row_number() over (partition by pid order by date) || ' days')::interval)::date as grouping_date
from test
)
select * , dense_rank() over (partition by grouping_date order by date) as in_streak
from grouped_dates
order by pid, date
pid date grouping_date in_streak
--
1 2014-10-01 2014-09-30 1
1 2014-10-02 2014-09-30 2
1 2014-10-03 2014-09-30 3
1 2014-10-05 2014-10-01 1
1 2014-10-07 2014-10-02 1
2 2014-10-01 2014-09-30 1
2 2014-10-02 2014-09-30 2
2 2014-10-03 2014-09-30 3
2 2014-10-05 2014-10-01 1
2 2014-10-07 2014-10-02 1

How to select 6 top records of each individual records at the database when selecting from all rows

Assume that i have the following table
CREATE TABLE #tblUsersPokemons (
RecordId int NOT NULL,
PokemonId int NOT NULL,
PokemonExp int NOT NULL,
PokemonLevel int NOT NULL,
UserId int NOT NULL
)
Now the below query works awesome as expected
select
SUM(cast(PokemonExp as bigint)) as TotalExp,
MAX(PokemonLevel) as MaxPokeLevel,
Count(PokemonId) as TotalPoke,
UserId
from #tblUsersPokemons
group by UserId
Here example result of such query
ToplamExp MaxPokeLevel TotalPoke UserId
----------- --------------- ----------- --------
29372294 101 4 1
1134696 98 1 2
1400 98 1 101
24534365 98 4 102
1400 98 1 1102
1400 98 1 1103
1400 98 1 2102
1400 98 1 2103
789220 98 7 2105
1468 98 1 3104
Now here my question comes
I want to limit counted PokemonIds. What i mean is i want to select maximum 6 of each same PokemonId records. And from these records top 6 ordered desc by PokemonExp should be counted in.
For example a user has the below records
From this table the query should take record id : 1,2,3,4,5,6,9 and not take 7,8 since top 6 records for PokemonId 1 taken
If I understand correctly, you want the aggregations on the top 6 rows for each user. You can do this easily using row_number():
select SUM(cast(PokemonExp as bigint)) as ToplamExp,
MAX(PokemonLevel) as MaxPokeLevel,
Count(PokemonId) as TotalPoke,UserId
from (select p.*,
row_number() over (partition by userid order by pokemanexp desc) as seqnum
from tblUsersPokemons p
) p
where seqnum <= 6
group by UserId;
EDIT:
I think you want to include PokemonId in the partition by clause:
select SUM(cast(PokemonExp as bigint)) as ToplamExp,
MAX(PokemonLevel) as MaxPokeLevel,
Count(PokemonId) as TotalPoke,UserId
from (select p.*,
row_number() over (partition by userid, PokemonId
order by pokemanexp desc) as seqnum
from tblUsersPokemons p
) p
where seqnum <= 6
group by UserId;