SQL Server : get row order number in given month - sql

I am using SQL Server and trying to get the order of transaction in a month, for example if I have 10 transactions in a month I want to know when the selected transaction occurred, if it was the 1st, 2nd, 3rd etc. transaction in that month. I tried using ROW_NUMBER() like
SELECT
transactionid,
ROW_NUMBER() OVER(ORDER BY transactionexecutiontime ASC) rowOrderNumber
FROM
dbo.exportebop
WHERE
MONTH(transactionexecutiontime) = 1
ORDER BY
transactionexecutiontime ASC
This works if I do not specify the TransactionID because I get multiple rows, the moment I use TransactionID = idNumber It returns a single row and row_number() is always 1
I know ROW_NUMBER() counts the result set but I need a way to calculate when order number in a month using the TransactionExecutionTime instead.
Does anyone know how I can achieve this?

I think you need to apply Row_Number first then after that you can apply filtering. I will use CTE to do that -
;WITH CTE AS (
SELECT
TransactionID,
ROW_NUMBER() OVER(ORDER BY TransactionExecutionTime ASC) rowOrderNumber
FROM dbo.EXPORTEBOP
WHERE MONTH(TransactionExecutionTime) = 1
)
SELECT *
FROM CTE
WHERE TransactionID = 1
ORDER BY rowOrderNumber ASC
As a sub-query-
DECLARE #Month INT = 1
SELECT *
FROM (
SELECT
TransactionID,
ROW_NUMBER() OVER(ORDER BY TransactionExecutionTime ASC) rowOrderNumber
FROM dbo.EXPORTEBOP
WHERE MONTH(TransactionExecutionTime) = #Month
) X
WHERE TransactionID = 1
ORDER BY rowOrderNumber ASC

If you want all months listed, you can partition the row number by month:
SELECT
transactionid,
Month = MONTH(transactionexecutiontime),
rowOrderNumber = ROW_NUMBER() OVER(
PARTITION BY
MONTH(transactionexecutiontime) -- Will start different "row numbers" for each month
ORDER BY
transactionexecutiontime ASC)
FROM
dbo.exportebop
ORDER BY
MONTH(transactionexecutiontime),
transactionexecutiontime ASC
If you want a particular transaction ID, you can't filter it on the same result set as the ROW_NUMBER() because it will only generate a row number for that result set, which will only contain 1 row (your filtered transaction). You will have to filter it on a another query that references the first (with a CTE or a subquery):
;WITH RankedTransactionsByMonth AS
(
SELECT
transactionid,
Month = MONTH(transactionexecutiontime),
rowOrderNumber = ROW_NUMBER() OVER(
PARTITION BY
MONTH(transactionexecutiontime) -- Will start different "row numbers" for each month
ORDER BY
transactionexecutiontime ASC)
FROM
dbo.exportebop
-- No Order by in subqueries without TOP!!
)
SELECT
R.*
FROM
RankedTransactionsByMonth AS R
WHERE
R.transactionid = 1

Related

The SQL equivalent of pandas sort, group by, count and first

I have a table that looks like:
I need to determine what the top 3 most common viewplanes are captured when first scanning a new patient (I believe the patients are indicated by the subject_label column).
In Pandas, this looks like:
df.sort_values('datetime').groupby('subject_label').first().viewplane
In SQL, I have tried:
WITH added_row_number
(SELECT
*,
ROW_NUMBER() OVER(PARTITION BY subject_label ORDER BY datetime ASC) AS row_number
FROM image_list_csv)
SELECT lower(viewplane),
COUNT(lower(viewplane)) OVER (ORDER BY datetime ASC) AS running_total
FROM added_row_number
WHERE ROW_NUMBER = 1
ORDER BY running_total DESC;
Which gives:
I have also tried:
WITH added_row_number AS ( SELECT
*,
ROW_NUMBER() OVER(PARTITION BY subject_label, datetime ORDER BY datetime DESC) AS row_number FROM image_list_csv ) SELECT
LOWER(viewplane), datetime FROM added_row_number WHERE row_number = 1;
Which gives:

Complex Ranking in SQL (Teradata)

I have a peculiar problem at hand. I need to rank in the following manner:
Each ID gets a new rank.
rank #1 is assigned to the ID with the lowest date. However, the subsequent dates for that particular ID can be higher but they will get the incremental rank w.r.t other IDs.
(E.g. ADF32 series will be considered to be ranked first as it had the lowest date, although it ends with dates 09-Nov, and RT659 starts with 13-Aug it will be ranked subsequently)
For a particular ID, if the days are consecutive then ranks are same, else they add by 1.
For a particular ID, ranks are given in date ASC.
How to formulate a query?
You need two steps:
select
id_col
,dt_col
,dense_rank()
over (order by min_dt, id_col, dt_col - rnk) as part_col
from
(
select
id_col
,dt_col
,min(dt_col)
over (partition by id_col) as min_dt
,rank()
over (partition by id_col
order by dt_col) as rnk
from tab
) as dt
dt_col - rnk caluclates the same result for consecutives dates -> same rank
Try datediff on lead/lag and then perform partitioned ranking
select t.ID_COL,t.dt_col,
rank() over(partition by t.ID_COL, t.date_diff order by t.dt_col desc) as rankk
from ( SELECT ID_COL,dt_col,
DATEDIFF(day, Lag(dt_col, 1) OVER(ORDER BY dt_col),dt_col) as date_diff FROM table1 ) t
One way to think about this problem is "when to add 1 to the rank". Well, that occurs when the previous value on a row with the same id_col differs by more than one day. Or when the row is the earliest day for an id.
This turns the problem into a cumulative sum:
select t.*,
sum(case when prev_dt_col = dt_col - 1 then 0 else 1
end) over
(order by min_dt_col, id_col, dt_col) as ranking
from (select t.*,
lag(dt_col) over (partition by id_col order by dt_col) as prev_dt_col,
min(dt_col) over (partition by id_col) as min_dt_col
from t
) t;

How to retrieve MAX Turntime of Top Two earliest date?

How would I construct a query to receive the MAX TurnTime per ID of the first 2 rounds? Rounds being defined as minimum Beginning_Date to mininmum End_Date of an ID. Without reusing either of the dates for the second round Turn Time calculation.
You can use row_number() . . . twice:
select d.*
from (select d.*,
row_number() over (partition by id order by turn_time desc) as seqnum_turntime
from (select d.*,
row_number() over (partition by id order by beginning_end desc) as seqnum_round
from data d
) d
where seqnum_round <= 2
) d
where seqnum_turntime = 1;
The innermost subquery gets the first two rounds. The outer subquery gets the maximum.
You could express this without window functions as well:
select top (1) with ties d.*
from data d
where d.beginning_date <= (select d2.beginning_date
from data d2
where d2.id = d.id
offset 1 fetch first 1 row only
)
order by row_number() over (partition by id order by turntime desc);
SELECT
ID
,turn_time
,beginning_date
,end_date
FROM
(
SELECT
ID
,MAX(turn_time) OVER (PARTITION BY Id ORDER BY BeginningDate ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS turn_time --Maximum turn time of the current row and preceding row
,MIN(BeginningDate) OVER (PARTITION BY Id ORDER BY BeginningDate ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS beginning_date --Minimum begin date over current row and preceding row (could also use LAG)
,end_date
,ROW_NUMBER() OVER (PARTITION BY Id ORDER BY BeginningDate) AS Turn_Number
FROM
<whatever your table is>
) turn_summary
WHERE
Turn_Number = 2

SQL Finding five largest numbers instead of one Max in a table

I have a table and I need to run a query that contains some aggregation Functions like Maximum , Average , Standard Deviation , ...
but instead of one Maximum I should return 5 largest number.
the simplified query is something like this:
SELECT OSI_KEY , MAX(VALUE) , AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
and I need some Magical ;) Query like this:
SELECT OSI_KEY , MAX1(VALUE) ,MAX2(VALUE) ,MAX3(VALUE) ,MAX4(VALUE) , MAX5(VALUE) ,
AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
I appreciate your considerations.
Oracle has an NTH_VALUE() function. Unfortunately, it is only an analytic function and not a window function. This leads to the strange construct of SELECT DISTINCT with a bunch of analytic functions:
SELECT DISTINCT OSI_KEY,
MAX(VALUE) OVER (PARTITION BY OSI_KEY),
NTH_VALUE(VALUE, 2) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_2,
NTH_VALUE(VALUE, 3) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_3,
NTH_VALUE(VALUE, 4) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_4,
NTH_VALUE(VALUE, 5) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_5,
AVG(VALUE) OVER (PARTITION BY OSI_KEY),
STDDEV(VALUE) OVER (PARTITION BY OSI_KEY),
variance(VALUE) OVER (PARTITION BY OSI_KEY)
FROM DATA_VALUES_5MIN_6_2013
ORDER BY OSI_KEY;
You can also do this using conditional aggregation, with a row_number() or dense_rank() in a subquery.
SELECT OSI_KEY, MaxValue FROM (
SELECT OSI_KEY, MAX(value) AS MaxValue FROM table GROUP BY OSI_KEY
)
ORDER BY MaxValue DESC
FETCH FIRST 5 ROWS ONLY;

SQL Server Partition Order - No tie DenseRank values even if rows are same

This question is best explained with an image and the script I have currently... How can I extract a FULL one row per assignment, with the lowest rank, and if there are 2 rows with a denserank as 1, then choose either of them?...
select *
,Dense_RANK() over (partition by [Assignment] order by [Text] desc) as
[DenseRank]
from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
select * from
(
select *
,Dense_RANK() over (partition by [Assignment] order by [Text] desc, NewID()
) as [DenseRank] from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
) as A
where A.[DenseRank] = 1
Second script is working perfectly!
SELECT * INTO
[dbo].[CLEANSED_T3B_Step1_COMPLETED]
from
(
select *
,Dense_RANK() over (partition by [Assignment] order by
left([Text],1) desc , [Diff_Doc_Clearing_Date] desc , [Amount] asc
as [DenseRank]
from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
)
as A
where A.[DenseRank] = 1
No longer need just a random first Tied '1st place', now need to get the one with the highest day diff and then also the highest amount after. SO have adapted everything in this version 3.
It seems you don't want to use DENSE_RANK but ROW_NUMBER.
with cte as(
select t.*, rn = row_number() over(partition by assignment order by [text] desc)
from tablename t
)
select * from cte
where rn = 1
Order by 'newid()' as the 'tie-breaker'
Order by [Text],Newid()