Teradata SQL -Min Max transaction dates from Rows - sql

Tried Qualify row_Number () and Qualify Min & Max functions but still not able to get range of dates for transaction. See data structure below
Need help for the following output
Thank you in advance

You need to find the groups of consecutive dates first. There are several ways to do this, in your case the best should is based on comparing a sequence to another sequence with gaps in it:
with cte as
(
select t.*
-- consecutive numbers = sequence without gaps
,row_number()
over (partition by location, cust#, cust_type -- ??
order by transaction_date) as rn
-- consecutive numbers as long as there's no missing date = sequence with gaps
,(transaction_date - date '0001-01-01') as rn2
-- assign a common (but meaningless) value to consecutive dates,
-- value changes when there's a gap
,rn2 - rn as grp
from tab as t
)
select location, cust#, cust_type -- ??
,min(transaction_date), max(transaction_date)
,min(amount), max(amount)
from cte
-- add the calculated "grp" to GROUP BY
group by location, cust#, cust_type, grp
The columns used for PARTITION BY/GROUP BY depend on your rules.

Related

new column with row number sql

I have data like below with two columns, I need an output with new column shown below
Input -
Name,Date,Value
Test1,20200901,55
Test1,20200901,100
Test1,20200901,150
Test1,20200805,25
Test1,20200805,30
Row number is based on data from column - Name and Date
Output,
Name,Date,Value, row_number
Test1,20200901,55,1
Test1,20200901,100,1
Test1,20200901,150,1
Test1,20200805,25,2
Test1,20200805,30,2
The query using Partition didn't help
select *, row_number() over (partition by Date) as Rank from Table
Can someone please help here
Thank you very much
You want dense_rank():
select *,
dense_rank() over (order by Date) as Rank
from Table;
There is something suspicious when you are using partition by without order by (even if the underlying database supports that).
Use dense_rank() - and an order by clause:
select t.*, dense_rank() over (order by Date) as rn from mytable t
This gives you a sequential number that starts at 1 on the earliest date value increments without gaps everytime date changes.

Oracle SQL Return First & Last Value From Different Columns By Partition

I need help with a query that will return a single record per partition in the below dataset. I used the DENSE_RANK to get the order and first/last position within each partition, but the problem is that I need to get a single record for each EMPLOYEE ITEM_ID combination which contains:
MIN(START) which is date type with time
SUM(DURATION) which is a number type signifying seconds of activity
MIN ranked value from INIT_STATUS
MAX ranked value from FIN_STATUS
Here is the initial data table, the same data table ordered with rank, and the desired result at the end (see image below):
Also, here is the code used to get the ordered table with rank values:
SELECT T.*,
DENSE_RANK() OVER (PARTITION BY T.EMPLOYEE, T.ITEM_ID ORDER BY T.START) AS D_RANK
FROM TEST_DATA T
ORDER BY T.EMPLOYEE, T.ITEM_ID, T.START;
Use first/last option to find statuses. The rest is classic aggregation:
select employee, min(start_), sum(duration),
max(init_status) keep (dense_rank first order by start_),
max(fin_status) keep (dense_rank last order by start_)
from test_data t
group by employee, item_id
order by employee, item_id;
start is a reserved word, so I used start_ for my test.

PostgreSQL backward intersection & join

I have a survey form of certain questions for a certain facility.
the facility can be monitored(data entry) more than once in a month.
now i need the latest data(values) against the questions
but if there is no latest data against any question i will traverse through prior records(previous dates) of the same month.
i can get the latest record but i don't know how to get previous record of the same month id there is no latest data.
i am using PostgreSQL 10.
Table Structure is
Desired output is
You can try to use ROW_NUMBER window function to make it.
SELECT to_char(date, 'MON') month,
facility,
idquestion,
value
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY facility,idquestion ORDER BY DATE DESC) rn
FROM T
) t1
where rn = 1
demo:db<>fiddle
SELECT DISTINCT
to_char(qdate, 'MON'),
facility,
idquestion,
first_value(value) OVER (PARTITION BY facility, idquestion ORDER BY qdate DESC) as value
FROM questions
ORDER BY facility, idquestion
Using window functions:
first_value(value) OVER ... gives you the first value of a window frame. The frame is a group of facility and idquestion. Within this group the rows are ordered by date DESC. So the very last value is first no matter which date it is
DISTINCT filtered the tied values (e.g. there are two values for facility == 1 and idquestion == 7)
Please notice:
"date" is a reserved word in Postgres. I strongly recommend to rename your column to avoid certain trouble. Furthermore in Postgres lower case is used and is recommended.

Sequence within a partition in SQL server

I have been looking around for 2 days and have not been able to figure out this one. Using dataset below and SQL server 2016 I would like to get the row number of each row by 'id' and 'cat' ordered by 'date' in asc order but would like to see a reset of the sequence if a different value in the 'cat' column for the same 'id' is found(see rows in green). Any help would be appreciated.
This is a gaps and islands problem. The simplest solution in this case is probably a difference of row numbers:
select t.*,
row_number() over (partition by id, cat, seqnum - seqnum_c order by date) as row_num
from (select t.*,
row_number() over (partition by id order by date) as seqnum,
row_number() over (partition by id, cat order by date) as seqnum_c
from t
) t;
Why this works is a bit tricky to explain. But, if you look at the sequence numbers in the subquery, you'll see that the difference defines the groups you want to define.
Note: This assumes that the date column provides a stable sort. You seem to have duplicates in the column. If there really are duplicates and you have no secondary column for sorting, then try rank() or dense_rank() instead of row_number().

Define groups of row by logic

I have a unique scenario to which i can't find a solution, so i thought to ask the experts :)
I have a query that returns a course syllabus, which each row represent a day of training. You can see in the picture below that there are rest days in the middle of the training
I can't find a way to group the each consecutive training days
Please see screenshot below detailed the rows and what i want to achieve
I am using MS-SQL 2014
Here is a Fiddle with the data i have and the expected results
SQL Fiddle
The simplest method is a difference of row_number(). The following identifies each consecutive group with a number:
select td.*,
dense_rank() over (order by dateadd(day, - seqnum, DayOfTraining)) as grpnum
from (select td.*,
row_number() over (order by DayOfTraining) as seqnum
from TrainingDays td
) td;
The key idea is that subtracting a sequence from consecutive days produces a constant for those days.
Here is the SQL Fiddle.
After many hit and trials, this is the closest I could come up with
http://rextester.com/ECBQ88563
The problem here is that if last row belongs to another group, it will still use it with previous group. So in your sample if you change last date from 19 to 20, the output will still be the same. May be with another condition we can eliminate it. Other than that this should work.
SELECT DayOfTraining1,
dense_rank() over (ORDER BY grp_dt) AS grp
FROM
(SELECT DayOfTraining1,
min(DayOfTraining) AS grp_dt
FROM
(SELECT trng.DayOfTraining AS DayOfTraining1,
dd.DayOfTraining
FROM trng
CROSS JOIN
(SELECT d.*
FROM
(SELECT trng.*,
lag (DayOfTraining,1) OVER (
ORDER BY DayOfTraining) AS nxt_DayOfTraining,
lead (DayOfTraining,1) OVER (
ORDER BY DayOfTraining) AS prev_DayOfTraining,
datediff(DAY, lag (DayOfTraining,1) OVER (
ORDER BY DayOfTraining), DayOfTraining) AS ddf
FROM trng
) d
WHERE d.ddf <> 1
OR prev_DayOfTraining IS NULL
) dd
WHERE trng.DayOfTraining <= dd.DayOfTraining
) t
GROUP BY DayOfTraining1
) t1;
Explanation: The inner query d is using lag and lead functions to capture previous and next rows values. Then we are taking the days difference and using and capturing dates where difference is not 1. These are the dates where group should switch. Use a derived table dd for the same.
Now cross join this with main table and use aggregate function to determine the continuous groups (took me many hit and trials) to achieve this.
Then use dense_rank function on it to get the group.