SQL Query Early/Late dates - sql

I am trying to create an SQL view, based on results from the earliest and latest dates. I am aware of the min and max functions but I've not been able to implement it correctly. So far I have:
select distinct
name,
study,
group,
ROUND (TLength * POWER (TWidth, 2) * 0.000523, 3) as Volume,
firstDate as firstDate,
lastDate as lastDate
from
(select
name,
study,
group,
min(operation_time) firstDate,
max(operation_time) lastDate,
MAX(DECODE (ACTIVITY,'length', RESULT_VALUE, NULL)) TLength,
MAX(DECODE (ACTIVITY,'width', RESULT_VALUE,NULL)) TWidth
from mx_all_data_vw
where mx_all_data_vw.study_name like '%MT%'
group by name, group study);
This gives me a single row for either the earliest or latest date, and two columns with earliest and latest dates.
I want 2 rows, that has a row containing all data for earliest date and another containing all data for latest date, rather than two columns seperating the early and late dates.
Thanks.

Simplified for readability:
SELECT *
FROM (
SELECT mx_all_data_vw.*,
ROW_NUMBER() OVER (PARTITION BY name, study, "group" ORDER BY operation_time) rna,
ROW_NUMBER() OVER (PARTITION BY name, study, "group" ORDER BY operation_time DESC) rnd,
DECODE(activity, 'length', result_value, NULL) AS TLength,
DECODE(activity, 'width', result_value, NULL) AS TWidth
FROM mx_all_data_vw
WHERE mx_all_data_vw.study_name like '%MT%'
)
WHERE 1 IN (rna, rnd)
Add the computed expressions instead of *.

Related

sql - find users who made 3 consecutive actions

i use sql server and i have this table :
ID Date Amount
I need to write a query that returns only users that have made at least 3 consecutive purchases, each one larger than the other.
I know that i need to use partition_id and row_number but i dont know how to do it
Thank you in advance
If you want three purchases in a row with increases in amount, then use lead() to get the next amounts:
select t.*
from (select t.*,
lead(amount, 1) over (partition by id order by date) as next_date,
lead(amount, 2) over (partition by id order by date) as next2_amount
from t
) t
where next_amount > amount and next2_amount > next_amount;
I originally missed the "greater than" part of the question. If you wanted purchases on three days in a row, then:
If you want three days in a row and there is at most one purchase per day, then you can use:
select t.*
from (select t.*,
lead(date, 2) over (partition by id order by date) as next2_date
from t
) t
where next2_date = dateadd(day, 2, date);
If you can have duplicates on a date, I would suggest this variant:
select t.*
from (select t.*,
lead(date, 2) over (partition by id order by date) as next2_date
from (select distinct id, date from t) t
) t
where next2_date = dateadd(day, 2, date);

ORACLE SQL: Find last minimum and maximum consecutive period

I have the sample data set below which list the water meters not working for specific reason for a certain range period (jan 2016 to december 2018).
I would like to have a query that retrieves the last maximum and minimum consecutive period where the meter was not working within that range of period.
any help will be greatly appreciated.
You have two options:
select code, to_char(min_period, 'yyyymm') min_period, to_char(max_period, 'yyyymm') max_period
from (
select code, min(period) min_period, max(period) max_period,
max(min(period)) over (partition by code) max_min_period
from (
select code, period, sum(flag) over (partition by code order by period) grp
from (
select code, period,
case when add_months(period, -1)
= lag(period) over (partition by code order by period)
then 0 else 1 end flag
from (select mrdg_acc_code code, to_date(mrdg_per_period, 'yyyymm') period from t)))
group by code, grp)
where min_period = max_min_period
Explanation:
flag rows where period is not equal previous period plus one month,
create column grp which sums flags consecutively,
group data using code and grp additionaly finding maximal start of period,
show only rows where min_period = max_min_period
Second option is recursive CTE available in Oracle 11g and above:
with
data(period, code) as (
select to_date(mrdg_per_period, 'yyyymm'), mrdg_acc_code from t
where mrdg_per_period between 201601 and 201812),
cte (period, code) as (
select to_char(period, 'yyyymm'), code from data
where (period, code) in (select max(period), code from data group by code)
union all
select to_char(data.period, 'yyyymm'), cte.code
from cte
join data on data.code = cte.code
and data.period = add_months(to_date(cte.period, 'yyyymm'), -1))
select code, min(period) min_period, max(period) max_period
from cte group by code
Explanation:
subquery data filters only rows from 2016 - 2018 additionaly converting period to date format. We need this for function add_months to work.
cte is recursive. Anchor finds starting rows, these with maximum period for each code. After union all is recursive member, which looks for the row one month older than current. If it finds it then net row, if not then stop.
final select groups data. Notice that period which were not consecutive were rejected by cte.
Though recursive queries are slower than traditional ones, there can be scenarios where second solution is better.
Here is the dbfiddle demo for both queries. Good luck.
use aggregate function with group by
select max(mdrg_per_period) mdrg_per_period, mrdg_acc_code,max(mrdg_date_read),rea_Desc,min(mdrg_per_period) not_working_as_from
from tablename
group by mrdg_acc_code,rea_Desc
This is a bit tricky. This is a gap-and-islands problem. To get all continuous periods, it will help if you have an enumeration of months. So, convert the period to a number of months and then subtract a sequence generated using row_number(). The difference is constant for a group of adjacent months.
This looks like:
select acc_code, min(period), max(period)
from (select t.*,
row_number() over (partition by acc_code order by period_num) as seqnum
from (select t.*, floor(period / 100) * 12 + mod(period, 100) as period_num
from t
) t
where rea_desc = 'METER NOT WORKING'
) t
group by (period_num - seqnum);
Then, if you want the last one for each account, you can use a subquery:
select t.*
from (select acc_code, min(period), max(period),
row_number() over (partition by acc_code order by max(period desc) as seqnum
from (select t.*,
row_number() over (partition by acc_code order by period_num) as seqnum
from (select t.*, floor(period / 100) * 12 + mod(period, 100) as period_num
from t
) t
where rea_desc = 'METER NOT WORKING'
) t
group by (period_num - seqnum)
) t
where seqnum = 1;

Finding lowest two minimum values and finding difference between the two in SQL Server?

I have a transaction table where I have to find the first and second date of transaction of every customer. Finding first date is very simple where I can use MIN() func to find the first date but the second and in particular finding the difference between the two is getting very challenging and somehow I am not able to find out any feasible way:
select a.customer_id, a.transaction_date, a.Row_Count2
from ( select
transaction_date as transaction_date,
reference_no as customer_id,
row_number() over (partition by reference_no
ORDER BY reference_no, transaction_date) AS Row_Count2
from transaction_detail
) a
where a.Row_Count2 < 3
ORDER BY a.customer_id, a.transaction_date, a.Row_Count2
Gives me this :
What I want is , following columns:
||CustomerID|| ||FirstDateofPurchase|| ||SecondDateofPuchase|| ||Diff. b/w Second & First Date ||
You can use window functions LEAD/LAG to return results you are looking for
First try to find all the leading dates by reference number using LEAD, generate row number for each row using your original logic. You can then do difference on dates for row number value 1 row from the result set.
Ex (I'm not excluding same day transactions and treating them as separate and generating row number based on result set from your query above, you can easily change the sql below to consider these as one and remove them so that you get next date as second date):
declare #tbl table(reference_no int, transaction_date datetime)
insert into #tbl
select 1000, '2018-07-11'
UNION ALL
select 1001, '2018-07-12'
UNION ALL
select 1001, '2018-07-12'
UNIOn ALL
select 1001, '2018-07-13'
UNIOn ALL
select 1002, '2018-07-11'
UNIOn ALL
select 1002, '2018-07-15'
select customer_id, transaction_date as firstdate,
transaction_date_next seconddate,
datediff(day, transaction_date, transaction_date_next) diff_in_days
from
(
select reference_no as customer_id, transaction_date,
lead(transaction_date) over (partition by reference_no
order by transaction_date) transaction_date_next,
row_number() over (partition by reference_no ORDER BY transaction_date) AS Row_Count
from #tbl
) src
where Row_Count = 1
You can do this with CROSS APPLY.
SELECT td.customer_id, MIN(ca.transaction_date), MAX(ca.transaction_date),
DATEDIFF(day, MIN(ca.transaction_date), MAX(ca.transaction_date))
FROM transaction_detail td
CROSS APPLY (SELECT TOP 2 *
FROM transaction_detail
WHERE customer_id = td.customer_id
ORDER BY transaction_date) ca
GROUP BY td.customer_id

Tagging consecutive days

Supposedly I have data something like this:
ID,DATE
101,01jan2014
101,02jan2014
101,03jan2014
101,07jan2014
101,08jan2014
101,10jan2014
101,12jan2014
101,13jan2014
102,08jan2014
102,09jan2014
102,10jan2014
102,15jan2014
How could I efficiently code this in Greenplum SQL such that I can have a grouping of consecutive days similar to the one below:
ID,DATE,PERIOD
101,01jan2014,1
101,02jan2014,1
101,03jan2014,1
101,07jan2014,2
101,08jan2014,2
101,10jan2014,3
101,12jan2014,4
101,13jan2014,4
102,08jan2014,1
102,09jan2014,1
102,10jan2014,1
102,15jan2014,2
You can do this using row_number(). For a consecutive group, the difference between the date and the row_number() is a constant. Then, use dense_rank() to assign the period:
select id, date,
dense_rank() over (partition by id order by grp) as period
from (select t.*,
date - row_number() over (partition by id order by date) * 'interval 1 day'
from table t
) t

Last day of the month with a twist in SQLPLUS

I would appreciate a little expert help please.
in an SQL SELECT statement I am trying to get the last day with data per month for the last year.
Example, I am easily able to get the last day of each month and join that to my data table, but the problem is, if the last day of the month does not have data, then there is no returned data. What I need is for the SELECT to return the last day with data for the month.
This is probably easy to do, but to be honest, my brain fart is starting to hurt.
I've attached the select below that works for returning the data for only the last day of the month for the last 12 months.
Thanks in advance for your help!
SELECT fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,fd.column_name
FROM super_table fd,
(SELECT TRUNC(daterange,'MM')-1 first_of_month
FROM (
select TRUNC(sysdate-365,'MM') + level as DateRange
from dual
connect by level<=365)
GROUP BY TRUNC(daterange,'MM')) fom
WHERE fd.cust_id = :CUST_ID
AND fd.coll_date > SYSDATE-400
AND TRUNC(fd.coll_date) = fom.first_of_month
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
You probably need to group your data so that each month's data is in the group, and then within the group select the maximum date present. The sub-query might be:
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY YEAR(coll_date) * 100 + MONTH(coll_date);
This presumes that the functions YEAR() and MONTH() exist to extract the year and month from a date as an integer value. Clearly, this doesn't constrain the range of dates - you can do that, too. If you don't have the functions in Oracle, then you do some sort of manipulation to get the equivalent result.
Using information from Rhose (thanks):
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY TO_CHAR(coll_date, 'YYYYMM');
This achieves the same net result, putting all dates from the same calendar month into a group and then determining the maximum value present within that group.
Here's another approach, if ANSI row_number() is supported:
with RevDayRanked(itemDate,rn) as (
select
cast(coll_date as date),
row_number() over (
partition by datediff(month,coll_date,'2000-01-01') -- rewrite datediff as needed for your platform
order by coll_date desc
)
from super_table
)
select itemDate
from RevDayRanked
where rn = 1;
Rows numbered 1 will be nondeterministically chosen among rows on the last active date of the month, so you don't need distinct. If you want information out of the table for all rows on these dates, use rank() over days instead of row_number() over coll_date values, so a value of 1 appears for any row on the last active date of the month, and select the additional columns you need:
with RevDayRanked(cust_id, server_name, coll_date, rk) as (
select
cust_id, server_name, coll_date,
rank() over (
partition by datediff(month,coll_date,'2000-01-01')
order by cast(coll_date as date) desc
)
from super_table
)
select cust_id, server_name, coll_date
from RevDayRanked
where rk = 1;
If row_number() and rank() aren't supported, another approach is this (for the second query above). Select all rows from your table for which there's no row in the table from a later day in the same month.
select
cust_id, server_name, coll_date
from super_table as ST1
where not exists (
select *
from super_table as ST2
where datediff(month,ST1.coll_date,ST2.coll_date) = 0
and cast(ST2.coll_date as date) > cast(ST1.coll_date as date)
)
If you have to do this kind of thing a lot, see if you can create an index over computed columns that hold cast(coll_date as date) and a month indicator like datediff(month,'2001-01-01',coll_date). That'll make more of the predicates SARGs.
Putting the above pieces together, would something like this work for you?
SELECT fd.cust_id,
fd.server_name,
fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,
fd.column_name
FROM super_table fd,
WHERE fd.cust_id = :CUST_ID
AND TRUNC(fd.coll_date) IN (
SELECT MAX(TRUNC(coll_date))
FROM super_table
WHERE coll_date > SYSDATE - 400
AND cust_id = :CUST_ID
GROUP BY TO_CHAR(coll_date,'YYYYMM')
)
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)