Between numerical values with no lower limit in Oracle SQL - sql

My data looks something similar to:
days
weight
start date
end date
180
1
01/01/2020
null
365
0.75
01/01/2020
null
And I want to be able to select this to assign the correct value where say if the days were 0-180, they would be row 1 and 181-365 it would be row 2. If it was 365+ it would be row 2. I have already found out I can use between sql syntax for the date.
My initial code tries to do this:
select weight from (select * from table where days >= #DAYS order by days ASC) where rownum =1
But then if you do more than the last value it doesn't show anything so i've then tried to introduce a maximum element trying to find the maximum value and saying
>= #DAYS
or
>= MAX(#DAYS)
Is there a simpler way to do this?
Thanks.

select weight
from (select t.*, max(days) over () as max_day from table t) v
where days >= least(#DAY,max_day)
order by days asc
fetch first 1 row only
I'd suggest this option. When #DAY becomes larger than the largest days entry, we use max_days instead.

Select max(weight) from table
Where days=(Select max(days) from table
Where days >= #DAYS)
The first max() function is defensive in case your table has 2 entries with the same days number.

Related

Extract previous row calculated value for use in current row calculations - Postgres

Have a requirement where I would need to rope the calculated value of the previous row for calculation in the current row.
The following is a sample of how the data currently looks :-
ID
Date
Days
1
2022-01-15
30
2
2022-02-18
30
3
2022-03-15
90
4
2022-05-15
30
The following is the output What I am expecting :-
ID
Date
Days
CalVal
1
2022-01-15
30
2022-02-14
2
2022-02-18
30
2022-03-16
3
2022-03-15
90
2022-06-14
4
2022-05-15
30
2022-07-14
The value of CalVal for the first row is Date + Days
From the second row onwards it should take the CalVal value of the previous row and add it with the current row Days
Essentially, what I am looking for is means to access the previous rows calculated value for use in the current row.
Is there anyway we can achieve the above via Postgres SQL? I have been tinkering with window functions and even recursive CTEs but have had no luck :(
Would appreciate any direction!
Thanks in advance!
select
id,
date,
coalesce(
days - (lag(days, 1) over (order by date, days))
, days) as days,
first_date + cast(days as integer) as newdate
from
(
select
-- get a running sum of days
id,
first_date,
date,
sum(days) over (order by date, days) as days
from
(
select
-- get the first date
id,
(select min(date) from table1) as first_date,
date,
days
from
table1
) A
) B
This query get the exact output you described. I'm not at all ready to say it is the best solution but the strategy employed is to essential create a running total of the "days" ... this means that we can just add this running total to the first date and that will always be the next date in the desired sequence. One finesse: to put the "days" back into the result, we calculated the current running total less the previous running total to arrive at the original amount.
assuming that table name is table1
select
id,
date,
days,
first_value(date) over (order by id) +
(sum(days) over (order by id rows between unbounded preceding and current row))
*interval '1 day' calval
from table1;
We just add cumulative sum of days to first date in table. It's not really what you want to do (we don't need date from previous row, just cumulative days sum)
Solution with recursion
with recursive prev_row as (
select id, date, days, date+ days*interval '1 day' calval
from table1
where id = 1
union all
select t.id, t.date, t.days, p.calval + t.days*interval '1 day' calval
from prev_row p
join table1 t on t.id = p.id+ 1
)
select *
from prev_row

Need to count unique transactions by month but ignore records that occur 3 days after 1st entry for that ID

I have a table with just two columns: User_ID and fail_date. Each time somebody's card is rejected they are logged in the table, their card is automatically tried again 3 days later, and if they fail again, another entry is added to the table. I am trying to write a query that counts unique failures by month so I only want to count the first entry, not the 3 day retries, if they exist. My data set looks like this
user_id fail_date
222 01/01
222 01/04
555 02/15
777 03/31
777 04/02
222 10/11
so my desired output would be something like this:
month unique_fails
jan 1
feb 1
march 1
april 0
oct 1
I'll be running this in Vertica, but I'm not so much looking for perfect syntax in replies. Just help around how to approach this problem as I can't really think of a way to make it work. Thanks!
You could use lag() to get the previous timestamp per user. If the current and the previous timestamp are less than or exactly three days apart, it's a follow up. Mark the row as such. Then you can filter to exclude the follow ups.
It might look something like:
SELECT month,
count(*) unique_fails
FROM (SELECT month(fail_date) month,
CASE
WHEN datediff(day,
lag(fail_date) OVER (PARTITION BY user_id,
ORDER BY fail_date),
fail_date) <= 3 THEN
1
ELSE
0
END follow_up
FROM elbat) x
WHERE follow_up = 0
GROUP BY month;
I'm not so sure about the exact syntax in Vertica, so it might need some adaptions. I also don't know, if fail_date actually is some date/time type variant or just a string. If it's just a string the date/time specific functions may not work on it and have to be replaced or the string has to be converted prior passing it to the functions.
If the data spans several years you might also want to include the year additionally to the month to keep months from different years apart. In the inner SELECT add a column year(fail_date) year and add year to the list of columns and the GROUP BY of the outer SELECT.
You can add a flag about whether this is a "unique_fail" by doing:
select t.*,
(case when lag(fail_date) over (partition by user_id order by fail_date) > fail_date - 3
then 0 else 1
end) as first_failure_flag
from t;
Then, you want to count this flag by month:
select to_char(fail_date, 'Mon'), -- should aways include the year
sum(first_failure_flag)
from (select t.*,
(case when lag(fail_date) over (partition by user_id order by fail_date) > fail_date - 3
then 0 else 1
end) as first_failure_flag
from t
) t
group by to_char(fail_date, 'Mon')
order by min(fail_date)
In a Derived Table, determine the previous fail_date (prev_fail_date), for a specific user_id and fail_date, using a Correlated subquery.
Using the derived table dt, Count the failure, if the difference of number of days between current fail_date and prev_fail_date is greater than 3.
DateDiff() function alongside with If() function is used to determine the cases, which are not repeated tries.
To Group By this result on Month, you can use MONTH function.
But then, the data can be from multiple years, so you need to separate them out yearwise as well, so you can do a multi-level group by, using YEAR function as well.
Try the following (in MySQL) - you can get idea for other RDBMS as well:
SELECT YEAR(dt.fail_date) AS year_fail_date,
MONTH(dt.fail_date) AS month_fail_date,
COUNT( IF(DATEDIFF(dt.fail_date, dt.prev_fail_date) > 3, user_id, NULL) ) AS unique_fails
FROM (
SELECT
t1.user_id,
t1.fail_date,
(
SELECT t2.fail_date
FROM your_table AS t2
WHERE t2.user_id = t1.user_id
AND t2.fail_date < t1.fail_date
ORDER BY t2.fail_date DESC
LIMIT 1
) AS prev_fail_date
FROM your_table AS t1
) AS dt
GROUP BY
year_fail_date,
month_fail_date
ORDER BY
year_fail_date ASC,
month_fail_date ASC

Smoothing out a result set by date

Using SQL I need to return a smooth set of results (i.e. one per day) from a dataset that contains 0-N records per day.
The result per day should be the most recent previous value even if that is not from the same day. For example:
Starting data:
Date: Time: Value
19/3/2014 10:01 5
19/3/2014 11:08 3
19/3/2014 17:19 6
20/3/2014 09:11 4
22/3/2014 14:01 5
Required output:
Date: Value
19/3/2014 6
20/3/2014 4
21/3/2014 4
22/3/2014 5
First you need to complete the date range and fill in the missing dates (21/3/2014 in you example). This can be done by either joining a calendar table if you have one, or by using a recursive common table expression to generate the complete sequence on the fly.
When you have the complete sequence of dates finding the max value for the date, or from the latest previous non-null row becomes easy. In this query I use a correlated subquery to do it.
with cte as (
select min(date) date, max(date) max_date from your_table
union all
select dateadd(day, 1, date) date, max_date
from cte
where date < max_date
)
select
c.date,
(
select top 1 max(value) from your_table
where date <= c.date group by date order by date desc
) value
from cte c
order by c.date;
May be this works but try and let me know
select date, value from test where (time,date) in (select max(time),date from test group by date);

SQL Server : select the minimum value from table

I know it's simple question, but I still can't figure it out.
I want to find the date which is the closest date from now.
Here is my product table:
P_INDATE
----------
2013-11-03
2013-12-13
2013-11-13
Basically, it should show 2013-12-13.
I type this SELECT Max( P_INDATE) FROM product and it work.
Then, I try to use MIN((GETDATE()- P_INDATE)) in the where condition, but I fail.
Use MAX and WHERE clause along with function GETDATE():
SELECT MAX(P_INDATE)
FROM product
WHERE P_INDATE < GETDATE()
The above query gives you maximum date, which is less than current date, which you get using function GETDATE()
One way to go about this is to order the query by the difference between the stored date and the current date and take the first rows only. Using abs will allow you to find the closest date regardless of whether its before or after the current date.
SELECT TOP 1 p_indate
FROM mytable
ORDER BY ABS(GETDATE() - p_indate) ASC
Assuming you have a column which stores data and you want to show only recent one every time,why cant you use
select max(date) from yourtable which will always give you recent date
If you have an index on the column, the most efficient method is probably a bit more complicated:
SELECT TOP 1 P_INDATE
FROM ((SELECT TOP 1 P_INDATE
FROM product
WHERE P_INDATE < GETDATE()
ORDER BY P_INDATE DESC
) UNION ALL
(SELECT TOP 1 P_INDATE
FROM product
WHERE P.INDATE >= GETDATE()
ORDER BY P.INDATE
)
)
ORDER BY ABS(DATEDIFF(second, P_INDATE, GETDATE()))
The subqueries will use the index to get (at most) one row earlier and later than the current date. The outer ORDER BY then just needs to sort two rows.
Well you can try this:
SELECT TOP(1) P_INDATE
FROM [product table]
ORDER BY CASE
WHEN DATEDIFF(day,P_INDATE,GETDATE()) < 0
THEN DATEDIFF(day,GETDATE(),P_INDATE)
ELSE DATEDIFF(day,P_INDATE,GETDATE())
END ASC

Select X Most Recent Non-Consecutive Days Worth of Data

Anyone got any insight as to select x number of non-consecutive days worth of data? Dates are standard sql datetime. So for example I'd like to select 5 most recent days worth of data, but there could be many days gap between records, so just selecting records from 5 days ago and more recent will not do.
Following the approach Tony Andrews suggested, here is a way of doing it in T-SQL:
SELECT
Value,
ValueDate
FROM
Data
WHERE
ValueDate >=
(
SELECT
CONVERT(DATETIME, MIN(TruncatedDate))
FROM
(
SELECT DISTINCT TOP 5
CONVERT(VARCHAR, ValueDate, 102) TruncatedDate
FROM
Event
ORDER BY
TruncatedDate DESC
) d
)
ORDER BY
ValueDate DESC
I don't know the SQL Server syntax, but you need to:
1) Select the dates (with time component truncated) in descending order
2) Pick off top 5
3) Obtain 5th value
4) Select data where the datetime >= 5th value
Something like this "pseudo-SQL":
select *
from data
where datetime >=
( select top 1 date
from
( select top 5 date from
( select truncated(datetime) as date
from data
order by truncated(datetime) desc
)
order by date
)
)
This should do it and be reasonably good from a performance standpoint. You didn't mention how to handle ties, so you can add the WITH TIES clause if you need to do that.
SELECT TOP (#number_to_return)
* -- Write out your columns here
FROM
dbo.MyTable
ORDER BY
MyDateColumn DESC