Interesting logic to be solved in teradata SQL

Interesting logic to be solved in teradata SQL - sql

In the image the target is top calculate years since last DX.
A year level we want to calculate thw number of years since we have found a DX.
Eg
There was a dx for id 1 in 2014 so in 2015,2016 and 2017 we are populating the value in incremental manner of year.
we find one more DX in 2017 so in the following year we populate 1.
the same cycle beging in next ID
The Idea is to implement this logic in Teradata SQL
Can someone help how this could be done?

You can get the largest value for the most recent "DX" value using analytic functions:
select t.*,
max(case when code = 'DX' then date end) over
(partition by id
order by date
range between unbounded preceding and 1 preceding
) as prev_dx
from t;
I have no idea how to convert this to "years". The date format makes no sense to me.

Related

Finding the initial sampled time window after using SAMPLE BY again

I can't seem to find a perhaps easy solution to what I'm trying to accomplish here, using SQL and, more importantly, QuestDB. I also find it hard to put my exact question into words so bear with me.
Input
My real input is different of course but a similar dataset or case is the gas_prices table on the demo page of QuestDB. On https://demo.questdb.io, you can directly write and run queries against some sample database, so it should be easy enough to follow.
The main task I want to accomplish is to find out which month was responsible for the year's highest galon price.
Output
Using the following query, I can get the average galon price per month just fine.
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
timestamp
avg_per_month
2000-06-05T00:00:00.000000Z
1.6724
2000-07-05T00:00:00.000000Z
1.69275
2000-08-05T00:00:00.000000Z
1.635
...
...
Then, I get all these monthly averages, group them by year and return the maximum galon price per year by wrapping the above query in a subquery, like so:
SELECT timestamp, max(avg_per_month) as max_per_year FROM (
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
) SAMPLE BY 12M
timestamp
max_per_year
2000-01-05T00:00:00.000000Z
1.69275
2001-01-05T00:00:00.000000Z
1.767399999999
2002-01-05T00:00:00.000000Z
1.52075
...
...
Wanted output
I want to know which month was responsible for the maximum price of a year.
Looking at the output of the above query, we see that the maximum galon price for the year 2000 was 1.69275. Which month of the year 2000 had this amount as average price? I'd like to display this month in an additional column.
For the first row, July 2000 is shown in the additional column for year 2000 because it is responsible for the highest average price in 2000. For the second row, it was May 2001 as that month had the highest average price of 2001.
timestamp
max_per_year
which_month_is_responsible
2000-01-05T00:00:00.000000Z
1.69275
2000-07-05T00:00:00.000000Z
2001-01-05T00:00:00.000000Z
1.767399999999
2001-05-05T00:00:00.000000Z
...
...
What did I try?
I tried by adding a subquery to the SELECT to have a "duplicate" of some sort for the timestamp column but that's apparently never valid in QuestDB (?), so probably the solution is by adding even more subqueries in the FROM? Or a UNION?
Who can help me out with this? The data is there in the database and it can be calculated. It's just a matter of getting it out.

I think 'wanted output' can be achieved with window functions.
Please have a look at:
CREATE TABLE electricity (ts TIMESTAMP, consumption DOUBLE) TIMESTAMP(ts);
INSERT INTO electricity
SELECT (x*1000000)::timestamp, rnd_double()
FROM long_sequence(10000000);
SELECT day, ts, max_per_day
FROM
(
SELECT timestamp_floor('d', ts) as day,
ts,
avg_in_15_min as max_per_day,
row_number() OVER (PARTITION BY timestamp_floor('d', ts) ORDER BY avg_in_15_min desc) as rn_per_day
FROM
(
SELECT ts, avg(consumption) as avg_in_15_min
FROM electricity
SAMPLE BY 15m
)
) WHERE rn_per_day = 1

SQL LAG function

I tried using the LAG function to calculate the value of previous weeks, but there are gaps in the data due to the fact that certain weeks are missing.
This is the table:
The problem is that the LAG functions takes the previous found week in the table. But I would like it to be zero if the previous week is not consecutive previous week.
This is what I would like it to be:
I'm open to any solutions.
Thank you in advance

Your example data is baffling. You have multiple rows per time frame. The first column looks like a string, which doesn't really make sense for the comparison.
So, let me answer based on a simpler data mode. The answer is to use range. If you had an integer column that specified the time frame:
ordering sales
1 10
2 20
3 30
5 50
Then you would phrase this as:
select max(sales) over (order by ordering range between 1 preceding and 1 preceding)
This would return the value from the "previous" row as defined by the first column. The value would be in a separate column, not a separate row.

Oracle SQL - Sum next X number of Rows

I have a table in Oracle database whith projected sales per week and would like to sum the next 3 weeks for each week. Here is an example of the table for one product and what I would like to achieve in the last column.
I tried the Sum(Proj Sales) over (partition by Product order by Date), but I am not sure how to configure the Sum Over to get what I am looking for.
Any assistance will be much appreciated.

You can use analytic functions. Assuming that the next three weeks are the current row and the next two:
select t.*,
sum(proj_sales) over (partition by product
order by date
rows between current row and 2 following
) as next_three_weeks
from t;

Use row number in aggregate sum over UNBOUNDED FOLLOWING SQL

I would like to add a discount rate when summing Cashflows over a number of period. To do this I need to multiply each of the remaining cashflows by the discount rate, consummate with this period. I could do this, if I knew the row number of each period, but I can't use it with the window calc I am using. The example below shows the column 'Remaining Interest' which is what I am trying to calculate based on raw data of period and interest.
select Period,RemainingInterest = SUM(PeriodInterestPaid)
OVER (PARTITION BY Name ORDER BY period ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM CF A
Period Interest Remaining Interest(Query) Remaining Interest(Required)
1 1000 1000+2000 1000/1.02^1+2000/1.02^2
2 2000 2000 2000/1.02^1

hi i hope i understand Well ---
you need to get the sum of value based on the period that what i under stand from the query but u said that you need a multiply
So there's no need to make a window function just group by
select Period, SUM(PeriodInterestPaid) as RemainingInterest
FROM CF A
and if u want a multiplay you will make group by also but u will use anther exp :
Pls explan what exactly u need

Year wise Average days SQL

Today i have below problem while perform an sql query. Please find below data.
I perform SQL query on my table and get the below resulted output. i perform Group by on ID, Name, Week, Year, Days now i want the Days column as average of All Days based on year column. means there is multiple value of year is exist so i need Avg of Days data in all rows of DAYS for particular row. expected result as per below.
Thanks in Advance!!!
Write in comment if you have any query.

You can use OVER:
SELECT
*,
AVG(Days) OVER (PARTITION BY LEFT(Year, 4)) AvgDays
FROM
Tbl
Note: Just grouped by year (2016)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Interesting logic to be solved in teradata SQL - sql

Related

Finding the initial sampled time window after using SAMPLE BY again

SQL LAG function

Oracle SQL - Sum next X number of Rows

Use row number in aggregate sum over UNBOUNDED FOLLOWING SQL

Year wise Average days SQL

Categories

Resources