return values within the last 365 counting from newest date ORACLE SQL - sql

[ORACLE SQL] I'm trying to write a query that returns all the values in a single column in the last 365 days counting from the last time (or newest date; in other words) data was entered.
For example: table: EMPLOYEE_TIMESTAMPS
EMPLOYEE_ID TIMESTAMP_DATE
1 AUG 2014
1 AUG 2015
2 JAN 2016
1 FEB 2016
1 OCT 2016
the resulting data should be only the last two rows, as it should count 365 days from OCT 2016 backwards.
I tried using the following code but resulted in [ORA-00934: group function is not allowed here] because of the MAX function. Using SYSDATE does not get the job done as the last data could have been added months ago.
SELECT * FROM EMPLOYEE_TIMESTAMPS
WHERE TIMESTAMP_DATE >= MAX(TIMESTAMP_DATE) -365;
I'm fairly new to programming so I still have a hard time transmitting my ideas. Thanks for the help.

One method of doing what you want uses window functions:
SELECT *
FROM (SELECT et.*,
MAX(TIMESTAMP_DATE) OVER (PARTITION BY EMPLOYEE_ID) as MAX_TIMESTAMP_DATE
FROM EMPLOYEE_TIMESTAMPS
) et
WHERE TIMESTAMP_DATE >= MAX_TIMESTAMP_DATE - 365;
The problem with your version is the use of the aggregation function MAX(). First, there is no GROUP BY in the query. And, second, these functions are not allowed in the WHERE clause.
The MAX() in the above version is called an analytic function, because it has the OVER clause.

Related

How to create a rolling period-over-period comparison in Redshift SQL

I have the following query that pulls all records from a Redshift table from January 1st of the current year through the final date of the most recent, full quarter.
SELECT *
FROM table
WHERE date_value BETWEEN DATE_TRUNC('year',getdate()) AND DATE_TRUNC('quarter',dateadd(day,-1,getdate()));
I now want to create a period-over-period comparison query that returns all records for the previous n months. Ex. if the first query returns all records for Jan - Jun 2022, this query will return all records for Jul - Dec 2021.
Here is what I have so far, however it currently returns Jan - Jun 2021 instead of the desired date range. I've tried playing around with DATEDIFF() instead of DATEADD() but haven't had any luck with that either. Any help is much appreciated.
SELECT *
FROM TABLE
WHERE date_value BETWEEN DATE_TRUNC('year',dateadd(year,-1,getdate())) AND DATE_TRUNC('quarter',dateadd(year,-1,getdate()));

Getting the oldest datas from a table that are older than a 100 days

I've been struggling with the following problem:
EXPLAINING
I have a table called part_subhourly_data that holds production data for a part (For the purpose of the problem, no need to know what a part is).
I need to archive the any data older than a 100 days. But since there's a lot of data (they arrive each 5 or 10 minutes) and we have more than 1000 parts, I need to do it the 5 oldest days each time.
This is the schema of my table:
part_subhourly_data
id INTERGER,
part_id INTEGER,
produced_at TIMESTAMP
data HSTORE
So basically I need to get all data that is in this table, where produced_at is prior to 100 days ago and limit that to the first 5 days, per part.
Example:
Part 1 has data from 15 Aug 2016 until 12 Dec 2016
Part 2 has data from 1st Sep 2016 until 12 Dec 2016
100 days ago would be 3 Sep 2016.
For Part 1 I would take data from 15 Aug 2016 until 19 Aug 2016 (5 days).
For Part 2 I would take data from 1st Sep 2016 until 3 Sep 2016 (3 days because of the 100 days old condition).
WHAT HAVE I TRIED
Well, I'm using rails on this, but a SQL solution is welcome as well. For now, What I'm doing is to grab the oldest data with:
SELECT "part_subhourly_data"."part_id", MIN(produced_at) produced_at
FROM "part_subhourly_data"
WHERE (produced_at < (NOW() - INTERVAL '100 days'))
GROUP BY "part_subhourly_data"."part_id"
And then I loop Over each part_id and grab the data based on the MIN(produced_at). It works, but it doesn't seems ideal. I'm sure that there is some SQL magic to make it simpler, and quicker, without having to loop over each part.
Any idea?
Take all records where produced_at is prior to 100 days ago.
dense rank the records per part_id ordered by produced_at::date in ascending order.
The records with the oldest date will get 1, the records with the next oldest date will get 2 etc.
select part_id,produced_at
from (select part_id,produced_at
,dense_rank () over (partition by part_id order by produced_at::date) as dr
from part_subhourly_data
where produced_at < now() - interval '100 days'
) p
where dr <= 5
;

BigQuery : is it possible to execute another query inside an UDF?

I have a table that records a row for each unique user per day with some aggregated stats for that user on that day, and I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
eg.
for Aug 31st, it'll count the unique users from Aug 2nd to Aug 31st
for Aug 30th, it'll count the unique users from Aug 1st to Aug 30th
and so on...
I've looked at some related questions but they aren't quite what I need - if a user logs in on multiple days in the last 30 days he should be counted only once, so I can't just sum the DAU count for the last 30 days.
Bigquery SQL for sliding window aggregate
BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)
So far, my ideas are to either:
write a simple script that'll execute a separate BigQuery for each of the relevant days
write a BigQuery UDF that'll execute basically the same query for each day selected from another query
but I've not found any examples on how to execute another BigQuery query inside an UDF, or if it's possible at all.
I need to produce a report that tells me for each day, the no. of
unique users in the last 30 days including that day.
Below should do this
SELECT
calendar_day,
EXACT_COUNT_DISTINCT(userID) AS unique_users
FROM (
SELECT calendar_day, userID
FROM YourTable
CROSS JOIN (
SELECT DATE(DATE_ADD('2016-08-08', pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF('2016-09-08', '2016-08-08'), '.'),'') AS h
FROM (SELECT NULL)),h
)))
) AS calendar
WHERE DATEDIFF(calendar_day, dt) BETWEEN 0 AND 29
)
GROUP BY calendar_day
ORDER BY calendar_day DESC
It assumes YourTable has userID and dt fields (like below for example)
dt userID
2016-09-08 1
2016-09-08 2
...
And you can control:
- reporting dates range by changing respectively 2016-08-08 and 2016-09-08
- aggregation size by changing 29 in BETWEEN 0 AND 29

Summarising MONTH value

I have a simple statement that starts:
SELECT a.product, MONTH(a.saledate) AS Month, Count(*) AS Total
Which yields, for example,
Product Month Total
Bike 8 1000
Please can anyone advise if it's possible to add the month's name to this query and also, is it possible to get a monthly total to appear as well?
Thanks!
The query in your example counts all the rows in your table, then presents that count next to a randomly chosen row's product and sale date. That's -- almost certainly -- not what you want. MySQL is quirky that way. Other DBMSs reject your example query.
If you want to display a monthly summary of product sold, here's the basic query:
SELECT a.product,
LAST_DAY(a.saledate) AS month_ending,
COUNT(*) AS Total
FROM table a
GROUP BY a.product, LAST_DAY(a.saledate)
The LAST_DAY() function is a great way to extract month and year from a date.
Finally, if you want to display the text name of the month, you can use the DATE_FORMAT() function to do that. %b as a format specifier gives a three-letter month name, and %M gives the full month name. So this query will do it.
SELECT a.product,
LAST_DAY(a.saledate) AS month_ending,
DATE_FORMAT(LAST_DAY(a.saledate), '%M %Y')) AS month
COUNT(*) AS Total
FROM table a
GROUP BY a.product, LAST_DAY(a.saledate)
In SQL Server 2012+ you can use the EOMONTH() function in place of LAST_DAY().
In SQL Server 2008+ you can use DATENAME(mm, a.saledate) to retrieve the month name from a date.
There are two ways of getting month name
1)
SUBSTRING('JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC ', (MONTH(a.saledate) * 4) - 3, 3)
2)
DATENAME(month, a.saledate)
Some poeple say You might be using MYSQL:
Then getting month name will be:
SELECT MONTHNAME( a.saledate);

How can I get a table of dates from the first day of the month, two months ago, to yesterday?

I have a situation that I would normally solve by creating a feeder table (for example, every date between five years ago and a hundred years into the future) for querying but, unfortunately, this particular job disallows creation of such a table.
So I'm opening this up to the SO community. Today is Jan 29, 2010. What query could I run that would give a table with a single date column with values ranging from Nov 1, 2009 through Jan 28, 2010 inclusive? On Feb 1, it should give me every date from Dec 1, 2009 through Jan 31, 2010.
I'm using DB2 but I'm happy to see any other solutions on the off-chance they may provide a clue.
I know I can select CURRENT DATE from sysibm.sysdummy1 (or dual for Oracle bods) but I'm not sure how to immediately select a date range without a physical backing table.
This just does sequential days between two dates, but I've posted to show you can eliminate the recursive error by supplying a limit.
with temp (level, seqdate) as
(select 1, date('2008-01-01')
from sysibm.sysdummy1
union all
select level, seqdate + level days
from temp
where level < 1000
and seqdate + 1 days < Date('2008-02-01')
)
select seqdate as CalendarDay
from temp
order by seqdate
Update from pax:
This answer actually put me on the right track. You can get rid of the warning by introducing a variable that's limited by a constant. The query above didn't have it quite right (and got the dates wrong, which I'll forgive) but, since it pointed me to the problem solution, it wins the prize.
The code below was the final working version (sans warning):
WITH DATERANGE(LEVEL,DT) AS (
SELECT 1, CURRENT DATE + (1 - DAY(CURRENT DATE)) DAYS - 2 MONTHS
FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT LEVEL + 1, DT + 1 DAY
FROM DATERANGE
WHERE LEVEL < 1000 AND DT < CURRENT DATE - 1 DAY
) SELECT DT FROM DATERANGE;
which outputs, when run on the 2nd of February:
----------
DT
----------
2009-12-01
2009-12-02
2009-12-03
: : : :
2010-01-30
2010-01-31
2010-02-01
DSNE610I NUMBER OF ROWS DISPLAYED IS 63
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL.
just an idea (not even sure how you'd do this), but let's say you knew how many days you wanted. Like 45 days. If you could get a select to list 1-45 you could do date arithmetic to subtract that number from your reference data (ie today).
This kind of works (in MySQL):
set #i = 0;
SELECT #i:=#i+1 as myrow, ADDDATE(CURDATE(), -#i)
FROM some_table
LIMIT 10;
The trick i have is how to get the 10 to be dynamic. If you can do that then you can use a query like this to get the right number to limit.
SELECT DATEDIFF(DATE_SUB(CURDATE(), INTERVAL 1 DAY),
DATE_SUB(LAST_DAY(CURDATE()), INTERVAL 2 MONTH))
FROM dual;
I haven't used DB2 before but in SQL Server you could do something like the following. Note that you need a table with at least the number of rows as you need days.
SELECT TOP 45
DATEADD(d, ROW_NUMBER() OVER(ORDER BY [Field]) * -1, GETDATE())
FROM
[Table]