Sql query to return data if AVG(7 days record count) > (Today's record count) - sql

I want to write a SQL query in Oracle database for:
A priceindex(field name) have around 120(say) records each day and I have to display the priceindex name and today's date, if the avg of last 7 days record count is greater than Todays record count for the priceindex(group by priceindex).
Basically, There will be 56 priceindex and each should have around 120 records each day and is dump to database each day from external site. So want to make sure all records are downloaded to the database everyday.

Except for the clarification I requested in a Comment to your question (having to do with "how can we know today's final count, when today is not over yet), the problem can be solved along the following lines. Not tested since you didn't provide sample data.
From your table, select only the rows where the relevant DATE is between "today" - 7 and "today" (so there are really EIGHT days: the seven days preceding today, and today). Then group by PRICEINDEX. Count total rows for each group, and count rows just for "today". The rows for "today" should be less than 1/8 times the total count (this is easy algebra: this is equivalent to being less than 1/7 times the count of OTHER days).
Such conditions, at the group level, must be in the HAVING clause.
select priceindex
from your_table
where datefield >= trunc(sysdate) - 7 and datefield < trunc(sysdate) + 1
group by priceindex
having count(case when datefield >= trunc(sysdate) then 1 end) < 1/8 * count(*)
;
EDIT The OP clarified that the query runs every day at midnight; this means that "today" should actually mean "yesterday" (the day that just ended). In Oracle, and probably in all of computing, midnight belongs to the day that BEGINS at midnight, not the one that ends at midnight. The time-of-day at midnight is 00:00:00 (beginning of the new day), not 24:00:00.
So, the query above will have to be changed slightly:
select priceindex
from your_table
where datefield >= trunc(sysdate) - 8 and datefield < trunc(sysdate)
group by priceindex
having count(case when datefield >= trunc(sysdate) - 1 then 1 end)
< 1/8 * count(*)
;

Related

Date Functions Trunc (SysDate)

I am running the below query to get data recorded in the past 24 hours. I need the same data recorded starting midnight (DATE > 12:00 AM) and also data recorded starting beginning of the month. Not sure if using between will work or if there is better option. Any suggestions.
SELECT COUNT(NUM)
FROM TABLE
WHERE
STATUS = 'CNLD'
AND
TRUNC(TO_DATE('1970-01-01','YYYY-MM-DD') + OPEN_DATE/86400) = trunc(sysdate)
Output (Just need Count). OPEN_DATE Data Type is NUMBER. the output below displays count in last 24 hours. I need the count beginning midnight and another count starting beginning of the month.
The query you've shown will get the count of rows where OPEN_DATE is an 'epoch date' number representing time after midnight this morning*. The condition:
TRUNC(TO_DATE('1970-01-01','YYYY-MM-DD') + OPEN_DATE/86400) = trunc(sysdate)
requires every OPEN_DATE value in your table (or at least all those for CNLD rows) to be converted from a number to an actual date, which is going to be doing a lot more work than necessary, and would stop a standard index against that column being used. It could be rewritten as:
OPEN_DATE >= (trunc(sysdate) - date '1970-01-01') * 86400
which converts midnight this morning to its epoch equivalent, once, and compares all the numbers against that value; using an index if there is one and the optimiser thinks it's appropriate.
To get everything since the start of the month you could just change the default behaviour of trunc(), which is to truncate to the 'DD' element, to truncate to the start of the month instead:
OPEN_DATE >= (trunc(sysdate, 'MM') - date '1970-01-01') * 86400
And the the last 24 hours, subtract a day from the current time instead of truncating it:
OPEN_DATE >= ((sysdate - 1) - date '1970-01-01') * 86400
db<>fiddle with some made-up data to get 72 back for today, more for the last 24 hours, and more still for the whole month.
Based on your current query I'm assuming there won't be any future-dated values, so you don't need to worry about an upper bound for any of these.
*Ignoring leap seconds...
It sounds like you have a column that is of data type TIMESTAMP and you only want to select rows where that TIMESTAMP indicates that it is today's date? And as a related problem, you want to find those that are the current month, based on some system values like CURRENT TIMESTAMP and CURRENT DATE? If so, let's call your column TRANSACTION_TIMESTAMP instead of (reserved word) DATE. Your first query could be:
SELECT COUNT(NUM)
FROM TABLE
WHERE
STATUS = 'CLND'
AND
DATE(TRANSACTION_TIMESTAMP)=CURRENT DATE
The second example of finding all for the current month up to today's date could be:
SELECT COUNT(NUM)
FROM TABLE
WHERE
STATUS = 'CLND'
AND
YEAR(DATE(TRANSACTION_TIMESTAMP)=YEAR(CURRENT DATE) AND
MONTH(DATE(TRANSACTION_TIMESTAMP)=MONTH(CURRENT DATE) AND
DAY(DATE(TRANSACTION_TIMESTAMP)<=DAY(CURRENT DATE)

Get All records of previous date from current date in oracle

I want all the data from a table which is more than 6 months available in my table. So for that I wrote the below query but it wasn't giving the exact records.
Select * from changerequests where lastmodifiedon < sysdate - 180;
The issue is I was getting the records for 2nd april, 2020 which is not more than 6 months. Please suggest the query
If you want records that were last modified within the last 6 months, then you want the inequality condition the other way around:
where lastmodifiedon > sysdate - 180
Note that 180 days is not exactly 6 months. You might want to use add_months() for something more accurate:
where lastmodifiedon > add_months(sysdate, -12)

unable to count records for an interval in days

I want to detect how many records are covered by a certain period in a RedShift table. So I queried records for various periods of time. However I've noticed a strange behavior.
When I'm trying to count a number of records for say 100 days it returns 0 no matter how many days I'm executing the query for.
SELECT count(*)
FROM main.transaction_data
WHERE tr_date > current_date - interval '100' day;
But when I query the count for several months it returns a valid count.
SELECT count(*)
FROM main.transaction_data
WHERE tr_date > current_date - interval '3 months';
Is the query for a period of 100 days incorrect?

Sum of shifting range in SQL Query

I am trying to write an efficient query to get the sum of the previous 7 days worth of values from a relational DB table, and record each total against the final date in the 7 day period (e.g. the 'WeeklyTotals Table' in the example below). For example, in my WeeklyTotals query, I would like the value for February 15th to be 333, since that is the total sum of users from Feb 9th - Feb 15th, and so on:
I have a base query which gets me my previous weeks users for today's date (simplified for the sake of the example):
SELECT Date, Sum("Total Users")
FROM "UserRecords"
WHERE (dateadd(hour, -8, "UserRecords"."Date") BETWEEN
dateadd(hour, -8, sysdate) - INTERVAL '7 DAY' AND dateadd(hour, -8, sysdate);
The problem is, this only get's me the total for today's date. I need a query which will get me this information for the previous seven days.
I know I can make a view for each date (since I only need the previous seven entries) and join them all together, but that seems really inefficient (I'll have to create/update 7 views, and then do all the inner join operations). I am wondering if there's a more efficient way to achieve this.
Provided there are no gaps, you can use a running total with SUM OVER including the six previous rows. Use ROW_NUMBER to exclude the first six records, as their totals don't represent complete weeks.
select log_date, week_total
from
(
select
log_date,
sum(total_users) over (order by log_date rows 6 preceding) as week_total,
row_number() over (order by log_date) as rn
from mytable
where log_date > 0
)
where rn >= 7
order by log_date;
UPDATE: In case there are gaps, it should be
sum(total_users) over (order by log_date range interval '6' day preceding)
but I don't know whether PostgreSQL supports this already. (Moreover the ROW_NUMBER exclusion wouldn't work then and would have to be replaced by something else.)
Here's a a query that self joins to the previous 6 days and sums the value to get the weekly totals:
select u1.date, sum(u2.total_users) as weekly_users
from UserRecords u1
join UserRecords u2
on u1.date - u2.date < 7
and u1.date >= u2.date
group by u1.date
order by u1.date
You can use the SUM over Window function, with the expression using Date Part, of week.
Self joins are much slower than Window functions.

Pull records ahead of time

I'm trying to develop an 'upcoming listings' puller for my website, however the following query does not seem to be performing as it should.
The SQL:
SELECT * FROM listings WHERE start_date > DATE_SUB( CURDATE(), INTERVAL 3 MONTH )
Rather than pulling listings from 3 months ahead it pulls all listings?
If you want to work with 'future' records, you should use DATE_ADD instead:
SELECT *
FROM listings
WHERE start_date
BETWEEN CURDATE()
AND DATE_ADD(CURDATE(), INTERVAL 3 MONTH);
Note that BETWEEN ... AND clause is inclusive: in other words, you'll have records for start_date equal both to the current's one and the one exactly 3 months after. If that's not the desired outcome, just use separate two conditions:
WHERE start_date > CURDATE()
AND start_date < DATE_ADD(CURDATE(), INTERVAL 3 MONTH);
As it stands now, you collect all the records having start_date set to later than 3 months before the current date. That probably includes the whole dataset.