Here is my SQL query:
select id, max(file_uploaded_at) as latest_date, current_date
from employee_imports
where convert(file_uploaded_at) - integer '90'< current_date AND
error_message IS NULL
group by id
order by latest_date DESC
I keep getting back values that are older than 90 days though. Do you know why?
select id, max(file_uploaded_at) as latest_date, current_date
from employee_imports
where current_date - date(file_uploaded_at) <= 90 AND
error_message IS NULL
group by id
order by latest_date DESC
This is what I changed my query to and everything appears to work fine.
Thanks anyway guys.
Related
I hope you can suppor me with a piece of code I'm writing. I'm working with the following query:
SELECT case_id, case_date, people_id FROM table_1;
and I've to search in the DB how many times the same people_id is repeted in the DB, (different case_id) considering the case_date -90days timeframe. Any advise on how to address that?
Data sample
Additional info: as results I'm expecting to have the list of people_id with how many cases received in the 90 days from the last case_date.
expected result sample:
The way I understood the question, it would be something like this:
select people_id,
case_id,
count(*)
from table_1
where case_date >= trunc(sysdate) - 90
group by people_id,
case_id
You want to filter WHERE the case_date is greater than or equal to 90 days before the start of today and then GROUP BY the people_id and COUNT the number of DISTINCT (different) case_id:
SELECT people_id,
COUNT( DISTINCT case_id ) AS number_of_cases
FROM table_1
WHERE case_date >= TRUNC( SYSDATE ) - INTERVAL '90' DAY
GROUP BY
people_id;
If you only want to count repeated case_id per person_id then:
SELECT person_id,
COUNT(*) AS number_of_repeated_cases
FROM (
SELECT case_id,
person_id,
FROM table_1
WHERE case_date >= TRUNC( SYSDATE ) - INTERVAL '90' DAY
GROUP BY
people_id,
case_id
HAVING COUNT(*) >= 2
)
GROUP BY
people_id;
I think you want window functions:
select t.*,
count(*) over (partition by people_idorder by case_date
range between interval '90' day preceding and current row
) as person_count_90_day
from t;
The best way to explain what I need is showing, so, here it is:
Currently I have this query
select
date_
,count(*) as count_
from table
group by date_
which returns me the following database
Now I need to get a new column, that shows me the count off all the previous 7 days, considering the row date_.
So, if the row is from day 29/06, I have to count all ocurrencies of that day ( my query is already doing it) and get all ocurrencies from day 22/06 to 29/06
The result should be something like this:
If you have values for all dates, without gaps, then you can use window functions with a rows frame:
select
date,
count(*) cnt
sum(count(*)) over(order by date rows between 7 preceding and current row) cnt_d7
from mytable
group by date
order by date
you can try something like this:
select
date_,
count(*) as count_,
(select count(*)
from table as b
where b.date_ <= a.date_ and b.date_ > a.date - interval '7 days'
) as count7days_
from table as a
group by date_
If you have gaps, you can do a more complicated solution where you add and subtract the values:
with t as (
select date_, count(*) as count_
from table
group by date_
union all
select date_ + interval '8 day', -count(*) as count_
from table
group by date_
)
select date_,
sum(sum(count_)) over (order by date_ rows between unbounded preceding and current row) - sum(count_)
from t;
The - sum(count_) is because you do not seem to want the current day in the cumulated amount.
You can also use the nasty self-join approach . . . which should be okay for 7 days:
with t as (
select date_, count(*) as count_
from table
group by date_
)
select t.date_, t.count_, sum(tprev.count_)
from t left join
t tprev
on tprev.date_ >= t.date_ - interval '7 day' and
tprev.date_ < t.date_
group by t.date_, t.count_;
The performance will get worse and worse as "7" gets bigger.
Try with subquery for the new column:
select
table.date_ as groupdate,
count(table.date_) as date_count,
(select count(table.date_)
from table
where table.date_ <= groupdate and table.date_ >= groupdate - interval '7 day'
) as total7
from table
group by groupdate
order by groupdate
I have a table customerordercapture and a column updatedate. I want to select entities which are monitored by customer in customerordercapture table during previous day (till midnight), that is last updatedate whole days data
and I want one more query where I need data of and before second last updatedate.
The query is fired from a script, so hardcoded dates won't work.
i think my queries are wrong.
SELECT distinct UPDATEDATE
FROM customerordercapture
GROUP BY UPDATEDATE
HAVING MAX(UPDATEDATE) < = SYSDATE-2
and UPDATEDATE >= MIN(UPDATEDATE)
ORDER BY UPDATEDATE asc ;
SELECT distinct UPDATEDATE
FROM customerordercapture
GROUP BY UPDATEDATE
HAVING UPDATEDATE <= TRUNC(MAX(UPDATEDATE)) - INTERVAL '3' DAY
and UPDATEDATE =TRUNC( MAX(UPDATEDATE) )
ORDER BY UPDATEDATE asc ;
If I right understand, maybe this help you:
SELECT MAX(UPDATEDATE), some_column
FROM customerordercapture
WHERE UPDATEDATE <= sysdate
GROUP BY some_column
ORDER BY MAX(UPDATEDATE) asc;
This is the SQL query that I'm trying to execute:
select *,count(dummy) over(partition by dummy) as total_count
from aaca711a5e78441cdbf062f1d630ee261
WHERE (max_timestamp BETWEEN '2017-01-01' AND '2018-01-01')
ORDER BY max_timestamp DESC
As far as I know in a BETWEEN AND operation, both values are inclusive. Here, this query is unable to fetch records corresponding to 2018-01-01.
I changed the query to this:
select *,count(dummy) over(partition by dummy) as total_count
from aaca711a5e78441cdbf062f1d630ee261
WHERE (max_timestamp >= '2017-01-01' AND max_timestamp <= '2018-01-01')
ORDER BY max_timestamp DESC
Still, it's not working.
Then I tried this:
select *,count(dummy) over(partition by dummy) as total_count
from aaca711a5e78441cdbf062f1d630ee261
WHERE (max_timestamp >= '2017-01-01' AND max_timestamp <= '2018-01-02')
ORDER BY max_timestamp DESC
It's able to fetch records related to 2018-01-01.
What could be the reason for this? and how can I fix this?
Thanks in advance.
This is your query:
select *, count(dummy) over (partition by dummy) as total_count
from aaca711a5e78441cdbf062f1d630ee261
where max_timestamp BETWEEN '2017-01-01' AND '2018-01-01'
order by max_timestamp DESC;
Simply don't use between with date times. Use explicit logic:
select *, count(dummy) over (partition by dummy) as total_count
from aaca711a5e78441cdbf062f1d630ee261
where max_timestamp >= '2017-01-01' and
max_timestamp < '2018-01-02' --> notice this is one day later
order by max_timestamp DESC;
The problem is that you have a time component on the date.
Aaron Bertrand explains this very well in his blog What do BETWEEN and the devil have in common? (I am amused by the title, given that BETWEEN definitely does exist, but there is more controversy about the existence of the devil.)
This is a known issue with Spark.
Please refer to this link for more info: https://issues.apache.org/jira/browse/SPARK-10837
I've fixed this issue by using the date_add function provided by spark.
so the last date was changed to date_add(endDate, 1) so that we'll get all the values including those corresponding to the last date.
I am trying to find all record count for each day using query:
select cast(Timestamp_field as date), count(*) as cnt
from table1
group by 1
having cast(Timestamp_field as date) between date and date -10;
Timestamp_field is a timestamp and I am casting this to date. This; despite max value of Timestamp_field showing 2016-09-20 12:31:38.000000, doesn't return any record. Any idea why?
My guess is that the problem is the between. Perhaps this will work for you:
select cast(Timestamp_field as date), count(*)
from table1
group by 1
having cast(Timestamp_field as date) between date - 10 and date;
The smaller value should go first for the between comparands.
Note: You should do the filtering before the group by, not after:
select cast(Timestamp_field as date), count(*)
from table1
where cast(Timestamp_field as date) between date - 10 and date;
group by 1