How can you make a date range in a big query? A date range starts from 29th of the month and ends with 28th of the next month. It should be like this
Date | Starting Date | Ending Date
03-13-2020 | 02-29-2020 | 03-28-2021
06-30-2020 | 06-29-2020 | 07-28-2021
01-01-2021 | 12-29-2020 | 01-28-2021
11-11-2021 | 10-28-2021 | 11-29-2021
Actually, i make an article on it.
Check this out:
https://www.theaccountingtactics.com/2021/12/BigQueryBQ-DateProblems-DateSituations-that-are-Hard-to-Analyze-and-Takes-Time-ToCrack%20.html?m=1
Consider below approach
create temp function set_day(date date, day int64) as (
ifnull(
safe.date(extract(year from date), extract(month from date), day),
last_day(date)
)
);
select Date,
set_day(Starting_Date, 29) as Starting_Date,
set_day(Ending_Date, 28) as Ending_Date
from (
select *, if(extract(day from Date) < 29,
struct(date_sub(Date, interval 1 month) as Starting_Date, Date as Ending_Date),
struct(Date as Starting_Date, date_add(Date, interval 1 month) as Ending_Date)
).*
from your_table
)
if applied to sample data as in your question
with your_table as (
select date '2020-03-13' Date union all
select '2021-03-13' union all
select '2020-06-30' union all
select '2021-01-01' union all
select '2021-11-11'
)
output is
You can test whole stuff using below
create temp function set_day(date date, day int64) as (
ifnull(
safe.date(extract(year from date), extract(month from date), day),
last_day(date)
)
);
with your_table as (
select date '2020-03-13' Date union all
select '2021-03-13' union all
select '2020-06-30' union all
select '2021-01-01' union all
select '2021-11-11'
)
select Date,
set_day(Starting_Date, 29) as Starting_Date,
set_day(Ending_Date, 28) as Ending_Date
from (
select *, if(extract(day from Date) < 29,
struct(date_sub(Date, interval 1 month) as Starting_Date, Date as Ending_Date),
struct(Date as Starting_Date, date_add(Date, interval 1 month) as Ending_Date)
).*
from your_table
)
Related
I need SELECT for finding data with overlapping date in Oracle SQL just from today to exactly one year ago. ID_FORMULAR is not UNIQUE value and I need to include just data with overlapping date where ID_FORMULAR is UNIQUE.
My code:
SELECT T1.*
FROM VISITORS T1, VISITORS T2
WHERE ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.FROM_DATE >= t2.FROM_DATE
AND t1.FROM_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.FROM_DATE
AND t1.TO_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.TO_DATE
AND t1.FROM_DATE <= t2.FROM_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
It is not working correctly. Any help?
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:
SELECT *
FROM (
SELECT *
FROM visitors
WHERE created_date >= ADD_MONTHS(TRUNC(CURRENT_DATE), -12)
AND created_date < TRUNC(CURRENT_DATE) + 1
)
MATCH_RECOGNIZE(
ORDER BY from_date
ALL ROWS PER MATCH
PATTERN (any_row overlap+)
DEFINE
overlap AS PREV(id_formular) != id_formular
AND PREV(to_date) >= from_date
)
Which, for the sample data:
CREATE TABLE visitors (id_formular, created_date, from_date, to_date) AS
SELECT 1, DATE '2022-08-01', DATE '2022-08-01', DATE '2022-08-03' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-02', DATE '2022-08-04' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-01', DATE '2022-08-03', DATE '2022-08-05' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-06', DATE '2022-08-06' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-07', DATE '2022-08-09' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-08', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-09', DATE '2022-08-11' FROM DUAL;
Outputs:
FROM_DATE
ID_FORMULAR
CREATED_DATE
TO_DATE
01-AUG-22
1
01-AUG-22
03-AUG-22
02-AUG-22
2
01-AUG-22
04-AUG-22
03-AUG-22
3
01-AUG-22
05-AUG-22
08-AUG-22
2
01-AUG-22
10-AUG-22
09-AUG-22
1
01-AUG-22
11-AUG-22
db<>fiddle here
I don't quite understand the question. The thing that is confusing me is that you need just rows where ID is unique. If ID is unique than there is no other row to overlap with. Anyway, lets suppose that the sample data is like below:
WITH
tbl AS
(
SELECT 0 "ID", DATE '2021-07-01' "CREATED", DATE '2021-07-01' "DATE_FROM", DATE '2021-07-13' "DATE_TO" FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-01', DATE '2021-12-01', DATE '2021-12-03' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-04', DATE '2021-12-04', DATE '2021-12-14' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-12', DATE '2021-12-12', DATE '2021-12-29' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-04', DATE '2022-08-04', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-21' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-21', DATE '2022-08-21', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-14', DATE '2022-08-14', DATE '2022-08-14' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-29', DATE '2022-08-14', DATE '2022-08-29' FROM DUAL
)
We can add some columns that will tell us if the ID is unique or not, what is the order of appearance of the same ID, what is the end date of the previous row for the same ID and if the rows of a particular ID overlaps or not. Here is the code: (used analytic functions with windowing clause)
SELECT
ID "ID",
CASE WHEN Count(*) OVER (PARTITION BY ID ORDER BY ID) = 1 THEN 'Y' ELSE 'N' END "IS_UNIQUE",
Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) "ID_ORDER_NO",
CREATED "CREATED",
DATE_FROM "DATE_FROM",
DATE_TO "DATE_TO",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN Null
ELSE
First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
END "PREVIOUS_END_DATE",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN 'N'
ELSE
CASE
WHEN DATE_FROM <= First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
THEN 'Y'
ELSE 'N'
END
END "OVERLAPS"
FROM
TBL
WHERE
CREATED BETWEEN ADD_MONTHS(TRUNC(SYSDATE, 'dd'), -12) And TRUNC(SYSDATE, 'dd')
Here is the resulting dataset...
/* R e s u l t
ID IS_UNIQUE ID_ORDER_NO CREATED DATE_FROM DATE_TO PREVIOUS_END_DATE OVERLAPS
---------- --------- ----------- --------- --------- --------- ----------------- --------
1 N 1 01-DEC-21 01-DEC-21 03-DEC-21 N
1 N 2 04-DEC-21 04-DEC-21 14-DEC-21 03-DEC-21 N
1 N 3 12-DEC-21 12-DEC-21 29-DEC-21 14-DEC-21 Y
2 N 1 04-AUG-22 04-AUG-22 10-AUG-22 N
2 N 2 11-AUG-22 11-AUG-22 21-AUG-22 10-AUG-22 N
2 N 3 21-AUG-22 21-AUG-22 29-AUG-22 21-AUG-22 Y
3 Y 1 11-AUG-22 11-AUG-22 29-AUG-22 N
4 N 1 14-AUG-22 14-AUG-22 14-AUG-22 N
4 N 2 29-AUG-22 14-AUG-22 29-AUG-22 14-AUG-22 Y
*/
This dataset could be further used to get you the rows and columns that you are trying to get. You can filter it, do some other calculations (like number of overlaping days), get number of rows per ID and so on....
Regards...
anyone can help me doing this in bigquery? So i have 2 table like this
01/01/2000
01/02/2000
01/03/2000
01/04/2000
and this
start | end | status
01/01/2000 | 01/02/2000 | a
01/02/2000 | 01/06/2000 | b
i want them become like this
month | status
01/01/2000 | a
01/02/2000 | b
01/03/2000 | b
01/04/2000 | b
You can handle this with a between in the join. The only caveat is how to handle the overlap of dates. In this case I've subtracted a day from the end to not include it in the range.
with temp1 as(
select '01/01/2000' dt UNION ALL
select '01/02/2000' UNION ALL
select '01/03/2000' UNION ALL
select '01/04/2000'
),
temp2 as (
select '01/01/2000' start, '01/02/2000' end_dt, 'a' status UNION ALL
select '01/02/2000' start, '01/06/2000' end_dt, 'b' status
)
select *
from temp1
join temp2
on parse_date('%d/%m/%Y',temp1.dt) between parse_date('%d/%m/%Y',temp2.start) and date_add(parse_date('%d/%m/%Y', temp2.end_dt), interval -1 day)
Below is for BigQuery Standard SQL
#standardSQL
select month, status
from `project.dataset.tableA`
join `project.dataset.tableB`
on parse_date('%d/%m/%Y', month) >= parse_date('%d/%m/%Y', start)
and parse_date('%d/%m/%Y', month) < parse_date('%d/%m/%Y', `end`)
If to apply to sample data in your question as in below example
#standardSQL
with `project.dataset.tableA` as (
select '01/01/2000' month union all
select '01/02/2000' union all
select '01/03/2000' union all
select '01/04/2000'
), `project.dataset.tableB` as (
select '01/01/2000' start, '01/02/2000' `end`, 'a' status union all
select '01/02/2000', '01/06/2000', 'b'
)
select month, status
from `project.dataset.tableA`
join `project.dataset.tableB`
on parse_date('%d/%m/%Y', month) >= parse_date('%d/%m/%Y', start)
and parse_date('%d/%m/%Y', month) < parse_date('%d/%m/%Y', `end`)
output is
And btw, since mid October 2020 - BigQuery standard SQL supports DATE arithmetic operators. So, below will also work
#standardSQL
select month, status
from `project.dataset.tableA`
join `project.dataset.tableB`
on parse_date('%d/%m/%Y', month)
between parse_date('%d/%m/%Y', start)
and parse_date('%d/%m/%Y', `end`) - 1
I am trying to get the last 31st August every year dynamically.
E.g if current date is today I would like to get 31st August 2019
next year, and I want this to be dynamic and get 31st August 2020?
I have tried Date_Sub and Date_Trunc and they are not working. Any ideas would be really helpful?
SELECT DATE_SUB(current_date(), INTERVAL 5 DAY) as five_days_ago
Below will always return last /latest August 31st
#standardSQL
SELECT IF(CURRENT_DATE() < last_august_31, DATE_SUB(last_august_31, INTERVAL 1 YEAR), last_august_31) AS last_august_31
FROM UNNEST([DATE(EXTRACT(YEAR FROM CURRENT_DATE()), 8, 31)]) last_august_31
In case if you need to use this within the query with date field - consider below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT DATE '2019-01-01'dt UNION ALL
SELECT '2019-12-31' UNION ALL
SELECT CURRENT_DATE()
)
SELECT dt, IF(dt < last_august_31, DATE_SUB(last_august_31, INTERVAL 1 YEAR), last_august_31) AS last_august_31
FROM `project.dataset.table`,
UNNEST([DATE(EXTRACT(YEAR FROM dt), 8, 31)]) last_august_31
-- ORDER BY dt
with result
Row dt last_august_31
1 2019-01-01 2018-08-31
2 2019-12-31 2019-08-31
3 2020-02-25 2019-08-31
with dates as (
select cast('2019-01-01' as date) as my_date union all select '2019-12-31' union all select current_date()
)
select
my_date,
date(extract(year from my_date) - case when extract(month from my_date) < 9 then 1 else 0 end, 8, 31) as prev_aug_31,
date(extract(year from my_date) + case when extract(month from my_date) >= 9 then 1 else 0 end, 8, 31) as next_aug_31
from dates
Given I have multiple tables in BigQuery, hence I have multiple SQL-statements that gives me "the number of X per day". For example:
SELECT FORMAT_TIMESTAMP("%F",timestamp) AS day, COUNT(*) as installs
FROM database.table1
GROUP BY day
ORDER BY day ASC
Which would give the result:
| day | installs |
-------------------------
| 2017-01-01 | 11 |
| 2017-01-02 | 22 |
etc
Another statement:
SELECT FORMAT_TIMESTAMP("%F",timestamp) AS day, COUNT(*) as uninstalls
FROM database.table2
GROUP BY day
ORDER BY day ASC
Which would give the result:
| day | uninstalls |
---------------------------
| 2017-01-02 | 22 |
| 2017-01-03 | 33 |
etc
Another statement:
SELECT FORMAT_TIMESTAMP("%F",timestamp) AS day, COUNT(*) as cases
FROM database.table3
GROUP BY day
ORDER BY day ASC
Which would give the result:
| day | cases |
----------------------
| 2017-01-01 | 11 |
| 2017-01-03 | 33 |
etc
etc
Now I need to combine all these into a single SELECT statement that gives the following results:
| day | installs | uninstalls | cases |
----------------------------------------------
| 2017-01-01 | 11 | 0 | 11 |
| 2017-01-02 | 22 | 22 | 0 |
| 2017-01-03 | 0 | 33 | 33 |
etc
Is this even possible?
Or what's the closest SQL-statement I can write that would give me a similar result?
Any feedback is appreciated!
Here is a self-contained example that might help to get you started. It uses two dummy tables, InstallEvents and UninstallEvents, which contain timestamps for the respective actions. It creates a common table expression called StartAndEnd that computes the minimum and maximum dates for these events in order to decide which dates to aggregate over, then unions the contents of the InstallEvents and UninstallEvents, counting the events for each day.
WITH InstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-01 00:00:00', INTERVAL x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 100)) AS x
),
UninstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-02 00:00:00', INTERVAL 2 * x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 50)) AS x
),
StartAndEnd AS (
SELECT MIN(DATE(timestamp)) AS min_date, MAX(DATE(timestamp)) AS max_date
FROM (
SELECT * FROM InstallEvents UNION ALL
SELECT * FROM UninstallEvents
)
)
SELECT
day,
COUNTIF(is_install AND DATE(timestamp) = day) AS installs,
COUNTIF(NOT is_install AND DATE(timestamp) = day) AS uninstalls
FROM (
SELECT *, true AS is_install
FROM InstallEvents UNION ALL
SELECT *, false
FROM UninstallEvents
)
CROSS JOIN UNNEST(GENERATE_DATE_ARRAY(
(SELECT min_date FROM StartAndEnd),
(SELECT max_date FROM StartAndEnd)
)) AS day
GROUP BY day
ORDER BY day;
If you know what the start and end dates are in advance, you can hard-code them in the query instead and then omit the StartAndEnd CTE:
WITH InstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-01 00:00:00', INTERVAL x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 100)) AS x
),
UninstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-02 00:00:00', INTERVAL 2 * x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 50)) AS x
)
SELECT
day,
COUNTIF(is_install AND DATE(timestamp) = day) AS installs,
COUNTIF(NOT is_install AND DATE(timestamp) = day) AS uninstalls
FROM (
SELECT *, true AS is_install
FROM InstallEvents UNION ALL
SELECT *, false
FROM UninstallEvents
)
CROSS JOIN UNNEST(GENERATE_DATE_ARRAY('2017-01-01', '2017-01-04')) AS day
GROUP BY day
ORDER BY day;
To see the events in the sample data, use a query that unions the contents:
WITH InstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-01 00:00:00', INTERVAL x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 100)) AS x
),
UninstallEvents AS (
SELECT TIMESTAMP_ADD('2017-01-02 00:00:00', INTERVAL 2 * x HOUR) AS timestamp
FROM UNNEST(GENERATE_ARRAY(0, 50)) AS x
)
SELECT timestamp, true AS is_install
FROM InstallEvents UNION ALL
SELECT timestamp, false
FROM UninstallEvents;
Below is for BigQuery Standard SQL
#standardSQL
WITH calendar AS (
SELECT day
FROM (
SELECT MIN(min_day) AS min_day, MAX(max_day) AS max_day
FROM (
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table1` UNION ALL
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table2` UNION ALL
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table3`
)
), UNNEST(GENERATE_DATE_ARRAY(min_day, max_day, INTERVAL 1 DAY)) AS day
)
SELECT
c.day AS day,
IFNULL(SUM(installs), 0) AS installs,
IFNULL(SUM(uninstalls), 0) AS uninstalls,
IFNULL(SUM(cases),0) AS cases
FROM calendar AS c
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) installs FROM `database.table1` GROUP BY day) t1 ON t1.day = c.day
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) uninstalls FROM `database.table2` GROUP BY day) t2 ON t2.day = c.day
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) cases FROM `database.table3` GROUP BY day) t3 ON t3.day = c.day
GROUP BY day
HAVING installs + uninstalls + cases > 0
-- ORDER BY day
Please note: you are using timestamp as a column name which is not the best practice as it is keyword, so in my example i leave your naming but consider to change this!
You can test / play this solution with below dummy data
#standardSQL
WITH `database.table1` AS (
SELECT TIMESTAMP '2017-01-01' AS timestamp, 1 AS installs
UNION ALL SELECT TIMESTAMP '2017-01-01', 22
),
`database.table2` AS (
SELECT TIMESTAMP '2016-12-01' AS timestamp, 1 AS installs UNION ALL SELECT TIMESTAMP '2017-01-01', 22 UNION ALL SELECT TIMESTAMP '2017-01-01', 22 UNION ALL
SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22
),
`database.table3` AS (
SELECT TIMESTAMP '2017-01-01' AS timestamp, 1 AS installs UNION ALL SELECT TIMESTAMP '2017-01-01', 22 UNION ALL SELECT TIMESTAMP '2017-01-01', 22 UNION ALL
SELECT TIMESTAMP '2017-01-10', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22 UNION ALL SELECT TIMESTAMP '2017-01-02', 22
),
calendar AS (
SELECT day
FROM (
SELECT MIN(min_day) AS min_day, MAX(max_day) AS max_day
FROM (
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table1` UNION ALL
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table2` UNION ALL
SELECT MIN(DATE(timestamp)) AS min_day, MAX(DATE(timestamp)) AS max_day FROM `database.table3`
)
), UNNEST(GENERATE_DATE_ARRAY(min_day, max_day, INTERVAL 1 DAY)) AS day
)
SELECT
c.day AS day,
IFNULL(SUM(installs), 0) AS installs,
IFNULL(SUM(uninstalls), 0) AS uninstalls,
IFNULL(SUM(cases),0) AS cases
FROM calendar AS c
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) installs FROM `database.table1` GROUP BY day) t1 ON t1.day = c.day
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) uninstalls FROM `database.table2` GROUP BY day) t2 ON t2.day = c.day
LEFT JOIN (SELECT DATE(timestamp) day, COUNT(1) cases FROM `database.table3` GROUP BY day) t3 ON t3.day = c.day
GROUP BY day
HAVING installs + uninstalls + cases > 0
ORDER BY day
I am not very familiar with bigquery, so this is probably not going to be a copy-paste answer.
You'll first have to build a calander table to make sure you have all dates. Here's an example for sql server. There are probably examples for bigquery available as well. The following assumes a Calander table with Date attribute in timestamp.
Once you have your calander table you can join all your tables to that:
SELECT FORMAT_TIMESTAMP("%F",C.Date) AS day
, COUNT(T1.DATE(T1.TIMESTAMP)) AS installs --Here you could also use your FORMAT_TIMESTAMP
, COUNT(T1.DATE(T2.TIMESTAMP)) AS uninstalls
FROM Calander C
LEFT JOIN database.table1 T1
ON DATE(T1.TIMESTAMP) = DATE(C.Date) --Convert to date to remove times, you could also use your FORMAT_TIMESTAMP
LEFT JOIN database.table2 T2
ON DATE(T2.TIMESTAMP) = DATE(C.Date)
GROUP BY day
ORDER BY day ASC
I have a table that looks like this:
+--------------------+---------+
| Month (date) | amount |
+--------------------+---------+
| 2016-10-01 | 20 |
| 2016-08-01 | 10 |
| 2016-07-01 | 17 |
+--------------------+---------+
I'm looking for a query (sql statement) which satisfies the following conditions:
Give me the value of the previous month.
If there is no value for the previous month lock back in time until one can be found.
If there is just a value for the current month give me this value.
In the example table the row I'm looking for would be this:
+--------------------+---------+
| 2016-08-01 | 10 |
+--------------------+---------+
Has anyone a idea for a non complex select query?
Thanks in advance,
Peter
You may need the following:
SELECT *
FROM ( SELECT *
FROM test
WHERE TRUNC(SYSDATE, 'month') >= month
ORDER BY CASE
WHEN TRUNC(SYSDATE, 'month') = month
THEN 0 /* if current month, ordered last */
ELSE 1 /* previous months are ordered first */
END DESC,
month DESC /* among previous months, the greatest first */
)
WHERE ROWNUM = 1
Another way using MAX
WITH tbl AS (
SELECT TO_DATE('2016-10-01', 'YYYY-MM-DD') AS "month", 20 AS amount FROM dual
UNION
SELECT TO_DATE('2016-08-01', 'YYYY-MM-DD') AS "month", 10 AS amount FROM dual
UNION
SELECT TO_DATE('2016-07-01', 'YYYY-MM-DD') AS "month", 5 AS amount FROM dual
)
SELECT *
FROM tbl
WHERE TRUNC("month", 'MONTH') = NVL((SELECT MAX(t."month")
FROM tbl t
WHERE t."month" < TRUNC(SYSDATE, 'MONTH')),
TRUNC(SYSDATE, 'MONTH'));
I would use row_number():
select t.*
from (select t.*,
row_number() over (order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) as seqnum
from t
) t
where seqnum = 1;
Actually, you don't need row_number() for this:
select t.*
from (select t.*
from t
order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) t
where rownum = 1;
It's not the nicest query but it should work.
select amount, date from (
select amount, date, row_number over(partition by HERE_PUT_ID order by
case trunc(date, 'month') when trunc(sysdate, 'month') then to_date('00010101', 'yyyymmdd') else trunc(date, 'month') end
desc) r)
where r = 1;
I guess you have some id in table so put id column instead of HERE_PUT_ID if you want query for whole table just delete: partition by HERE_PUT_ID
I added more data for testing, and an "id" column (a more realistic scenario) to show how this would work. If there is no "id" in your data, simply delete any reference to it from the solution.
Notes - month is a reserved Oracle word, don't use it as a column name. The solution assumes the date column contains dates that are already truncated to the beginning of the month. The trick in "order by" in the dense_rank last is to assign a value (ANY value!) when the month is the current month; by default, the value assigned to all other months is NULL, which by default come after any non-null value in an ascending order.
You may want to test the various solutions for efficiency if execution time is important.
with
inputs ( id, mth, amount ) as (
select 1, date '2016-10-01', 20 from dual union all
select 1, date '2016-08-01', 10 from dual union all
select 1, date '2016-07-01', 17 from dual union all
select 2, date '2016-10-01', 30 from dual union all
select 2, date '2016-09-01', 25 from dual union all
select 3, date '2016-10-01', 20 from dual union all
select 4, date '2016-08-01', 45 from dual union all
select 4, date '2016-06-01', 30 from dual
)
-- end of TEST DATA - the solution (SQL query) is below this line
select id,
max(mth) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as mth,
max(amount) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as amount
from inputs
group by id
order by id -- ORDER BY is optional
;
ID MTH AMOUNT
--- ---------- -------
1 2016-08-01 10
2 2016-09-01 25
3 2016-10-01 20
4 2016-08-01 45
You could sort the data in the direction you want to:
with MyData as
(
SELECT to_date('2016-10-01','YYYY-MM-DD') MY_DATE, 20 AMOUNT FROM DUAL UNION
SELECT to_date('2016-08-01','YYYY-MM-DD') MY_DATE, 10 AMOUNT FROM DUAL UNION
SELECT to_date('2016-07-01','YYYY-MM-DD') MY_DATE, 17 AMOUNT FROM DUAL
),
MyResult AS (
SELECT
D.*
FROM MyData D
ORDER BY
DECODE(
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'),
12*TO_CHAR(SYSDATE,'YYYY') + TO_CHAR(SYSDATE,'MM'),
-1,
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'))
DESC
)
SELECT * FROM MyResult WHERE RowNum = 1