Rolling months data by year_month in obiee - sql

I just need to how to create Rolling months in Obiee. If I click for jan 2017,it should show datas from feb 2016.For past previous 12 Months it should show.

You will need a properly configured time dimension. As soon as you have that all the time series functionalities are at your disposal and will work immediately.
https://gerardnico.com/wiki/dat/obiee/obis/time_dimension
https://gerardnico.com/wiki/dat/obiee/obis/logical_sql/function_time

Related

Group data by weeks since the start of event in sql

I’m a data analyst in the insurance industry and we currently have a program in SAS EG that tracks catastrophe development week by week since the start of the event for all of the catastrophic events that are reported.
(I.E week 1 is catastrophe start date + 7 days, week 2 would be end of week 1 + 7 days and so on) then all transaction amounts (dollars) for the specific catastrophes would be grouped into the respective weeks based on the date each transaction was made.
Problem that we’re faced with is we are moving away from SAS EG to GCP big query and the current process of calculating those weeks is a manually read in list which isn’t very efficient and not easily translated to BigQuery.
Curious if anybody has an idea that would allow me to calculate each week number in periods of 7 days since the start of an event in SQL or has an idea specific for BigQuery? There would be different start dates for each event.
It is complex, I know and I’m willing to give more explanation as needed. Open to any ideas for this as I haven’t been able to find anything.

BigQuery UI - How to schedule a query to run on the last day of the month?

I have a query that pulls a summary of metrics for the past month that needs to run on the last day of each month at a set time.
The BigQuery 'Schedule query' UI allows you to choose a date and time to run a query each month but there is no apparent option to choose that last day of the month.
If I simply choose the 31st of the month, what happens if there are only 30-days in that month? Will the query still run?
Or do I have to schedule a query to run on the 27th, 28th, 29th, 30th, 31st of the month to make sure that I don't miss the correct date?
I can't find any mention of this situation in the BigQuery documentation or online so any help/suggestions will be very gratefully received.
I understand that you need to run your query in the last day of each month. However, if you set 31st of the month as the schedule options, it will skip the months which do not have 31 days.
You can check this affirmation, by performing the following test in the BigQuery UI:
In the Schedule query options, under Schedule options, set:
Repeats: Monthly
On the: 31
Start date and run time: set the date to 10th of June (which is a month with 30 days)
Click Schedule
On the left side of the BigQuery UI, click on Scheduled queries
Check the query you saved. Among other details, it should be displayed Next Scheduled.
It will be shown July, 31
As you can see, it skipped the 30th of June. Thus, when you configure your query to run on the 31st of each month, it will ignore the months with less than 31 days. For this reason, I would advise you to select 27th, 28th, 29th, 30th, 31st of each month in order to run it in a manner that suits you.
As a bonus information, you can set up a Custom schedule option, as mentioned here. You can use syntaxes as "1st monday of month 00:00" or "every monday 00:00", here.

Organizing per-week and daily data in SQL

Problem overview
I'm working on a simple app for reminding the user of weekly goals. Let's say the goal is to do 30 minutes of exercise on specific days of the week.
Sample goal: do exercise on Mon, Wed, Fri.
The app also needs to track past record, i.e. dates when the user did exercise. It could be just dates, e.g.: 2019-09-02, 2019-09-05, 2019-09-11 means the user did exercise on these days and did not on the others (doesn't need to be on "exercise goal" days of the week).
The goal can change in time. Let's say today is 2019-09-11 and the goal for this week ([2019-09-09, 2019-09-15]) is Mon, Wed, Fri but from 2019-08-05 to 2019-09-08 it was Mon, Thu (repeatedly for all these weeks).
I need to store these week-oriented goals and historic exercise of data and be able to retrieve the following:
The goal days for the current week (or any week, let's say I can compute start and end day for any week given a date).
Exercise history for a larger range of days together with goal days for that range (e.g. to show when the user was supposed to exercise and when they actually did in the last month).
Question
How to best store this data in SQL.
This is a little bit academic because I'm working on a small Android app and the data is just for a single user. So there will be little data and I can successfully use any approach, even a very clumsy one will be efficient enough.
However, I'd like to explore the topic and maybe learn a thing or two.
Possible solutions
Here are two approaches that come to my mind.
In both cases I would store exercise history as a table of dates. If there is an entry for that date it means the user did exercise on that day.
It's the goal storage that is interesting.
Approach 1
Store the goals per-week (it's SQLite so dates are stored as strings - all dates are just 'YEAR-MONTH-DAY'):
CREATE TABLE goals (
start_date TEXT,
exercise_days TEXT);
"start_date" is the first day of the week,
"exercise_days" is a comma-separated list of weekdays (let's say numbers 1-7).
So for the example above we might have two rows:
'2019-08-05', '1,4'
'2019-09-09', '1,3,5'
meaning that since 2019-08-05 the goal is Mon, Thu for all weeks until 2019-09-09, when the goal becomes Mon, Wed, Fri. So there is a gap in the data. I wouldn't want to generate data for weeks starting on 2019-08-12, 2019-08-19, 2019-08-26.
With this approach it is easy to work with the data week-wise. The current goal is the one with MAX('start_date'). The goal for a week for a given date is MAX('start_date') WHERE 'start_date' <= :date.
However it gets cumbersome when I want to get data for the last 3 months and show the user their progress.
Or maybe I want to show the user the percentage of actual exercise days to what they set as their goal in a year.
In this case it seems the best approach is to fetch the data separately and merge it in the application (or maybe write some complex queries), processing week by week. This is ok performance-wise because the amount of data is small and I rarely need more than a handful of weeks.
Approach 2
Store goals in such a way that each goal day is a record:
CREATE TABLE goals (
day TEXT,
);
"day" is a day when the user should exercise. So for the week starting 2019-09-09 (Mon, Wed, Fri) we would have:
'2019-09-09'
'2019-09-11'
'2019-09-13'
and for the week starting 2019-08-05 (Mon, Thu) we would have:
'2019-08-05'
'2019-08-09'
but what for the weeks in-between?
If my app could fill all the weeks in-between then it would be easy to merge this data with the exercise history and display days when the user was supposed to exercise and when they actually did. Extracting the goal for any given week would also be easy.
The problem is: this requires the app to generate data for the "gap" weeks even if the user doesn't tweak the goal. This can be implemented as a transaction that is run each time the app process starts. In some cases it could take noticeable time for occasional users of the app (think progress bar for a second).
Maybe there a smart way to generate the data in-between when making a SELECT query?
I don't like the fact that it requires generating data. I do like the fact that I can just join the tables and then process that (e.g. compute how many exercise days there were supposed to be in August and how many days the user did actually exercise and then show them percentage like "you did 85% of your goal" - in fact I can do this without joining the tables).
Also, it seems this approach gives me more flexibility for analysis in the future.
But is there a third way? Or maybe I am overthinking this? :)
(I am asking mostly for the way of organizing the data, there's no need for exact SQL queries)
Perhaps I'm over-thinking this, but if a goal can have multiple components to it, and can change over time I'd have a goal header record, with the ID, name and other data about the goal as a whole, and then a separate table linked with the components of that goal which are time-boxed, for example:
CREATE TABLE goal_days (goal_day_ID INT,
goal_ID INT,
day_ID INT,
target_minutes INT,
start_date TEXT,
end_date TEXT)
I'd have thought that allows you to easily check against the history to map against each day of the goal - e.g. they got 100% of the Mondays, but kept missing Thursday - however when the goal was changed to Friday instead they got better.

Bigquery - Table decorators changed weirdly

I used to have a number of queries running on the past 40 days of data using a decorator with [dataset.table#-4123456789-].
However, since September 15 all the decorators return maximum 10 days of data.
By the way [dataset.table#0] returns the whole table and not the past 7 days as told in the documentation.
Does anyone know what is going on. Do I have to move my table to partition in order to receive data for a limited period of time but more the a week?
Thanks

Logs from firebase to Bigquery in the wrong table

I'm using Firebase to register some events from an iOS/Android app and log them into BigQuery. As I understood from the documentation, BigQuery creates a different table each day in order to store the events of the single day.
Each day, Firebase Analytics creates a new table in the BigQuery dataset corresponding to the app. The tables are named using the pattern app_events_YYYYMMDD and contain the events recorded for the specified day.
However I'm getting some events in a certain day registered in the table of the following day. For example the table app_events_20160727 contains some events from July 26th, the table app_events_20160728 contains some events from July 27th.
Am I missing something?
Thanks for your support
Sep, 14 Update
I'll try to better explain the issue through an example: the events recorded in the first part of the day (let's say until 3PM/4PM but I don't see any pattern) are collected in the table of that day, the events of the last part of the day are collected in the table of the following day.
So, let's take the events of Sep, 12: here below the screenshot of the first and last entries of the tables related to Sep 12 and Sep 13
First entries of Sep, 13
Last entries of Sep, 13
First entries of Sep, 12
Last entries of Sep, 12
As you can see, the events from Sep, 12 are split into two tables.
Thanks for your support.
Firebase register the timestamp of when the event was track client side.
This is likely to happen in that scenario:
You trigger an event while offline, day N
Your user reconnect to internet only the following day, day N+1, (or the day after)
Thus Firebase base receive the event of day N, on day N+1.
During day N, firebase will export all the event he received (erver side) on day N. on day N+1 he'll export all the event he received on day N+1, even the one actually track client side on day N, but not sent to server on day N.
I'm unsure the explanation is clear, can you tell if it was clear ?