Bigquery - Table decorators changed weirdly - google-bigquery

I used to have a number of queries running on the past 40 days of data using a decorator with [dataset.table#-4123456789-].
However, since September 15 all the decorators return maximum 10 days of data.
By the way [dataset.table#0] returns the whole table and not the past 7 days as told in the documentation.
Does anyone know what is going on. Do I have to move my table to partition in order to receive data for a limited period of time but more the a week?
Thanks

Related

Group data by weeks since the start of event in sql

I’m a data analyst in the insurance industry and we currently have a program in SAS EG that tracks catastrophe development week by week since the start of the event for all of the catastrophic events that are reported.
(I.E week 1 is catastrophe start date + 7 days, week 2 would be end of week 1 + 7 days and so on) then all transaction amounts (dollars) for the specific catastrophes would be grouped into the respective weeks based on the date each transaction was made.
Problem that we’re faced with is we are moving away from SAS EG to GCP big query and the current process of calculating those weeks is a manually read in list which isn’t very efficient and not easily translated to BigQuery.
Curious if anybody has an idea that would allow me to calculate each week number in periods of 7 days since the start of an event in SQL or has an idea specific for BigQuery? There would be different start dates for each event.
It is complex, I know and I’m willing to give more explanation as needed. Open to any ideas for this as I haven’t been able to find anything.

Select Data between current time - 15 mins and current time in SQL

I am looking to pull data between two time periods at only 15 to 30 mins apart. I want to be able to rerun the code multiple times to constantly update the data I had already pulled. I know there is a function for current system time but I am unable to use it effectively in SQL developer.
I have tried using the function CURRENT_TIMESTAMP but could not get it to work effectively.
Currently i am using the following code and just pulling over a broad time frame, but i would like to shrink that down to 15 to 30 minute intervals that could be used to continue to pull updated data.
I expect to be able to pull current data within 15 to 30 minute segments of time.

Has Google just changed their historical stock price interface (again)?

For years I've been using webpage requests like the following to retrieve 20 days at a time of minutewise stock data from Google:
http://www.google.com/finance/getprices?q=.INX&i=60&p=20d&f=d,c,h,l,o,v
= Retrieve for .INX (S&P 500 index) 60-second interval data for the last 20 days, with format Datetime(in Unix format), Close, High, Low, Open, Volume.
The Datetime is in Unix format (seconds since 1/1/1970, prefixed with an "A") for the first entry of each day, and subsequent entries show the intervals that have passed (so 1 = 60 seconds after the opening of the market that day).
That worked up until 9/10/2017, but today (9/17) it only returns day-end data (it even reports the "interval" between samples as 86400). Pooey! I can get that anywhere, in bulk.
But if I ask for fewer days, or broader intervals, it seems to return data - but weird data. Asking for data every 120 seconds returns exactly that - but only for every other market day. Weird!
Has anyone got a clue what might have happened?
Whoa! I think I figured it out.
Google still returns minutewise data for the same approximate limitations (up to 20 calendar days), but instead of d=10 returning all the market data for the last 10 calendar days, it return the data for the last 10 market days. Previously, to get the last 10 market days you would ask for d=14 (M-Fx2, plus two weekends). Now, Google interprets the d variable as market days, and asking for d=20 exceeds the limits on what they will deliver.
It now appears that d=15 is the limit (three weeks of market days). No clue on why I got the very weird every-other-day data for a while... but maybe if you exceed their d-limits the intervals get screwy. Dunno. Don't care. Easy fix.

How to store availability information in SQL, including recurring items

So I'm developing a database for an agency that manages many relief staff.
Relief workers set their availability for each day in one of three categories (day, evening, night).
We also need to be able to set some part-time relief workers as busy on weekly, biweekly, and in one instance, on a 9-week rotation. Since we're already developing recurring patterns of availability here, we might as well also give the relief workers the option of setting recurring availability days.
We also need to be able to query the database, and determine if an employee is available for a given day.
But here's the gotcha - we need to be able to use change data capture. So I'm not sure if calculating availability is the best option.
My SQL prototype table looks like this:
TABLE Availability Day
employee_id_fk | workday (DATETIME) | day | eve | night (all booleans)| worksite_code_fk (can be null)
I'm really struggling how to wrap my head around recurring events. I could create say, a years worth, of availability days following a pattern in 'x' day cycle. But how far ahead of time do we store information? I can see running into problems when we reach the end of the data set.
I was thinking of storing say, 6 months of information, then adding a server side task that runs monthly to keep the tables updated with 6 months of data, but my intuition is telling me this is a bad fix.
For absolutely flexibility in the future and keeping data from bloating my first thought would be something like
Calendar Dimension Table - Make it for like 100 years or Whatever you Want make it include day of week information etc.
Time Dimension Table - Hour, Minutes, every 15 what ever but only for 24 hour period
Shifts Table - 1 record per shift e.g. Day, Evening, and Night
Specific Availability Table - Relationship to Calendar & Time with Start & Stops recommend 1 record per day so even if they choose a range of 7 days split that to 1 record perday and 1 record per shift.
Recurring Availability Table - for day of week (1-7),Month,WeekOfYear, whatever you can think of. But again I am thinking 1 record per value so if they are available Mondays and Tuesday's that would be 2 rows. and if multiple shifts then it would be multiple rows.
Now and here is the perhaps the weird part, I would put a Available Column on the Specific and Recurring Availability Tables, maybe make it a tiny int and store something like 0 not available, 1 available, 2 maybe available, 3 available with notice.
If you want to take into account Availability with Notice you could add columns for that too such as x # of days. If you want full flexibility maybe that becomes a related table too.
The queries would be complex but you could use a stored procedure or a table valued function to handle it fairly routinely.

BigQuery: Why does Table Range Decorators return wrong result sometimes?

I've been using the Table Range Decorators feature daily since May in order to only query the data from the last 7 days in some of my tables.
Since 2 weeks, I've noticed that sometimes some data is missing when I use that feature. For example, if do a query to get the results for the last 7 days (by adding "#-604800000--1" to table), some data will be missing as opposed to if I query on the whole table (without a table decorator).
I wonder what could explain this and if there is a fix coming soon to address this?
If this can help the BigQuery team, I've noticed that when using Table Decorators some data was missing for us for October 16th between around 16:00 and 20:00 UTC time.
For the BigQuery team here are 2 jobs ids where some data is missing: job_-xtL4PlIYhNjQ5weMnssvqDmd6U , job_9ASNxqq_swjCd1eMmiQ6SmPpxlQ
and 1 job id where data is correct(without decorators): job_QbcRwYGbQv0BZdHreQEvRlYh-mM
This is a known issue with table decorators containing a time range. Due to a bug in BigQuery, it is possible for certain time ranges to omit data that should be included within the time range.
We're working on a fix and plan to have it released next week. After this fix is deployed time range decorators should again work as expected.