fb-prophet: optimal tuning for seasonality - facebook-prophet

I am using prophet for alerting purposes.
In most cases it's a great algorithm.
Here is an example of some of my data (red) and where prophet triggers (blue).
I would like prophet to ignore these recurring drops.
I have tried to tune the algorithm with these hyperparameters:
interval_width
changepoint_prior_scale
seasonality_prior_scale
But I cannot get the algorithm to ignore these recurring drops.
Any pointers?

Related

Google Cloud ML: How can I enforce a pure grid-search for a hyperparameter tuning job

Google Cloud ML uses Bayesian optimisation to mitigate the curse of dimensionality. In specific situations I have hyperparameter tuning jobs in which I want to enforce an exhaustive search over a grid of hyperparameters in a hyperparameter-tuning job. How can I do this?
My motivation for enforcing a pure grid-search is: I have observed that a hyperparameter-tuning job for hyperparameters which are exclusively of DISCRETE type, evaluates the same combination of hyperparameters more than once, which I do not want. I am suspecting it has to do with the use of Bayesian optimisation. This is why I would like to enforce a pure grid-search for those cases.
There is not currently an argument available to enforce a grid search.
The best workaround currently is probably to submit multiple jobs, with the specific hyperparameters set for each one. This can be done without changing the code, as you can specify the values as user command line arguments. You should be able to submit all the jobs in a loop, and Google Cloud ML will queue them if there are too many to run at once. The downside is that you'll have to figure out which is the best.

Correcting SLAM drift error using GPS measurements

I'm trying to figure out how to correct drift errors introduced by a SLAM method using GPS measurements, I have two point sets in euclidian 3d space taken at fixed moments in time:
The red dataset is introduced by GPS and contains no drift errors, while blue dataset is based on SLAM algorithm, it drifts over time.
The idea is that SLAM is accurate on short distances but eventually drifts, while GPS is accurate on long distances and inaccurate on short ones. So I would like to figure out how to fuse SLAM data with GPS in such way that will take best accuracy of both measurements. At least how to approach this problem?
Since your GPS looks like it is very locally biased, I'm assuming it is low-cost and doesn't use any correction techniques, e.g. that it is not differential. As you probably are aware, GPS errors are not Gaussian. The guys in this paper show that a good way to model GPS noise is as v+eps where v is a locally constant "bias" vector (it is usually constant for a few metters, and then changes more or less smoothly or abruptly) and eps is Gaussian noise.
Given this information, one option would be to use Kalman-based fusion, e.g. you add the GPS noise and bias to the state vector, and define your transition equations appropriately and proceed as you would with an ordinary EKF. Note that if we ignore the prediction step of the Kalman, this is roughly equivalent to minimizing an error function of the form
measurement_constraints + some_weight * GPS_constraints
and that gives you a more straigh-forward, second option. For example, if your SLAM is visual, you can just use the sum of squared reprojection errors (i.e. the bundle adjustment error) as the measurment constraints, and define your GPS constraints as ||x- x_{gps}|| where the x are 2d or 3d GPS positions (you might want to ignore the altitude with low-cost GPS).
If your SLAM is visual and feature-point based (you didn't really say what type of SLAM you were using so I assume the most widespread type), then fusion with any of the methods above can lead to "inlier loss". You make a sudden, violent correction, and augment the reprojection errors. This means that you lose inliers in SLAM's tracking. So you have to re-triangulate points, and so on. Plus, note that even though the paper I linked to above presents a model of the GPS errors, it is not a very accurate model, and assuming that the distribution of GPS errors is unimodal (necessary for the EKF) seems a bit adventurous to me.
So, I think a good option is to use barrier-term optimization. Basically, the idea is this: since you don't really know how to model GPS errors, assume that you have more confidance in SLAM locally, and minimize a function S(x) that captures the quality of your SLAM reconstruction. Note x_opt the minimizer of S. Then, fuse with GPS data as long as it does not deteriorate S(x_opt) more than a given threshold. Mathematically, you'd want to minimize
some_coef/(thresh - S(X)) + ||x-x_{gps}||
and you'd initialize the minimization with x_opt. A good choice for S is the bundle adjustment error, since by not degrading it, you prevent inlier loss. There are other choices of S in the litterature, but they are usually meant to reduce computational time and add little in terms of accuracy.
This, unlike the EKF, does not have a nice probabilistic interpretation, but produces very nice results in practice (I have used it for fusion with other things than GPS too, and it works well). You can for example see this excellent paper that explains how to implement this thoroughly, how to set the threshold, etc.
Hope this helps. Please don't hesitate to tell me if you find inaccuracies/errors in my answer.

When Is It Apt to Convert Features to Log-values Before Fitting

I am planning to implement anomaly detection using one-class SVM. The data which I have has 25 features and some of the columns have unique variations like trend and seasonality.
I have tried to convert the trend and seasonal features into log-values. I have found that the distribution has changed from skewed to normal.
I am not certain, though, if the convertion is way-to-go. Also, I am not sure what consequences it may cause during the fitting.
It would be greatly appreciated if anybody could highlight the best case scenario of converting the features into log-values and/or any other techniques which could mitigate the effect of time series variations.

Deep neural network diverges after convergence

I implemented the A3C network in https://arxiv.org/abs/1602.01783 in TensorFlow.
At this point I'm 90% sure the algorithm is implemented correctly. However, the network diverges after convergence. See the attached image that I got from a toy example where the maximum episode reward is 7.
When it diverges, policy network starts giving a single action very high probability (>0.9) for most states.
What should I check for this kind of problem? Is there any reference for it?
Note that in Figure 1 of the original paper the authors say:
For asynchronous methods we average over the best 5
models from 50 experiments.
That can mean that in lot of cases the algorithm does not work that well. From my experience, A3C often diverges, even after convergence. Carefull learning-rate scheduling can help. Or do what the authors did - learn several agents with different seed and pick the one performing the best on your validation data. You could also employ early stopping when validation error becomes to increase.

analysis Fitbit walking and sleeping data

I'm participating in small data analysis competition in our school.
We use Fitbit wearable devices, which is loaned to each participants by host of contest.
For 2 months during the contest, they walk and sleep with this small device 24/7,
allow it to gather data about participant's walk count with heart rate(bpm), etc.
and we need to solve some problems based on these participants' data
like, example,
show the relations between rainy days and participants' working out rate using the chart,
i think purpose of problem is,
because of rain, lot of participants are expected to be at home.
can you show some cause and effect numerically?
i'm now studying python library numpy, pandas with ipython notebook.
but still i have no idea about solving these problems..
could you recommend some projects or sites use for references? i really eager to win this competition.:(
and lastly, sorry for my poor English.
Thank you.
that's a fun project. I'm working on something kind of similar.
Here's what you need to do:
Learn the fitbit API and stream the data from the fitbit accelerometer and gyroscope. If you can combine this with heart rate data, great. The more types of data you have, the more effective your algorithm will be. You can store this data in a simple csv file (streaming the accel/gyro data at 50Hz is recommended). Or setup a web server and store it in a database for easy access
Learn how to use pandas and scikit learn
[optional but recommended]: Learn matplotlib so you can graph you data and get a feel for how it looks
Load the data into pandas and create features on the data - notably using 1-2 second sliding window analysis with 50% overlap. Good features include (for all three Accel X, Y, Z): max, min, standard deviation, root mean square, root sum square and tilt. Polynomials will help.
Since this is a supervised classification problem, you will need to create some labelled data - so do this manually (state 1 = rainy day, state 2 = non-rainy day) and then train a classification algorithm. I would recommend a random forest
Test using unlabeled data - don't forget to use cross validation
Voila, you now have a highly accurate model and will win the competition. Plus you've learned about a bunch of really cool Python and machine learning stuff.
For more tutorials on how all this stuff works, I'd highly recommend the Kaggle tutorial projects
BONUS: If you want to take it to a new level, you can start adding smoothers on top of your classifier, for example by using a Hidden Markov Model as explained in this talk
BONUS 2: Go get a PhD in Human Activity Recognition.