A Simple Bayesian Network with a Coin-Flipping Problem - bayesian

I am trying to implement a Bayesian network and solve a regression problem using PYMC3. In my model, I have a fair coin as the parent node. If the parent node is H, the child node selects the normal distribution N(5,0.2); if T, the child selects N(0,0.5). Here is an illustration of my network.
To simulate this network, I generated a sample dataset and tried doing Bayesian regression using the code below. Currently, the model does regression only for the child node as if the parent node does not exist. I would greatly appreciate it if anyone can let me know how to implement the conditional probability P(D|C). Ultimately, I am interested in finding the probability distribution for mu1 and mu2. Thank you!
# Generate data for coin flip P(C) and store in c1
theta_real = 0.5 # unkown value in a real experiment
n_sample = 10
c1 = bernoulli.rvs(p=theta_real, size=n_sample)
# Generate data for normal distribution P(D|C) and store in d1
np.random.seed(123)
mu1 = 0
sigma1 = 0.5
mu2 = 5
sigma2 = 0.2
d1 = []
for index, item in enumerate(c1):
if item == 0:
d1.extend(normal(mu1, sigma1, 1))
else:
d1.extend(normal(mu2, sigma2, 1))
# I start building PYMC3 model here
c1_tensor = theano.shared(np.array(c1))
d1_tensor = theano.shared(np.array(d1))
with pm.Model() as model:
# define prior for c1. I am not sure how to do this.
#c1_present = pm.Categorical('c1',observed=c1_tensor)
# how do I incorporate P(D | C)
mu_prior = pm.Normal('mu', mu=2, sd=2, shape=1)
sigma_prior = pm.HalfNormal('sigma', sd=2, shape=1)
y_likelihood = pm.Normal('y', mu=mu_prior, sd=sigma_prior, observed=d1_tensor)

You could use the Dirichlet distribution as a prior for the coin toss and NormalMixture as the prior of the two Gaussians. In the following snippet I changed the fairness of the coin and increased the number of coin tosses, but you could adjust these in any way want:
import numpy as np
import pymc3 as pm
from scipy.stats import bernoulli
# Generate data for coin flip P(C) and store in c1
theta_real = 0.2 # unkown value in a real experiment
n_sample = 2000
c1 = bernoulli.rvs(p=theta_real, size=n_sample)
# Generate data for normal distribution P(D|C) and store in d1
np.random.seed(123)
mu1 = 0
sigma1 = 0.5
mu2 = 5
sigma2 = 0.2
d1 = []
for index, item in enumerate(c1):
if item == 0:
d1.extend(np.random.normal(mu1, sigma1, 1))
else:
d1.extend(np.random.normal(mu2, sigma2, 1))
with pm.Model() as model:
w = pm.Dirichlet('p', a=np.ones(2))
mu = pm.Normal('mu', 0, 20, shape=2)
sigma = np.array([0.5,0.2])
pm.NormalMixture('like',w=w,mu=mu,sigma=sigma,observed=np.array(d1))
trace = pm.sample()
pm.summary(trace)
This will give you the following:
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
mu__0 4.981222 0.023900 0.000491 4.935044 5.027420 2643.052184 0.999637
mu__1 -0.007660 0.004946 0.000095 -0.017388 0.001576 2481.146286 1.000312
p__0 0.213976 0.009393 0.000167 0.195602 0.231803 2245.905021 0.999302
p__1 0.786024 0.009393 0.000167 0.768197 0.804398 2245.905021 0.999302
The parameters are recovered nicely as you can also see from the traceplots:
The above implementation will give you the posterior of theta_real, mu1 and mu2 but I could not get convergence when I added sigma1 and sigma2 as parameters to be estimated by the data (even though the prior was quite narrow):
with pm.Model() as model:
w = pm.Dirichlet('p', a=np.ones(2))
mu = pm.Normal('mu', 0, 20, shape=2)
sigma = pm.HalfNormal('sigma', sd=2, shape=2)
pm.NormalMixture('like',w=w,mu=mu,sigma=sigma,observed=np.array(d1))
trace = pm.sample()
print(pm.summary(trace))
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [sigma, mu, p]
Sampling 4 chains: 100%|██████████| 4000/4000 [00:10<00:00, 395.57draws/s]
The acceptance probability does not match the target. It is 0.883057127209148, but should be close to 0.8. Try to increase the number of tuning steps.
The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
mean sd mc_error ... hpd_97.5 n_eff Rhat
mu__0 1.244021 2.165433 0.216540 ... 5.005507 2.002049 212.596596
mu__1 3.743879 2.165122 0.216510 ... 5.012067 2.002040 235.750129
p__0 0.643069 0.248630 0.024846 ... 0.803369 2.004185 30.966189
p__1 0.356931 0.248630 0.024846 ... 0.798632 2.004185 30.966189
sigma__0 0.416207 0.125435 0.012517 ... 0.504110 2.009031 17.333177
sigma__1 0.271763 0.125539 0.012533 ... 0.497208 2.007779 19.217223
[6 rows x 7 columns]
Based on that you most likely will need to reparametrize if you also wanted to estimate the two standard deviations from this data.

This answer is to supplement #balleveryday's answer, which suggests the Gaussian Mixture Model, but had some trouble getting the symmetry breaking to work. Admittedly, the symmetry breaking in the official example is done in the context of Metropolis-Hastings sampling, whereas I think NUTS might be a little more sensitive to encountering impossible values (not sure). Here's what worked for me:
import numpy as np
import pymc3 as pm
from scipy.stats import bernoulli
import theano.tensor as tt
# everything should reproduce
np.random.seed(123)
n_sample = 2000
# Generate data for coin flip P(C) and store in c1
theta_real = 0.2 # unknown value in a real experiment
c1 = bernoulli.rvs(p=theta_real, size=n_sample)
# Generate data for normal distribution P(D|C) and store in d1
mu1, mu2 = 0, 5
sigma1, sigma2 = 0.5, 0.2
d1 = np.empty_like(c1, dtype=np.float64)
d1[c1 == 0] = np.random.normal(mu1, sigma1, np.sum(c1 == 0))
d1[c1 == 1] = np.random.normal(mu2, sigma2, np.sum(c1 == 1))
with pm.Model() as gmm_asym:
# mixture vector
w = pm.Dirichlet('p', a=np.ones(2))
# Gaussian parameters (testval helps start off ordered)
mu = pm.Normal('mu', 0, 20, shape=2, testval=[-10, 10])
sigma = pm.HalfNormal('sigma', sd=2, shape=2)
# break symmetry, forcing mu[0] < mu[1]
order_means_potential = pm.Potential('order_means_potential',
tt.switch(mu[1] - mu[0] < 0, -np.inf, 0))
# observed
pm.NormalMixture('like', w=w, mu=mu, sigma=sigma, observed=d1)
# reproducible sampling
tr_gmm_asym = pm.sample(tune=2000, target_accept=0.9, random_seed=20191121)
This produces samples with the statistics
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
mu__0 0.004549 0.011975 0.000226 -0.017398 0.029375 2425.487301 0.999916
mu__1 5.007663 0.008993 0.000166 4.989247 5.024692 2181.134002 0.999563
p__0 0.789983 0.009091 0.000188 0.773059 0.808062 2417.356539 0.999788
p__1 0.210017 0.009091 0.000188 0.191938 0.226941 2417.356539 0.999788
sigma__0 0.497322 0.009103 0.000186 0.480394 0.515867 2227.397854 0.999358
sigma__1 0.191310 0.006633 0.000141 0.178924 0.204859 2286.817037 0.999614
and the traces

Related

tfp.mcmc.HamiltonianMonteCarlo Not working in Tensorflow Probability

I have the following code, which basically tries to fit a simple regression model using tensorflow probability. The code runs without error, but the MCMC sampler doesn't seem to be doing anything in that it returns a trace of the initial states.
import tensorflow.compat.v2 as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfdimport warnings
tf.enable_v2_behavior()
plt.style.use("ggplot")
warnings.filterwarnings('ignore')
ru=4
N=102
N = 102 # number of data points
t = np.linspace(0, 4*np.pi, N)
data = 3+np.sin(t+0.001) + 0.5 + np.random.randn(N)
media_1 = ((data-min(data))/(max(data)-min(data)) ) #+ np.random.normal(0,.05, N)
y = np.repeat(ru, N) + np.random.normal(.3,.01,N) * media_1 + np.random.normal(0, .005, N)
# model
model = tfd.JointDistributionNamed(dict(
beta1 = tfd.Normal(0,1) ,
intercept = tfd.Normal(0,5 ) ,
var = tfd.Normal(0.05, 0.0005) ,
y = lambda intercept,beta1,var:
tfd.Independent(tfd.Normal(loc=intercept + beta1 * media_1, scale=var),
reinterpreted_batch_ndims=1
)
))
def target_log_prob_fn(intercept, beta1, var):
return model.log_prob({'intercept':intercept, 'beta1':beta1, 'var':var, 'y': y })
s = model.sample()
init_states = [ tf.fill([1], s['intercept'].numpy(), name='init_intercept'),
tf.fill([1], s['beta1'].numpy(), name='init_beta1'),
tf.fill([1], s['var'].numpy(), name='init_var'),]
num_results = 5000
num_burnin_steps = 3000
# Improve performance by tracing the sampler using `tf.function`
# and compiling it using XLA.
#tf.function(autograph=False, experimental_compile=True)
def do_sampling():
return tfp.mcmc.sample_chain(
num_results=num_results,
num_burnin_steps=num_burnin_steps,
current_state=init_states,
kernel=tfp.mcmc.HamiltonianMonteCarlo(
target_log_prob_fn=target_log_prob_fn,
step_size=0.1,
num_leapfrog_steps=3)
)
states, kernel_results = do_sampling()
The trace that is returned in states is exactly the same as the initial values in initial_states... Any ideas?
I can confirm that this MCMC sampler is not mixing by printing the acceptance rate with a snippet I found here
print("Acceptance rate:", kernel_results.is_accepted.numpy().mean())
That same page provides some hints about how to make your HMC kernel adaptive, which means it will automatically reduce the step size if too many proposed steps are rejected (and increase it if too many are accepted, too):
# Apply a simple step size adaptation during burnin
#tf.function
def do_sampling():
adaptive_kernel = tfp.mcmc.SimpleStepSizeAdaptation(
tfp.mcmc.HamiltonianMonteCarlo(
target_log_prob_fn=target_log_prob_fn,
step_size=0.1,
num_leapfrog_steps=3),
num_adaptation_steps=int(.8 * num_burnin_steps),
target_accept_prob=np.float64(.65))
return tfp.mcmc.sample_chain(
num_results=num_results,
num_burnin_steps=num_burnin_steps,
current_state=init_states,
kernel=adaptive_kernel,
trace_fn=lambda cs, kr: kr)
samples, kernel_results = do_sampling()
print("Acceptance rate:", kernel_results.inner_results.is_accepted.numpy().mean())
This produces a non-zero acceptance rate for me.

Avoiding optimization pitfalls when modeling an ordinal predicted variable in PyMC3

I am trying to model an ordinal predicted variable using PyMC3 based on the approach in chapter 23 of Doing Bayesian Data Analysis. I would like to determine a good starting value using find_MAP, but am receiving an optimization error.
The model:
import pymc3 as pm
import numpy as np
import theano
import theano.tensor as tt
# Some helper functions
def cdf(x, location=0, scale=1):
epsilon = np.array(1e-32, dtype=theano.config.floatX)
location = tt.cast(location, theano.config.floatX)
scale = tt.cast(scale, theano.config.floatX)
div = tt.sqrt(2 * scale ** 2 + epsilon)
div = tt.cast(div, theano.config.floatX)
erf_arg = (x - location) / div
return .5 * (1 + tt.erf(erf_arg + epsilon))
def percent_to_thresh(idx, vect):
return 5 * tt.sum(vect[:idx + 1]) + 1.5
def full_thresh(thresh):
idxs = tt.arange(thresh.shape[0] - 1)
thresh_mod, updates = theano.scan(fn=percent_to_thresh,
sequences=[idxs],
non_sequences=[thresh])
return tt.concatenate([[-1 * np.inf, 1.5], thresh_mod, [6.5, np.inf]])
def compute_ps(thresh, location, scale):
f_thresh = full_thresh(thresh)
return cdf(f_thresh[1:], location, scale) - cdf(f_thresh[:-1], location, scale)
# Generate data
real_ps = [0.05, 0.05, 0.1, 0.1, 0.2, 0.3, 0.2]
data = np.random.choice(7, size=1000, p=real_ps)
# Run model
with pm.Model() as model:
mu = pm.Normal('mu', mu=4, sd=3)
sigma = pm.Uniform('sigma', lower=0.1, upper=70)
thresh = pm.Dirichlet('thresh', a=np.ones(5))
cat_p = compute_ps(thresh, mu, sigma)
results = pm.Categorical('results', p=cat_p, observed=data)
with model:
start = pm.find_MAP()
trace = pm.sample(2000, start=start)
When running this, I receive the following error:
Applied interval-transform to sigma and added transformed sigma_interval_ to model.
Applied stickbreaking-transform to thresh and added transformed thresh_stickbreaking_ to model.
Traceback (most recent call last):
File "cm_net_log.v1-for_so.py", line 53, in <module>
start = pm.find_MAP()
File "/usr/local/lib/python3.5/site-packages/pymc3/tuning/starting.py", line 133, in find_MAP
specific_errors)
ValueError: Optimization error: max, logp or dlogp at max have non-finite values. Some values may be outside of distribution support. max: {'thresh_stickbreaking_': array([-1.04298465, -0.48661088, -0.84326554, -0.44833646]), 'sigma_interval_': array(-2.220446049250313e-16), 'mu': array(7.68422528308479)} logp: array(-3506.530143064723) dlogp: array([ 1.61013190e-06, nan, -6.73994118e-06,
-6.93873894e-06, 6.03358122e-06, 3.18954680e-06])Check that 1) you don't have hierarchical parameters, these will lead to points with infinite density. 2) your distribution logp's are properly specified. Specific issues:
My questions:
How can I determine why dlogp is nan at certain points?
Is there a different way that I can express this model to avoid dlogp being nan?
Also worth noting:
This model runs fine if I don't find_MAP and use a Metropolis sampler. However, I'd like to have the flexibility of using other samplers as this model becomes more complex.
I have a suspicion that the issue is due to the relationship between the thresholds and the normal distribution, but I don't know how to disentangle them for the optimization.
Regarding question 2: I expressed the model for the ordinal predicted variable (single group) differently; I used the Theano #as_op decorator for a function that calculates probabilities for the outcomes. That also explains why I cannot use find_MAP() or gradient based samplers: Theano cannot calculate a gradient for the custom function. (http://pymc-devs.github.io/pymc3/notebooks/getting_started.html#Arbitrary-deterministics)
# Number of outcomes
nYlevels = df.Y.cat.categories.size
thresh = [k + .5 for k in range(1, nYlevels)]
thresh_obs = np.ma.asarray(thresh)
thresh_obs[1:-1] = np.ma.masked
#as_op(itypes=[tt.dvector, tt.dscalar, tt.dscalar], otypes=[tt.dvector])
def outcome_probabilities(theta, mu, sigma):
out = np.empty(nYlevels)
n = norm(loc=mu, scale=sigma)
out[0] = n.cdf(theta[0])
out[1] = np.max([0, n.cdf(theta[1]) - n.cdf(theta[0])])
out[2] = np.max([0, n.cdf(theta[2]) - n.cdf(theta[1])])
out[3] = np.max([0, n.cdf(theta[3]) - n.cdf(theta[2])])
out[4] = np.max([0, n.cdf(theta[4]) - n.cdf(theta[3])])
out[5] = np.max([0, n.cdf(theta[5]) - n.cdf(theta[4])])
out[6] = 1 - n.cdf(theta[5])
return out
with pm.Model() as ordinal_model_single:
theta = pm.Normal('theta', mu=thresh, tau=np.repeat(.5**2, len(thresh)),
shape=len(thresh), observed=thresh_obs, testval=thresh[1:-1])
mu = pm.Normal('mu', mu=nYlevels/2.0, tau=1.0/(nYlevels**2))
sigma = pm.Uniform('sigma', nYlevels/1000.0, nYlevels*10.0)
pr = outcome_probabilities(theta, mu, sigma)
y = pm.Categorical('y', pr, observed=df.Y.cat.codes.as_matrix())
http://nbviewer.jupyter.org/github/JWarmenhoven/DBDA-python/blob/master/Notebooks/Chapter%2023.ipynb

Interpreting Tensorflow/Tensorboard "subtraction" operation

The following is code adapted from a simple learning example, that I have bent out of shape to understand the Tensorboard graph visualizations:
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(10).astype("float32")
y_data = x_data * 0.1 + 0.3
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0, name = "internal_W"), name = "external_W")
b = tf.Variable(2*tf.zeros([1], name = "internal_b"), name = "doubled_b")
y = (W * x_data + b)
l1 = (y - y_data)
l2 = (y_data - y )
writer = tf.train.SummaryWriter("/tmp/test1", sess.graph_def)
init = tf.initialize_all_variables()
# Launch the graph.
sess = tf.Session()
sess.run(init)
print(sess.run(y))
print('---')
print((y_data))
print('---')
print(sess.run(l1))
print('---')
print(sess.run(l2))
A sample output of the print statements is:
[ 0.84253538 0.31011301 0.11627766 0.35491142 0.65550905 0.1798114
0.13632762 0.02010157 0.42960873 0.04218956]
---
[ 0.39195824 0.33384719 0.31269109 0.33873668 0.37154531 0.31962547
0.31487945 0.302194 0.3468895 0.30460477]
---
[ 0.45057714 -0.02373418 -0.19641343 0.01617473 0.28396374 -0.13981406
-0.17855182 -0.28209242 0.08271924 -0.2624152 ]
---
[-0.45057714 0.02373418 0.19641343 -0.01617473 -0.28396374 0.13981406
0.17855182 0.28209242 -0.08271924 0.2624152 ]
Clearly, the subtractions are working properly-- the inputs to the subtraction are in different order, and yield different outputs. However, the graph visualization is:
Notice the "Sub" operators, which appear not to reverse the order of the operands as the code does. (Highlighting either operator yields no additional insight.) Am I missing something obvious, or do the node visualizations completely obscure order of operands?
After futzing around with this, my considered answer to my own question is, "Yes, this is working as intended." The inputs to the nodes show only what the inputs are, not any particular relationships to the operation or the node or themselves; indeed, if one added a variable to itself in an operation node, the input variable would show up only once.
This is not a design choice I would have made, but that does seem to be the intent.
I still encourage others who may have more insight to comment or fully answer.

Bayesian Probabilistic Matrix Factorization (BPMF) with PyMC3: PositiveDefiniteError using `NUTS`

I've implemented the Bayesian Probabilistic Matrix Factorization algorithm using pymc3 in Python. I also implemented it's precursor, Probabilistic Matrix Factorization (PMF). See my previous question for a reference to the data used here.
I'm having trouble drawing MCMC samples using the NUTS sampler. I initialize the model parameters using the MAP from PMF, and the hyperparameters using Gaussian random draws sprinkled around 0. However, I get a PositiveDefiniteError when setting up the step object for the sampler. I've verified that the MAP estimate from PMF is reasonable, so I expect it has something to do with the way the hyperparameters are being initialized. Here is the PMF model:
import pymc3 as pm
import numpy as np
import pandas as pd
import theano
import scipy as sp
data = pd.read_csv('jester-dense-subset-100x20.csv')
n, m = data.shape
test_size = m / 10
train_size = m - test_size
train = data.copy()
train.ix[:,train_size:] = np.nan # remove test set data
train[train.isnull()] = train.mean().mean() # mean value imputation
train = train.values
test = data.copy()
test.ix[:,:train_size] = np.nan # remove train set data
test = test.values
# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2 # fixed precision for likelihood function
dim = 10 # dimensionality
# Specify the model.
with pm.Model() as pmf:
pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
shape=(n, dim), testval=np.random.randn(n, dim)*.01)
pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
shape=(m, dim), testval=np.random.randn(m, dim)*.01)
pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
tau=alpha, observed=train)
# Find mode of posterior using optimization
start = pm.find_MAP(fmin=sp.optimize.fmin_powell)
And here is BPMF:
n, m = data.shape
dim = 10 # dimensionality
beta_0 = 1 # scaling factor for lambdas; unclear on its use
alpha = np.ones((n,m)) * 2 # fixed precision for likelihood function
logging.info('building the BPMF model')
std = .05 # how much noise to use for model initialization
with pm.Model() as bpmf:
# Specify user feature matrix
lambda_u = pm.Wishart(
'lambda_u', n=dim, V=np.eye(dim), shape=(dim, dim),
testval=np.random.randn(dim, dim) * std)
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
U = pm.MvNormal(
'U', mu=mu_u, tau=lambda_u, shape=(n, dim),
testval=np.random.randn(n, dim) * std)
# Specify item feature matrix
lambda_v = pm.Wishart(
'lambda_v', n=dim, V=np.eye(dim), shape=(dim, dim),
testval=np.random.randn(dim, dim) * std)
mu_v = pm.Normal(
'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
testval=np.random.randn(dim) * std)
V = pm.MvNormal(
'V', mu=mu_v, tau=lambda_v, shape=(m, dim),
testval=np.random.randn(m, dim) * std)
# Specify rating likelihood function
R = pm.Normal(
'R', mu=theano.tensor.dot(U, V.T), tau=alpha,
observed=train)
# `start` is the start dictionary obtained from running find_MAP for PMF.
for key in bpmf.test_point:
if key not in start:
start[key] = bpmf.test_point[key]
with bpmf:
step = pm.NUTS(scaling=start)
At the last line, I get the following error:
PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [ 0 2 ... 2206 2207 ]
As I understand it, I can't use find_MAP with models that have hyperpriors like BPMF. This is why I'm attempting to initialize with the MAP values from PMF, which uses point estimates for the parameters on U and V rather than parameterized hyperpriors.
Unfortunately the Wishart distribution is non-functional. I recently added a warning here: https://github.com/pymc-devs/pymc3/commit/642f63973ec9f807fb6e55a0fc4b31bdfa1f261e
See here for more discussions on this tricky distribution: https://github.com/pymc-devs/pymc3/issues/538
You could confirm that that's the source by fixing the covariance matrix. If that's the case, I'd try using the JKL prior distribution: https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/LKJ_correlation.py

Verlet integrator + friction

I have been following "A Verlet based approach for 2D game physics" on Gamedev.net and I have written something similar.
The problem I am having is that the boxes slide along the ground too much.
How can I add a simple rested state thing where the boxes will have more friction and only slide a tiny bit?
Just introduce a small, constant acceleration on moving objects that points in the direction opposite to the motion. And make sure it can't actually reverse the motion; if you detect that in an integration step, just set the velocity to zero.
If you want to be more realistic, the acceleration should derive from a force which is proportional to the normal force between the object and the surface it's sliding on.
You can find this in any basic physics text, as "kinetic friction" or "sliding friction".
At the verlet integration: r(t)=2.00*r(t-dt)-1.00*r(t-2dt)+2at²
change the multipliers to 1.99 and 0.99 for friction
Edit: this is more true:
r(t)=(2.00-friction_mult.)*r(t-dt)-(1.00-friction_mult.)*r(t-2dt)+at²
Here is a simple time stepping scheme (symplectic Euler method with manually resolved LCP) for a box with Coulomb friction and a spring (frictional oscillator)
mq'' + kq + mu*sgn(q') = F(t)
import numpy as np
import matplotlib.pyplot as plt
q0 = 0 # initial position
p0 = 0 # initial momentum
t_start = 0 # initial time
t_end = 10 # end time
N = 500 # time points
m = 1 # mass
k = 1 # spring stiffness
muN = 0.5 # friction force (slip and maximal stick)
omega = 1.5 # forcing radian frequency [RAD]
Fstat = 0.1 # static component of external force
Fdyn = 0.6 # amplitude of harmonic external force
F = lambda tt,qq,pp: Fstat + Fdyn*np.sin(omega*tt) - k*qq - muN*np.sign(pp) # total force, note sign(0)=0 used to disable friction
zero_to_disable_friction = 0
omega0 = np.sqrt(k/m)
print("eigenfrequency f = {} Hz; eigen period T = {} s".format(omega0/(2*np.pi), 2*np.pi/omega0))
print("forcing frequency f = {} Hz; forcing period T = {} s".format(omega/(2*np.pi), 2*np.pi/omega))
time = np.linspace(t_start, t_end, N) # time grid
h = time[1] - time[0] # time step
q = np.zeros(N+1) # position
p = np.zeros(N+1) # momentum
absFfriction = np.zeros(N+1)
q[0] = q0
p[0] = p0
for n, tn in enumerate(time):
p1slide = p[n] + h*F(tn, q[n], p[n]) # end-time momentum, assuming sliding
q1slide = q[n] + h*p1slide/m # end-time position, assuming sliding
if p[n]*p1slide > 0: # sliding goes on
q[n+1] = q1slide
p[n+1] = p1slide
absFfriction[n] = muN
else:
q1stick = q[n] # assume p1 = 0 at t=tn+h
Fstick = -p[n]/h - F(tn, q1stick, zero_to_disable_friction) # friction force needed to stop at t=tn+h
if np.abs(Fstick) <= muN:
p[n+1] = 0 # sticking
q[n+1] = q1stick
absFfriction[n] = np.abs(Fstick)
else: # sliding starts or passes zero crossing of velocity
q[n+1] = q1slide # possible refinements (adapt to slip-start or zero crossing)
p[n+1] = p1slide
absFfriction[n] = muN