-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction Fails with Data Uncertainties #1039
Comments
I am not an expert on this one but I guess its your If you, e.g. remove that Is this helpful? |
Thanks for giving this a try! You're right that removing the I'm looking for a way to optimise a model that's aware of uncertainties in the |
I see. I don't work with I think there is some fix for your issue. See the below I hope the below at least provides a workaround. Please let me know if this is not what you are looking for. import GPy
import numpy as np
import matplotlib.pyplot as plt
# Generate some data
n = 55
x = np.sort(np.random.uniform(0, 12, (n, 1)), axis=0)
y = 2 + np.sin(x) + np.random.normal(2.2, 0.2, (n, 1)) + 0.2 * x.reshape(-1, 1)
y_err = np.random.uniform(0.15, 0.5, (n, 1)) # the data uncertainties
fig = plt.figure()
plt.plot(x, y, "kx", mew=1.5)
plt.errorbar(
x.squeeze(),
y.squeeze(),
yerr=np.column_stack(
[y.squeeze() - y_err.squeeze(), y.squeeze() + y_err.squeeze()]
).T,
fmt="kx",
mew=1.5,
)
# Build covariance matrix
cov = np.zeros((n, n))
cov[np.diag_indices_from(cov)] = (y_err**2).flat
# Define kernel
kernel = (
GPy.kern.RBF(input_dim=1, variance=0.3, lengthscale=1.0)
+ GPy.kern.White(input_dim=1, variance=0.1)
+ GPy.kern.Bias(input_dim=1, variance=0.1)
+ GPy.kern.Fixed(
input_dim=1, covariance_matrix=cov
) # representing the data uncertainties here
)
# Optimise model
m = GPy.models.GPRegression(x, y, kernel=kernel)
m.optimize(messages=True)
# Predict on new `x` data
xnew = np.linspace(0, 15, 100)
# suggestion: use all but the fixed kernel and manually add noise to the
# prediction and update noise
# m.kern = GPy.kern.Add(m.kern.parts[:-1])
# or, add a new fixed kernel based on some random noise
y_err = np.random.uniform(0.15, 0.5, (len(xnew), 1)) # the data uncertainties
cov = np.zeros((len(xnew), len(xnew)))
cov[np.diag_indices_from(cov)] = (y_err**2).flat
m.kern = GPy.kern.Add(
m.kern.parts[:-1] + [GPy.kern.Fixed(input_dim=1, covariance_matrix=cov)]
)
pred, pred_var = m.predict_noiseless(xnew.reshape(-1, 1))
plt.plot(xnew, pred, "b-")
plt.fill_between(
xnew.squeeze(),
pred.squeeze() - np.sqrt(pred_var.squeeze()),
pred.squeeze() + np.sqrt(pred_var.squeeze()),
color="b",
alpha=0.2,
edgecolor="none",
)
plt.show() |
I'm trying to run a GPRegressor on some
[x, y]
data where there are known uncertainties (standard deviations) in they
data. I'm using the method outlined in #196 ... this issue is from 2015, but I can't find any more recent mention of this in the docs?The model optimizes fine, but whenever I try to make a prediction on new
x
values I get an array broadcast error.I'm not familiar enough with GPy to know whether this is me being stupid, or a bug... Any thoughts?
Setup:
I'm running GPy from the
devel
branch (because waiting for updates to allow GPy to work withnumpy >= 1.24
)MWE:
Error:
The text was updated successfully, but these errors were encountered: