This code generates error:
IndexError: invalid index to scalar variable.
at the line: results.append(RMSPE(np.expm1(y_train[testcv]), [y[1] for y in y_test]))
How to fix it?
import pandas as pd
import numpy as np
from sklearn import ensemble
from sklearn import cross_validation
def ToWeight(y):
w = np.zeros(y.shape, dtype=float)
ind = y != 0
w[ind] = 1./(y[ind]**2)
return w
def RMSPE(y, yhat):
w = ToWeight(y)
rmspe = np.sqrt(np.mean( w * (y - yhat)**2 ))
return rmspe
forest = ensemble.RandomForestRegressor(n_estimators=10, min_samples_split=2, n_jobs=-1)
print ("Cross validations")
cv = cross_validation.KFold(len(train), n_folds=5)
results = []
for traincv, testcv in cv:
y_test = np.expm1(forest.fit(X_train[traincv], y_train[traincv]).predict(X_train[testcv]))
results.append(RMSPE(np.expm1(y_train[testcv]), [y[1] for y in y_test]))
testcv
is:
[False False False ..., True True True]
In the for, you have an iteration, then for each element of that loop which probably is a scalar, has no index. When each element is an empty array, single variable, or scalar and not a list or array you cannot use indices.
Basically, 1
is not a valid index of y
. If the visitor is comming from his own code he should check if his y
contains the index which he tries to access (in this case the index is 1
).
Source: Stackoverflow.com