[python] Error in Python script "Expected 2D array, got 1D array instead:"?

I'm following this tutorial to make this ML prediction:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style

style.use("ggplot")
from sklearn import svm

x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]

plt.scatter(x,y)
plt.show()

X = np.array([[1,2],
             [5,8],
             [1.5,1.8],
             [8,8],
             [1,0.6],
             [9,11]])

y = [0,1,0,1,0,1]
X.reshape(1, -1)

clf = svm.SVC(kernel='linear', C = 1.0)
clf.fit(X,y)

print(clf.predict([0.58,0.76]))

I'm using Python 3.6 and I get error "Expected 2D array, got 1D array instead:" I think the script is for older versions, but I don't know how to convert it to the 3.6 version.

Already try with the:

X.reshape(1, -1)

This question is related to python python-3.x machine-learning predict

The answer is


You are just supposed to provide the predict method with the same 2D array, but with one value that you want to process (or more). In short, you can just replace

[0.58,0.76]

With

[[0.58,0.76]]

And it should work.

EDIT: This answer became popular so I thought I'd add a little more explanation about ML. The short version: we can only use predict on data that is of the same dimensionality as the training data (X) was.

In the example in question, we give the computer a bunch of rows in X (with 2 values each) and we show it the correct responses in y. When we want to predict using new values, our program expects the same - a bunch of rows. Even if we want to do it to just one row (with two values), that row has to be part of another array.


I faced the same issue except that the data type of the instance I wanted to predict was a panda.Series object.

Well I just needed to predict one input instance. I took it from a slice of my data.

df = pd.DataFrame(list(BiogasPlant.objects.all()))
test = df.iloc[-1:]       # sliced it here

In this case, you'll need to convert it into a 1-D array and then reshape it.

 test2d = test.values.reshape(1,-1)

From the docs, values will convert Series into a numpy array.


Just insert the argument between a double square bracket:

regressor.predict([[values]])

that worked for me


The X and Y matrix of Independent Variable and Dependent Variable respectively to DataFrame from int64 Type so that it gets converted from 1D array to 2D array.. i.e X=pd.DataFrame(X) and Y=pd.dataFrame(Y) where pd is of pandas class in python. and thus feature scaling in-turn doesn't lead to any error!


I use the below approach.

reg = linear_model.LinearRegression()
reg.fit(df[['year']],df.income)

reg.predict([[2136]])

With one feature my Dataframe list converts to a Series. I had to convert it back to a Dataframe list and it worked.

if type(X) is Series:
    X = X.to_frame()

I was facing the same issue earlier but I have somehow found the solution, You can try reg.predict([[3300]]).

The API used to allow scalar value but now you need to give a 2D array.


The problem is occurring when you run prediction on the array [0.58,0.76]. Fix the problem by reshaping it before you call predict():

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style

style.use("ggplot")
from sklearn import svm

x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]

plt.scatter(x,y)
plt.show()

X = np.array([[1,2],
             [5,8],
             [1.5,1.8],
             [8,8],
             [1,0.6],
             [9,11]])

y = [0,1,0,1,0,1]

clf = svm.SVC(kernel='linear', C = 1.0)
clf.fit(X,y)

test = np.array([0.58, 0.76])
print test       # Produces: [ 0.58  0.76]
print test.shape # Produces: (2,) meaning 2 rows, 1 col

test = test.reshape(1, -1)
print test       # Produces: [[ 0.58  0.76]]
print test.shape # Produces (1, 2) meaning 1 row, 2 cols

print(clf.predict(test)) # Produces [0], as expected

I faced the same problem. You just have to make it an array and moreover you have to put double squared brackets to make it a single element of the 2D array as first bracket initializes the array and the second makes it an element of that array.

So simply replace the last statement by:

print(clf.predict(np.array[[0.58,0.76]]))

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to python-3.x

Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Replace specific text with a redacted version using Python Upgrade to python 3.8 using conda "Permission Denied" trying to run Python on Windows 10 Python: 'ModuleNotFoundError' when trying to import module from imported package What is the meaning of "Failed building wheel for X" in pip install? How to downgrade python from 3.7 to 3.6 I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? Iterating over arrays in Python 3 How to upgrade Python version to 3.7?

Examples related to machine-learning

Error in Python script "Expected 2D array, got 1D array instead:"? How to predict input image using trained model in Keras? What is the role of "Flatten" in Keras? How to concatenate two layers in keras? How to save final model using keras? scikit-learn random state in splitting dataset Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? What is the meaning of the word logits in TensorFlow? Can anyone explain me StandardScaler? Can Keras with Tensorflow backend be forced to use CPU or GPU at will?

Examples related to predict

Error in Python script "Expected 2D array, got 1D array instead:"? Predict() - Maybe I'm not understanding it R: numeric 'envir' arg not of length one in predict()