I am getting
TypeError: unhashable type: 'slice'
when executing the below code for encoding categorical data in Python. Can anyone please help?
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('50_Startups.csv')
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 3] = labelencoder_X.fit_transform(X[:, 3])
This question is related to
python
pandas
numpy
matplotlib
X
is a dataframe and can't be accessed via slice terminology like X[:, 3]
. You must access via iloc
or X.values
. However, the way you constructed X
made it a copy... so. I'd use values
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')
dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
# I changed this line
X.values[:, 3] = labelencoder_X.fit_transform(X.values[:, 3])
While creating the matrix X
and Y
vector use values
.
X=dataset.iloc[:,4].values
Y=dataset.iloc[:,0:4].values
It will definitely solve your problem.
Try by changing X[:,3] to X.iloc[:,3] in label encoder
I was getting same error (TypeError: unhashable type: 'slice') with below code:
included_cols = [2,4,10]
dataset = dataset[:,included_cols] #Columns 2,4 and 10 are included.
Resolved with below code by putting iloc after dataset:
included_cols = [2,4,10]
dataset = dataset.iloc[:,included_cols] #Columns 2,4 and 10 are included.
use Values either while creating variable X or while encoding as mentioned above
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')
dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4].values
X=dataset.iloc[:, 0:4].values
Your x and y values ??are not running so first of all youre begin to write this point
import numpy as np
import pandas as pd
import matplotlib as plt
dataframe=pd.read_csv(".\datasets\Position_Salaries.csv")
x=dataframe.iloc[:,1:2].values
y=dataframe.iloc[:,2].values
x1=dataframe.iloc[:,:-1].values
point of value have publish
if you use .Values while creating the matrix X and Y vectors it will fix the problem.
y=dataset.iloc[:, 4].values
X=dataset.iloc[:, 0:4].values
when you use .Values it creates a Object representation of the created matrix will be returned with the axes removed. Check the below link for more information
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.values.html
Source: Stackoverflow.com