How can I use the simple imputer class to replace missing values with mean values in Python?

This is my code

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#Importing Dataset
dataset = pd.read_csv('C:/Users/Rupali Singh/Desktop/ML A-Z/Machine Learning A-Z Template Folder/Part 1 - Data Preprocessing/Data.csv')
print(dataset)
X = dataset.iloc[:, :-1].values
Y = dataset.iloc[:, 3].values
#Missing Data

from sklearn.impute import SimpleImputer

imputer = SimpleImputer(missing_values= np.nan, strategy='mean')
X.fit[:, 1:3] = imputer.fit_transform(X[:, 1:3])
print(X)
My data set:
Country   Age   Salary Purchased
0   France  44.0  72000.0        No
1    Spain  27.0  48000.0       Yes
2  Germany  30.0  54000.0        No
3    Spain  38.0  61000.0        No
4  Germany  40.0      NaN       Yes
5   France  35.0  58000.0       Yes
6    Spain   NaN  52000.0        No
7   France  48.0  79000.0       Yes
8  Germany  50.0  83000.0        No
9   France  37.0  67000.0       Yes
Error Message:
File "C:/Users/Rupali Singh/PycharmProjects/Machine_Learning/data_preprocessing_Template.py", line 15, in
    X.fit[:, 1:3] = imputer.fit_transform(X[:, 1:3])
AttributeError: 'numpy.ndarray' object has no attribute 'fit'
Answered by Dominic Poole

Your error is due to using Simple Imputer's fit and fit_transform on a numpy array. Here's how i used it on a Dataframe


imr = Imputer(missing_values='NaN', strategy='median', axis=0)
imr = imr.fit(data[['age']])
data['age'] = imr.transform(data[['age']]).ravel()
X.fit = impute.fit_transform().. this is wrong. you can't assign a value to a X.fit() just simply because .fit() is an imputer function, you can't use the method fit() on a numpy array, hence your error!
Use x[:, 1:3] = imputer.fit_transform(x[:, 1:3]) instead


Your Answer

Interviews

Parent Categories