How to resolve the valueerror: input contains nan, infinity or a value too large for dtype('float64')?

507    Asked by ranjan_6399 in Data Science , Asked on Feb 10, 2023

 I got ValueError when predicting test data using a RandomForest model.

My code:

clf = RandomForestClassifier(n_estimators=10, max_depth=6, n_jobs=1, verbose=2)
clf.fit(X_fit, y_fit)
df_test.fillna(df_test.mean())
X_test = df_test.values  
y_pred = clf.predict(X_test)
The error:
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

How do I find the bad values in the test dataset? Also, I do not want to drop these records, can I just replace them with the mean or median?

Answered by Ranjana Admin

To resolve the valueerror: input contains nan, infinity or a value too large for dtype('float64'), in most cases getting rid of infinite and null values solve this problem.


get rid of infinite values.
df.replace([np.inf, -np.inf], np.nan, inplace=True)
get rid of null values the way you like, specific value such as 999, mean, or create your own function to impute missing values
df.fillna(999, inplace=True)
or
df.fillna(df.mean(), inplace=True)


Your Answer

Interviews

Parent Categories