home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation. > 116 X = check_array(X, dtype=DTYPE, accept_sparse="csc")ġ17 y = check_array(y, ensure_2d=False, dtype=None) home/fatima/anaconda2/lib/python2.7/site-packages/sklearn/tree/tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)ġ14 random_state = check_random_state(self.random_state) PS: error is occurring at the last line and here is the traceback ValueError Traceback (most recent call last) And by looking at the data my random guess is that low has a space before it and med doesn't. ![]() Data that I am using is: Īnd following is my script import numpy as npįrom the error I am guessing that it couldn't convert "med" attribute value to float. All I have to do is read data using panda and then train a decision tree on data. X_train, X_test, y_train, y_test = train_test_split( X, Y, test_size = 0.2, random_state = 100)Ĭlf_gini = DecisionTreeClassifier(criterion = "gini", random_state = 100, Here's the overall code with the fix: import numpy as npįrom sklearn.cross_validation import train_test_splitįrom ee import DecisionTreeClassifierįrom trics import accuracy_score practical-guide-data-preprocessing-python-scikit-learn.For that, I recommend that you go through some tutorial to understand the process. In general, you need to perform some data preprocessing before training/fitting your model. Usually, this happens if the string object has an invalid floating value with spaces or comma Python will throw ValueError while parsing into string object into float. ![]() Now the data in balance_data looks like this: 0 1 2 3 4 5 6 If you convert a string object into a floating point in Python, you will get a ValueError: could not convert string to float. There are several ways to encode your data, one way is to use label encoding, to do that, add the following lines to your code just after loading the dataset: le = preprocessing.LabelEncoder()īalance_data = balance_data.apply(le.fit_transform) ) thus, you need to encode your data before you use it for training. However, machine learning algorithms can only learn from numbers (int, float, doubles. Where the data types (dtypes) are all objects. ![]() The dataset looks like this: 0 1 2 3 4 5 6
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |