“ValueError: could not convert string to float” error in scikit-learn












-1















I'm running the following script:



import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
dataset = pd.read_csv('data/50_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
onehotencoder = OneHotEncoder(categorical_features=3,
handle_unknown='ignore')
onehotencoder.fit(X)


The data head looks like:
data



And I've got this:




ValueError: could not convert string to float: 'New York'




I read the answers to similar questions and then opened scikit-learn documentations, but how you can see scikit-learn authors doesn't have issues with spaces in strings



I know that I can use LabelEncocder from sklearn.preprocessing and then use OHE and it works well, but in that case



In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
warnings.warn(msg, FutureWarning)


massage occurs.



You can use full csv file or



[[165349.2, 136897.8, 471784.1, 'New York', 192261.83],
[162597.7, 151377.59, 443898.53, 'California', 191792.06],
[153441.51, 101145.55, 407934.54, 'Florida', 191050.39],
[144372.41, 118671.85, 383199.62, 'New York', 182901.99],
[142107.34, 91391.77, 366168.42, 'Florida', 166187.94]]


5 first lines to test this code.










share|improve this question

























  • My input, as you can see from code, is csv file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:14






  • 1





    try: dataset.info() to check the types of data that you have in your dataframe.

    – Jorge
    Nov 26 '18 at 0:20






  • 1





    I've add 5 first lines and link to pastebin with full content of the file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:29











  • The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

    – Aziz Temirkhanov
    Nov 26 '18 at 0:31











  • What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

    – Jared Smith
    Nov 26 '18 at 0:33
















-1















I'm running the following script:



import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
dataset = pd.read_csv('data/50_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
onehotencoder = OneHotEncoder(categorical_features=3,
handle_unknown='ignore')
onehotencoder.fit(X)


The data head looks like:
data



And I've got this:




ValueError: could not convert string to float: 'New York'




I read the answers to similar questions and then opened scikit-learn documentations, but how you can see scikit-learn authors doesn't have issues with spaces in strings



I know that I can use LabelEncocder from sklearn.preprocessing and then use OHE and it works well, but in that case



In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
warnings.warn(msg, FutureWarning)


massage occurs.



You can use full csv file or



[[165349.2, 136897.8, 471784.1, 'New York', 192261.83],
[162597.7, 151377.59, 443898.53, 'California', 191792.06],
[153441.51, 101145.55, 407934.54, 'Florida', 191050.39],
[144372.41, 118671.85, 383199.62, 'New York', 182901.99],
[142107.34, 91391.77, 366168.42, 'Florida', 166187.94]]


5 first lines to test this code.










share|improve this question

























  • My input, as you can see from code, is csv file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:14






  • 1





    try: dataset.info() to check the types of data that you have in your dataframe.

    – Jorge
    Nov 26 '18 at 0:20






  • 1





    I've add 5 first lines and link to pastebin with full content of the file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:29











  • The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

    – Aziz Temirkhanov
    Nov 26 '18 at 0:31











  • What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

    – Jared Smith
    Nov 26 '18 at 0:33














-1












-1








-1


0






I'm running the following script:



import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
dataset = pd.read_csv('data/50_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
onehotencoder = OneHotEncoder(categorical_features=3,
handle_unknown='ignore')
onehotencoder.fit(X)


The data head looks like:
data



And I've got this:




ValueError: could not convert string to float: 'New York'




I read the answers to similar questions and then opened scikit-learn documentations, but how you can see scikit-learn authors doesn't have issues with spaces in strings



I know that I can use LabelEncocder from sklearn.preprocessing and then use OHE and it works well, but in that case



In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
warnings.warn(msg, FutureWarning)


massage occurs.



You can use full csv file or



[[165349.2, 136897.8, 471784.1, 'New York', 192261.83],
[162597.7, 151377.59, 443898.53, 'California', 191792.06],
[153441.51, 101145.55, 407934.54, 'Florida', 191050.39],
[144372.41, 118671.85, 383199.62, 'New York', 182901.99],
[142107.34, 91391.77, 366168.42, 'Florida', 166187.94]]


5 first lines to test this code.










share|improve this question
















I'm running the following script:



import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
dataset = pd.read_csv('data/50_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
onehotencoder = OneHotEncoder(categorical_features=3,
handle_unknown='ignore')
onehotencoder.fit(X)


The data head looks like:
data



And I've got this:




ValueError: could not convert string to float: 'New York'




I read the answers to similar questions and then opened scikit-learn documentations, but how you can see scikit-learn authors doesn't have issues with spaces in strings



I know that I can use LabelEncocder from sklearn.preprocessing and then use OHE and it works well, but in that case



In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
warnings.warn(msg, FutureWarning)


massage occurs.



You can use full csv file or



[[165349.2, 136897.8, 471784.1, 'New York', 192261.83],
[162597.7, 151377.59, 443898.53, 'California', 191792.06],
[153441.51, 101145.55, 407934.54, 'Florida', 191050.39],
[144372.41, 118671.85, 383199.62, 'New York', 182901.99],
[142107.34, 91391.77, 366168.42, 'Florida', 166187.94]]


5 first lines to test this code.







python numpy scikit-learn






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 26 '18 at 1:33







Aziz Temirkhanov

















asked Nov 26 '18 at 0:08









Aziz TemirkhanovAziz Temirkhanov

43




43













  • My input, as you can see from code, is csv file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:14






  • 1





    try: dataset.info() to check the types of data that you have in your dataframe.

    – Jorge
    Nov 26 '18 at 0:20






  • 1





    I've add 5 first lines and link to pastebin with full content of the file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:29











  • The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

    – Aziz Temirkhanov
    Nov 26 '18 at 0:31











  • What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

    – Jared Smith
    Nov 26 '18 at 0:33



















  • My input, as you can see from code, is csv file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:14






  • 1





    try: dataset.info() to check the types of data that you have in your dataframe.

    – Jorge
    Nov 26 '18 at 0:20






  • 1





    I've add 5 first lines and link to pastebin with full content of the file

    – Aziz Temirkhanov
    Nov 26 '18 at 0:29











  • The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

    – Aziz Temirkhanov
    Nov 26 '18 at 0:31











  • What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

    – Jared Smith
    Nov 26 '18 at 0:33

















My input, as you can see from code, is csv file

– Aziz Temirkhanov
Nov 26 '18 at 0:14





My input, as you can see from code, is csv file

– Aziz Temirkhanov
Nov 26 '18 at 0:14




1




1





try: dataset.info() to check the types of data that you have in your dataframe.

– Jorge
Nov 26 '18 at 0:20





try: dataset.info() to check the types of data that you have in your dataframe.

– Jorge
Nov 26 '18 at 0:20




1




1





I've add 5 first lines and link to pastebin with full content of the file

– Aziz Temirkhanov
Nov 26 '18 at 0:29





I've add 5 first lines and link to pastebin with full content of the file

– Aziz Temirkhanov
Nov 26 '18 at 0:29













The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

– Aziz Temirkhanov
Nov 26 '18 at 0:31





The 'State' column full of 50 non-null objects. Now I see the problem, but anyway have no idea how to fix it without using LabelEncoder

– Aziz Temirkhanov
Nov 26 '18 at 0:31













What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

– Jared Smith
Nov 26 '18 at 0:33





What would you expect 'New York' to be as a floating point number? Why would you think it has anything to do with a space in the string?

– Jared Smith
Nov 26 '18 at 0:33












2 Answers
2






active

oldest

votes


















0














It is categorical_features=3 that hurts you. You cannot use categorical_features with string data. Remove this option, and luck will be with you. Also, you probably need fit_transform, not fit as such.



onehotencoder = OneHotEncoder(handle_unknown='ignore')
transformed = onehotencoder.fit_transform(X[:, [3]]).toarray()
X1 = np.concatenate([X[:, :2], transformed, X[:, 4:]], axis=1)
#array([[165349.2, 136897.8, 0.0, '0.0, 1.0, 192261.83],
# [162597.7, 151377.59, 1.0, 0.0, 0.0, 191792.06],
# [153441.51, 101145.55, 0.0, 1.0, 0.0, 191050.39],
# [144372.41, 118671.85, 0.0, 0.0, 1.0, 182901.99],
# [142107.34, 91391.77, 0.0, 1.0, 0.0, 166187.94']])





share|improve this answer


























  • In that case the whole dataset tranforms to categorical data, not only 3d column

    – Aziz Temirkhanov
    Nov 26 '18 at 0:57











  • You can choose which columns to transform.

    – DYZ
    Nov 26 '18 at 0:58











  • I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

    – Aziz Temirkhanov
    Nov 26 '18 at 1:08













  • Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

    – DYZ
    Nov 26 '18 at 1:24











  • OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

    – Aziz Temirkhanov
    Nov 26 '18 at 1:30





















0














Try this:



from sklearn.compose import ColumnTransformer, make_column_transformer
from sklearn.preprocessing import OneHotEncoder

columntransformer = make_column_transformer(
(OneHotEncoder(categories='auto'), [3]),
remainder='passthrough')


X = columntransformer.fit_transform(X)
X = X.astype(float)





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53473274%2fvalueerror-could-not-convert-string-to-float-error-in-scikit-learn%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    It is categorical_features=3 that hurts you. You cannot use categorical_features with string data. Remove this option, and luck will be with you. Also, you probably need fit_transform, not fit as such.



    onehotencoder = OneHotEncoder(handle_unknown='ignore')
    transformed = onehotencoder.fit_transform(X[:, [3]]).toarray()
    X1 = np.concatenate([X[:, :2], transformed, X[:, 4:]], axis=1)
    #array([[165349.2, 136897.8, 0.0, '0.0, 1.0, 192261.83],
    # [162597.7, 151377.59, 1.0, 0.0, 0.0, 191792.06],
    # [153441.51, 101145.55, 0.0, 1.0, 0.0, 191050.39],
    # [144372.41, 118671.85, 0.0, 0.0, 1.0, 182901.99],
    # [142107.34, 91391.77, 0.0, 1.0, 0.0, 166187.94']])





    share|improve this answer


























    • In that case the whole dataset tranforms to categorical data, not only 3d column

      – Aziz Temirkhanov
      Nov 26 '18 at 0:57











    • You can choose which columns to transform.

      – DYZ
      Nov 26 '18 at 0:58











    • I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

      – Aziz Temirkhanov
      Nov 26 '18 at 1:08













    • Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

      – DYZ
      Nov 26 '18 at 1:24











    • OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

      – Aziz Temirkhanov
      Nov 26 '18 at 1:30


















    0














    It is categorical_features=3 that hurts you. You cannot use categorical_features with string data. Remove this option, and luck will be with you. Also, you probably need fit_transform, not fit as such.



    onehotencoder = OneHotEncoder(handle_unknown='ignore')
    transformed = onehotencoder.fit_transform(X[:, [3]]).toarray()
    X1 = np.concatenate([X[:, :2], transformed, X[:, 4:]], axis=1)
    #array([[165349.2, 136897.8, 0.0, '0.0, 1.0, 192261.83],
    # [162597.7, 151377.59, 1.0, 0.0, 0.0, 191792.06],
    # [153441.51, 101145.55, 0.0, 1.0, 0.0, 191050.39],
    # [144372.41, 118671.85, 0.0, 0.0, 1.0, 182901.99],
    # [142107.34, 91391.77, 0.0, 1.0, 0.0, 166187.94']])





    share|improve this answer


























    • In that case the whole dataset tranforms to categorical data, not only 3d column

      – Aziz Temirkhanov
      Nov 26 '18 at 0:57











    • You can choose which columns to transform.

      – DYZ
      Nov 26 '18 at 0:58











    • I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

      – Aziz Temirkhanov
      Nov 26 '18 at 1:08













    • Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

      – DYZ
      Nov 26 '18 at 1:24











    • OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

      – Aziz Temirkhanov
      Nov 26 '18 at 1:30
















    0












    0








    0







    It is categorical_features=3 that hurts you. You cannot use categorical_features with string data. Remove this option, and luck will be with you. Also, you probably need fit_transform, not fit as such.



    onehotencoder = OneHotEncoder(handle_unknown='ignore')
    transformed = onehotencoder.fit_transform(X[:, [3]]).toarray()
    X1 = np.concatenate([X[:, :2], transformed, X[:, 4:]], axis=1)
    #array([[165349.2, 136897.8, 0.0, '0.0, 1.0, 192261.83],
    # [162597.7, 151377.59, 1.0, 0.0, 0.0, 191792.06],
    # [153441.51, 101145.55, 0.0, 1.0, 0.0, 191050.39],
    # [144372.41, 118671.85, 0.0, 0.0, 1.0, 182901.99],
    # [142107.34, 91391.77, 0.0, 1.0, 0.0, 166187.94']])





    share|improve this answer















    It is categorical_features=3 that hurts you. You cannot use categorical_features with string data. Remove this option, and luck will be with you. Also, you probably need fit_transform, not fit as such.



    onehotencoder = OneHotEncoder(handle_unknown='ignore')
    transformed = onehotencoder.fit_transform(X[:, [3]]).toarray()
    X1 = np.concatenate([X[:, :2], transformed, X[:, 4:]], axis=1)
    #array([[165349.2, 136897.8, 0.0, '0.0, 1.0, 192261.83],
    # [162597.7, 151377.59, 1.0, 0.0, 0.0, 191792.06],
    # [153441.51, 101145.55, 0.0, 1.0, 0.0, 191050.39],
    # [144372.41, 118671.85, 0.0, 0.0, 1.0, 182901.99],
    # [142107.34, 91391.77, 0.0, 1.0, 0.0, 166187.94']])






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 26 '18 at 1:50

























    answered Nov 26 '18 at 0:56









    DYZDYZ

    27k62049




    27k62049













    • In that case the whole dataset tranforms to categorical data, not only 3d column

      – Aziz Temirkhanov
      Nov 26 '18 at 0:57











    • You can choose which columns to transform.

      – DYZ
      Nov 26 '18 at 0:58











    • I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

      – Aziz Temirkhanov
      Nov 26 '18 at 1:08













    • Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

      – DYZ
      Nov 26 '18 at 1:24











    • OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

      – Aziz Temirkhanov
      Nov 26 '18 at 1:30





















    • In that case the whole dataset tranforms to categorical data, not only 3d column

      – Aziz Temirkhanov
      Nov 26 '18 at 0:57











    • You can choose which columns to transform.

      – DYZ
      Nov 26 '18 at 0:58











    • I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

      – Aziz Temirkhanov
      Nov 26 '18 at 1:08













    • Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

      – DYZ
      Nov 26 '18 at 1:24











    • OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

      – Aziz Temirkhanov
      Nov 26 '18 at 1:30



















    In that case the whole dataset tranforms to categorical data, not only 3d column

    – Aziz Temirkhanov
    Nov 26 '18 at 0:57





    In that case the whole dataset tranforms to categorical data, not only 3d column

    – Aziz Temirkhanov
    Nov 26 '18 at 0:57













    You can choose which columns to transform.

    – DYZ
    Nov 26 '18 at 0:58





    You can choose which columns to transform.

    – DYZ
    Nov 26 '18 at 0:58













    I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

    – Aziz Temirkhanov
    Nov 26 '18 at 1:08







    I ran this code: onehotencoder = OneHotEncoder(handle_unknown='ignore') onehotencoder.fit(X[:, 3]) and got this error: ValueError: Expected 2D array, got 1D array instead:

    – Aziz Temirkhanov
    Nov 26 '18 at 1:08















    Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

    – DYZ
    Nov 26 '18 at 1:24





    Because you pass a 1D array instead of a 2D array. You ought to pass X[:, [3]] or X[:,3].reshape(1,-1).

    – DYZ
    Nov 26 '18 at 1:24













    OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

    – Aziz Temirkhanov
    Nov 26 '18 at 1:30







    OK, I did it like you said. Now if I apply this X = onehotencoder.transform(X[:, [3]]).toarray() I losing my first 3 colums. If I apply this X = onehotencoder.transform(X[:, 3]).toarray() the same error occurs

    – Aziz Temirkhanov
    Nov 26 '18 at 1:30















    0














    Try this:



    from sklearn.compose import ColumnTransformer, make_column_transformer
    from sklearn.preprocessing import OneHotEncoder

    columntransformer = make_column_transformer(
    (OneHotEncoder(categories='auto'), [3]),
    remainder='passthrough')


    X = columntransformer.fit_transform(X)
    X = X.astype(float)





    share|improve this answer




























      0














      Try this:



      from sklearn.compose import ColumnTransformer, make_column_transformer
      from sklearn.preprocessing import OneHotEncoder

      columntransformer = make_column_transformer(
      (OneHotEncoder(categories='auto'), [3]),
      remainder='passthrough')


      X = columntransformer.fit_transform(X)
      X = X.astype(float)





      share|improve this answer


























        0












        0








        0







        Try this:



        from sklearn.compose import ColumnTransformer, make_column_transformer
        from sklearn.preprocessing import OneHotEncoder

        columntransformer = make_column_transformer(
        (OneHotEncoder(categories='auto'), [3]),
        remainder='passthrough')


        X = columntransformer.fit_transform(X)
        X = X.astype(float)





        share|improve this answer













        Try this:



        from sklearn.compose import ColumnTransformer, make_column_transformer
        from sklearn.preprocessing import OneHotEncoder

        columntransformer = make_column_transformer(
        (OneHotEncoder(categories='auto'), [3]),
        remainder='passthrough')


        X = columntransformer.fit_transform(X)
        X = X.astype(float)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 17 '18 at 2:59









        Muke888Muke888

        164




        164






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53473274%2fvalueerror-could-not-convert-string-to-float-error-in-scikit-learn%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            Refactoring coordinates for Minecraft Pi buildings written in Python