Categories
dataframe pandas python

Delete a column from a Pandas DataFrame

1970

To delete a column in a DataFrame, I can successfully use:

del df['column_name']

But why can’t I use the following?

del df.column_name

Since it is possible to access the column/Series as df.column_name, I expected this to work.

1

  • 5

    Note this question is being discussed on Meta.

    – R.M.

    May 22, 2019 at 16:37

1255

As you’ve guessed, the right syntax is

del df['column_name']

It’s difficult to make del df.column_name work simply as the result of syntactic limitations in Python. del df[name] gets translated to df.__delitem__(name) under the covers by Python.

8

  • 33

    I realize this is a super old “answer”, but my curiosity is piqued – why is that a syntactic limitation of Python? class A(object): def __init__(self): self.var = 1 sets up a class, then a = A(); del a.var works just fine…

    Oct 4, 2016 at 14:24

  • 22

    @dwanderson the difference is that when a column is to be removed, the DataFrame needs to have its own handling for “how to do it”. In the case of del df[name], it gets translated to df.__delitem__(name) which is a method that DataFrame can implement and modify to its needs. In the case of del df.name, the member variable gets removed without a chance for any custom-code running. Consider your own example – can you get del a.var to result in a print of “deleting variable”? If you can, please tell me how. I can’t 🙂

    – Yonatan

    Dec 22, 2016 at 8:27

  • 12

    @Yonatan You can use either docs.python.org/3/reference/datamodel.html#object.__delattr__ or descriptors for that: docs.python.org/3/howto/descriptor.html

    Jan 19, 2017 at 16:06

  • 6

    @Yonatan Eugene’s comment applies to Python 2 also; descriptors have been in Python 2 since 2.2 and it is trivial to satisfy your requirement 😉

    – C S

    Jun 20, 2017 at 12:38

  • 6

    This answer isn’t really correct – the pandas developers didn’t, but that doesn’t mean it is hard to do.

    – wizzwizz4

    Sep 30, 2017 at 9:42

1255

As you’ve guessed, the right syntax is

del df['column_name']

It’s difficult to make del df.column_name work simply as the result of syntactic limitations in Python. del df[name] gets translated to df.__delitem__(name) under the covers by Python.

8

  • 33

    I realize this is a super old “answer”, but my curiosity is piqued – why is that a syntactic limitation of Python? class A(object): def __init__(self): self.var = 1 sets up a class, then a = A(); del a.var works just fine…

    Oct 4, 2016 at 14:24

  • 22

    @dwanderson the difference is that when a column is to be removed, the DataFrame needs to have its own handling for “how to do it”. In the case of del df[name], it gets translated to df.__delitem__(name) which is a method that DataFrame can implement and modify to its needs. In the case of del df.name, the member variable gets removed without a chance for any custom-code running. Consider your own example – can you get del a.var to result in a print of “deleting variable”? If you can, please tell me how. I can’t 🙂

    – Yonatan

    Dec 22, 2016 at 8:27

  • 12

    @Yonatan You can use either docs.python.org/3/reference/datamodel.html#object.__delattr__ or descriptors for that: docs.python.org/3/howto/descriptor.html

    Jan 19, 2017 at 16:06

  • 6

    @Yonatan Eugene’s comment applies to Python 2 also; descriptors have been in Python 2 since 2.2 and it is trivial to satisfy your requirement 😉

    – C S

    Jun 20, 2017 at 12:38

  • 6

    This answer isn’t really correct – the pandas developers didn’t, but that doesn’t mean it is hard to do.

    – wizzwizz4

    Sep 30, 2017 at 9:42

298

Use:

columns = ['Col1', 'Col2', ...]
df.drop(columns, inplace=True, axis=1)

This will delete one or more columns in-place. Note that inplace=True was added in pandas v0.13 and won’t work on older versions. You’d have to assign the result back in that case:

df = df.drop(columns, axis=1)

0