Categories
pandas python

Change type of pandas series/dataframe column inplace

TL;DR: I’d like to change the data types of pandas dataframe columns in-place.


I have a pandas dataframe:

df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6.1]})

Which by default gets its columns assigned ‘int64’ and ‘float64’ on my system:

df.dtypes
Out[172]:
a int64
b float64
dtype: object

Because my dataframe will be very large, I’d like to set the column data types, after having created the dataframe, to int32 and float32. I know how I could do this:

df['a'] = df['a'].astype(np.int32)
df['b'] = df['b'].astype(np.float32)

or, in one step:

df = df.astype({'a':np.int32, 'b':np.float32})

and the dtypes of my dataframe are indeed:

df.dtypes
Out[180]:
a int32
b float32
dtype: object

However: this seems clunky, having to reassign the series, esp. since many pandas methods have an inplace kwarg. Using this, however, doesn’t seem to work (starting out with the same dataframe at the top):

df['a'].astype(np.int32, inplace=True)
df.dtypes
Out[187]:
a int64
b float64
dtype: object

Is there something I’m overlooking here? Is this by design? The same behaviour is shown when working with Series instead of DataFrame objects.

Many thanks,