Categories
csv header pandas python

Old pre-0.17 pandas.read_csv behavior of `header=True` for inferring header row?

How did old pre-0.17 versions of pandas read_csv() interpret passing a boolean header=True/False for inferring the header row?

I have CSV data with header:

col1;col2;col3
1.0;10.0;100.0
2.0;20.0;200.0
3.0;30.0;300.0

If read with header=True

i.e. df = pandas.read_csv('test.csv', sep=';', header=True),

that gives the following data-frame:

   1.0  10.0  100.0
0 2 20 200
1 3 30 300

It means that pandas used the second row (“row 1”) for column names (the names inferred are ‘1.0’, ‘10.0’ and ‘100.0’).

whereas if read with header=False

df = pandas.read_csv('test.csv', sep=';', header=False)

gives the following:

   col1  col2  col3
0 1 10 100
1 2 20 200
2 3 30 300

Which means that pandas used the first row (“row 0”) as header in spite on the fact that I wrote explicitly that there is no header.

This behaviour is not intuitive to me. Can somebody explain what is happening?