I am trying to convert categorical values into binary values using pandas. The idea is to consider every unique categorical value as a feature (i.e. a column) and put 1 or 0 depending on whether a particular object (i.e. row) was assigned to this category. The following is the code:
data = pd.read_csv('somedata.csv')
converted_val = data.T.to_dict().values()
vectorizer = DV( sparse = False )
vec_x = vectorizer.fit_transform( converted_val )
My question is, how to save this converted data with the column names?. In the above code, I am able to save the data using
numpy.savetxt function, but this simply saves the array and the column names are lost. Alternatively, is there a much efficient way to perform the above operation?.