Categories
group-by null pandas python unique

Python pandas unique value ignoring NaN

I want to use unique in groupby aggregation, but I don’t want nan in the unique result.

An example dataframe:

df = pd.DataFrame({'a': [1, 2, 1, 1, pd.np.nan, 3, 3], 'b': [0,0,1,1,1,1,1],
'c': ['foo', pd.np.nan, 'bar', 'foo', 'baz', 'foo', 'bar']})
a b c
0 1.0000 0 foo
1 2.0000 0 NaN
2 1.0000 1 bar
3 1.0000 1 foo
4 nan 1 baz
5 3.0000 1 foo
6 3.0000 1 bar

And the groupby:

df.groupby('b').agg({'a': ['min', 'max', 'unique'], 'c': ['first', 'last', 'unique']})

It’s result is:

       a                             c                      
min max unique first last unique
b
0 1.0000 2.0000 [1.0, 2.0] foo foo [foo, nan]
1 1.0000 3.0000 [1.0, nan, 3.0] bar bar [bar, foo, baz]

But I want it without nan:

       a                        c                      
min max unique first last unique
b
0 1.0000 2.0000 [1.0, 2.0] foo foo [foo]
1 1.0000 3.0000 [1.0, 3.0] bar bar [bar, foo, baz]

How can I do that? Of course I have several columns to aggregate and every column needs different aggregation functions, so I don’t want to do the unique aggregations one-by-one and separately from other aggregations.

Thank you!