I want to filter my dataframe with an or
condition to keep rows with a particular column’s values that are outside the range [0.25, 0.25]
. I tried:
df = df[(df['col'] < 0.25) or (df['col'] > 0.25)]
But I get the error:
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
The or
and and
python statements require truth
values. For pandas
, these are considered ambiguous so you should use “bitwise” 
(or) or &
(and) operations:
df = df[(df['col'] < 0.25)  (df['col'] > 0.25)]
These are overloaded for these kinds of data structures to yield the elementwise or
or and
.
Just to add some more explanation to this statement:
The exception is thrown when you want to get the bool
of a pandas.Series
:
>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What you hit was a place where the operator implicitly converted the operands to bool
(you used or
but it also happens for and
, if
and while
):
>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Besides these 4 statements there are several python functions that hide some bool
calls (like any
, all
, filter
, …) these are normally not problematic with pandas.Series
but for completeness I wanted to mention these.
In your case, the exception isn’t really helpful, because it doesn’t mention the right alternatives. For and
and or
, if you want elementwise comparisons, you can use:
>>> import numpy as np >>> np.logical_or(x, y)
or simply the

operator:>>> x  y
>>> np.logical_and(x, y)
or simply the
&
operator:>>> x & y
If you’re using the operators, then be sure to set your parentheses correctly because of operator precedence.
There are several logical numpy functions which should work on pandas.Series
.
The alternatives mentioned in the Exception are more suited if you encountered it when doing if
or while
. I’ll shortly explain each of these:
If you want to check if your Series is empty:
>>> x = pd.Series([]) >>> x.empty True >>> x = pd.Series([1]) >>> x.empty False
Python normally interprets the
len
gth of containers (likelist
,tuple
, …) as truthvalue if it has no explicit boolean interpretation. So if you want the pythonlike check, you could do:if x.size
orif not x.empty
instead ofif x
.If your
Series
contains one and only one boolean value:>>> x = pd.Series([100]) >>> (x > 50).bool() True >>> (x < 50).bool() False
If you want to check the first and only item of your Series (like
.bool()
but works even for not boolean contents):>>> x = pd.Series([100]) >>> x.item() 100
If you want to check if all or any item is notzero, notempty or notFalse:
>>> x = pd.Series([0, 1, 2]) >>> x.all() # because one element is zero False >>> x.any() # because one (or more) elements are nonzero True
Well pandas use bitwise &

and each condition should be wrapped in a ()
For example following works
data_query = data[(data['year'] >= 2005) & (data['year'] <= 2010)]
But the same query without proper brackets does not
data_query = data[(data['year'] >= 2005 & data['year'] <= 2010)]
Wonderful, the only answer mentioning the importance of wrapping conditions in parenthesis. The only problem with my syntax. But why is this mandatory?
– u tyagi
For boolean logic, use &
and 
.
np.random.seed(0)
df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>> df
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 0.977278
2 0.950088 0.151357 0.103219
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
>>> df.loc[(df.C > 0.25)  (df.C < 0.25)]
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 0.977278
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
To see what is happening, you get a column of booleans for each comparison, e.g.
df.C > 0.25
0 True
1 False
2 False
3 True
4 True
Name: C, dtype: bool
When you have multiple criteria, you will get multiple columns returned. This is why the join logic is ambiguous. Using and
or or
treats each column separately, so you first need to reduce that column to a single boolean value. For example, to see if any value or all values in each of the columns is True.
# Any value in either column is True?
(df.C > 0.25).any() or (df.C < 0.25).any()
True
# All values in either column is True?
(df.C > 0.25).all() or (df.C < 0.25).all()
False
One convoluted way to achieve the same thing is to zip all of these columns together, and perform the appropriate logic.
>>> df[[any([a, b]) for a, b in zip(df.C > 0.25, df.C < 0.25)]]
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 0.977278
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
For more details, refer to Boolean Indexing in the docs.
use

instead ofor
Apr 28, 2016 at 17:54
Here’s a workaround:
abs(result['var'])>0.25
Dec 28, 2018 at 17:29
Mar 8, 2019 at 22:09
I ran into the same error message using the standard
max()
function. Replacing it with withnumpy.maximum()
for elementwise maxima between two values solved my problem.

