Categories
dask data-manipulation dictionary pandas python

Convert string to dict, then access key:values??? How to access data in a for Python?

I am having issues accessing data inside a dictionary.

Sys: Macbook 2012

Python: Python 3.5.1 :: Continuum Analytics, Inc.

I am working with a dask.dataframe created from a csv.

Edit Question

How I got to this point

Assume I start out with a Pandas Series:

df.Coordinates
130 {u'type': u'Point', u'coordinates': [-43.30175...
278 {u'type': u'Point', u'coordinates': [-51.17913...
425 {u'type': u'Point', u'coordinates': [-43.17986...
440 {u'type': u'Point', u'coordinates': [-51.16376...
877 {u'type': u'Point', u'coordinates': [-43.17986...
1313 {u'type': u'Point', u'coordinates': [-49.72688...
1734 {u'type': u'Point', u'coordinates': [-43.57405...
1817 {u'type': u'Point', u'coordinates': [-43.77649...
1835 {u'type': u'Point', u'coordinates': [-43.17132...
2739 {u'type': u'Point', u'coordinates': [-43.19583...
2915 {u'type': u'Point', u'coordinates': [-43.17986...
3035 {u'type': u'Point', u'coordinates': [-51.01583...
3097 {u'type': u'Point', u'coordinates': [-43.17891...
3974 {u'type': u'Point', u'coordinates': [-8.633880...
3983 {u'type': u'Point', u'coordinates': [-46.64960...
4424 {u'type': u'Point', u'coordinates': [-43.17986...

The problem is, this is not a true dataframe of dictionaries. Instead, it’s a column full of strings that LOOK like dictionaries. Running this show it:

df.Coordinates.apply(type)
130 <class 'str'>
278 <class 'str'>
425 <class 'str'>
440 <class 'str'>
877 <class 'str'>
1313 <class 'str'>
1734 <class 'str'>
1817 <class 'str'>
1835 <class 'str'>
2739 <class 'str'>
2915 <class 'str'>
3035 <class 'str'>
3097 <class 'str'>
3974 <class 'str'>
3983 <class 'str'>
4424 <class 'str'>

My Goal: Access the coordinates key and value in the dictionary. That’s it. But it’s a str

I converted the strings to dictionaries using eval.

new = df.Coordinates.apply(eval)
130 {'coordinates': [-43.301755, -22.990065], 'typ...
278 {'coordinates': [-51.17913026, -30.01201896], ...
425 {'coordinates': [-43.17986794, -22.91000096], ...
440 {'coordinates': [-51.16376782, -29.95488677], ...
877 {'coordinates': [-43.17986794, -22.91000096], ...
1313 {'coordinates': [-49.72688407, -29.33757253], ...
1734 {'coordinates': [-43.574057, -22.928059], 'typ...
1817 {'coordinates': [-43.77649254, -22.86940539], ...
1835 {'coordinates': [-43.17132318, -22.90895217], ...
2739 {'coordinates': [-43.1958313, -22.98755333], '...
2915 {'coordinates': [-43.17986794, -22.91000096], ...
3035 {'coordinates': [-51.01583481, -29.63593292], ...
3097 {'coordinates': [-43.17891379, -22.96476163], ...
3974 {'coordinates': [-8.63388008, 41.14594453], 't...
3983 {'coordinates': [-46.64960938, -23.55902666], ...
4424 {'coordinates': [-43.17986794, -22.91000096], ...

Next I text the type of object and get:

130      <class 'dict'>
278 <class 'dict'>
425 <class 'dict'>
440 <class 'dict'>
877 <class 'dict'>
1313 <class 'dict'>
1734 <class 'dict'>
1817 <class 'dict'>
1835 <class 'dict'>
2739 <class 'dict'>
2915 <class 'dict'>
3035 <class 'dict'>
3097 <class 'dict'>
3974 <class 'dict'>
3983 <class 'dict'>
4424 <class 'dict'>

If I try to access my dictionaries:
new.apply(lambda x: x[‘coordinates’]

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-71-c0ad459ed1cc> in <module>()
----> 1 dfCombined.Coordinates.apply(coord_getter)
/Users/linwood/anaconda/envs/dataAnalysisWithPython/lib/python3.5/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
2218 else:
2219 values = self.asobject
-> 2220 mapped = lib.map_infer(values, f, convert=convert_dtype)
2221
2222 if len(mapped) and isinstance(mapped[0], Series):
pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62658)()
<ipython-input-68-748ce2d8529e> in coord_getter(row)
1 import ast
2 def coord_getter(row):
----> 3 return (ast.literal_eval(row))['coordinates']
TypeError: 'bool' object is not subscriptable

It’s some type of class, because when I run dir I get this for one object:

new.apply(lambda x: dir(x))[130]
130 __class__
130 __contains__
130 __delattr__
130 __delitem__
130 __dir__
130 __doc__
130 __eq__
130 __format__
130 __ge__
130 __getattribute__
130 __getitem__
130 __gt__
130 __hash__
130 __init__
130 __iter__
130 __le__
130 __len__
130 __lt__
130 __ne__
130 __new__
130 __reduce__
130 __reduce_ex__
130 __repr__
130 __setattr__
130 __setitem__
130 __sizeof__
130 __str__
130 __subclasshook__
130 clear
130 copy
130 fromkeys
130 get
130 items
130 keys
130 pop
130 popitem
130 setdefault
130 update
130 values
Name: Coordinates, dtype: object

My Problem: I just want to access the dictionary. But, the object is <class 'dict'>. How do I covert this to a regular dict or just access the key:value pairs?

Any ideas??