runtime error with python dictionary while using defaultdict - python

I am using a dictionary to add key and values in it. I am checking if the key is already present, and if yes, I am appending the value; if not I add a key and the corresponding value.
I am getting the error message:
AttributeError: 'str' object has no attribute 'append'
Here is the code. I am reading a CSV file:
metastore_dir = collections.defaultdict(list)
with open(local_registry_file_path + data_ext_dt + "_metastore_metadata.csv",'rb') as metastore_metadata:
for line in metastore_metadata:
key = line[2]
key = key.lower().strip()
if (key in metastore_dir):
metastore_dir[key].append(line[0])
else:
metastore_dir[key] = line[0]
I found the answer on stack overflow which says to use defaultdict to resolve the issue, i am getting the error message even after the suggested anwer.
I have pasted my code for reference.

The str type has no append() method.
Replace your call to append with the + operator:
sentry_dir[key] += line[1]

It is a dictionary of strings. To declare it as a list use
if (key not in metastore_dir): ## add key first if not in dict
metastore_dir[key] = [] ## empty list
metastore_dir[key].append(line[0])
""" with defaultdict you don't have to add the key
i.e. "if key in" not necessary
"""
metastore_dir[key].append(line[0])

When you insert a new item into the dictionary, you want to insert it as a list:
...
if (key in metastore_dir):
metastore_dir[key].append(line[0])
else:
metastore_dir[key] = [line[0]] # wrapping it in brackets creates a singleton list
On an unrelated note, it looks like you are not correctly parsing the CSV. Trying splitting each line by commas (e.g. line.split(',')[2] refers to the third column of a CSV file). Otherwise line[0] refers to the first character of the line and line[2] refers to the third character of the line, which I suspect is not what you want.

Related

How to extract all occurrences of a JSON object that share a duplicate key:value pair?

I am writing a python script that reads a large JSON file containing data from an API, and iterates through all the objects. I want to extract all objects that have a specific matching/duplicate "key:value", and save it to a separate JSON file.
Currently, I have it almost doing this, however the one flaw in my code that I cannot fix is that it skips the first occurrence of the duplicate object, and does not add it to my dupObjects list. I have an OrderedDict keeping track of unique objects, and a regular list for duplicate objects. I know this means that when I add the second occurrence, I must add the first (unique) object, but how would I create a conditional statement that only does this once per unique object?
This is my code at the moment:
import collections import OrderedDict
import json
with open('input.json') as data:
data = json.load(data)
uniqueObjects = OrderedDict()
dupObjects = list()
for d in data:
value = d["key"]
if value in uniqueObjects:
# dupObjects.append(uniqueObjects[hostname])
dupHostnames.append(d)
if value not in uniqueObjects:
uniqueObjects[value] = d
with open('duplicates.json', 'w') as g:
json.dump(dupObjects, g, indent=4)
Where you see that one commented line is where I tried to just add the object from the OrderedList to my list, but that causes it to add it as many times as there are duplicates. I only want it to add it one time.
Edit:
There are several unique objects that have duplicates. I'm looking for some conditional statement that can add the first occurrence of an object that has duplicates, once per unique object.
You could group by key.
Using itertools:
def by_key(element):
return ["key"]
grouped_by_key = itertools.groupby(data, key_func=by_key)
Then is just a matter of finding groups that have more than one element.
For details check: https://docs.python.org/3/howto/functional.html#grouping-elements
In this line you forgot .keys(), so you skip need values
if value in uniqueObjects.keys():
And this line
if value not in uniqueObjects.keys():
Edit #1
My mistake :)
You need to add first duplicate object from uniqueObjects in first if
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1
dupHostnames.append(d)
Edit #2
Try this option, it will write only the first occurrence in duplicates
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1

Python : AttributeError: 'int' object has no attribute 'append'

I have a dict of int, list. What I'm trying to do is loop through `something' and if the key is present in the dict add the item to the lsit or else create a new list and add the item.
This is my code.
levels = {}
if curr_node.dist in levels:
l = levels[curr_node.dist]
l.append(curr_node.tree_node.val)...........***
else:
levels[curr_node.dist] = []
levels[curr_node.dist].append(curr_node.tree_node.val)
levels[curr_node.dist] = curr_node.tree_node.val
My question is two-fold.
1. I get the following error,
Line 27: AttributeError: 'int' object has no attribute 'append'
Line 27 is the line marked with ***
What am I missing that's leading to the error.
How can I run this algorithm of checking key and adding to a list in a dict more pythonically.
You set a list first, then replace that list with the value:
else:
levels[curr_node.dist] = []
levels[curr_node.dist].append(curr_node.tree_node.val)
levels[curr_node.dist] = curr_node.tree_node.val
Drop that last line, it breaks your code.
Instead of using if...else, you could use the dict.setdefault() method to assign an empty list when the key is missing, and at the same time return the value for the key:
levels.setdefault(curr_node.dist, []).append(curr_node.tree_node.val)
This one line replaces your 6 if: ... else ... lines.
You could also use a collections.defaultdict() object:
from collections import defaultdict
levels = defaultdict(list)
and
levels[curr_node.dist].append(curr_node.tree_node.val)
For missing keys a list object is automatically added. This has a downside: later code with a bug in it that accidentally uses a non-existing key will get you an empty list, making matters confusing when debugging that error.

Cannot append string to dictionary key

I've been programming for less than four weeks and have run into a problem that I cannot figure out. I'm trying to append a string value to an existing key with an existing string stored in it but if any value already exists in the key I get "str object has no attribute 'append'.
I've tried turning the value to list but this also does not work. I need to use the .append() attribute because update simply replaces the value in clientKey instead of appending to whatever value is already stored. After doing some more research, I understand now that I need to somehow split the value stored in clientKey.
Any help would be greatly appreciated.
data = {}
while True:
clientKey = input().upper()
refDate = strftime("%Y%m%d%H%M%S", gmtime())
refDate = refDate[2 : ]
ref = clientKey + refDate
if clientKey not in data:
data[clientKey] = ref
elif ref in data[clientKey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
break
You can't .append() to a string because a string is not mutable. If you want your dictionary value to be able to contain multiple items, it should be a container type such as a list. The easiest way to do this is just to add the single item as a list in the first place.
if clientKey not in data:
data[clientKey] = [ref] # single-item list
Now you can data[clientkey].append() all day long.
A simpler approach for this problem is to use collections.defaultdict. This automatically creates the item when it's not there, making your code much simpler.
from collections import defaultdict
data = defaultdict(list)
# ... same as before up to your if
if clientkey in data and ref in data[clientkey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
You started with a string value, and you cannot call .append() on a string. Start with a list value instead:
if clientKey not in data:
data[clientKey] = [ref]
Now data[clientKey] references a list object with one string in it. List objects do have an append() method.
If you want to keep appending to the string you can use data[clientKey]+= ref

Add multiple values to dictionary

Here is my code:
for response in responses["result"]:
ids = {}
key = response['_id'].encode('ascii')
print key
for value in response['docs']:
ids[key].append(value)
Traceback:
File "people.py", line 47, in <module>
ids[key].append(value)
KeyError: 'deanna'
I am trying to add multiple values to a key. Throws an error like above
Check out setdefault:
ids.setdefault(key, []).append(value)
It looks to see if key is in ids, and if not, sets that to be an empty list. Then it returns that list for you to inline call append on.
Docs:
http://docs.python.org/2/library/stdtypes.html#dict.setdefault
If I'm reading this correctly your intention is to map the _id of a response to its docs. In that case you can bring down everything you have above to a dict comprehension:
ids = {response['_id'].encode('ascii'): response['docs']
for response in responses['result']}
This also assumes you meant to have id = {} outside of the outermost loop, but I can't see any other reasonable interpretation.
If the above is not correct,
You can use collections.defaultdict
import collections # at top level
#then in your loop:
ids = collections.defaultdict(list) #instead of ids = {}
A dictionary whose default value will be created by calling the init argument, in this case calling list() will produce an empty list which can then be appended to.
To traverse the dictionary you can iterate over it's items()
for key, val in ids.items():
print(key, val)
The reason you're getting a KeyError is this: In the first iteration of your for loop, you look up the key in an empty dictionary. There is no such key, hence the KeyError.
The code you gave will work, if you first insert an empty list into the dictionary under to appropriate key. Then append the values to the list. Like so:
for response in responses["result"]:
ids = {}
key = response['_id'].encode('ascii')
print key
if key not in ids: ## <-- if we haven't seen key yet
ids[key] = [] ## <-- insert an empty list into the dictionary
for value in response['docs']:
ids[key].append(value)
The previous answers are correct. Both defaultdict and dictionary.setdefault are automatic ways of inserting the empty list.

Error with Python dictionary: str object has no attribute append

I am writing code in python.
My input line is "all/DT remaining/VBG all/NNS of/IN "
I want to create a dictionary with one key and multiple values
For example - all:[DT,NNS]
groupPairsByKey={}
Code:
for line in fileIn:
lineLength=len(line)
words=line[0:lineLength-1].split(' ')
for word in words:
wordPair=word.split('/')
if wordPair[0] in groupPairsByKey:
groupPairsByKey[wordPair[0]].append(wordPair[1])
<getting error here>
else:
groupPairsByKey[wordPair[0]] = [wordPair[1]]
Your problem is that groupPairsByKey[wordPair[0]] is not a list, but a string!
Before appending value to groupPairsByKey['all'], you need to make the value a list.
Your solution is already correct, it works perfectly in my case. Try to make sure that groupPairsByKey is a completely empty dictionary.
By the way, this is what i tried:
>>> words = "all/DT remaining/VBG all/NNS of/IN".split
>>> for word in words:
wordPair = word.split('/')
if wordPair[0] in groupPairsByKey:
groupPairsByKey[wordPair[0]].append(wordPair[1])
else:
groupPairsByKey[wordPair[0]] = [wordPair[1]]
>>> groupPairsByKey
{'of': ['IN'], 'remaining': ['VBG'], 'all': ['DT', 'NNS']}
>>>
Also, if your code is formatted like the one you posted here, you'll get an indentationError.
Hope this helps!
Although it looks to me like you should be getting an IndentationError, if you are getting the message
str object has no attribute append
then it means
groupPairsByKey[wordPair[0]]
is a str, and strs do not have an append method.
The code you posted does not show how
groupPairsByKey[wordPair[0]]
could have a str value. Perhaps put
if wordPair[0] in groupPairsByKey:
if isinstance(groupPairsByKey[wordPair[0]], basestring):
print('{}: {}'.format(*wordPair))
raise Hell
into your code to help track down the culprit.
You could also simplify your code by using a collections.defaultdict:
import collections
groupPairsByKey = collections.defaultdict(list)
for line in fileIn:
lineLength=len(line)
words=line[0:lineLength-1].split(' ')
for word in words:
wordPair=word.split('/')
groupPairsByKey[wordPair[0]].append(wordPair[1])
When you access a defaultdict with a missing key, the factory function -- in this case list -- is called and the returned value is used as the associated value in the defaultdict. Thus, a new key-value pair is automatically inserted into the defaultdict whenever it encounters a missing key. Since the default value is always a list, you won't run into the error
str object has no attribute append anymore -- unless you have
code which reassigns an old key-value pair to have a new value which is a str.
You can do:
my_dict["all"] = my_string.split('/')
in Python,

Categories