I try to write script for deleting JSON fragment.
Currently I stopped with deleting key and value.
I get key error 0:
File "<stdin>", line 4, in <module>
KeyError: 0
I use json module and Python 2.7.
My sample json file is this:
"1": {
"aaa": "234235",
"bbb": "sfd",
"date": "01.01.2022",
"ccc": "456",
"ddd": "dghgdehs"
},
"2": {
"aaa": "544634436",
"bbb": "rgdfhfdsh",
"date": "01.01.2022",
"ccc": "etw",
"ddd": "sgedsry"
}
And faulty code is this:
import json
obj = json.load(open("aaa.json"))
for i in xrange(len(obj)):
if obj[i]["date"] == "01.01.2022":
obj.pop(i)
break
What I do wrong here?
i will take on the integer values 0, 1, but your object is a dictionary with string keys "1", "2". So iterate over the keys instead, which is simply done like this:
for i in obj:
if obj[i]["date"] == "01.01.2022":
obj.pop(i)
break
In your loop, range yields integers, the first being 0. The is no integer as key in your json so this immediately raises a KeyError.
Instead, loop over obj.items() which yields key-value pairs. Since some of your entries are not dict themselves, you will need to be careful with accessing obj[i]['date'].
if isinstance(v, dict) and v.get("date") == "01.01.2022":
obj.pop(k)
break
The way you're reading it in, obj is a dict. You're trying to access it as a list, with integer indices. This code:
for i in range(len(obj)):
if obj[i]["date"] == "Your Date":
...
First calls obj[0]["date"], then obj[1]["date"], and so on. Since obj is not a list, 0 here is interpreted here as a key - and since obj doesn't have a key 0, you get a KeyError.
A better way to do this would be to iterate through the dict by keys and values:
for k, v in obj.items():
if v["date"] == "your date": # index using the value
obj.pop(k) # delete the key
Related
I have the dictionary that I got from a .txt file.
dictOne = {
"AAA": 0,
"BBB": 1,
"AAA": 3,
"BBB": 1,
}
I would like to generate a new dictionary called dictTwo with the sum of values of equal keys. Result:
dictTwo = {
"AAA": 3,
"BBB": 2,
}
I prepared the following code, but it points to error syntax (SyntaxError: invalid syntax):
import json
dictOne = json.loads(text)
dictTwo = {}
for k, v in dictOne.items():
dictTwo [k] = v += v
Can anyone help me what error?
Assuming you resolve the duplicate key issue in dict
dictOne = {
"AAA": 0,
"BBB": 1,
"AAA": 3,
"BBB": 1
}
dictTwo = {
"AAA": 3,
"BBB": 2,
}
for k, v in dictOne.items():
if k in dictTwo:
dictTwo [k] += v
else:
dictTwo[k] = v
print(dictTwo)
You can do this if you do it while reading the JSON input.
JSON permits duplicate keys in objects, although it discourages the practice, noting that different JSON processors produce different results for duplicate keys.
Python does not allow duplicate keys in dictionaries, and Python's json module handles duplicate keys in one of the ways noted by the JSON standard: it ignores all but the last value for any such key. However, it gives you a mechanism to do your own processing of objects, in case you want to do something else with duplicate keys (or produce something other than a dictionary).
You do this by providing the object_pairs_hook parameter to json.load or json.loads. That parameter should be a function whose argument is an iterable of (key, value) pairs, where the key is a string and the value is an already processed JSON object. Whatever the function returns will be the value used by json.load for an object literal; it does not need to return a dict.
That implies that the handling of duplicate keys will be the same for every object literal in the JSON input, which is a bit of a limitation, but it may be acceptable in your case.
Here's a simple example:
import json
def key_combiner(pairs):
rv = {}
for k, v in pairs:
if k in rv: rv[k] += v
else: rv[k] = v
return rv
# Sample usage:
# (Note: JSON doesn't allow trailing commas in objects or lists.)
json_data = '''{
"AAA": 0,
"BBB": 1,
"AAA": 3,
"BBB": 1
}'''
consolidated = json.loads(json_data, object_pairs_hook=key_combiner)
print(consolidated)
This prints {'AAA': 3, 'BBB': 2}.
If I'd known that the values were numbers, I could have used a slightly simpler definition using defaultdict. Writing it the way I did permits combining certain other value types, such as strings or arrays, provided that all the values for the same key in an object are the same type. (Unfortunately, it doesn't allow combining objects, because Python uses | to combine two dicts, instead of +.)
This feature was mostly intended to be used for creating class instances from json objects, but it has many other possible uses.
my data is
my_dict = {
u'samosa': {
u'shape': u'triangle',
u'taste': None,
u'random': None,
u'salt': u'7.5.1'
},
u'idli': {
u'color': u'red',
u'eattime': u'134'
},
u'ridgegaurd': {},
u'sambhar': {
u'createdate': u'2016-05-12',
u'end time': u'10655437'
}
}
There are four keys samosa, idli, ridgegaurd and sambhar.
I don't want whole part of the values. I just want to get
value(shape) from samosa,
values(color and eattime) from idli
values(createdate and endtime) from sambhar
I want only the above values. I tried using dict but was not able to. Is it possible to write regular expressions for this?
If the value of a dictionary entry is another dictionary, you can simply index it again.
my_dict[u'samosa'][u'shape']
my_dict[u'idli'][u'color'], my_dict[u'idli'][u'eattime']
my_dict[u'sambhar'][u'createdate'], my_dict[u'sambhar'][u'endtime']
This function will recursively run through json and return a single dictionary with all the information, thus making it much easier to access your data:
def flatten(data):
flattened = {}
for k, v in data.items():
if isinstance(v, dict):
flattened.update(flatten(v))
else:
flattened[k] = v
return flattened
I have this json data I need to iterate through. The general format of the json data is
{
"Name" : "Bob"
"value" : "1100"
"morestuff" : "otherstuff"
"otherTermResults" : {
"stuff" : "morestuff"
"things" : "thisandthat"
"value" : "1200"
}
"value" : "1300"
....
....
....
} // end
As you can see there are 3 fields named "value". In python i can access the first 2 with
line_object = json.loads(line)
value1 = line_object["value"] //gets me 1000
value2 = line_object["otherTermResults"][0]["value"] // gets me 1200
This reliable gets me the first 2 "value" fields. I dont know how to get the 3rd "value" the one reading 1300. In addition the json data im working with may have an "n" unknown duplicates of "value" not nested in a subfield just for one name, in this case "bob". I read a few things saying you have to access the correct index but that was for jquery. In python
json.loads(line)
only loads in the first field that matches the "value". how do i resolve this issue in python? Do i need to switch to another language?
json.loads takes an object_pairs_hook that you can use to customize behavior. For example, to collect the duplicate values into a list you might do something like:
import json
def find_value(ordered_pairs):
d = {}
for k, v in ordered_pairs:
if k == "value":
d.setdefault(k, []).append(v)
else:
d[k] = v
return d
json.loads(raw_post_data, object_pairs_hook=find_value)
I have a json file that contains about 100,000 lines in the following format:
{
"00-0000045": {
"birthdate": "5/18/1975",
"college": "Michigan State",
"first_name": "Flozell",
"full_name": "Flozell Adams",
"gsis_id": "00-0000045",
"gsis_name": "F.Adams",
"height": 79,
"last_name": "Adams",
"profile_id": 2499355,
"profile_url": "http://www.nfl.com/player/flozelladams/2499355/profile",
"weight": 338,
"years_pro": 13
},
"00-0000108": {
"birthdate": "12/9/1974",
"college": "Louisville",
"first_name": "David",
"full_name": "David Akers",
"gsis_id": "00-0000108",
"gsis_name": "D.Akers",
"height": 70,
"last_name": "Akers",
"number": 2,
"profile_id": 2499370,
"profile_url": "http://www.nfl.com/player/davidakers/2499370/profile",
"weight": 200,
"years_pro": 16
}
}
I am trying to delete all the items that do not have a gsis_name property. So far I have this python code, but it does not delete any values (note: I do not want to overwrite the original file)
import json
with open("players.json") as json_file:
json_data = json.load(json_file)
for x in json_data:
if 'gsis_name' not in x:
del x
print json_data
You're deleting x, but x is a copy of the original element in json_data; deleting x won't actually delete it from the object that it was drawn from.
In Python, if you want to filter some items out of a collection your best bet is to copy the items you do want into a new collection.
clean_data = {k: v for k, v in json_data.items() if 'gsis_name' in v}
and then write clean_data to a file with json.dump.
When you say del x, you are unassigning the name x from your current scope (in this case, global scope, since the delete is not in a class or function).
You need to delete it from the object json_data. json.load returns a dict because your main object is an associative array / map / Javascript object. When you iterate a dict, you are iterating over the keys, so x is a key (e.g. "00-0000108"). This is a bug: You want to check whether the value has the key gsis_name.
The documentation for dict shows you how to delete from a dict using the key: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
del d[key]
Remove d[key] from d. Raises a KeyError if key is not in the map.
But as the other answers say, it's better to create a new dict with the objects you want, rather than removing the objects you don't want.
Just create new dict without unwanted elements:
res = dict((k, v) for k, v in json_data.iteritems() if 'gsis_name' in json_data[k])
Since Python 2.7 you could use a dict comprehension.
I would like to parse a JSON file and print source in this code fragment :
{
"trailers": {
"quicktime": [],
"youtube": [
{
"source": "mmNhzU6ySL8",
"type": "Trailer",
"name": "Trailer 1",
"size": "HD"
},
{
"source": "CPTIgILtna8",
"type": "Trailer",
"name": "Trailer 2",
"size": "Standard"
}
],
"id": 27205
},
I wrote this code :
for item in j:
if item['trailers']:
e = item['trailers']
for k,value in e.iteritems():
if k == "youtube":
for innerk, innerv in k.iteritems():
if innerk == "source" :
print innerv
unfortunately I can't resolve this error :
for innerk, innerv in k.iteritems():
AttributeError: 'unicode' object has no attribute 'iteritems'
Assuming the JSON is formatted properly, the problem is that your code includes this check:
if k == "youtube":
for innerk, innerv in k.iteritems():
Given that you just asked for k to be "youtube" (an instance of str or unicode), it won't make sense to expect k to have an iteritems method.
I believe instead you are expecting the associated dict that would have come along with k, something like this:
if k == "youtube":
for innerk, innerv in value.iteritems():
I'm noticing from your JSON, though, that it looks like you should expect multiple dict variables to be loaded as the list-typed value for the case when k == "youtube". In that case, you'll need to iterate over those elements first, asking for each one's iteritems separately:
if k == "youtube":
for each_dict in value:
for innerk, innerv in each_dict.iteritems():
or something along those lines. The final full code would be:
for item in j:
if item['trailers']:
e = item['trailers']
for k,value in e.iteritems():
if k == "youtube":
for each_dict in value:
for innerk, innerv in each_dict.iteritems():
if innerk == "source" :
print innerv
Aside from the first-order question, you should also take a look at the dict type's built-in method get, which allows you to safely get items from a dictionary and handle the case when they are missing gracefully. In your code, when you say if item['trailers']: this may not behave the way you expect.
First, if trailers is not a key to the dictionary, it will generate a KeyError instead of just skipping that conditional block. Secondly, if the value stored for the key value trailers evaluates to False in a bool context, the conditional block will also be skipped, even if you had wanted to handle it differently (for example, suppose that None is a sentinel value signaling that there is no data for trailers in that case, but it's due to a specific error that you want to log.
Meanwhile, if it's just an empty dict then that does mean you should simply skip the conditional block). This may not matter much in one-off data exploration, but in general it's good to become automatically conditioned to avoid these sorts of pitfalls, especially when the built-in types themselves make it very easy to handle things more gracefully.
Given all of this, a more Pythonic approach might be as follows:
for item in j:
y_tube = item.get('trailers', {}).get("youtube", [])
for each_dict in y_tube:
print each_dict.get("source", "Warning: no entry found for 'source'")
Look at this line:
for k,value in e.iteritems()
So clearly, k is a key (a unicode string in your case). You clearly know this, with your comparison of if k == "youtube".
Unicode strings don't have the iteritems() method.
I have a feeling that what you're looking for is this:
for k,value in e.iteritems()
for innerk,innerv in value.iteritems():
# do stuff