Can't iterate over my own object - python

I am new to Python and can't figure this out. I am trying to make an object from a json feed. I am trying to basically make a dictionary for each item in the json fed that has every property. The error I get is either TypeError: 'mediaObj' object is not subscriptable or not iterable
For bonus points, the array has many sub dictionaries too. What I would like is to be able to access that nested data as well.
Here is my code:
url = jsonfeedwithalotofdata.com
data = urllib.request.urlopen(url)
data = instagramData.read()
data = instagramData.decode('utf-8')
data = json.loads(data)
media = data['data']
class mediaObj:
def __init__(self, item):
for key in item:
setattr(self, key, item[key])
print(self[key])
def run(self):
return self['id']
for item in media:
mediaPiece = mediaObj(item)
This would come from a json feed that looks as follows (so data is the array that comes after "data"):
"data": [
{
"attribution": null,
"videos": {},
"tags": [],
"type": "video",
"location": null,
"comments": {},
"filter": "Normal",
"created_time": "1407423448461",
"link": "http://instagram.com/p/rabdfdIw9L7D-/",
"likes": {},
"images": {},
"users_in_photo": [],
"caption": {},
"user_has_liked": true,
"id": "782056834879232959294_1051813051",
"user": {}
}
So my hope was that I could create an object for every item in the array, and then I could, for instance, say:
print(mediaPiece['id'])
or even better
print(mediaPiece['comments'])
And see a list of comments. Thanks a million

You're having a problem because you're using attributes to store your data items, but using list/dictionary lookup syntax to try to retrieve them.
Instead of print(self[key]), use print(getattr(self, key)), and instead of return self['id'] use return self.id.

Related

How do I extract a list item from nested json in Python?

I have a json object and I'm trying to extract a couple of values from a nested list. Then print them in markup. I'm getting and error - AttributeError: 'list' object has no attribute 'get'
I understand that it's a list and I can't preform a get. I've been searching for the proper method for a few hours now and I'm running out of steam. I'm able to get the Event, but not Value1 and Value2.
This is the json object
{
"resource": {
"data": {
"event": "qwertyuiop",
"eventVersion": "1.05",
"parameters": {
"name": "sometext",
"othername": [
""
],
"thing": {
"something": {
"blah": "whatever"
},
"abc": "123",
"def": {
"xzy": "value"
}
},
"something": [
"else"
]
},
"whatineed": [{
"value1": "text.i.need",
"value2": "text.i.need.also"
}]
}
}
}
And this is my function
def parse_json(json_data: dict) -> Info:
some_data = json_data.get('resource', {})
specific_data = some_data.get('data', {})
whatineed_data = specific_data.get('whatineed', {})
formatted_json = json.dumps(json_data, indent=2)
description = f'''
h3. Details
*Event:* {some_data.get('event')}
*Value1:* {whatineed_data('value1')}
*Value2:* {whatineed_data('value2')}
'''
From the data structure, whatineed is a list with a single item, which in turn is a dictionary. So, one way to access it would be:
whatineed_list = specific_data.get('whatineed', [])
whatineed_dict = whatineed_list[0]
At this point you can do:
value1 = whatineed_dict.get('value1')
value2 = whatineed_dict.get('value2')
You can change your function to the following:
def parse_json(json_data: dict) -> Info:
some_data = json_data.get('resource')
specific_data = some_data.get('data', {})
whatineed_data = specific_data.get('whatineed', {})
formatted_json = json.dumps(json_data, indent=2)
description = '''
h3. Details
*Event:* {}
*Value1:* {}
*Value2:* {}
'''.format(some_data.get('data').get('event'),whatineed_data[0]['value1'], whatineed_data[0]['value2'])
Since whatineed_data is a list, you need to index the element first
Python handles json as strings unless they are coming directly from a file. This could be the source for some of your problems. Also this article might help.
Assuming that "whatineed" attribute is really a list, and it's elements are dicts, you can't call whatineed.get asking for Value1 or Value2 as if they are attributes, because it is a list and it don't have attributes.
So, you have two options:
If whatineed list has a single element ever, you can access this element directly and than access the element attributes:
element = whatineed[0]
v1 = element.get('value1', {})
v2 = element.get('value2', {})
Or, if whatineed list can have more items, so, you will need to iterate over this list and access those elements:
for element in whatineed:
v1 = element.get('value1', {})
v2 = element.get('value2', {})
## Do something with values

TypeError: string indices must be integers // working with JSON as dict in python

Okay, so I've been banging my head on this for the last 2 days, with no real progress. I am a beginner with python and coding in general, but this is the first issue I haven't been able to solve myself.
So I have this long file with JSON formatting with about 7000 entries from the youtubeapi.
right now I want to have a short script to print certain info ('videoId') for a certain dictionary key (refered to as 'key'):
My script:
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key']['Items']['id']['videoId'])
# print(trailers['key']['videoId'] gives same response
Error:
print(trailers['key']['Items']['id']['videoId'])
TypeError: string indices must be integers
It does work when I want to print all the information for the dictionary key:
This script works
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key'])
Also print(type(trailers)) results in class 'dict', as it's supposed to.
My JSON File is formatted like this and is from the youtube API, youtube#searchListResponse.
{
"kind": "youtube#searchListResponse",
"etag": "",
"nextPageToken": "",
"regionCode": "",
"pageInfo": {
"totalResults": 1000000,
"resultsPerPage": 1
},
"items": [
{
"kind": "youtube#searchResult",
"etag": "",
"id": {
"kind": "youtube#video",
"videoId": ""
},
"snippet": {
"publishedAt": "",
"channelId": "",
"title": "",
"description": "",
"thumbnails": {
"default": {
"url": "",
"width": 120,
"height": 90
},
"medium": {
"url": "",
"width": 320,
"height": 180
},
"high": {
"url": "",
"width": 480,
"height": 360
}
},
"channelTitle": "",
"liveBroadcastContent": "none"
}
}
]
}
What other information is needed to be given for you to understand the problem?
The following code gives me all the videoId's from the provided sample data (which is no id's at all in fact):
import json
with open('sampledata', 'r') as datafile:
data = json.loads(datafile.read())
print([item['id']['videoId'] for item in data['items']])
Perhaps you can try this with more data.
Hope this helps.
I didn't really look into the youtube api but looking at the code and the sample you gave it seems you missed out a [0]. Looking at the structure of json there's a list in key items.
import json
f = open ('json1.json', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['items'][0]['id']['videoId'])
I've not used json before at all. But it's basically imported in the form of dicts with more dicts, lists etc. Where applicable. At least from my understanding.
So when you do type(trailers) you get type dict. Then you do dict with trailers['key']. If you do type of that, it should also be a dict, if things work correctly. Working through the items in each dict should in the end find your error.
Pythons error says you are trying find the index/indices of a string, which only accepts integers, while you are trying to use a dict. So you need to find out why you are getting a string and not dict when using each argument.
Edit to add an example. If your dict contains a string on key 'item', then you get a string in return, not a new dict which you further can get a dict from. item in the json for example, seem to be a list, with dicts in it. Not a dict itself.

List Indices in json in Python

I've got a json file that I've pulled from a web service and am trying to parse it. I see that this question has been asked a whole bunch, and I've read whatever I could find, but the json data in each example appears to be very simplistic in nature. Likewise, the json example data in the python docs is very simple and does not reflect what I'm trying to work with. Here is what the json looks like:
{"RecordResponse": {
"Id": blah
"Status": {
"state": "complete",
"datetime": "2016-01-01 01:00"
},
"Results": {
"resultNumber": "500",
"Summary": [
{
"Type": "blah",
"Size": "10000000000",
"OtherStuff": {
"valueOne": "first",
"valueTwo": "second"
},
"fieldIWant": "value i want is here"
The code block in question is:
jsonFile = r'C:\Temp\results.json'
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Summary"]:
print(i["fieldIWant"])
Not only am I not getting into the field I want, but I'm also getting a key error on trying to suss out "Summary".
I don't know how the indices work within the array; once I even get into the "Summary" field, do I have to issue an index manually to return the value from the field I need?
The example you posted is not valid JSON (no commas after object fields), so it's hard to dig in much. If it's straight from the web service, something's messed up. If you did fix it with proper commas, the "Summary" key is within the "Results" object, so you'd need to change your loop to
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Results"]["Summary"]:
print(i["fieldIWant"])
If you don't know the structure at all, you could look through the resulting object recursively:
def findfieldsiwant(obj, keyname="Summary", fieldname="fieldIWant"):
try:
for key,val in obj.items():
if key == keyname:
return [ d[fieldname] for d in val ]
else:
sub = findfieldsiwant(val)
if sub:
return sub
except AttributeError: #obj is not a dict
pass
#keyname not found
return None

How do I parse this data with fabric?

I use fabric perform scrapyd task, the server returns the id of the task being performed, but I want to get this code below and put all id in list, but when I use r.status , there occur error: '_AttributeString' object has no attribute 'status' error, how do I get all id?the code blow:
#task
def stop_slave_machine(slave_ip = None):
jobs_id = []
with cd("/spider/distributed/wzws"):
if not None:
r = local("curl http://%s:%s/listjobs.json?project=WzwsSpider" % (slave_ip, scrapyd_port))
print(r.status)
and the server return data:
{"status": "ok", "running": [{"start_time": "2016-03-28 18:21:21.951943", "id": "d10eae6cf4ce11e5a6646cae8b23c5da", "spider": "wzws"}, {"start_time": "2016-03-28 18:21:26.945244", "id": "d11a47f4f4ce11e5a6646cae8b23c5da", "spider": "wzws"}, {"start_time": "2016-03-28 18:21:31.941162", "id": "d12320ccf4ce11e5a6646cae8b23c5da", "spider": "wzws"}, {"start_time": "2016-03-28 18:21:36.941122", "id": "d12975b2f4ce11e5a6646cae8b23c5da", "spider": "wzws"}, {"start_time": "2016-03-28 18:21:41.941010", "id": "d131096cf4ce11e5a6646cae8b23c5da", "spider": "wzws"}], "finished": [], "pending": [], "node_name": "XXXXXXX"}
That's a JSON body being returned. You can use python's json library to turn the response into a python object. From there you can iterate over the list of "running" to extract out the id for each.
Something like this:
from json import loads
# turn r into a python object as long as r is a string (hence loads not load)
returned = loads(r)
# Make a list ids from a list comprehension where we pull out the value
# id from each item in the list 'running' from the object returned
ids = [ r["id"] for r in returned["running"] ]

Python ---- TypeError: string indices must be integers

I have the below Python code
from flask import Flask, jsonify, json
app = Flask(__name__)
with open('C:/test.json', encoding="latin-1") as f:
dataset = json.loads(f.read())
#app.route('/api/PDL/<string:dataset_identifier>', methods=['GET'])
def get_task(dataset_identifier):
global dataset
dataset = [dataset for dataset in dataset if dataset['identifier'] == dataset_identifier]
if len(task) == 0:
abort(404)
return jsonify({'dataset': dataset})
if __name__ == '__main__':
app.run(debug=True)
Test.json looks like this:
{
"dataset": [{
"bureauCode": [
"016:00"
],
"description": "XYZ",
"contactPoint": {
"fn": "AG",
"hasEmail": "mailto:AG#AG.com"
},
"distribution": [
{
"format": "XLS",
"mediaType": "application/vnd.ms-excel",
"downloadURL": "https://www.example.com/xyz.xls"
}
],
"programCode": [
"000:000"
],
"keyword": [ "return to work",
],
"modified": "2015-10-14",
"title": "September 2015",
"publisher": {
"name": "abc"
},
"identifier": US-XYZ-ABC-36,
"rights": null,
"temporal": null,
"describedBy": null,
"accessLevel": "public",
"spatial": null,
"license": "http://creativecommons.org/publicdomain/zero/1.0/",
"references": [
"http://www.example.com/example.html"
]
}
],
"conformsTo": "https://example.com"
}
When I pass the variable in the URL like this: http://127.0.0.1:5000/api/PDL/1403
I get the following error: TypeError: string indices must be integers
Knowing that the "identifier" field is a string and I am passing the following in the URL:
http://127.0.0.1:5000/api/PDL/"US-XYZ-ABC-36"
http://127.0.0.1:5000/api/PDL/US-XYZ-ABC-36
I keep getting the following error:
TypeError: string indices must be integers
Any idea on what am I missing here? I am new to Python!
The problem is that you are trying to iterate the dictionary instead of the list of datasources inside it. As a consequence, you're iterating through the keys of the dictionary, which are strings. Additionaly, as it was mentioned by above, you will have problems if you use the same name for the list and the iterator variable.
This worked for me:
[ds for ds in dataset['dataset'] if ds['identifier'] == dataset_identifier]
The problem you have right now is that during iteration in the list comprehension, the very first iteration changes the name dataset from meaning the dict you json.loads-ed to a key of that dict (dicts iterate their keys). So when you try to look up a value in dataset with dataset['identifier'], dataset isn't the dict anymore, it's the str key of you're currently iterating.
Stop reusing the same name to mean different things.
From the JSON you posted, what you probably want is something like:
with open('C:/test.json', encoding="latin-1") as f:
alldata = json.loads(f.read())
#app.route('/api/PDL/<string:dataset_identifier>', methods=['GET'])
def get_task(dataset_identifier):
# Gets the list of data objects from top level object
# Could be inlined into list comprehension, replacing dataset with alldata['dataset']
dataset = alldata['dataset']
# data is a single object in that list, which should have an identifier key
# data_for_id is the list of objects passing the filter
data_for_id = [data for data in dataset if data['identifier'] == dataset_identifier]
if len(task) == 0:
abort(404)
return jsonify({'dataset': data_for_id})

Categories