Add values to dictionary key using loop - python

I have a large database from the following type:
data = {
"2": {"overall": 172, "buy": 172, "name": "ben", "id": 2, "sell": 172},
"3": {"overall": 173, "buy": 173, "name": "dan", "id": 3, "sell": 173},
"4": {"overall": 174, "buy": 174, "name": "josh", "id": 4, "sell": 174},
...
and so on for about 10k rows.
Then, I created a loop to find if inside this dict() there are specific names:
I used the next loop
items = ["ben","josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell'])
However, I'm getting the next error:
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
AttributeError: 'str' object has no attribute 'append'
My goal is to obtain a dict names Database with 2 keys: Buying_Price,Selling_Price where in each one I will have the following:
Buying_Price = {"ben":172,"josh":174}
Sellng_Price = {"ben":172,"josh":174}
Thank you.

There are a couple of issues with the code you posted, so we'll go line by line and fix them:
items = ["ben", "josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell_average'])
In Python, you don't need to define the object type
explicitly and then assign its value, so it means that Database =
dict() is redundant since you already define this to be a
dictionary the line below.
You intend to aggregate your results of the if statement
so both Buying_Price and Selling_Price should be defined as lists and not as strings. You can either do it by assigning a []
value or the literal list().
According to your data structure, you don't have the
buy_average and sell_average keys, only buy and sell so make sure you use the correct keys.
You don't need to re-assign your list value when using the
append() method, it's the object's method so it will update the object in-place.
You didn't set what Name is in your Database object and
yet you're trying to append values to it.
Overall, the code should roughly look like this:
items = ["ben","josh"]
Database = {"Buying_Price": [], "Selling_Price": [], "Name": []}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"].append(mastervalue['name'])
Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"].append(mastervalue['sell'])

It sounds like you want a nested dict.
items = ["ben", "josh"]
new_database = {"Buying_Price": {}, "Selling_Price": {}}
for key, row in data.items():
name = row["name"]
if name in items:
new_database["Buying_Price"][name] = row["buy"]
new_database["Selling_Price"][name] = row["sell"]

In Database = {"Buying_Price": "", "Selling_Price": ""}, you are defining the key Buying_Price as "" : meaning a string. You are trying to use the .append() list method into a string, hence the error 'str' object has no attribute 'append'.
We do not know the output you want, but seeing how you want to compute your data, I suggest you do :
Database = {"Name" : [], "Buying_Price": [], "Selling_Price": []}
instead of the original...
Database = {"Buying_Price": "", "Selling_Price": ""}
This way, you will be able to append your data Name, Buying_Price, and Selling_Price at the same time, and you'll be able to make search and access data of all the arrays using the index of only one.
I haven't paid attention, but you are badly appending your data to your dict.
.append() will work in-place, meaning that you should do :
Database["Name"].append(mastervalue['name'])
instead of
Database["Name"] = Database["Name"].append(mastervalue['name'])

Related

Create dictionary using JSON data

I have a JSON file that has movie data in it. I want to create a dictionary that has the movie title as the key and a count of how many actors are in that movie as the value. An example from the JSON file is below:
{
"title": "Marie Antoinette",
"year": "2006",
"genre": "Drama",
"summary": "Based on Antonia Fraser's book about the ill-fated Archduchess of Austria and later Queen of France, 'Marie Antoinette' tells the story of the most misunderstood and abused woman in history, from her birth in Imperial Austria to her later life in France.",
"country": "USA",
"director": {
"last_name": "Coppola",
"first_name": "Sofia",
"birth_date": "1971"
},
"actors": [
{
"first_name": "Kirsten",
"last_name": "Dunst",
"birth_date": "1982",
"role": "Marie Antoinette"
},
{
"first_name": "Jason",
"last_name": "Schwartzman",
"birth_date": "1980",
"role": "Louis XVI"
}
]
}
I have the following but it's counting all of the actors from all of the movies instead of each movie and the number of actors per movie. I'm not sure how to do this correctly as I'm newer to Python so help would be great.
import json
def actor_count(json_data):
with open("movies_db.json", 'r') as file:
data = json.load(file)
for t in data:
title = [t['title'] for t in data]
for element in data:
for actor in element['actors']:
rolee = [actor['role'] for movie in data for actor in movie['actors']]
len_role = [len(role)]
newD = dict(zip(title, len_role))
print(newD)
json_data = open('movies_db.json')
actor_count(json_data)
You show json that only contains a dictionary, yet you seem to process it as if it were a list of dictionaries with the structure you have shown. Pending clarification, I am answering here as if the latter is true -- you have a list of dictionaries, since you would be asking a different question about a different error if this was not the case.
In your function, each element of data is a dictionary that contains the information for a single movie. To get a dict correlating the title to the count of actors in this movie, you just need to access the "title" key and the length of the "actors" key for each element.
def actor_count(json_data):
movie_actors = {}
for movie in json_data:
title = movie["title"]
num_actors = len(movie["actors"])
movie_actors[title] = num_actors
return movie_actors
Alternatively, use a dictionary comprehension to build this dictionary:
def actor_count(json_data):
movie_actors = {movie["title"]: len(movie["actors"]) movie in json_data}
return movie_actors
Now, load your json file once, and use that when you call actors_count. This will return a dictionary mapping each movie title to the number of actors.
with open("movies_db.json", 'r') as file:
data = json.load(file)
actors_count(data)
Note that loading the json file again in the function is unnecessary, since you already did it before calling the function, and are passing the parsed object to the function.
If you want to keep your current logic of using list comprehensions, and then zipping the resultant lists to create a dict, that is also possible although slightly less efficient. There are significant changes you will need to make:
def actor_count(json_data):
title = [t['title'] for t in json_data]
n_actors = [len(t['actors'] for t in json_data)]
newD = dict(zip(title, n_actors))
return newD
As before, no need to read the file again in the function
You're already looping over all elements in json_data as part of the list comprehension, so no need for another loop outside this.
You can get the number of actors simply by len(t['actors'])
You seem to have misconceptions about how list comprehensions and loops work. A list comprehension is a self-contained loop that builds a list. If you have a list comprehension, there's usually no need to surround it by the same for ... in ... statement that already exists in the comprehension.
def actor_count(json_data):
newD = dict()
with open("movies_db.json", 'r') as file:
data = json.load(file)
for t in data:
if t == 'title':
title_ = json_data[t]
newD[ title_ ] = 0
if t == 'actors':
newD[ title_ ] = len(json_data[t])
print(newD)
Output:
{'Marie Antoinette': 2}

cx_Oracle string representation of to_date() throws an error

I am constructing a dynamic update statement based on values from a dict.
dict1 = {"name": "Test name",
"id": 100,
"location": "",
"custom": "01/01/2020"}
print(dict1)
print()
dict2 = {}
func1 = 'to_date({}, "dd/mm/yyyy")'
for k, v in dict1.items():
if k == 'custom':
v = func1.format(f'"{v}"')
dict2[k] = v
print(dict2)
{'name': 'Test name', 'id': 100, 'location': '', 'custom': 'to_date("01/01/2020", "dd/mm/yyyy")'}
dictionary is built as it's expected because func1 is defined as a string. However I am trying to see if there is a way I could change 'to_date("01/01/2020", "dd/mm/yyyy")' stored in dict as to_date("01/01/2020", "dd/mm/yyyy") without single quote (not string)
I am using this value in further processing to oracle using cx_oracle. Update statement is which would throw an error 'ORA-01858: a non-numeric character was found where a numeric was expected'
update table1 set dt='to_date("01/01/2020", "dd/mm/yyyy")' where x=y
Any suggestions would be helpful.
Put the to_date() in the SQL statement and bind only the data itself:
dict1 = {"name": "Test name",
"id": 100,
"location": "wherever",
"custom": "01/01/2020"}
sql = """update table1 set dt=to_date(:custom, 'dd/mm/yyyy') where id=:id and location=:location and name = :name"""
cursor.execute(sql, dict1)
Note the bind variable names match the dict key names.
cx_Oracle documentation on binding is here.

How do I extract a list item from nested json in Python?

I have a json object and I'm trying to extract a couple of values from a nested list. Then print them in markup. I'm getting and error - AttributeError: 'list' object has no attribute 'get'
I understand that it's a list and I can't preform a get. I've been searching for the proper method for a few hours now and I'm running out of steam. I'm able to get the Event, but not Value1 and Value2.
This is the json object
{
"resource": {
"data": {
"event": "qwertyuiop",
"eventVersion": "1.05",
"parameters": {
"name": "sometext",
"othername": [
""
],
"thing": {
"something": {
"blah": "whatever"
},
"abc": "123",
"def": {
"xzy": "value"
}
},
"something": [
"else"
]
},
"whatineed": [{
"value1": "text.i.need",
"value2": "text.i.need.also"
}]
}
}
}
And this is my function
def parse_json(json_data: dict) -> Info:
some_data = json_data.get('resource', {})
specific_data = some_data.get('data', {})
whatineed_data = specific_data.get('whatineed', {})
formatted_json = json.dumps(json_data, indent=2)
description = f'''
h3. Details
*Event:* {some_data.get('event')}
*Value1:* {whatineed_data('value1')}
*Value2:* {whatineed_data('value2')}
'''
From the data structure, whatineed is a list with a single item, which in turn is a dictionary. So, one way to access it would be:
whatineed_list = specific_data.get('whatineed', [])
whatineed_dict = whatineed_list[0]
At this point you can do:
value1 = whatineed_dict.get('value1')
value2 = whatineed_dict.get('value2')
You can change your function to the following:
def parse_json(json_data: dict) -> Info:
some_data = json_data.get('resource')
specific_data = some_data.get('data', {})
whatineed_data = specific_data.get('whatineed', {})
formatted_json = json.dumps(json_data, indent=2)
description = '''
h3. Details
*Event:* {}
*Value1:* {}
*Value2:* {}
'''.format(some_data.get('data').get('event'),whatineed_data[0]['value1'], whatineed_data[0]['value2'])
Since whatineed_data is a list, you need to index the element first
Python handles json as strings unless they are coming directly from a file. This could be the source for some of your problems. Also this article might help.
Assuming that "whatineed" attribute is really a list, and it's elements are dicts, you can't call whatineed.get asking for Value1 or Value2 as if they are attributes, because it is a list and it don't have attributes.
So, you have two options:
If whatineed list has a single element ever, you can access this element directly and than access the element attributes:
element = whatineed[0]
v1 = element.get('value1', {})
v2 = element.get('value2', {})
Or, if whatineed list can have more items, so, you will need to iterate over this list and access those elements:
for element in whatineed:
v1 = element.get('value1', {})
v2 = element.get('value2', {})
## Do something with values

Can't iterate over my own object

I am new to Python and can't figure this out. I am trying to make an object from a json feed. I am trying to basically make a dictionary for each item in the json fed that has every property. The error I get is either TypeError: 'mediaObj' object is not subscriptable or not iterable
For bonus points, the array has many sub dictionaries too. What I would like is to be able to access that nested data as well.
Here is my code:
url = jsonfeedwithalotofdata.com
data = urllib.request.urlopen(url)
data = instagramData.read()
data = instagramData.decode('utf-8')
data = json.loads(data)
media = data['data']
class mediaObj:
def __init__(self, item):
for key in item:
setattr(self, key, item[key])
print(self[key])
def run(self):
return self['id']
for item in media:
mediaPiece = mediaObj(item)
This would come from a json feed that looks as follows (so data is the array that comes after "data"):
"data": [
{
"attribution": null,
"videos": {},
"tags": [],
"type": "video",
"location": null,
"comments": {},
"filter": "Normal",
"created_time": "1407423448461",
"link": "http://instagram.com/p/rabdfdIw9L7D-/",
"likes": {},
"images": {},
"users_in_photo": [],
"caption": {},
"user_has_liked": true,
"id": "782056834879232959294_1051813051",
"user": {}
}
So my hope was that I could create an object for every item in the array, and then I could, for instance, say:
print(mediaPiece['id'])
or even better
print(mediaPiece['comments'])
And see a list of comments. Thanks a million
You're having a problem because you're using attributes to store your data items, but using list/dictionary lookup syntax to try to retrieve them.
Instead of print(self[key]), use print(getattr(self, key)), and instead of return self['id'] use return self.id.

Error handling in Python with JSON and a dictionary

I currently have a Python 2.7 script which scrapes Facebook and captures some JSON data from each page. The JSON data contains personal information. A sample of the JSON data is below:-
{
"id": "4",
"name": "Mark Zuckerberg",
"first_name": "Mark",
"last_name": "Zuckerberg",
"link": "http://www.facebook.com/zuck",
"username": "zuck",
"gender": "male",
"locale": "en_US"
}
The JSON values can vary from page to page. The above example lists all the possibles but sometimes, a value such as 'username' may not exist and I may encounter JSON data such as:-
{
"id": "6",
"name": "Billy Smith",
"first_name": "Billy",
"last_name": "Smith",
"gender": "male",
"locale": "en_US"
}
With this data, I want to populate a database table. As such, my code is as below:-
results_json = simplejson.loads(scraperwiki.scrape(profile_url))
for result in results_json:
profile = dict()
try:
profile['id'] = int(results_json['id'])
except:
profile['id'] = ""
try:
profile['name'] = results_json['name']
except:
profile['name'] = ""
try:
profile['first_name'] = results_json['first_name']
except:
profile['first_name'] = ""
try:
profile['last_name'] = results_json['last_name']
except:
profile['last_name'] = ""
try:
profile['link'] = results_json['link']
except:
profile['link'] = ""
try:
profile['username'] = results_json['username']
except:
profile['username'] = ""
try:
profile['gender'] = results_json['gender']
except:
profile['gender'] = ""
try:
profile['locale'] = results_json['locale']
except:
profile['locale'] = ""
The reason I have so many try/excepts is to account for when the key value doesn't exist on the webpage. Nonetheless, this seems to be a really clumpsy and messy way to handle this issue.
If I remove these try / exception clauses, should my scraper encounter a missing key, it returns a KeyError such as "KeyError: 'username'" and my script stops running.
Any suggestions on a much smarter and improved way to handle these errors so that, should a missing key be encountered, the script continues.
I've tried creating a list of the JSON values and looked to iterate through them with an IF clause but I just can't figure it out.
Use the .get() method instead:
>>> a = {'bar': 'eggs'}
>>> a['foo']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'foo'
>>> a.get('foo', 'default value')
'default value'
>>> a.get('bar', 'default value')
'eggs'
The .get() method returns the value for the requested key, or the default value if the key is missing.
Or you can create a new dict with empty strings for each key and use .update() on it:
profile = dict.fromkeys('id name first_name last_name link username gender locale'.split(), '')
profile.update(result)
dict.fromkeys() creates a dictionary with all keys you request set to a given default value ('' in the above example), then we use .update() to copy all keys and values from the result dictionary, replacing anything already there.

Categories