How to Optimally Save Data in Python as Data Structure

How to Optimally Save Data in Python as Data Structure - python

My model returns information about PC games in the following format. The format is game index and game value. This is my sim_sorted.
[(778, 0.99999994), (1238, 0.9999997), (1409, 0.99999905), (1212, 0.99999815)]
I retrieve the information about the game by indexing the database (df_indieGames):
sims_sorted = sorted(enumerate(sims), key=lambda item: -item[1])
results = {}
for val in sims_sorted[:4]:
index, value = val[0], val[1]
results[df_indieGames.game_name.loc[index]] =
{
"Genre":df_indieGames.genre.loc[index],
"Rating": df_indieGames.score.loc[index],
"Link": df_indieGames.game_link[index]
}
However, such a data structure is hard to sort (by Rating). Is there a better way to store the information so retrieval and sorting is easier? Thanks.
Here's the output of results:
{u'Diehard Dungeon': {'Genre': u'Roguelike',
'Link': u'http://www.indiedb.com/games/diehard-dungeon',
'Rating': 8.4000000000000004},
u'Fork Truck Challenge': {'Genre': u'Realistic Sim',
'Link': u'http://www.indiedb.com/games/fork-truck-challenge',
'Rating': 7.4000000000000004},
u'Miniconomy': {'Genre': u'Realistic Sim',
'Link': u'http://www.indiedb.com/games/miniconomy',
'Rating': 7.2999999999999998},
u'World of Padman': {'Genre': u'First Person Shooter',
'Link': u'http://www.indiedb.com/games/world-of-padman',
'Rating': 9.0}}
UPDATE
The solution to the problem as suggested by ziddarth is the following:
result = sorted(results.iteritems(), key=lambda x: x[1]['Rating'], reverse=True)

You can sort by rating using code below. The lambda function is called with a tuple whose first element is the dictionary key and the second element is the dictionary value for the corresponding key, so you can use the lambda function to get to any value in the nested dictionary
sorted(results.iteritems(), key=lambda x: x[1]['Rating'])

Related

Given the value of a dictionary field, how can I find a dictionary in a list of dictionaries?

Based on a list of fastfoods (list of dictionaries), for each fastfood (each dictionary), I'm trying to 1) extract the competitor's name and then 2) use that competitor's name to retrieve its shortname.
Currently my solution really doesn't make sense and I have a feeling it might be recursive? I'm really having an issue conceptualizing this.
fastfoods = [{'id': 30, 'name': 'McDonalds', 'shortname': 'MC', 'competitor': 'BurgerKing'}, {'id': 47, 'name': 'BurgerKing', 'shortname': 'BK', 'competitor': None}]
for fastfood in fastfoods:
competitor_name = fastfood.get('competitor')
short_name = fastfood.get('shortname')
for fastfood in fastfoods:
if competitor_name == short_name:
print(fastfood.get('shortname')
Here's a visualization of what I'm trying to achieve:
In this limited example I have (real example has thousands of dictionaries, but I'm using 2 just for the example.
So here, I loop over the dictionaries, I reach the first dictionary, I extract the competitor's name ('BurgerKing'). At this point, I want to search for 'BurgerKing' as a 'name' field (not as a competitor field). Once that's found, I access that dictionary where the 'name' field == 'BurgerKing' and extract the shortname ('BK').

I think you're looking for something like this:
byName = {dct['name']:dct for dct in fastfoods}
for fastfood in fastfoods:
if 'competitor' in fastfood and fastfood['competitor'] is not None:
competitor = byName[fastfood['competitor']]
if 'shortname' in competitor:
print(competitor['shortname'])
else:
print(f"competitor {fastfood['competitor']} has no shortname")
Explanation:
create byName, a dictionary that indexes dicts in fastfoods by their name entry
iterate over all dicts in fastfoods
if a given dict has a competitor entry and it's non-null, look up the dict for that competitor by name in byName and if it has a shortname entry print it
otherwise print a message indicating there is no shortname entry for the competitor (you can do something different in this case if you like).

I would first construct a dictionary that maps a name to its shortened version, and then use it. This would be way faster than looking for the competitor in the list over and over again.
fastfoods = [{'id': 30, 'name': 'McDonalds', 'shortname': 'MC', 'competitor': 'BurgerKing'}, {'id': 47, 'name': 'BurgerKing', 'shortname': 'BK', 'competitor': None}]
name_to_short = {dct['name']: dct['shortname'] for dct in fastfoods}
for dct in fastfoods:
print(f"The competitor of {dct['name']} is: {name_to_short.get(dct['competitor'], 'None')}")
# The competitor of McDonalds is: BK
# The competitor of BurgerKing is: None

Parsing nested dictionary to dataframe

I am trying to create data frame from a JSON file.
and each album_details have a nested dict like this
{'api_path': '/albums/491200',
'artist': {'api_path': '/artists/1421',
'header_image_url': 'https://images.genius.com/f3a1149475f2406582e3531041680a3c.1000x800x1.jpg',
'id': 1421,
'image_url': 'https://images.genius.com/25d8a9c93ab97e9e6d5d1d9d36e64a53.1000x1000x1.jpg',
'iq': 46112,
'is_meme_verified': True,
'is_verified': True,
'name': 'Kendrick Lamar',
'url': 'https://genius.com/artists/Kendrick-lamar'},
'cover_art_url': 'https://images.genius.com/1efc5de2af228d2e49d91bd0dac4dc49.1000x1000x1.jpg',
'full_title': 'good kid, m.A.A.d city (Deluxe Version) by Kendrick Lamar',
'id': 491200,
'name': 'good kid, m.A.A.d city (Deluxe Version)',
'url': 'https://genius.com/albums/Kendrick-lamar/Good-kid-m-a-a-d-city-deluxe-version'}
I want to create another column in the data frame with just album name which is one the above dict
'name': 'good kid, m.A.A.d city (Deluxe Version)',
I have been looking how to do this from very long time , can some one please help me. thanks

Is that is the case use str to call the dict key
df['name'] = df['album_details'].str['name']

If you have the dataframe stored in the df variable you could do:
df['artist_name'] = [x['artist']['name'] for x in df['album_details'].values]

You can use apply with lambda function:
df['album_name'] = df['album_details'].apply(lambda d: d['name'])
Basically you execute the lambda function for each value of the column 'album_details'. Note that the argument 'd' in the function is the album dictionary. Apply returns a series of the function return values and this you can set to a new column.
See: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

how to get data in dict from list and store back in dict form? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Here is the dictionary form
abc = {
'if1': {'name': 'data', 'date': '80980'},
'if2': {'name': 'data_1', 'date': '9886878'},
'if3': {'name': 'data', 'date': '0987667'},
'if4': {'name': 'data__5', 'date': '0987667'},
'if5': {'date': '0987667'}
}
and I am trying to apply a filter using the NAME when I give input filter in the form of a list
list_item = ['data','data_1']
it should give me the output dates as follows
{
data:['80980', '0987667'],
data_1:['9886878']
}
please help me to resolve this issue.

For the resulting dictionary we create a defaultdict with an empty list as the default value. Then we loop over all values of 'abc' and check if we have an entry with the key 'name' and the corresponding value is in list_items. If this is the case we can use the name as a key for the resulting dictionary and append the value of the element with the key 'date'.
abc = {
'if1': {'name': 'data', 'date': '80980'},
'if2': {'name': 'data_1', 'date': '9886878'},
'if3': {'name': 'data', 'date': '0987667'},
'if4': {'name': 'data__5', 'date': '0987667'},
'if5': {'date': '0987667'}
}
list_item = ['data','data_1']
import collections
result = collections.defaultdict(list)
for item in abc.values():
if item.get('name', None) in list_item:
result[item['name']].append(item['date'])
print(result)
An other approach is looping over the values in 'list_item'.
result = {}
for key in list_item:
result[key] = [item['date'] for item in abc.values() if item.get('name', None) == key]
print(result)
Using a dictionary comprehension you can transform the last solution into a one-liner (but I prefer a more readable style):
result = {key:[item['date'] for item in abc.values() if item.get('name', None) == key] for key in list_item}

This dict is so complex you can't just filter it, you have to convert it to something more useful.
abc= { 'if1':{'name':'data','date':'80980'}, 'if2':{'name':'data_1','date':'9886878'}, 'if3':{'name':'data','date':'0987667'}, 'if4':{'name':'data__5','date':'0987667'}}
[The dictionary was made wrongly, I assumed 2nd if4 is a typo and I deleted it when I copied the text.]
First, let's flatten it by removing the inside dictionary, making date our dict's value:
formatted1 = {(key, subdict['name']): subdict['date'] for key, subdict in abc.items() if 'name' in subdict}
I kept the original key as part of the new key because otherwise we'd overwrite our data entry.
Our new dict looks like that:
{('if1', 'data'): '80980', ('if2', 'data_1'): '9886878', ('if3', 'data'): '0987667', ('if4', 'data__5'): '0987667'}
Now it's easier to work with. Let's do a simple loop to format it further:
formatted2 = {}
for (_, key), value in formatted1.items(): # we'll be skipping first part of the key, hence I didn't give it a meaningful name
elem = formatted2.get(key, [])
elem.append(value)
formatted2[key] = elem
Our even newer dict looks now like that:
{'data': ['80980', '0987667'], 'data_1': ['9886878'], 'data__5': ['0987667']}
And now it's finally in a form we can easily filter!
list_item = ['data','data_1']
result = {k: formatted2[k] for k in formatted2 if k in list_item}
Result:
{'data': ['80980', '0987667'], 'data_1': ['9886878']}

django-tables2 sorting non-queryset data

Hi I have a table thus:
class MyTable(tables.Table):
brand = tables.Column(verbose_name='Brand')
market = tables.Column(verbose_name='Market')
date = tables.DateColumn(format='d/m/Y')
value = tables.Column(verbose_name='Value')
I'm populating the table with a list of dicts and then I'm configuring it and rendering it in my view in the usual django-tables2 way:
data = [
{'brand': 'Nike', 'market': 'UK', 'date': <datetime obj>, 'value': 100},
{'brand': 'Django', 'market': 'FR', 'date': <datetime obj>, 'value': 100},
]
table = MyTable(data)
RequestConfig(request, paginate=False).configure(table)
context = {'table': table}
return render(request, 'my_tmp.html', context)
This all renders the table nicely with the correct data in it. However, I can sort by the date column and the value column on the webpage, but not by the brand and market. So it seems non-string values can be sorted but strings can't. I've been trying to figure out how to do this sorting, is it possible with non-queryset data?
I can't populate the table with a queryset, as I'm using custom methods to generate the value column data. Any help appreciated! I guess I need to specify the order_by parameter in my tables.Column but haven't found the correct setting yet.

data.sort(key = lambda item: item["brand"].lower()) will sort the list in place (so it will return None; the original 'data' list is edited) based on brand (alphabetically). The same can be done for any key.
Alternatively, sorted(data, key = lambda item: item["brand"].lower()) returns a sorted copy of the list.

Merge Dicts in List calculating min, max and average in Python

I try to write a World of Warcraft Auctionhouse analyzing tool.
For each auction i have data that looks like this:
{
'timeLeftHash': 4,
'bid': 3345887,
'timestamp': 1415339912,
'auc': 1438188059,
'quantity': 1,
'id': 309774,
'ownerHash': 751,
'buy': 3717652,
'ownerRealmHash': 1,
'item': 35965
}
I'd like to combine all dicts that have the same value of "item" so i can get a minBuy, avgBuy, maxBuy, minQuantity, avgQuantity, maxQantity and the sum of combined auctions for the specific item.
How can i archieve that?
I already tried to write it in a Second list of dicts, but then the min and max is missing...

You could try to make a dictionary where the key is the item ID and the Value is a list of tuples of price and quantity.
If you would like to keep all the information, you could also make a dictionary where the key is the item ID and the value is a list of dictionaries corresponding to that ID and from there extract the info that you want through a generator.

data = [
{'item': 35964, 'buy': 3717650, ...},
{'item': 35965, 'buy': 3717652, ...},
...
]
by_item = {}
for d in data:
by_item.setdefault(d['item'], []).append(d['buy'])
stats = dict((k, {'minBuy': min(v), 'maxBuy': max(v)})
for k, v in by_item.iteritems())

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to Optimally Save Data in Python as Data Structure - python

Related

Given the value of a dictionary field, how can I find a dictionary in a list of dictionaries?

Parsing nested dictionary to dataframe

how to get data in dict from list and store back in dict form? [closed]

django-tables2 sorting non-queryset data

Merge Dicts in List calculating min, max and average in Python

Categories

Resources