I have an API response for listing out information of all Volumes. I want to loop through the response and get the value of the name and assign each one of them dynamically to each url.
This is my main API endpoint which returns the following:
[{'source': None, 'serial': '23432', 'created': '2018-11-
12T04:27:14Z', 'name': 'v001', 'size':
456456}, {'source': None, 'serial': '4364576',
'created': '2018-11-12T04:27:16Z', 'name': 'v002',
'size': 345435}, {'source': None, 'serial':
'6445645', 'created': '2018-11-12T04:27:17Z', 'name': 'v003', 'size':
23432}, {'source': None,
'serial': 'we43235', 'created': '2018-11-12T04:27:20Z',
'name': 'v004', 'size': 35435}]
I'm doing this to get the value of 'name'
test_url = 'https://0.0.0.0/api/1.1/volume'
test_data = json.loads(r.get(test_url, headers=headers,
verify=False).content.decode('UTF-8'))
new_data = [{
'name': value['name']
} for value in test_data]
final_data = [val['name'] for val in new_data]
for k in final_data:
print(k)
k prints out all the values in name, but i'm stuck at where i want to be able to use it in assigning different API endpoints. Now, k returns
v001
v002
v003
v004
I want to assign each one of them to different endpoints like below:
url_v001 = test_url + v001
url_v002 = test_url + v002
url_v003 = test_url + v003
url_v004 = test_url + v004
I want this to be dynamically done, because there may be more than 4 volume names returned by my main API.
It wouldn't be good to do that, but the best way is to use a dictionary:
d={}
for k in final_test:
d['url_'+k] = test_url + k
Or much better in a dictionary comprehension:
d={'url_'+k:test_url + k for k in final_test}
And now:
print(d)
Both reproduce:
{'url_v001': 'https://0.0.0.0/api/1.1/volumev001', 'url_v002': 'https://0.0.0.0/api/1.1/volumev002', 'url_v003': 'https://0.0.0.0/api/1.1/volumev003', 'url_v004': 'https://0.0.0.0/api/1.1/volumev004'}
To use d:
for k,v in d.items():
print(k+',',v)
Outputs:
url_v001, https://0.0.0.0/api/1.1/volumev001
url_v002, https://0.0.0.0/api/1.1/volumev002
url_v003, https://0.0.0.0/api/1.1/volumev003
url_v004, https://0.0.0.0/api/1.1/volumev004
Related
Below is an example of sports betting app I'm working on.
games.json()['data'] - contains the game id for each sport event for that day. The API then returns the odds for that specific game.
What's the fastest option to take json and turn it into a panda dataframe? currently looking into msgspec.
Some games can have over 5K total bets
master_df = pd.DataFrame()
for game in games.json()['data']:
odds_params = {'key': api_key, 'game_id': game['id'], 'sportsbook': sportsbooks}
odds = requests.get(api_url, params=odds_params)
for o in odds.json()['data'][0]['odds']:
temp = pd.DataFrame()
temp['id'] = [game['id']]
for k,v in game.items():
if k != 'id' and k != 'is_live':
temp[k] = v
for k, v in o.items():
if k == 'id':
temp['odds_id'] = v
else:
temp[k] = v
if len(master_df) == 0:
master_df = temp
else:
master_df = pd.concat([master_df, temp])
odds.json response snippet -
{'data': [{'id': '35142-30886-2023-02-08',
'sport': 'basketball',
'league': 'NBA',
'start_date': '2023-02-08T19:10:00-05:00',
'home_team': 'Washington Wizards',
'away_team': 'Charlotte Hornets',
'is_live': False,
'tournament': None,
'status': 'unplayed',
'odds': [{'id': '4BB426518ECF',
'sports_book_name': 'Betfred',
'name': 'Charlotte Hornets',
'price': 135.0,
'checked_date': '2023-02-08T11:46:12-05:00',
'bet_points': None,
'is_main': True,
'is_live': False,
'market_name': '1st Half Moneyline',
'home_rotation_number': None,
'away_rotation_number': None,
'deep_link_url': None,
'player_id': None},
....
By the end of this process, I usually have about 30K records in the dataframe
Here is what I would do.
def _create_record_(game: dict, odds: dict) -> dict:
"""
Warning: THIS MUTATES THE INPUT
"""
odds['id'] = "odds_id"
# the pipe | operator is only available in dicts in recent versions of python
# use dict(**game, **odds) if you get a TypeError
result = game | odds
result.pop("is_live")
return result
def _get_odds(game: dict) -> list:
params = {'key': api_key, 'game_id': game['id'], 'sportsbook': sportsbooks}
return requests.get(api_url, params=params).json()['data'][0]['odds']
df = pd.DataFrame(
[
_create_record_(game, odds)
for game in games.json()['data']
for odds in _get_odds(game)
]
)
The fact that it is in this list comprehenesion isn't relevant. And equivalent for-loop would work just as well, the point is you create a list of dicts first, then create your dataframe. This avoids the quadratic time behavior of incrementally creating a dataframe using pd.concat.
I have a problem. I have a dict my_Dict. This is somewhat nested. However, I would like to 'clean up' the dict my_Dict, by this I mean that I would like to separate all nested ones and also generate a unique ID so that I can later find the corresponding object again.
For example, I have detail: {...}, this nested, should later map an independent dict my_Detail_Dict and in addition, detail should receive a unique ID within my_Dict. Unfortunately, my list that I give out is empty. How can I remove my slaughtered keys and give them an ID?
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
my_new_dict_list = []
for key in my_Dict.keys():
#print(f"Looking for {key}")
if isinstance(my_Dict[key], dict):
print(f"{key} is nested")
# Add id to nested stuff
my_Dict[key]["__id"] = 1
my_nested_Dict = my_Dict[key]
# Delete all nested from the key
del my_Dict[key]
# Add id to key, but not the nested stuff
my_Dict[key] = 1
my_new_dict_list.append(my_Dict[key])
my_new_dict_list.append(my_Dict)
return my_new_dict_list
nested_dict(my_Dict)
[OUT] []
# What I want
[my_Dict, my_Details_Dict, my_Data_Dict]
What I have
{'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}}}
What I want
my_Dict = {'_key': '1',
'group': 'test',
'data': 18,
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': 22}
my_Data_Dict = {'__id': 18}
my_Detail_Dict = {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}, '__id': 22}
The following code snippet will solve what you are trying to do:
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
# Initializing a dictionary that will store all the nested dictionaries
my_new_dict = {}
idx = 0
for key in my_Dict.keys():
# Checking which keys are nested i.e are dictionaries
if isinstance(my_Dict[key], dict):
# Generating ID
idx += 1
# Adding generated ID as another key
my_Dict[key]["__id"] = idx
# Adding nested key with the ID to the new dictionary
my_new_dict[key] = my_Dict[key]
# Replacing nested key value with the generated ID
my_Dict[key] = idx
# Returning new dictionary containing all nested dictionaries with ID
return my_new_dict
result = nested_dict(my_Dict)
print(my_Dict)
# Iterating through dictionary to get all nested dictionaries
for item in result.items():
print(item)
If I understand you correctly, you wish to automatically make each nested dictionary it's own variable, and remove it from the main dictionary.
Finding the nested dictionaries and removing them from the main dictionary is not so difficult. However, automatically assigning them to a variable is not recommended for various reasons. Instead, what I would do is store all these dictionaries in a list, and then assign them manually to a variable.
# Prepare a list to store data in
inidividual_dicts = []
id_index = 1
for key in my_Dict.keys():
# For each key, we get the current value
value = my_Dict[key]
# Determine if the current value is a dictionary. If so, then it's a nested dict
if isinstance(value, dict):
print(key + " is a nested dict")
# Get the nested dictionary, and replace it with the ID
dict_value = my_Dict[key]
my_Dict[key] = id_index
# Add the id to previously nested dictionary
dict_value['__id'] = id_index
id_index = id_index + 1 # increase for next nested dic
inidividual_dicts.append(dict_value) # store it as a new dictionary
# Manually write out variables names, and assign the nested dictionaries to it.
[my_Details_Dict, my_Data_Dict] = inidividual_dicts
In python3 I need to get a JSON response from an API call,
and parse it so I will get a dictionary That only contains the data I need.
The final dictionary I ecxpt to get is as follows:
{'Severity Rules': ('cc55c459-eb1a-11e8-9db4-0669bdfa776e', ['cc637182-eb1a-11e8-9db4-0669bdfa776e']), 'auto_collector': ('57e9a4ec-21f7-4e0e-88da-f0f1fda4c9d1', ['0ab2470a-451e-11eb-8856-06364196e782'])}
the JSON response returns the following output:
{
'RuleGroups': [{
'Id': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rules',
'Order': 1,
'Enabled': True,
'Rules': [{
'Id': 'cc637182-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rule',
'Description': 'Look for default severity text',
'Enabled': False,
'RuleMatchers': None,
'Rule': '\\b(?P<severity>DEBUG|TRACE|INFO|WARN|ERROR|FATAL|EXCEPTION|[I|i]nfo|[W|w]arn|[E|e]rror|[E|e]xception)\\b',
'SourceField': 'text',
'DestinationField': 'text',
'ReplaceNewVal': '',
'Type': 'extract',
'Order': 21520,
'KeepBlockedLogs': False
}],
'Type': 'user'
}, {
'Id': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c',
'Name': 'auto_collector',
'Order': 4,
'Enabled': True,
'Rules': [{
'Id': '2d6bdc1d-4064-11eb-8856-06364196e782',
'Name': 'auto_collector',
'Description': 'DO NOT CHANGE!! Created via API coralogix-blocker tool',
'Enabled': False,
'RuleMatchers': None,
'Rule': 'AUTODISABLED',
'SourceField': 'subsystemName',
'DestinationField': 'subsystemName',
'ReplaceNewVal': '',
'Type': 'block',
'Order': 1,
'KeepBlockedLogs': False
}],
'Type': 'user'
}]
}
I was able to create a dictionary that contains the name and the RuleGroupsID, like that:
response = requests.get(url,headers=headers)
output = response.json()
outputlist=(output["RuleGroups"])
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
# Create a dictionary of NAME + ID
ruleDic = {}
for key in groupRuleName:
for value in groupRuleID:
ruleDic[key] = value
groupRuleID.remove(value)
break
Which gave me a simple dictionary:
{'Severity Rules': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e', 'Rewrites': 'ddbaa27e-1747-11e9-9db4-0669bdfa776e', 'Extract': '0cb937b6-2354-d23a-5806-4559b1f1e540', 'auto_collector': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c'}
but when I tried to parse it as nested JSON things just didn't work.
In the end, I managed to create a function that returns this dictionary,
I'm doing it by breaking the JSON into 3 lists by the needed elements (which are Name, Id, and Rules from the first nest), and then create another list from the nested JSON ( which listed everything under Rule) which only create a list from the keyword "Id".
Finally creating a dictionary using a zip command on the lists and dictionaries created earlier.
def get_filtered_rules() -> List[dict]:
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
ruleIDList = [li['Rules'] for li in outputlist]
ruleIDListClean = []
ruleClean = []
for sublist in ruleIDList:
try:
lstRule = [item['Rule'] for item in sublist]
ruleClean.append(lstRule)
ruleContent=list(zip(groupRuleName, ruleClean))
ruleContentDictionary = dict(ruleContent)
lstID = [item['Id'] for item in sublist]
ruleIDListClean.append(lstID)
# Create a dictionary of NAME + ID + RuleID
ruleDic = dict(zip(groupRuleName, zip(groupRuleID, ruleIDListClean)))
except Exception as e: print(e)
return ruleDic
I have a list as shown below:
[{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
I am trying to find how many times type : type_1 occurs and what is the earliest created_at timestamp in that list for type_1
We can achieve this in several steps.
To find the number of times type_1 occurs we can use the built-in filter in tandem with itemgetter.
from operator import itemgetter
def my_filter(item):
return item['type'] == 'type_1'
key = itemgetter('created_at')
items = sorted(filter(my_filter, data), key=key)
print(f"Num records is {len(items)}")
print(f"Earliest record is {key(items[0])}")
Num records is 2
Earliest record is 2020-02-12T17:45:00Z
Conversely you can use a generator-comprehension and then sort the generator.
gen = (item for item in data if item['type'] == 'type_1')
items = sorted(gen, key=key)
# rest of the steps are the same...
You could use list comprehension to get all the sublists you're interested in, then sort by 'created_at'.
l = [{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
ll = [x for x in l if x['type'] == 'type_1']
ll.sort(key=lambda k: k['created_at'])
print(len(ll))
print(ll[0]['created_at'])
Output:
2
02/12/2020 17:45:00
This is one approach using filter and min.
Ex:
data = [{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
onlytype_1 = list(filter(lambda x: x['type'] == 'type_1', data))
print(len(onlytype_1))
print(min(onlytype_1, key=lambda x: x['created_at']))
Or:
temp = {}
for i in data:
temp.setdefault(i['type'], []).append(i)
print(len(temp['type_1']))
print(min(temp['type_1'], key=lambda x: x['created_at']))
Output:
2
{'id': 'id_123', 'type': 'type_1', 'created_at': '2020-02-12T17:45:00Z'}
You can just generate a list of all the type_1s using a list_comprehension, and them use sort with datetime.strptime to sort the values accordingly
from datetime import datetime
# Generate a list with only the type_1s' created_at values
type1s = [val['created_at'] for val in vals if val['type']=="type_1"]
# Sort them based on the timestamps
type1s.sort(key=lambda date: datetime.strptime(date, "%Y-%m-%dT%H:%M:%SZ"))
# Print the lowest value
print(type1s[0])
#'2020-02-12T17:45:00Z'
You can use the following function to get the desired output:
from datetime import datetime
def sol(l):
sum_=0
dict_={}
for x in l:
if x['type']=='type_1':
sum_+=1
dict_[x['id']]=datetime.strptime(x['created_at'], "%Y-%m-%dT%H:%M:%SZ")
date =sorted(dict_.values())[0]
for key,value in dict_.items():
if value== date: id_=key
return sum_,date,id_
sol(l)
This function gives the number of times type ='type_1', corresponding minimum date and its id respectively.
Hope this helps!
Simple Python question, but I'm scratching my head over the answer!
I have an array of strings of arbitrary length called path, like this:
path = ['country', 'city', 'items']
I also have a dictionary, data, and a string, unwanted_property. I know that the dictionary is of arbitrary depth and is dictionaries all the way down, with the exception of the items property, which is always an array.
[CLARIFICATION: The point of this question is that I don't know what the contents of path will be. They could be anything. I also don't know what the dictionary will look like. I need to walk down the dictionary as far as the path indicates, and then delete the unwanted properties from there, without knowing in advance what the path looks like, or how long it will be.]
I want to retrieve the parts of the data object (if any) that matches the path, and then delete the unwanted_property from each.
So in the example above, I would like to retrieve:
data['country']['city']['items']
and then delete unwanted_property from each of the items in the array. I want to amend the original data, not a copy. (CLARIFICATION: By this I mean, I'd like to end up with the original dict, just minus the unwanted properties.)
How can I do this in code?
I've got this far:
path = ['country', 'city', 'items']
data = {
'country': {
'city': {
'items': [
{
'name': '114th Street',
'unwanted_property': 'foo',
},
{
'name': '8th Avenue',
'unwanted_property': 'foo',
},
]
}
}
}
for p in path:
if p == 'items':
data = [i for i in data[p]]
else:
data = data[p]
if isinstance(data, list):
for d in data:
del d['unwanted_property']
else:
del data['unwanted_property']
The problem is that this doesn't amend the original data. It also relies on items always being the last string in the path, which may not always be the case.
CLARIFICATION: I mean that I'd like to end up with:
{
'country': {
'city': {
'items': [
{
'name': '114th Street'
},
{
'name': '8th Avenue'
},
]
}
}
}
Whereas what I have available in data is only [{'name': '114th Street'}, {'name': '8th Avenue'}].
I feel like I need something like XPath for the dictionary.
The problem you are overwriting the original data reference. Change your processing code to
temp = data
for p in path:
temp = temp[p]
if isinstance(temp, list):
for d in temp:
del d['unwanted_property']
else:
del temp['unwanted_property']
In this version, you set temp to point to the same object that data was referring to. temp is not a copy, so any changes you make to it will be visible in the original object. Then you step temp along itself, while data remains a reference to the root dictionary. When you find the path you are looking for, any changes made via temp will be visible in data.
I also removed the line data = [i for i in data[p]]. It creates an unnecessary copy of the list that you never need, since you are not modifying the references stored in the list, just the contents of the references.
The fact that path is not pre-determined (besides the fact that items is going to be a list) means that you may end up getting a KeyError in the first loop if the path does not exist in your dictionary. You can handle that gracefully be doing something more like:
try:
temp = data
for p in path:
temp = temp[p]
except KeyError:
print('Path {} not in data'.format(path))
else:
if isinstance(temp, list):
for d in temp:
del d['unwanted_property']
else:
del temp['unwanted_property']
The problem you are facing is that you are re-assigning the data variable to an undesired value. In the body of your for loop you are setting data to the next level down on the tree, for instance given your example data will have the following values (in order), up to when it leaves the for loop:
data == {'country': {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}}}
data == {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}}
data == {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}
data == [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]
Then when you delete the items from your dictionaries at the end you are left with data being a list of those dictionaries as you have lost the higher parts of the structure. Thus if you make a backup reference for your data you can get the correct output, for example:
path = ['country', 'city', 'items']
data = {
'country': {
'city': {
'items': [
{
'name': '114th Street',
'unwanted_property': 'foo',
},
{
'name': '8th Avenue',
'unwanted_property': 'foo',
},
]
}
}
}
data_ref = data
for p in path:
if p == 'items':
data = [i for i in data[p]]
else:
data = data[p]
if isinstance(data, list):
for d in data:
del d['unwanted_property']
else:
del data['unwanted_property']
data = data_ref
def delKey(your_dict,path):
if len(path) == 1:
for item in your_dict:
del item[path[0]]
return
delKey( your_dict[path[0]],path[1:])
data
{'country': {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo'}, {'name': '8th Avenue', 'unwanted_property': 'foo'}]}}}
path
['country', 'city', 'items', 'unwanted_property']
delKey(data,path)
data
{'country': {'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}}
You need to remove the key unwanted_property.
names_list = []
def remove_key_from_items(data):
for d in data:
if d != 'items':
remove_key_from_items(data[d])
else:
for item in data[d]:
unwanted_prop = item.pop('unwanted_property', None)
names_list.append(item)
This will remove the key. The second parameter None is returned if the key unwanted_property does not exist.
EDIT:
You can use pop even without the second parameter. It will raise KeyError if the key does not exist.
EDIT 2: Updated to recursively go into depth of data dict until it finds the items key, where it pops the unwanted_property as desired and append into the names_list list to get the desired output.
Using operator.itemgetter you can compose a function to return the final key's value.
import operator, functools
def compose(*functions):
'''returns a callable composed of the functions
compose(f, g, h, k) -> f(g(h(k())))
'''
def compose2(f, g):
return lambda x: f(g(x))
return functools.reduce(compose2, functions, lambda x: x)
get_items = compose(*[operator.itemgetter(key) for key in path[::-1]])
Then use it like this:
path = ['country', 'city', 'items']
unwanted_property = 'unwanted_property'
for thing in get_items(data):
del thing[unwanted_property]
Of course if the path contains non-existent keys it will throw a KeyError - you probably should account for that:
path = ['country', 'foo', 'items']
get_items = compose(*[operator.itemgetter(key) for key in path[::-1]])
try:
for thing in get_items(data):
del thing[unwanted_property]
except KeyError as e:
print('missing key:', e)
You can try this:
path = ['country', 'city', 'items']
previous_data = data[path[0]]
previous_key = path[0]
for i in path:
previous_data = previous_data[i]
previous_key = i
if isinstance(previous_data, list):
for c, b in enumerate(previous_data):
if "unwanted_property" in b:
del previous_data[c]["unwanted_property"]
current_dict = {}
previous_data_dict = {}
for i, a in enumerate(path):
if i == 0:
current_dict[a] = data[a]
previous_data_dict = data[a]
else:
if a == previous_key:
current_dict[a] = previous_data
else:
current_dict[a] = previous_data_dict[a]
previous_data_dict = previous_data_dict[a]
data = current_dict
print(data)
Output:
{'country': {'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}, 'items': [{'name': '114th Street'}, {'name': '8th Avenue'}], 'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}