list comprehension fails but why? - python

Who can explain to me why this list comprehension fails:
provider1 = {'id': 1, 'name': 'Een'}
provider2 = {'id': 2, 'name': 'Twee'}
provider3 = {'id': 3, 'name': 'Drie'}
provider4 = {'id': 4, 'name': 'Vier'}
provider5 = {'id': 5, 'name': 'Vijf'}
provider6 = {'id': 6, 'name': 'Zes'}
provider7 = {'id': 7, 'name': 'Zeven'}
providers = [provider1, provider2, provider3, provider4, provider5, provider6, provider7]
def testfunc(id):
return next(provider for provider in providers if int(provider['id']) == int(id))
for x in range(0, 8):
print testfunc(x)
When I run this and 0 is passed to the funtion, the output from this is:
Traceback (most recent call last):
File "/Users/me/Documents/scratchpad/main.py", line 17, in <module>
print testfunc(x)
File "/Users/me/Documents/scratchpad/main.py", line 13, in testfunc
return next(provider for provider in providers if int(provider['id']) == int(id))
StopIteration
Process finished with exit code 1
It does work for a non zero integer.

That's because next function raises StopIteration when there's no next item. In particular this occures when the underlying iterator is empty which is your case for id == 0.

The dictionary does not have a value for key 0. It finds value as None and it cannot determine the next value for iteration.
Replace your code with valid ranges, your code will work
for x in range(1, 8):
print( testfunc(x))
OR
You could add provider0 = {'id': 0, 'name': 'Onkar'}
and providers = [provider0,provider1, provider2, provider3, provider4, provider5, provider6, provider7] to make
for x in range(0, 8):
print( testfunc(x))
work

Yes, because your generator is empty. None of your data matches
if int(provider['id']) == 0
Calling next on an empty generator throws the StopIteration.

Related

loosing dict content as soon as am out of loop in python

Team: need some assistance..
sub: loosing dict content as soon as am out of loop. dict is populated with loop vars that are added to dict using subscript approach.
below foo() is always getting executed because the team_oncall_dict is empty outside. any hint how can I retain it as it was inside loop?
def askduty_oncall(self, *args):
session = APISession(PD_API_KEY, default_from=PD_USER_EMAIL)
total = 1 #true or false
limit = 40
teamteamnm = "Team Test Team"
team_esp_name = “Team Test Escalation Policy"
teamteamid = ""
teamesplcyid = ""
team_oncall_dict = {}
if args:
offset = args[0]
total_teams = args[1]
if offset <= total_teams:
print("\nfunc with args with new offset {} called\n".format(offset))
teams = session.get('/teams?limit={0}&total={1}&offset={2}'.format(limit,total,offset))
else:
print("Reached max teams, no more team records to pull")
return
else:
print("\nFunc with no args called, hence pull first set of {} teams as defined by limit var\n".format(limit))
teams = session.get('/teams?limit={0}&total={1}'.format(limit,total))
if not teams.ok:
return
else:
tj = teams.json()
tjd = tj['teams']
for adict in tjd:
if not adict['name'] == teamteamnm:
continue
elif adict['name'] == teamteamnm:
teamteamid = adict['id']
print("\nFound team..\nFetched",adict['name'], "id: {0}".format(teamteamid))
print("Pull escalation policy for team '{}':'{}'".format(teamteamnm,teamteamid))
esclp = session.get('/escalation_policies?total={0}&team_ids%5B%5D={1}'.format(total,teamteamid))
if not esclp.ok:
print("Pulling Escalation polices for team '{}' failed".format(teamteamnm))
return
else:
ep = esclp.json()
epj = esclp.json()['escalation_policies']
if not epj:
print("Escalation polices for team '{}' not defined".format(teamteamnm))
return
else:
for adict2 in epj:
if not adict2['summary'] == team_esp_name:
continue
else:
print("***************FOUND FOUND********************")
teamesplcyid = adict2['id']
print("\nFetched {} id: {}\n".format(team_esp_name, teamesplcyid))
oncalls = session.get('/oncalls?total={0}&escalation_policy_ids%5B%5D={1}'.format(total,teamesplcyid))
if not oncalls.ok:
print(“issue “with oncalls)
return
else:
ocj = oncalls.json()['oncalls']
for adict3 in ocj:
print("\n")
print(adict3['escalation_level'])
if i['escalation_level'] == 1:
print(adict3['schedule']['summary'], adict3['user']['summary'])
team_oncall_dict[adict3['schedule']['summary']] = adict3['user']['summary']
print(team_oncall_dict)
return team_oncall_dict
if not team_oncall_dict: #part of func def
do foo()
output
foo stuff
sample data is a list of dicts
[{'escalation_policy': {'id': 'P8RKTEE', 'type': 'escalation_policy_reference', 'summary': 'Team Escalation Policy'}, 'escalation_level': 3, 'schedule': None, 'user': {'id': 'PX8XYFT', 'type': 'user_reference', 'summary': 'M1’}, 'start': None, 'end': None},
{'escalation_policy': {'id': 'P8RKTEE', 'type': 'escalation_policy_reference', 'summary': 'Team Escalation Policy'}, 'escalation_level': 1, 'schedule': None, 'user': {'id': 'PKXXVJI', 'type': 'user_reference', 'summary': ‘R1’}, 'start': None, 'end': None},
{'escalation_policy': {'id': 'P8RKTEE', 'type': 'escalation_policy_reference', 'summary': 'Team’}, 'escalation_level': 2, 'schedule': None, 'user': {'d': 'PN8F9PC', 'type': 'user_reference’,'summary': ‘T1’}],'start': None, 'end': None}]
btw: above is 4th inner loop.
so flow is like this diagramatically.
def func1()
team_oncall_dict = {}
loop1
loop2
loop3
loop4
...
team_oncall_dict
if not team_oncall_dict:
print("dict is empty")
output
dict is empty
t was local vs global. fixed it by declaring the team_oncall_dict globally outside the function.
intead of
def func1()
team_oncall_dict = {}
team_oncall_dict = {}
def func1()

Parallelizing list filtering

I have a list of items that I need to filter based on some conditions. I'm wondering whether Dask could do this filtering in parallel, as the list is very long (a few dozen million records).
Basically, what I need to do is this:
items = [
{'type': 'dog', 'weight': 10},
{'type': 'dog', 'weight': 20},
{'type': 'cat', 'weight': 15},
{'type': 'dog', 'weight': 30},
]
def item_is_valid(item):
item_is_valid = True
if item['type']=='cat':
item_is_valid = False
elif item['weight']>20:
item_is_valid = False
# ...
# elif for n conditions
return item_is_valid
items_filtered = [item for item in items if item_is_valid(item)]
With Dask, what I have achieved to do is the following:
def item_is_valid_v2(item):
"""Return the whole item if valid."""
item_is_valid = True
if item['type']=='cat':
item_is_valid = False
elif item['weight']>20:
item_is_valid = False
# ...
# elif for n conditions
if item_is_valid:
return item
results = []
item = []
for item in items:
delayed = dask.delayed(item_is_valid)(item)
results.append(delayed)
results = dask.compute(*results)
However, the result I get contains a few None values, which then need to be filtered out somehow in a non-parallel way.
({'type': 'dog', 'weight': 10}, {'type': 'dog', 'weight': 20}, None, None)
Perhaps the bag API will work you, this is a rough pseudo-code:
import dask.bag as db
bag = db.from_sequence() # or better yet read it from disk
result = bag.filter(item_is_valid) # note this uses the first version (bool)
To inspect if this is working, inspect the outcome of result.take(5) and if that is satisfactory:
computed_result = result.compute()

How to extract value from json and increment

Sample json is below. I want to save id which is completed (False and True) into seperated dictionaryies
todos = [{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False},
{'userId': 1, 'id': 2, 'title': 'quis ut nam facil ', 'completed': False},
{'userId': 1, 'id': 1, 'title': 'fugiat veniam minus', 'completed': False},
{'userId': 1, 'id': 2, 'title': 'et porro tempora', 'completed': True},
{'userId': 1, 'id': 1,'title': 'laprovident illum', 'completed': False}]
Expected out is below
todos_by_user_true = {1:0,2:1}
todos_by_user_false = {1:3,2:1}
code is below? Why my code not working. I am getting blank dictionary
todos_by_user_true = {}
todos_by_user_false = {}
# Increment complete TODOs count for each user.
for todo in todos:
if todo["completed"]==True:
try:
# Increment the existing user's count.
todos_by_user_true[todo["id"]] += 1
except KeyError:
# This user has not been seen. Set their count to 1.
todos_by_user_true[todo["id"]] = 0
elif todo["completed"]==False:
try:
# Increment the existing user's count.
todos_by_user_false[todo["id"]] += 1
except KeyError:
# This user has not been seen. Set their count to 1.
todos_by_user_false[todo["id"]] = 0
I am getting not proper dictionary
My output is below
todos_by_user_false {1: 2, 2: 0}
todos_by_user_true {2: 0}
Disclaimer: I need to take care of exception also
Looking at your input data, it is so that:
userId 1, id 1 has 0 true, and 3 false
userId 1, id 2 has 1 true, and 1 false
Given the required output, it looks like you really want to use id rather than userId in your lookups. Besides that, there's an issue with accounting the first time you insert the id in the resulting dictionary. I would fix it like this:
todos_by_user_true = {}
todos_by_user_false = {}
# Increment complete TODOs count for each user.
for todo in todos:
if todo["completed"]==True:
try:
# Increment the existing user's count.
todos_by_user_true[todo["id"]] += 1
except KeyError:
# This user has not been seen. Set their count to 1.
todos_by_user_true[todo["id"]] = 1
elif todo["completed"]==False:
try:
# Increment the existing user's count.
todos_by_user_false[todo["id"]] += 1
except KeyError:
# This user has not been seen. Set their count to 1.
todos_by_user_false[todo["id"]] = 1
which (btw) is already what's in your comments.
Personally, I would check the dictionary for the key before insertion, instead of using try..except, like this:
todos_by_user_true = {}
todos_by_user_false = {}
# Increment complete TODOs count for each user.
for todo in todos:
key = todo["id"]
if todo["completed"]: # true case
# If `id` not there yet, insert it to 0
if key not in todos_by_user_true:
todos_by_user_true[key] = 0
# increment
todos_by_user_true[key] += 1
else: # false case
# If `id` not there yet, insert it to 0
if key not in todos_by_user_false:
todos_by_user_false[key] = 0
# increment
todos_by_user_false[key] += 1
This gives out:
todos_by_user_true = {2:1}
todos_by_user_false = {1:3,2:1}
The logic being this, you cannot have:
todos_by_user_true = {1:0}
You account for the value when you find it; rather than iterating for id from a separate list.

Appending new items to a Python list with logic operators (Time values)

I'm starting out on Python on a job (I'm used to R) where I have to get daily data from an API that returns the datetime and value (which is a certain number of listeners on a podcast) and then send that data to a bigquery database.
After I split up the date and time, I need to add a new column that indicates which program was playing in that moment. In other words:
if time is >= than 11:00 and <= 11:59 then add a 'program name' value to the row into the column 'program'.
I've ran into several problems, namely the fact that time has been split as strings (could be due to the fact that we use google data studio, which has extremely rigid datetime implementation).
How would you go about it?
if response.status_code == 200:
data = response.text
result = json.loads(data)
test = result
#Append Items
for k in test:
l = []
l.append(datetime.datetime.strptime(k["time"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime("%Y-%m-%d"))
l.append(datetime.datetime.strptime(k["time"], "%Y-%m-%dT%H:%M:%S.%fZ").astimezone(pytz.timezone("America/Toronto")).strftime("%H:%M"))
l.append(k["value"])
You need to have a 'DB' of the programs timetable. See below.
Your loop will call the function below with the time value and you will have the program name.
import datetime
from collections import namedtuple
Program = namedtuple('Program', 'name start end')
PROGRAMS_DB = [Program('prog1', datetime.time(3, 0, 0), datetime.time(3, 59, 0)),
Program('prog2', datetime.time(18, 0, 0), datetime.time(18, 59, 0)),
Program('prog3', datetime.time(4, 0, 0), datetime.time(4, 59, 0))]
def get_program_name(time_val):
for program in PROGRAMS_DB:
if program.start <= time_val <= program.end:
return program.name
data_from_the_web = [{"time": "2019-02-19T18:10:00.000Z", "value": 413, "details": None},
{"time": "2019-02-19T15:12:00.000Z", "value": 213, "details": None}]
for entry in data_from_the_web:
t = datetime.datetime.strptime(entry["time"], "%Y-%m-%dT%H:%M:%S.%fZ").time()
entry['prog'] = get_program_name(t)
for entry in data_from_the_web:
print(entry)
Output
{'prog': 'prog2', 'details': None, 'value': 413, 'time': '2019-02-19T18:10:00.000Z'}
{'prog': None, 'details': None, 'value': 213, 'time': '2019-02-19T15:12:00.000Z'}

Python Dictionary and Tuples Addition

Need a bit of help on this problem:
I have the following code:
def insertIntoDataStruct(state,job,count,dict):
if not state in dict:
print "adding"
dict[state] = [(job,count)]
else:
for x in range(0, len(dict[state])):
if(dict[state][x][0] == job):
print "hi"
print dict[state][x][0]
print job
print state
print dict[state][x][1]
dict[state][x][1] = dict[state][x][1] + 1
else:
dict[state].append((job,count))
courses = {}
insertIntoDataStruct("CA", "2121", (1), courses)
insertIntoDataStruct("CA", "169521", 1, courses)
insertIntoDataStruct("CA", "2121", 1, courses)
insertIntoDataStruct("TX", "2121", 1, courses)
insertIntoDataStruct("TX", "169521", 1, courses)
insertIntoDataStruct("TX", "262420", 1, courses)
print courses
and I am getting this error:
adding File "test2.py", line 21, in <module>
hi
insertIntoDataStruct("CA", "2121", 1, courses)
2121
File "test2.py", line 13, in insertIntoDataStruct
2121
dict[state][x][1] = dict[state][x][1] + 1
CA
TypeError: 'tuple' object does not support item assignment
1
Process finished with exit code 1
How can I go about fixing : TypeError: 'tuple' object does not support item assignment
The Ideal output of this code should be:
{
'CA': [('2121', 2), ('169521', 1), ('2122', 1)],
'TX': [('2121', 1), ('169521', 1), ('262420', 1)]
}
Thanks for all help!
tuples are immutable. Therefore, when you try to run the line dict[state][x][1] = dict[state][x][1] + 1, where the left side is the tuple ('2121', 1), the 'tuple' object does not support item assignment error results.
dict[state][x][1] = dict[state][x][1] + 1
You cannot do item assignment here.
My understanding is you want to increment the job ID and add the course in the dictionary with the new job, what you can do is
new_job = str(int(dict[state][x][1])+1)
dict[state].append((new_job, count))
Here I have incremented the job and then appended into the dictionary

Categories