I often have the case where I use two variables, one of them being the "current" value of something, another one a "newly retrieved" one.
After checking for equality (and a relevant action taken), they are swapped. This is then repeated in a loop.
import time
import random
def get_new():
# the new value is retrieved here, form a service or whatever
vals = [x for x in range(3)]
return random.choice(vals)
current = None
while True:
# get a new value
new = get_new()
if new != current:
print('a change!')
else:
print('no change :(')
current = new
time.sleep(1)
This solution works but I feel that it is a naïve approach and I think I remember (for "write pythonic code" series of talks) that there are better ways.
What is the pythonic way to handle such mechanism?
Really, all you have is a simple iteration over a sequence, and you want to detect changes from one item to the next. First, define an iterator that provides values from get_new:
# Each element is a return value of get_new(), until it returns None.
# You can choose a different sentinel value as necessary.
sequence = iter(get_new, None)
Then, get two copies of the iterator, one to use as a source for current values, the other for new values.
i1, i2 = itertools.tee(sequence)
Throw out the first value from one of the iterators:
next(i2)
Finally, iterate over the two zipped together. Putting it all together:
current_source, new_source = tee(iter(get_new, None))
next(new_source)
for current, new in zip(current_source, new_source):
if new != current:
...
else:
...
time.sleep(1)
Using itertoolz.cons:
current_source, new_source = tee(iter(get_new, None))
for current, new in zip(cons(None, current_source), new_source)):
...
Related
I am a python beginner and I learn using dataquest.
I want to use a self-defined function in a loop to check every item in a list, whether it is a color movie or not and add the results (True, False) to a list. Right now the function returns False only, also way to many times. Any hints what I did wrong?
wonder_woman = ['Wonder Woman','Patty Jenkins','Color',141,'Gal Gadot','English','USA',2017]
def is_usa(input_lst):
if input_lst[6] == "USA":
return True
else:
return False
def index_equals_str(input_lst, index, input_str):
if input_lst[index] == input_str:
return True
else:
return False
wonder_woman_in_color = index_equals_str(input_str="Color", index=2, input_lst=wonder_woman)
# End of dataquest challenge
# My own try to use the function in a loop and add the results to a list
f = open("movie_metadata.csv", "r")
data = f.read()
rows = data.split("\n")
aufbereitet = []
for row in rows:
einmalig = row.split(",")
aufbereitet.append(einmalig)
# print(aufbereitet)
finale_liste = []
for item in aufbereitet:
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
finale_liste.append(test)
print(finale_liste)
Also at pastebin: https://pastebin.com/AESjdirL
I appreciate your help!
The problem is in this line
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
The input_lst argument should be input_lst=item. Right now you are passing the whole list of lists to your function everytime.
The .csv file is not provided but I assume the reading is correct and it returns a list like the one you provided in the first line of your code; in particular, that you are trying to pack the data in a list of lists (the einmalig variable is a list obtained by the row of the csv file, then you append each einmalig you find in another list, aufbereitet).
The problem is not in the function itself but in the parameters you give as inputs: when you do
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
you should see that the third parameter is not a list corresponding to the single movie data but the whole list of movies. This means that the Python interpreter, in the function, does this iteration for every item in aufbereitet (that is, iterates for n times where n is aufbereitet's length):
if aufbereitet[2] == "Color":
return True
else:
return False
It is clear that even if the movie is in color, the comparison between a list (an element of aufbereitet) and a string returns False by default since they are different types.
To correct the issue just change the line
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
with
test = index_equals_str(input_str="Color", index=2, input_lst=item)
since, when you use the for loop in that way, the variable item changes at each iteration with the elements in aufbereitet.
Notice that if you're learning that's still ok to use functions but you can use an inline version of the algorithm (that's what Python is famous for). Using
finale_liste = [item[2] == "Color" for item in aufbereitet]
you obtain the list without going to define a function and without using the for loop. That's called list comprehension.
Another thing you can do to make the code more Pythonic - if you want to use the functions anyway - is to do something like
def index_equals_str(input_lst, index, input_str):
return input_lst[index] == input_str
that has the same result with less lines.
Functional programming is sometimes more readable and adaptable for such tasks:
from functools import partial
def index_equals_str(input_lst, index=1, input_str='Null'):
return input_lst[index] == input_str
input_data = [['Name1', 'Category1', 'Color', 'Language1'],
['Name2', 'Category2', 'BW', 'Language2']]
result = list(map(partial(index_equals_str, input_str='Color', index=2), input_data))
# output
# [True, False]
I have a function as follows:
def control(qstat):
gatnum = int(input("What number of control gates is this control qubit a part of?"))
global qstatnum
qstatnum = {}
qstatnum[gatnum] = []
qstatnum[gatnum].append(qstat) #seems to be a problem
return qstat
However, there is a problem. Let's say I run it once. There will be one item in the list. Then, I run it a second time, with an item distinguishable from the second supposed to be added to the list. When I print qstatnum[gatnum], the list contains only the second item, leading me to believe that the .append() statement is somehow incorrectly written and overwriting any previous additions to the list.
Is this a correct diagnosis? Why would this be? Any help would be appreciated. Thanks!
Each time you call the function, you are creating a new qstatnum dict, so the solution is to create the dictionary outside the function:
qstatnum = {}
def control(qstat):
gatnum = int(input("What number of control gates is this control qubit a part of?"))
try:
qstatnum[qstat].append(gatnum)
except:
qstatnum[qstat] = [gatnum]
return qstat
You need a try: except: block to verify if the key already exists in the dictionary, if it doesn't exists, just add the first value, else use append.
#DanD. approach seems to be shorter, please take a look:
qstatnum = {}
def control(qstat):
gatnum = int(input("What number of control gates is this control qubit a part of?"))
qstatnum.setdefault(qstat, []).append(gatnum)
return qstat
Every time the method is called, qstatnum is set to empty. So basically you are appending to nothing every time.
In web2py I have been trying to break down this list comprehension so I can do what I like with the categories it creates. Any ideas as to what this breaks down to?
def menu_rec(items):
return [(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)) for x in items or []]
In addition the following is what uses it:
response.menu = [(SPAN('Catalog', _class='highlighted'), False, '',
menu_rec(db(db.category).select().as_trees()) )]
So far I've come up with:
def menu_rec(items):
for x in items:
return x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children))
I've got other variations of this but, every variation only gives me back 1(one) category, when compared to the original that gives me all the categories.
Can anyone see where I'm messing this up at? Any and all help is appreciated, thank you.
A list comprehension builds a list by appending:
def menu_rec(items):
result = []
for x in items or []:
url = URL('shop', 'category', args=pretty_url(x.id, x.slug))
menu = menu_rec(x.children) # recursive call
result.append((x.title, None, url, menu))
return result
I've added two local variables to break up the long line somewhat, and to show how it recursively calls itself.
Your version returned directly out of the for loop, during the first iteration, and never built up a list.
You don't want to do return. Instead append to a list and then return the list:
def menu_rec(items):
result = []
for x in items:
result.append(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)))
return result
If you do return, it will return the value after only the first iteration. Instead, keep adding it to a list and then return that list at the end. This will ensure that your result list only gets returned when all the values have been added instead of just return one value.
I am stumped with this problem, and no matter how I get around it, it is still giving me the same result.
Basically, supposedly I have 2 groups - GrpA_null and GrpB_null, each having 2 meshes in them and are named exactly the same, brick_geo and bars_geo
- Result: GrpA_null --> brick_geo, bars_geo
But for some reason, in the code below which I presume is the one giving me problems, when it is run, the program states that GrpA_null has the same duplicates as GrpB_null, probably they are referencing the brick_geo and bars_geo. As soon as the code is run, my children geo have a numerical value behind,
- Result: GrpA_null --> brick_geo0, bars_geo0, GrpB_null1 --> brick_geo, bars_geo1
And so, I tried to modify the code such that it will as long as the Parent (GrpA_null and GrpB_null) is different, it shall not 'touch' on the children.
Could someone kindly advice me on it?
def extractDuplicateBoxList(self, inputs):
result = {}
for i in range(0, len(inputs)):
print '<<< i is : %s' %i
for n in range(0, len(inputs)):
print '<<< n is %s' %n
if i != n:
name = inputs[i].getShortName()
# Result: brick_geo
Lname = inputs[i].getLongName()
# Result: |GrpA_null|concrete_geo
if name == inputs[n].getShortName():
# If list already created as result.
if result.has_key(name):
# Make sure its not already in the list and add it.
alreadyAdded = False
for box in result[name]:
if box == inputs[i]:
alreadyAdded = True
if alreadyAdded == False:
result[name].append(inputs[i])
# Otherwise create a new list and add it.
else:
result[name] = []
result[name].append(inputs[i])
return result
There are a couple of things you may want to be aware of. First and foremost, indentation matters in Python. I don't know if the indentation of your code as is is as intended, but your function code should be indented further in than your function def.
Secondly, I find your question a little difficult to understand. But there are several things which would improve your code.
In the collections module, there is (or should be) a type called defaultdict. This type is similar to a dict, except for it having a default value of the type you specify. So a defaultdict(int) will have a default of 0 when you get a key, even if the key wasn't there before. This allows the implementation of counters, such as to find duplicates without sorting.
from collections import defaultdict
counter = defaultdict(int)
for item in items:
counter[item] += 1
This brings me to another point. Python for loops implement a for-each structure. You almost never need to enumerate your items in order to then access them. So, instead of
for i in range(0,len(inputs)):
you want to use
for input in inputs:
and if you really need to enumerate your inputs
for i,input in enumerate(inputs):
Finally, you can iterate and filter through iterable objects using list comprehensions, dict comprehensions, or generator expressions. They are very powerful. See Create a dictionary with list comprehension in Python
Try this code out, play with it. See if it works for you.
from collections import defaultdict
def extractDuplicateBoxList(self, inputs):
counts = defaultdict(int)
for input in inputs:
counts[input.getShortName()] += 1
dup_shns = set([k for k,v in counts.items() if v > 1])
dups = [i for i in inputs if input.getShortName() in dup_shns]
return dups
I was on the point to write the same remarks as bitsplit, he has already done it.
So I just give you for the moment a code that I think is doing exactly the same as yours, based on these remarks and the use of the get dictionary's method:
from collections import defaultdict
def extract_Duplicate_BoxList(self, inputs):
result = defaultdict()
for i,A in enumerate(inputs):
print '<<< i is : %s' %i
name = A.getShortName() # Result: brick_geo
Lname = A.getLongName() # Result: |GrpA_null|concrete_geo
for n in (j for j,B in enumerate(inputs)
if j!=i and B.getShortName()==name):
print '<<< n is %s' %n
if A not in result.get(name,[])):
result[name].append(A)
return result
.
Secondly, as bitsplit said it, I find your question ununderstandable.
Could you give more information on the elements of inputs ?
Your explanations about GrpA_null and GrpB_null and the names and the meshes are unclear.
.
EDIT:
If my reduction/simplification is correct, examining it , I see that What you essentially does is to compare A and B elements of inputs (with A!=B) and you record A in the dictionary result at key shortname (only one time) if A and B have the same shortname shortname;
I think this code can still be reduced to just:
def extract_Duplicate_BoxList(inputs):
result = defaultdict()
for i,A in enumerate(inputs):
print '<<< i is : %s' %i
result[B.getShortName()].append(A)
return result
this may be do what your looking for if I understand it, which seems to be comparing the sub-hierarchies of different nodes to see if they are they have the same names.
import maya.cmds as cmds
def child_nodes(node):
''' returns a set with the relative paths of all <node>'s children'''
root = cmds.ls(node, l=True)[0]
children = cmds.listRelatives(node, ad=True, f=True)
return set( [k[len(root):] for k in children])
child_nodes('group1')
# Result: set([u'|pCube1|pCubeShape1', u'|pSphere1', u'|pSphere1|pSphereShape1', u'|pCube1']) #
# note the returns are NOT valid maya paths, since i've removed the root <node>,
# you'd need to add it back in to actually access a real shape here:
all_kids = child_nodes('group1')
real_children = ['group1' + n for n in all_kids ]
Since the returns are sets, you can test to see if they are equal, see if one is a subset or superset of the other, see what they have in common and so on:
# compare children
child_nodes('group1') == child_nodes('group2')
#one is subset:
child_nodes('group1').issuperset(child_nodes('group2'))
Iterating over a bunch of nodes is easy:
# collect all the child sets of a bunch of nodes:
kids = dict ( (k, child_nodes(k)) for k in ls(*nodes))
-- I just parsed a big file and I created a list containing 42.000 strings/words. I want to query [against this list] to check if a given word/string belongs to it. So my question is:
What is the most efficient way for such a lookup?
A first approach is to sort the list (list.sort()) and then just use
>> if word in list: print 'word'
which is really trivial and I am sure there is a better way to do it. My goal is to apply a fast lookup that finds whether a given string is in this list or not. If you have any ideas of another data structure, they are welcome. Yet, I want to avoid for now more sophisticated data-structures like Tries etc. I am interested in hearing ideas (or tricks) about fast lookups or any other python library methods that might do the search faster than the simple in.
And also i want to know the index of the search item
Don't create a list, create a set. It does lookups in constant time.
If you don't want the memory overhead of a set then keep a sorted list and search through it with the bisect module.
from bisect import bisect_left
def bi_contains(lst, item):
""" efficient `item in lst` for sorted lists """
# if item is larger than the last its not in the list, but the bisect would
# find `len(lst)` as the index to insert, so check that first. Else, if the
# item is in the list then it has to be at index bisect_left(lst, item)
return (item <= lst[-1]) and (lst[bisect_left(lst, item)] == item)
A point about sets versus lists that hasn't been considered: in "parsing a big file" one would expect to need to handle duplicate words/strings. You haven't mentioned this at all.
Obviously adding new words to a set removes duplicates on the fly, at no additional cost of CPU time or your thinking time. If you try that with a list it ends up O(N**2). If you append everything to a list and remove duplicates at the end, the smartest way of doing that is ... drum roll ... use a set, and the (small) memory advantage of a list is likely to be overwhelmed by the duplicates.
Using this program it looks like dicts are the fastes, set second, list with bi_contains third:
from datetime import datetime
def ReadWordList():
""" Loop through each line in english.txt and add it to the list in uppercase.
Returns:
Returns array with all the words in english.txt.
"""
l_words = []
with open(r'c:\english.txt', 'r') as f_in:
for line in f_in:
line = line.strip().upper()
l_words.append(line)
return l_words
# Loop through each line in english.txt and add it to the l_words list in uppercase.
l_words = ReadWordList()
l_words = {key: None for key in l_words}
#l_words = set(l_words)
#l_words = tuple(l_words)
t1 = datetime.now()
for i in range(10000):
#w = 'ZEBRA' in l_words
w = bi_contains(l_words, 'ZEBRA')
t2 = datetime.now()
print('After: ' + str(t2 - t1))
# list = 41.025293 seconds
# dict = 0.001488 seconds
# set = 0.001499 seconds
# tuple = 38.975805 seconds
# list with bi_contains = 0.014000 seconds