This statement is running quite slowly, and I have run out of ideas to optimize it. Could someone help me out?
[dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]
The small_lists contain only about 6 elements.
A really_huge_list_of_list of size 209,510 took approximately 16.5 seconds to finish executing.
Thank you!
Edit:
really_huge_list_of_list is a generator. Apologies for any confusion.
The size is obtained from the result list.
Possible minor improvement:
[dict(itertools.izip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]
Also, you may consider to use generator instead of list comprehensions.
To expand on what the comments are trying to say, you should use a generator instead of that list comprehension. Your code currently looks like this:
[dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]
and you should change it to this instead:
def my_generator(input_list_of_lists):
small_list1 = ["wherever", "small_list1", "comes", "from"]
for small_list2 in input_list_of_lists:
yield dict(zip(small_list1, small_list2))
What you're doing right now is taking ALL the results of iterating over your really huge list, and building up a huge list of the results, before doing whatever you do with that list of results. Instead, you should turn that list comprehension into a generator so that you never have to build up a list of 200,000 results. It's building that result list that's taking up so much memory and time.
... Or better yet, just turn that list comprehension into a generator comprehension by changing its outer brackets into parentheses:
(dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list)
That's really all you need to do. The syntax for list comprehensions and generator comprehensions is almost identical, on purpose: if you understand a list comprehension, you'll understand the corresponding generator comprehension. (In this case, I wrote out the generator in "long form" first so that you'd see what that comprehension expands to).
For more on generator comprehensions, see here, here and/or here.
Hope this helps you add another useful tool to your Python toolbox!
Related
I want to perform calculations on a list and assign this to a second list, but I want to do this in the most efficient way possible as I'll be using a lot of data. What is the best way to do this? My current version uses append:
f=time_series_data
output=[]
for i, f in enumerate(time_series_data):
if f > x:
output.append(calculation with f)
etc etc
should I use append or declare the output list as a list of zeros at the beginning?
Appending the values is not slower compared to other ways possible to accomplish this.
The code looks fine and creating a list of zeroes would not help any further. Although it can create problems as you might not know how many values will pass the condition f > x.
Since you wrote etc etc I am not sure how long or what operations you need to do there. If possible try using list comprehension. That would be a little faster.
You can have a look at below article which compared the speed for list creation using 3 methods, viz, list comprehension, append, pre-initialization.
https://levelup.gitconnected.com/faster-lists-in-python-4c4287502f0a
data=[]
for i in range(int(input())):
name=input()
point=float(input())
data.append([name,point])
how i can convert this code to comprehension or is there any other way to reduce runtime.
for comprehension i tried this code below:
data=[[input() float(input())] for i in range(int(input()))]
i dont know is there any special ways to do list opearations and inputs during for-loop in comprensions.
As i know for statements, they should be mentioned after loop but operations for list before loop. But for my version it gives syntax error.
how i can convert this code to comprehension or is there any other way to reduce runtime.
Do the items of your outer list need to be more lists? Why not make a list of tuples? Anyways your code gave you a syntax error because you forgot to include a comma.
data = [(input(), float(input())) for i in range(int(input()))]
I'm also not sure why you are trying to find speed improvements on code that takes user input. No human is every going to provide input fast enough to notice any performance increase from one solution to another. Do you have another program providing input automatically through stdin or something?
I've just learned of list comprehension but I can't quite get it to work in the right context.
my for loop is:
results and instances are lists
for i in results:
instances.remove(i)
results.remove(i)
I tried [i for i in one if one.count(i)<2 if two.count(i)<2] but it doesn't work. I can get it to work on just one of them with this, [i for i in one if one.count(i)<2], but I wanted to incorporate both of them into the same loop. Can someone show me the easiest way to go about this?
Assuming results is a list. You seem to be trying to do this
for i in results:
instances.remove(i)
del results[:]
list comprehension is the wrong thing to use here. You're not trying to create a new list from a sequence.
This loops is similar, but will remove the instances in the reverse order
while results:
instances.remove(results.pop())
You need to first take the results out of the for loop. Set it after it and then delete it once the for loop has finished.
I am new to python and was reading through some code for a Sublime Text plugin and came across some code I am not familiar with.
views = [v for v in sublime.active_window().views()]
it is the "[v for v" part that I don't understand. What in the heck is this piece of code doing?
Thanks in advance!
That's a list comprehension. It is equivalent to (but more efficient than):
views = []
for v in sublime.active_window().views():
views.append(v)
Note that in this case, they should have just used list:
views = list(sublime.active_window().views())
There are other types of comprehensions that were introduced in python2.7:
set comprehension:
{x for x in iterable}
and dict comprehension:
{k:v for k,v in iterable_that_yields_2_tuples}
So, this is an inefficient way to create a dictionary where all the values are 1:
{k:1 for k in ("foo","bar","baz")}
Finally, python also supports generator expressions (they're available in python2.6 at least -- I'm not sure when they were introduced):
(x for x in iterable)
This works like a list comprehension, but it returns an iterable object. generators aren't particularly useful until you actually iterate over them. The advantage is that a generator calculates the values on the fly (rather than storing the values in a list which you can then iterate over later). They're more memory efficient, but they execute slower than list-comps in some circumstances -- In others, they outshine list-comprehensions because it's easy to say -- Just give me the first 3 elements please -- whereas with a list comprehension, you'd have to calculate all the elements up front which is sometimes an expensive procedure.
This is a list comprehension. It's a bit like an expression with an inline for loop, used to create a quick list on the fly. In this case, it's creating a shallow copy of the list returned by sublime.active_window().views().
List comprehensions really shine when you need to transform each value. For example, here's a quick list comprehension to get the first ten perfect squares:
[x*x for x in range(1,11)]
I have a list called L inside a loop that must iterate though millions of lines. The salient features are:
for line in lines:
L = ['a', 'list', 'with', 'lots', 'of', 'items']
L[3] = 'prefix_text_to_item3' + L[3]
Do more stuff with L...
Is there a better approach to adding text to a list item that would speed up my code. Can .join be used? Thanks.
In a performance oriented code, it is not a good idea to add 2 strings together, it is preferable to use a "".join(_items2join_) instead. (I found some benchmarks there : http://www.skymind.com/~ocrow/python_string/)
Since accessing an element in a python list is O(1), and appending a list to another is O(1) (which is probably the time complexity of concatenating strings in python), The code you have provided is running as fast as it can as far as I can tell. :) You probably can't afford to do this, but when I need speed I go to C++ or some other compiled language when I need to process that much information. Things run much quicker. For time complexity of list operations in python, you may consult this web site: http://wiki.python.org/moin/TimeComplexity and here: What is the runtime complexity of python list functions?
Don't actually create list objects.
Use generator functions and generator expressions.
def appender( some_list, some_text ):
for item in some_list:
yield item + some_text
This appender function does not actually create a new list. It avoids some of the memory management overheads associated with creating a new list.
There may be a better approach depending on what you are doing with list L.
For instance, if you are printing it, something like this may be faster.
print "{0} {1} {2} {3}{4} {5}".format(L[0], L[1], L[2], 'prefix_text_to_item3', L[3], L[4])
What happens to L later in the program?