What will it give me and how to access the elements? - python

Please look at this piece of code :
sig_array=[]
...
for i in range (0, 2):
....
temp=[]
for k in range (0, len (sig)):
#print (k)
temp.append(downsample(sig[k],sampl, new_freq))
sig_array.append(temp)
In other words, tempis a list of arrays (my downsamplefunction, as its name may suggest, return an array) and then the temp will be agregated so it would be a list of lists of arrays !
My questions are : How to deal with that (indexing, ...) and is there simplest way to proceed, by generating list of arrays in a loop but how to keep it in a data structure ?
Thanks

Regarding indexing, you'd just refer to elements like sig_array[0], sig_array[1][2] or sig_array[3][0][2] etc.
Regarding any better data structures, it really just depends on your use case. As #smagnan says in the comments, are you using it for easily accessing data? Matrix processing? If so, have a look at numpy ndarrays. You say that you need it for big data on time series analysis. In that case, using the pandas module will be quite helpful (more info).
Also, as #Bazingaa says, you can make your code less verbose by using list comprehensions (more info):
sig_array = [ [downsample(sig[i],sampl, new_freq) for i in range (len(sig))] for _ in range(2)]
With list comprehensions, it's best to start from the outside, and from the end. The for _ in range(2) will run twice (I've replaced your i with _ as I couldn't see you using it anywhere. If you need it, replace _ with a relevant variable name). In each iteration, it'll append the inner list comprehension to the sig_array. Inside the inner listcomp, the result of the downsample() function will be appended to the temporary list for each iteration of the for loop,
This will have exactly the same output as your code, but is clearly way shorter :)

Related

Fastest way to calculate new list from empty list in python

I want to perform calculations on a list and assign this to a second list, but I want to do this in the most efficient way possible as I'll be using a lot of data. What is the best way to do this? My current version uses append:
f=time_series_data
output=[]
for i, f in enumerate(time_series_data):
if f > x:
output.append(calculation with f)
etc etc
should I use append or declare the output list as a list of zeros at the beginning?
Appending the values is not slower compared to other ways possible to accomplish this.
The code looks fine and creating a list of zeroes would not help any further. Although it can create problems as you might not know how many values will pass the condition f > x.
Since you wrote etc etc I am not sure how long or what operations you need to do there. If possible try using list comprehension. That would be a little faster.
You can have a look at below article which compared the speed for list creation using 3 methods, viz, list comprehension, append, pre-initialization.
https://levelup.gitconnected.com/faster-lists-in-python-4c4287502f0a

Failing to understand simple error message

I've defined a function as below to try to interpolate between two sets of data. When I run it, I get the message:
for i, j in range(0, len(wavelength)):
TypeError: 'int' object is not iterable
I'm not sure what I'm doing wrong. Admittedly, I'm not very good at this.
def accountforfilter(wavelength, flux, filterwavelength, throughput):
filteredwavelength=[]
filteredflux=[]
for i in range(0, len(wavelength)):
if wavelength[i] in filterwavelength[j]:
j=filterwavelength.index(wavelength[i])
filteredwavelength.append(wavelength[i])
filteredflux.append(flux[i]*throughput[j])
elif wavelength[i]<filterwavelength[j]<wavelength[i+1]:
m=((throughput[j+1]-throughput[j])/(filterwavelength[j+1]-filterwavelength[j])
c=throughput[j]-(m*(wavelength[i]))
filteredwavelength.append(wavelength[i])
filteredflux.append(flux[i]*(m*wavelength[i]+c)
return filteredwavelength, filteredflux
range() returns a list of integers. By using for i,j in range() you are telling Python to unpack each item in range() to two values. But since those values are integers, which are a single piece of data, and thus not iterable, you get the error message.
Your code also looks a bit strange.
At first it seems like you want to loop over all combinations of wavelength/filterwavelength, which would be the same as
for i in range(len(wavelength)):
for j in range(len(filterwavelength)):
do_stuff()
but then you are modifying the j parameter inside the loop body, which I don't understand.
Regardless, there is probably a lot easier, and more clear, way to write the code you want. But from the current code it is hard to know what is expected (and should probably go in a separate question).
The problem is that the range only works with one variable like so:
for i in range(0, len(wavelength))
You try to use two variables at once so python tries to unpack an integer which is impossible. You should use the above. If you need two independent indices, use
for i in range(0, len(...))
for j in range(0, len(...))
Btw ranges always start with zero so you can save yourself some typing and use range(len(...)) instead.
You can use zip if you want.
check this: Better way to iterate over two or multiple lists at once
if you have two sets of data.
for i,j in zip(set1,set2):
print i,j

How to spawn nested loops in python

I've tried searching for an answer to this question and read a lot about decorators and global variables, but have not found anything that exactly makes sense with the problem at hand: I want to make every permutation of N-length using A-alphabet, fxn(A,N). I will pass the function 2 arguments: A and N. It will make dummy result of length N. Then, with N nested for loops it will update each index of the result with every element of A starting from the innermost loop. So with fxn(‘01’,4) it will produce
1111, 1110, 1101, 1100, 1011, 1010, 1001, 1000,
0111, 0110, 0101, 0100, 0011, 0010, 0001, 0000
It is straightforward to do this if you know how many nested loops you will need (N; although for more than 4 it starts to get really messy and cumbersome). However, if you want to make all arbitrary-length sequences using A, then you need some way to automate this looping behavior. In particular I will also want this function to act as a generator to prevent having to store all these values in memory, such as with a list. To start it needs to initialize the first loop and keep initializing nested loops with a single value change (the index to update) N-1 times. It will then yield the value of the innermost loop.
The straightforward way to do fxn('01',4) would be:
for i in alphabet:
tempresult[0] = i
for i in alphabet:
tempresult[1] = i
for i in alphabet:
tempresult[2] = i
for i in alphabet:
tempresult[3] = i
yield tempresult
Basically, how can I extend this to an arbitrary length list or string and still get each nest loop to update the appropriate index. I know there is probably a permutation function as part of numpy that will do this, but I haven't been able to come across one. Any advice would be appreciated.
You don't actually want permutations here, but the cartesian product of alphabet*alphabet*alphabet*alphabet. Which you can write as:
itertools.product(alphabet, repeat=4)
Or, if you want to get strings back instead of tuples:
map(''.join, itertools.product(alphabet, repeat=4))
(In 2.x, if you want this to return a lazy iterator instead of a list, as your original code does, use itertools.imap instead of map.)
If you want to do this with numpy, the best way I could think of is to use a recursive function that tiles and repeats for each factor, but this answer has a better implementation, which you can copy from there, or apparently pull out of scikit-learn as sklearn.utils.extmath.cartesian, and then just do this:
cartesian([alphabet]*4)
Of course that gives you a 2D array of single-digit strings; you still need one more step to flatten it to a 1D array of N-digit strings, and numpy will slow you down there more than it speeds you up in the product calculation, so… unless you actually needed a numpy array anyway, I'd stick with itertools here.
You can look to see how the itertools.permutations function works.

How to grow a list to fit a given capacity in Python

I'm a Python newbie. I have a series of objects that need to be inserted at specific indices of a list, but they come out of order, so I can't just append them. How can I grow the list whenever necessary to avoid IndexErrors?
def set(index, item):
if len(nodes) <= index:
# Grow list to index+1
nodes[index] = item
I know you can create a list with an initial capacity via nodes = (index+1) * [None] but what's the usual way to grow it in place? The following doesn't seem efficient:
for _ in xrange(len(nodes), index+1):
nodes.append(None)
In addition, I suppose there's probably a class in the Standard Library that I should be using instead of built-in lists?
This is the best way to of doing it.
>>> lst.extend([None]*additional_size)
oops seems like I misunderstood your question at first. If you are asking how to expand the length of a list so you can insert something at an index larger than the current length of the list, then lst.extend([None]*(new_size - len(lst)) would probably be the way to go, as others have suggested. Of course, if you know in advance what the maximum index you will be needing is, it would make sense to create the list in advance and fill it with Nones.
For reference, I leave the original text: to insert something in the middle of the existing list, the usual way is not to worry about growing the list yourself. List objects come with an insert method that will let you insert an object at any point in the list. So instead of your set function, just use
lst.insert(item, index)
or you could do
lst[index:index] = item
which does the same thing. Python will take care of resizing the list for you.
There is not necessarily any class in the standard library that you should be using instead of list, especially if you need this sort of random-access insertion. However, there are some classes in the collections module which you should be aware of, since they can be useful for other situations (e.g. if you're always appending to one end of the list, and you don't know in advance how many items you need, deque would be appropriate).
Perhaps something like:
lst += [None] * additional_size
(you shouldn't call your list variable list, since it is also the name of the list constructor).

What is the most pythonic way to pop a random element from a list?

Say I have a list x with unkown length from which I want to randomly pop one element so that the list does not contain the element afterwards. What is the most pythonic way to do this?
I can do it using a rather unhandy combincation of pop, random.randint, and len, and would like to see shorter or nicer solutions:
import random
x = [1,2,3,4,5,6]
x.pop(random.randint(0,len(x)-1))
What I am trying to achieve is consecutively pop random elements from a list. (i.e., randomly pop one element and move it to a dictionary, randomly pop another element and move it to another dictionary, ...)
Note that I am using Python 2.6 and did not find any solutions via the search function.
What you seem to be up to doesn't look very Pythonic in the first place. You shouldn't remove stuff from the middle of a list, because lists are implemented as arrays in all Python implementations I know of, so this is an O(n) operation.
If you really need this functionality as part of an algorithm, you should check out a data structure like the blist that supports efficient deletion from the middle.
In pure Python, what you can do if you don't need access to the remaining elements is just shuffle the list first and then iterate over it:
lst = [1,2,3]
random.shuffle(lst)
for x in lst:
# ...
If you really need the remainder (which is a bit of a code smell, IMHO), at least you can pop() from the end of the list now (which is fast!):
while lst:
x = lst.pop()
# do something with the element
In general, you can often express your programs more elegantly if you use a more functional style, instead of mutating state (like you do with the list).
You won't get much better than that, but here is a slight improvement:
x.pop(random.randrange(len(x)))
Documentation on random.randrange():
random.randrange([start], stop[, step])
Return a randomly selected element from range(start, stop, step). This is equivalent to choice(range(start, stop, step)), but doesn’t actually build a range object.
To remove a single element at random index from a list if the order of the rest of list elements doesn't matter:
import random
L = [1,2,3,4,5,6]
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i] # swap with the last element
x = L.pop() # pop last element O(1)
The swap is used to avoid O(n) behavior on deletion from a middle of a list.
despite many answers suggesting use random.shuffle(x) and x.pop() its very slow on large data. and time required on a list of 10000 elements took about 6 seconds when shuffle is enabled. when shuffle is disabled speed was 0.2s
the fastest method after testing all the given methods above was turned out to be written by #jfs
import random
L = [1,"2",[3],(4),{5:"6"},'etc'] #you can take mixed or pure list
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i] # swap with the last element
x = L.pop() # pop last element O(1)
in support of my claim here is the time complexity chart from this source
IF there are no duplicates in list,
you can achieve your purpose using sets too. once list made into set duplicates will be removed. remove by value and remove random cost O(1), ie very effecient. this is the cleanest method i could come up with.
L=set([1,2,3,4,5,6...]) #directly input the list to inbuilt function set()
while 1:
r=L.pop()
#do something with r , r is random element of initial list L.
Unlike lists which support A+B option, sets also support A-B (A minus B) along with A+B (A union B)and A.intersection(B,C,D). super useful when you want to perform logical operations on the data.
OPTIONAL
IF you want speed when operations performed on head and tail of list, use python dequeue (double ended queue) in support of my claim here is the image. an image is thousand words.
Here's another alternative: why don't you shuffle the list first, and then start popping elements of it until no more elements remain? like this:
import random
x = [1,2,3,4,5,6]
random.shuffle(x)
while x:
p = x.pop()
# do your stuff with p
I know this is an old question, but just for documentation's sake:
If you (the person googling the same question) are doing what I think you are doing, which is selecting k number of items randomly from a list (where k<=len(yourlist)), but making sure each item is never selected more than one time (=sampling without replacement), you could use random.sample like #j-f-sebastian suggests. But without knowing more about the use case, I don't know if this is what you need.
One way to do it is:
x.remove(random.choice(x))
While not popping from the list, I encountered this question on Google while trying to get X random items from a list without duplicates. Here's what I eventually used:
items = [1, 2, 3, 4, 5]
items_needed = 2
from random import shuffle
shuffle(items)
for item in items[:items_needed]:
print(item)
This may be slightly inefficient as you're shuffling an entire list but only using a small portion of it, but I'm not an optimisation expert so I could be wrong.
This answer comes courtesy of #niklas-b:
"You probably want to use something like pypi.python.org/pypi/blist "
To quote the PYPI page:
...a list-like type with better asymptotic performance and similar
performance on small lists
The blist is a drop-in replacement for the Python list that provides
better performance when modifying large lists. The blist package also
provides sortedlist, sortedset, weaksortedlist, weaksortedset,
sorteddict, and btuple types.
One would assume lowered performance on the random access/random run end, as it is a "copy on write" data structure. This violates many use case assumptions on Python lists, so use it with care.
HOWEVER, if your main use case is to do something weird and unnatural with a list (as in the forced example given by #OP, or my Python 2.6 FIFO queue-with-pass-over issue), then this will fit the bill nicely.

Categories