How can i make this python code quicklier? - python

How can i make this code more quicklier?
def add_may_go(x,y):
counter = 0
for i in range(-2,3):
cur_y = y + i
if cur_y < 0 or cur_y >= board_order:
continue
for j in range(-2,3):
cur_x = x+j
if (i == 0 and j == 0) or cur_x < 0 or cur_x >= board_order or [cur_y,cur_x] in huge_may_go:
continue
if not public_grid[cur_y][cur_x]:
huge_may_go.append([cur_y,cur_x])
counter += 1
return counter
INPUT:
something like: add_may_go(8,8), add_may_go(8,9) ...
huge_may_go is a huge list like:
[[7,8],[7,9], [8,8],[8,9],[8,10]....]
public_grid is also a huge list, the size is same as board_order*board_order
every content it have has to possble from : 0 or 1
like:
[
[0,1,0,1,0,1,1,...(board_order times), 0, 1],
... board_order times
[1,0,1,1,0,0,1,...(board_order times), 0, 1],
]
an board_order is a global variable which usually is 19 (sometimes it is 15 or 20)
It runs toooooooo slowy now. This function is gonna run for hundreds time. Any possible suggestions is ok!
I have tried numpy. But numpy make it more slowly! Please help

It is difficult to provide a definitive improvement without sample data and a bit more context. Using numpy would be beneficial if you can manage to perform all the calls (i.e. all (x,y) coordinate values) in a single operation. There are also strategies based on sets that could work but you would need to maintain additional data structures in parallel with the public_grid.
Based only on that piece of code, and without changing the rest of the program, there are a couple of things you could do that will provide small performance improvements:
loop only on eligible coordinates rather than skip invalid ones (outside of board)
only compute the curr_x, curr_y values once (track them in a dictionary for each x,y pairs). This is assuming that the same x,y coordinates are used in multiple calls to the function.
use comprehensions when possible
use set operations to avoid duplicate coordinates in huge_may_go
.
hugeCoord = dict() # keep track of the offset coordinates
def add_may_go(x,y):
# compute coordinates only once (the first time)
if (x,y) not in hugeCoord:
hugeCoord[x,y] = [(cx,cy)
for cy in range(max(0,y-2),min(board_order,y+3))
for cx in range(max(0,x-2),min(board_order,x+3))
if cx != x or cy != y]
# get the resulting list of coordinates using a comprehension
fit = {(cy,cx) for cx,cy in hugeCoord[x,y] if not public_grid[cy][cx]}
fit.difference_update(huge_may_go) # use set to avoid duplicates
huge_may_go.extend(fit)
return len(fit)
Note that, if huge_may_go was a set instead of a list, adding to it without repetitions would be more efficient because you could update it directly (and return the difference in size)

if (i == 0 and j == 0)...: continue
Small improvement; reduce the number of iterations by not making those.
for i in (1,2):
do stuff with i and -i
for j in (1,2):
do stuff with j and -j

I want to highlight 2 places which need special attention.
if (...) [cur_y,cur_x] in huge_may_go:
Unlike rest of conditions, this is not arithmetic condition, but contains check, if huge_may_go is list it does take O(n) time or speaking simply is proportional to number of elements in list.
huge_may_go.append([cur_y,cur_x])
PythonWiki described .append method of list as O(1) but with disclaimer that Individual actions may take surprisingly long, depending on the history of the container. You might use collections.deque as replacement for list which was designed with performance of insert (at either end) in mind.
If huge_may_go must not contain duplicates and you do not care about order, then you might use set rather than list and use it for keeping tuples of y,x (set is unable to hold lists). When using .add method of set you might skip contains check, as adding existing element will have not any effect, consider that
s = set()
s.add((1,2))
s.add((3,4))
s.add((1,2))
print(s)
gives output
{(1, 2), (3, 4)}
If you would then need some contains check, set contains check is O(1).

Related

For-Loop over python float array

I am working with the IRIS dataset. I have two sets of data, (1 training set) (2 test set). Now I want to calculate the euclidean distance between every test set row and the train set rows. However, I only want to include the first 4 points of the row.
A working example would be:
dist = np.linalg.norm(inner1test[0][0:4]-inner1train[0][0:4])
print(dist)
***output: 3.034243***
The problem is that I have 120 training set points and 30 test set points - so i would have to do 2700 operations manually, thus I thought about iterating through with a for-loop. Unfortunately, every of my attemps is failing.
This would be my best attempt, which shows the error message
for i in inner1test:
for number in inner1train:
dist = np.linalg.norm(inner1test[i][0:4]-inner1train[number][0:4])
print(dist)
(IndexError: arrays used as indices must be of integer (or boolean)
type)
What would be the best solution to iterate through this array?
ps: I will also provide a screenshot for better vizualisation.
From what I see, inner1test is a tuple of lists, so the i value will not be an index but the actual list.
You should use enumerate, which returns two variables, the index and the actual data.
for i, value in enumerate(inner1test):
for j, number in enumerate(inner1train):
dist = np.linalg.norm(inner1test[i][0:4]-inner1train[number][0:4])
print(dist)
Also, if your lists begin the be bigger, consider using a generator which will execute your calculcations iteration per iteration and return only one value at a time, avoiding to return a big chunk of results which would occupy a lot of memory.
eg:
def my_calculatiuon(inner1test, inner1train):
for i, value in enumerate(inner1test):
for j, number in enumerate(inner1train):
dist = np.linalg.norm(inner1test[i][0:4]-inner1train[number][0:4])
yield dist
for i in my_calculatiuon(inner1test, inner1train):
print(i)
You might also want to investigate python list comprehension which is sometimes more elegant way to handle for loops with lists.
[EDIT]
Here's a probably easier solution anyway, without the need of indexes, which won't fail to enumerate a numpy object:
for testvalue in inner1test:
for testtrain in inner1train:
dist = np.linalg.norm(testvalue[0:4]-testtrain[0:4])
[/EDIT]
This was the final solution with the correct output for me:
distanceslist = list()
for testvalue in inner1test:
for testtrain in inner1train:
dist = np.linalg.norm(testvalue[0:4]-testtrain[0:4])
distances = (dist, testtrain[0:4])
distanceslist.append(distances)
distanceslist

Alternating Directions in Python List - More Pythonic Solution

I feel this should be simple but I'm stuck on finding a neat solution. The code I have provided works, and gives the output I expect, but I don't feel it is Pythonic and it's getting on my nerves.
I have produced three sets of coordinates, X, Y & Z using 'griddata' from a base data set. The coordinates are evenly spaced over an unknown total area / shape (not necessarily square / rectangle) producing the NaN results which I want to ignore of the boundaries of each list. The list should be traversed from the 'bottom left' (in a coordinate system), across the x axis, up one space in the y direction then right to left before continuing. There could be an odd or even number of rows.
The operation to be performed on each point is the same no matter the direction, and it is guaranteed that the every point which exists in X a point exists in Y and Z as can be seen in the code below.
Arrays (lists?) are of the format DataPoint[rows][columns].
k = 0
for i in range(len(x)):
if k % 2 == 0: # cut left to right, then right to left
for j in range(len(x[i])):
if not numpy.isnan(x[i][j]):
file.write(f'X{x[i][j]} Y{y[i][j]} Z{z[i][j]}')
else:
for j in reversed(range(len(x[i]))):
if not numpy.isnan(x[i][j]):
file.write(f'X{x[i][j]} Y{y[i][j]} Z{z[i][j]}')
k += 1
One solution I could think of would be to reverse every other row in each of the lists before running the loop. It would save me a few lines, but probably wouldn't make sense from a performance standpoint - anyone have any better suggestions?
Expected route through list:
End════<══════╗
╔══════>══════╝
╚══════<══════╗
Start══>══════╝
Here's a variant:
for i, (x_row, y_row, z_row) in enumerate(zip(x, y, z)):
if i % 2:
z_row = reversed(x_row)
y_row = reversed(y_row)
z_row = reversed(z_row)
row_strs = list()
for x_elem, y_elem, z_elem in zip(x_row, y_row, z_row):
if not numpy.isnan(x_elem):
row_strs.append(f"X{x_elem} Y{y_elem} Z{z_elem}")
file.write("".join(row_strs))
Considerations:
There is no recipe for an optimization that will always perform better than any other. It also depends on the data that the code handles. Here's a list of things that I could think of, without knowing how the data looks like:
for index range(len(sequence)): is not a Pythonic way of iterating. Here, the foreach idiom is used. If the index is required, [Python 3.Docs]: Built-in Functions - enumerate(iterable, start=0) could be used
This no longer applies because of the previous bullet, but reversed(range(n)) is same as range(n - 1, -1, -1). Don't know whether the latter is faster, but it looks like it would be
Iterate over multiple iterables at once, using [Python 3.Docs]: Built-in Functions - zip(*iterables)
Don't need k, already have i
In general when working with files, it's better to read / write fewer times bigger chunks of data than many times smaller chunks of data (files generally reside on disk and disk operations are slow). However, buffering occurs by default (at Python, OS levels), so this is no longer an issue, but still. But again as always, it's a trade-off between resources (time, memory, ...). I chose to write to file once per line (rather than once per element - as it was originally). Of course, there's the 3rd possibility of writing everything at once, but I imagined that for larger data sets, it won't be the best solution
Probably, some optimizations could also happen at NumPy level (as it would handle bulk data much faster than Python code (iterating) does), but I'm not an expert in that area, nor do I know how the data looks like
I agree with #Prune, your code looks readable and does what it should do. You could compress it a bit by precomputing the indices, like so (note that this start from the top left):
import numpy as np
# generate some sample data
x = np.arange(100).reshape(10,10)
#precompute both directions
fancyranges = (
list(range(len(x[0,:]))),
reversed(list(range(len(x[0,:]))))
)
for a in range(x.shape[0]):
# call appropriate directions
for b in fancyranges[a%2]:
# do things
print(x[a,b])
you can move repeatable code to sub_func for further changes in one place
def func():
def sub_func():
# repeatable code
if not numpy.isnan(x[i][j]):
print(f'X{x[i][j]}...')
k = 0
for i in range(len(x)):
if k % 2 == 0: # cut left to right, then right to left
for j in range(len(x[i])):
sub_func()
else:
for j in reversed(range(len(x[i]))):
sub_func()
k += 1
func()

Faster way of centering lists

I'm looking for a better, faster way to center a couple of lists. Right now I have the following:
import random
m = range(2000)
sm = sorted(random.sample(range(100000), 16000))
si = random.sample(range(16005), 16000)
# Centered array.
smm = []
print sm
print si
for i in m:
if i in sm:
smm.append(si[sm.index(i)])
else:
smm.append(None)
print m
print smm
Which in effect creates a list (m) containing a range of random numbers to center against, another list (sm) from which m is centered against and a list of values (si) to append.
This sample runs fairly quickly, but when I run a larger task with much more variables performance slows to a standstill.
your mainloop contains this infamous line:
if i in sm:
it seems to be nothing but since sm is a result of sorted it is a list, hence O(n) lookup, which explains why it's slow with a big dataset.
Moreover you're using the even more infamous si[sm.index(i)], which makes your algorithm O(n**2).
Since you need the indexes, using a set is not so easy, and there's better to do:
Since sm is sorted, you could use bisect to find the index in O(log(n)), like this:
for i in m:
j = bisect.bisect_left(sm,i)
smm.append(si[j] if (j < len(sm) and sm[j]==i) else None)
small explanation: bisect gives you the insertion point of i in sm. It doesn't mean that the value is actually in the list so we have to check that (by checking if the returned value is within existing list range, and checking if the value at the returned index is the searched value), if so, append, else append None.

Vectorize iteration over two large numpy arrays in parallel

I have two large arrays of type numpy.core.memmap.memmap, called data and new_data, with > 7 million float32 items.
I need to iterate over them both within the same loop which I'm currently doing like this.
for i in range(0,len(data)):
if new_data[i] == 0: continue
combo = ( data[i], new_data[i] )
if not combo in new_values_map: new_values_map[combo] = available_values.pop()
data[i] = new_values_map[combo]
However this is unreasonably slow, so I gather that using numpy's vectorising functions are the way to go.
Is it possible to vectorize with the index – so that the vectorised array can compare it's items to the corresponding item in the other array?
I thought of zipping the two arrays but I guess this would cause unreasonable overhead to prepare?
Is there some other way to optimise this operation?
For context: the goal is to effectively merge the two arrays such that each unique combination of corresponding values between the two arrays is represented by a different value in the resulting array, except zeros in the new_data array which are ignored. The arrays represent 3D bitmap images.
EDIT: available_values is a set of values that have not yet been used in data and persists across calls to this loop. new_values_map on the other hand is reset to an empty dictionary before each time this loop is used.
EDIT2: the data array only contains whole numbers, that is: it's initialised as zeros then with each usage of this loop with a different new_data it is populated with more values drawn from available_values which is initially a range of integers. new_data could theoretically be anything.
In answer to you question about vectorising, the answer is probably yes, though you need to clarify what available_values contains and how it's used, as that is the core of the vectorisation.
Your solution will probably look something like this...
indices = new_data != 0
data[indices] = available_values
In this case, if available_values can be considered as a set of values in which we allocate the first value to the first value in data in which new_data is not 0, that should work, as long as available_values is a numpy array.
Let's say new_data and data take values 0-255, then you can construct an available_values array with unique entries for every possible pair of values in new_data and data like the following:
available_data = numpy.array(xrange(0, 255*255)).reshape((255, 255))
indices = new_data != 0
data[indices] = available_data[data[indices], new_data[indices]]
Obviously, available_data can be whatever mapping you want. The above should be very quick whatever is in available_data (especially if you only construct available_data once).
Python gives you a powerful tools for handling large arrays of data: generators and iterators
Basically, they will allow to acces your data as they were regular lists, without fetching them at once to memory, but accessing piece by piece.
In case of accessing two large arrays at once, you can
for item_a, item_b in izip(data, new_data):
#... do you stuff here
izip creates an iterator what iterates over the elements of your arrays at once, but it does picks pieces as you need them, not all at once.
It seems that replacing the first two lines of loop to produce:
for i in numpy.where(new_data != 0)[0]:
combo = ( data[i], new_data[i] )
if not combo in new_values_map: new_values_map[combo] = available_values.pop()
data[i] = new_values_map[combo]
has the desired effect.
So most of the time in the loop was spent skipping the entire loop upon encountering a zero in new_data. Don't really understand why these many null iterations were so expensive, maybe one day I will...

Efficient Array replacement in Python

I'm wondering what is the most efficient way to replace elements in an array with other random elements in the array given some criteria. More specifically, I need to replace each element which doesn't meet a given criteria with another random value from that row. For example, I want to replace each row of data as a random cell in data(row) which is between -.8 and .8. My inefficinet solution looks something like this:
import numpy as np
data = np.random.normal(0, 1, (10, 100))
for index, row in enumerate(data):
row_copy = np.copy(row)
outliers = np.logical_or(row>.8, row<-.8)
for prob in np.where(outliers==1)[0]:
fixed = 0
while fixed == 0:
random_other_value = r.randint(0,99)
if random_other_value in np.where(outliers==1)[0]:
fixed = 0
else:
row_copy[prob] = row[random_other_value]
fixed = 1
Obviously, this is not efficient.
I think it would be faster to pull out all the good values, then use random.choice() to pick one whenever you need it. Something like this:
import numpy as np
import random
from itertools import izip
data = np.random.normal(0, 1, (10, 100))
for row in data:
good_ones = np.logical_and(row >= -0.8, row <= 0.8)
good = row[good_ones]
row_copy = np.array([x if f else random.choice(good) for f, x in izip(good_ones, row)])
High-level Python code that you write is slower than the C internals of Python. If you can push work down into the C internals it is usually faster. In other words, try to let Python do the heavy lifting for you rather than writing a lot of code. It's zen... write less code to get faster code.
I added a loop to run your code 1000 times, and to run my code 1000 times, and measured how long they took to execute. According to my test, my code is ten times faster.
Additional explanation of what this code is doing:
row_copy is being set by building a new list, and then calling np.array() on the new list to convert it to a NumPy array object. The new list is being built by a list comprehension.
The new list is made according to the rule: if the number is good, keep it; else, take a random choice from among the good values.
A list comprehension walks over a sequence of values, but to apply this rule we need two values: the number, and the flag saying whether that number is good or not. The easiest and fastest way to make a list comprehension walk along two sequences at once is to use izip() to "zip" the two sequences together. izip() will yield up tuples, one at a time, where the tuple is (f, x); f in this case is the flag saying good or not, and x is the number. (Python has a built-in feature called zip() which does pretty much the same thing, but actually builds a list of tuples; izip() just makes an iterator that yields up tuple values. But you can play with zip() at a Python prompt to learn more about how it works.)
In Python we can unpack a tuple into variable names like so:
a, b = (2, 3)
In this example, we set a to 2 and b to 3. In the list comprehension we unpack the tuples from izip() into variables f and x.
Then the heart of the list comprehension is a "ternary if" statement like so:
a if flag else b
The above will return the value a if the flag value is true, and otherwise return b. The one in this list comprehension is:
x if f else random.choice(good)
This implements our rule.

Categories