Is there a better way to implement arrays in python - python

So here is my approach:
def transpose(m):
output = [["null" for i in range(len(m))] for j in range(len(m[0]))]
for i in range(len(m[0])):
for j in range(len(m)):
if i==j:
output[i][j]=m[i][j]
else:
output[i][j]=m[j][i]
return(output)
the above method creates an array/list as a placeholder so that new values can be added. I tried this approach because I am new to python and was previously learning Java which had built-in arrays but python doesn't and I found there was no easy way of indexing 2D lists similar to what we do in java unless we predefine list (like in java but had to use some for loops). I know there are packages which implement arrays but I am fairly new to the language so I tried simulating it the way I was familiar.
So my main question is that is there a better approach to predefine lists for a restricted kinda size (like arrays in java) without these funky for loops. OR even a better way to have predefined list which I can then easily index without needing to append list inside list and all those stuff. Its really difficult for me because it doesn't behave like I want.
Also I made a helper method for prebuilding lists like this:
def arraybuilder(r,c,jagged=[]): #builds an empty placeholder 2D array/list of required size
output=[]
if not jagged:
output = [["null" for i in range(c)] for j in range(r)]
return(output)
else:
noOfColumns=[]
for i in range(len(jagged)):
noOfColumns.append(len(jagged[i]))
for i in range(len(jagged)):
row=[]
for j in range(noOfColumns[i]):
row.append("null")
output.append(row)
return(output,noOfColumns)#returns noOfColumns as well for iteration purposes

The typical transposition pattern for 2d iterables is zip(*...):
def transpose(m):
return [*map(list, zip(*m))]
# same as:
# return [list(col) for col in zip(*m))]
zip(*m) unpacks the nested lists and zips (interleaves) them into column tuples. Since zip returns a lazy iterator over tuples, we consume it into a list while converting all the tuples into lists as well.
And if you want to be more explicit, there are more concise ways of creating a nested list. Here is a nested comprehension:
def transpose(m):
return [[row[c] for row in m] for c in range(len(m[0]))]

Related

How to create generic 2d array in python

In Java you would do it like this: Node[][] nodes; where Node.java is a custom class. How do I do it in python where Node.py:
class Node(object):
def __init__(self):
self.memory = []
self.temporal_groups = []
I have imported numpy and created an object type
typeObject = numpy.dtype('O') # O stands for python objects
nodes = ???
You can try it this way, inside your node class create a function that will return the generic array:
def genArray(a, b):
return [[0 for y in range(a)] for x in range(b)]
then you can assign them the way you want. Maybe you might change the 0 to your node object. Let me know if this helps
You have two easy options: use numpy or declare a nested list. The latter approach is more conceptually similar to Node[][] since it allows for ragged lists, as does Java, but the former approach will probably make processing faster.
numpy arrays
To make an array in numpy:
import numpy as np
x = np.full((m, n), None, dtype=np.object)
In numpy, you have to have some idea about the size of the array (here m, n) up-front. There are ways to grow an array, but they are not very efficient, especially for large arrays. np.full will initialize your array with a copy of whatever reference you want. You can modify the elements as you wish after that.
Python lists
To create a ragged list, you do not have to do much:
x = []
This creates an empty list. This is equivalent to Node[][] in Java because that declares a list too. The main difference is that Python lists can change size and are untyped. They are effectively always Object[].
To add more dimensions to the list, just do x[i] =
[], which will insert a nested list into your outer list. This is similar to defining something like
Node[][] nodes = new Node[m][];
nodes[i] = new Node[n];
where m is the number of nested lists (rows) and n is the number of elements in each list (columns). Again, the main difference is that once you have a list in Python, you can expand or contract it as you Java.
Manipulate with x[i][j] as you would in Java. You can add new sublists by doing x.append([]) or x[i] = [], where i is an index past the end of x.

Python: an efficient way to slice a list with a index list

I wish to know an efficient way and code saving to slice a list of thousand of elements
example:
b = ["a","b","c","d","e","f","g","h"]
index = [1,3,6,7]
I wish a result like as:
c = ["b","d","g","h"]
The most direct way to do this with lists is to use a list comprehension:
c = [b[i] for i in index]
But, depending on exactly what your data looks like and what else you need to do with it, you could use numpy arrays - in which case:
c = b[index]
would do what you want, and would avoid the potential memory overhead for large slices - numpy arrays are stored more efficiently than lists, and slicing takes a view into the array rather than making a partial copy.

Construct a dictionary merging multiple lists

I have a list of objects (clusters) and each object has an attribute vertices which is a list of numbers. I want to construct a dictionary (using a one liner) such that the key is a vertex number and the value is the index of the corresponding cluster in the actual list.
Ex:
clusters[0].vertices = [1,2]
clusters[1].vertices = [3,4]
Expected Output:
{1:0,2:0,3:1,4:1}
I came up with the following:
dict(reduce(lambda x,y:x.extend(y) or x, [
dict(zip(vertices, [index]*len(vertices))).items()
for index,vertices in enumerate([i.vertices for i in clusters])]))
It works... but is there a better way of doing this?
Also comment on the efficiency of the above piece of code.
PS: The vertex lists are disjoint.
This is a fairly simple solution, using a nested for:
dict((vert, i) for (i, cl) in enumerate(clusters) for vert in cl.vertices)
This is also more efficient than the version in the question, since it doesn't build lots of intermediate lists while collecting the data for the dict.

Efficient Array replacement in Python

I'm wondering what is the most efficient way to replace elements in an array with other random elements in the array given some criteria. More specifically, I need to replace each element which doesn't meet a given criteria with another random value from that row. For example, I want to replace each row of data as a random cell in data(row) which is between -.8 and .8. My inefficinet solution looks something like this:
import numpy as np
data = np.random.normal(0, 1, (10, 100))
for index, row in enumerate(data):
row_copy = np.copy(row)
outliers = np.logical_or(row>.8, row<-.8)
for prob in np.where(outliers==1)[0]:
fixed = 0
while fixed == 0:
random_other_value = r.randint(0,99)
if random_other_value in np.where(outliers==1)[0]:
fixed = 0
else:
row_copy[prob] = row[random_other_value]
fixed = 1
Obviously, this is not efficient.
I think it would be faster to pull out all the good values, then use random.choice() to pick one whenever you need it. Something like this:
import numpy as np
import random
from itertools import izip
data = np.random.normal(0, 1, (10, 100))
for row in data:
good_ones = np.logical_and(row >= -0.8, row <= 0.8)
good = row[good_ones]
row_copy = np.array([x if f else random.choice(good) for f, x in izip(good_ones, row)])
High-level Python code that you write is slower than the C internals of Python. If you can push work down into the C internals it is usually faster. In other words, try to let Python do the heavy lifting for you rather than writing a lot of code. It's zen... write less code to get faster code.
I added a loop to run your code 1000 times, and to run my code 1000 times, and measured how long they took to execute. According to my test, my code is ten times faster.
Additional explanation of what this code is doing:
row_copy is being set by building a new list, and then calling np.array() on the new list to convert it to a NumPy array object. The new list is being built by a list comprehension.
The new list is made according to the rule: if the number is good, keep it; else, take a random choice from among the good values.
A list comprehension walks over a sequence of values, but to apply this rule we need two values: the number, and the flag saying whether that number is good or not. The easiest and fastest way to make a list comprehension walk along two sequences at once is to use izip() to "zip" the two sequences together. izip() will yield up tuples, one at a time, where the tuple is (f, x); f in this case is the flag saying good or not, and x is the number. (Python has a built-in feature called zip() which does pretty much the same thing, but actually builds a list of tuples; izip() just makes an iterator that yields up tuple values. But you can play with zip() at a Python prompt to learn more about how it works.)
In Python we can unpack a tuple into variable names like so:
a, b = (2, 3)
In this example, we set a to 2 and b to 3. In the list comprehension we unpack the tuples from izip() into variables f and x.
Then the heart of the list comprehension is a "ternary if" statement like so:
a if flag else b
The above will return the value a if the flag value is true, and otherwise return b. The one in this list comprehension is:
x if f else random.choice(good)
This implements our rule.

Accessing elements with offsets in Python's for .. in loops

I've been mucking around a bit with Python, and I've gathered that it's usually better (or 'pythonic') to use
for x in SomeArray:
rather than the more C-style
for i in range(0, len(SomeArray)):
I do see the benefits in this, mainly cleaner code, and the ability to use the nice map() and related functions. However, I am quite often faced with the situation where I would like to simultaneously access elements of varying offsets in the array. For example, I might want to add the current element to the element two steps behind it. Is there a way to do this without resorting to explicit indices?
The way to do this in Python is:
for i, x in enumerate(SomeArray):
print i, x
The enumerate generator produces a sequence of 2-tuples, each containing the array index and the element.
List indexing and zip() are your friends.
Here's my answer for your more specific question:
I might want to add the current element to the element two steps behind it. Is there a way to do this without resorting to explicit indices?
arr = range(10)
[i+j for i,j in zip(arr[:-2], arr[2:])]
You can also use the module numpy if you intend to work on numerical arrays. For example, the above code can be more elegantly written as:
import numpy
narr = numpy.arange(10)
narr[:-2] + narr[2:]
Adding the nth element to the (n-2)th element is equivalent to adding the mth element to the (m+2) element (for the mathematically inclined, we performed the substitution n->m+2). The range of n is [2, len(arr)) and the range of m is [0, len(arr)-2). Note the brackets and parenthesis. The elements from 0 to len(arr)-3 (you exclude the last two elements) is indexed as [:-2] while elements from 2 to len(arr)-1 (you exclude the first two elements) is indexed as [2:].
I assume that you already know list comprehensions.

Categories