Create python 2d array indexed by string - python

(I'm not too great at python, so forgive me if this is a stupid question)
So I want to create a data structure that represents this -
word1 word2
word3 1 2
word4 3 4
Right now I've tried doing something like this -
self.table = [][]
but this is giving me an invalid syntax error(I guess because I haven't initialized the arrays?). However, even if I were to use this I wouldn't be able to use it because I don't know how large my x and y dimension is(it seems like there would be an array out of index exception).
Should I be using a double dictionary? What should I be using?

Maybe you can try initializing your table with
self.table = {r : { c : 0 for c in ['word1', 'word2']} for r in ['word3', 'word4']}
and then you can access each position by
self.table['word3']['word1'] = 2

Python doesn't have a ready-made instruction to create a bi-dimensional matrix, but it's easy to do that using list comprehensions:
matrix = [[0] * cols for i in range(rows)]
then you can access the matrix with, for example
matrix[row][col] += 42
assuming rows=10 and cols=20 then the row index must go from 0 to 9 and the col index from 0 to 19. You can also use negative indexes to mean counting from the end; for example
matrix[-1][-1] = 99
will set the last cell to 99

If you are not opposed to using an external library, you might check out
Pandas Data Frames

The idea is to create a dictionary, which maps the strings to the indeces. In the following there's a little class which overloads the '[ ]'-operator:
class ArrayStrIdx:
""" 2D Array with string as Indeces
Usage Example::
AA = ArrayStrIdx()
AA["word1", "word3"] = 99
print(AA["word2", "word5"])
"""
cols ="word1 word2"
rows ="word3 word4 word5"
dd = [[10,11,12],[20,21,22]] # data
def __init__(self):
""" Constructor builds dicts for indexes"""
self.ri = dict([(w,i) for i,w in enumerate(self.rows.split())])
self.ci = dict([(w,i) for i,w in enumerate(self.cols.split())])
def _parsekey(self, key):
""" Convert index key to indeces for ``self.dd`` """
w0, w1 = key[0], key[1]
return self.ci[w0], self.ri[w1]
def __getitem__(self, key):
""" overload [] operator - get() method"""
i0, i1 = self._parsekey(key)
return self.dd[i0][i1]
def __setitem__(self, key, value):
""" overload [] operator - set() method """
i0, i1 = self._parsekey(key)
self.dd[i0][i1] = value
Update: Expanded answer to allow something like AA["word1", "word3"] = 23.

Related

Can you store the position of a a multidimensional array in a variable?

I am trying to write a rubix cube solver and I would like to call the same edge repeatedly to check information from it. However the position of the three dimensional area takes up a lot of space, I would like to put the positions that I use repeatedly into a variable so I can call the variable instead of rewriting the position over and over.
This is what my rubix cube looks like:
rubixCube = [
[["G","G","Y"], ["R","B","O"], ["R","R","O"]],
[["G","O","O"], ["O","Y","G"], ["R","B","B"]],
[["Y","W","Y"], ["R","O","O"], ["O","W","W"]],
[["B","B","W"], ["B","W","G"], ["G","W","R"]],
[["B","Y","R"], ["Y","R","W"], ["G","Y","B"]],
[["O","G","W"], ["B","G","R"], ["Y","Y","W"]]
]
and this is an example of a position I call repeatly:
if(rubixCube[0][0][1] == "W"):
can I write something approximately looking like this:
position = [0][0][1]
if(rubixCube[position] == "W"):
if you use a numpy array
import numpy as np
rubixCube = np.array([
[["G","G","Y"], ["R","B","O"], ["R","R","O"]],
[["G","O","O"], ["O","Y","G"], ["R","B","B"]],
[["Y","W","Y"], ["R","O","O"], ["O","W","W"]],
[["B","B","W"], ["B","W","G"], ["G","W","R"]],
[["B","Y","R"], ["Y","R","W"], ["G","Y","B"]],
[["O","G","W"], ["B","G","R"], ["Y","Y","W"]]
])
then you can index it with a tuple
pos = (0, 0, 1)
print(rubixCude[pos])
If you are happy to install numpy, then this would be the standard way to handle multidimensional arrays.
If you want to keep within the standard library in order to avoid the need to install additional packages, you could make an indexer class to sit on top of your existing nested list, which will allow you to get and set elements using a tuple (or list) for the position.
class NdIndexer:
def __init__(self, arr):
self.arr = arr
def __getitem__(self, index):
val = self.arr
for i in index:
val = val[i]
return val
def __setitem__(self, index, value):
val = self.arr
for i in index[:-1]:
val = val[i]
val[index[-1]] = value
You can then do:
rubixCube = [
[["G","G","Y"], ["R","B","O"], ["R","R","O"]],
[["G","O","O"], ["O","Y","G"], ["R","B","B"]],
[["Y","W","Y"], ["R","O","O"], ["O","W","W"]],
[["B","B","W"], ["B","W","G"], ["G","W","R"]],
[["B","Y","R"], ["Y","R","W"], ["G","Y","B"]],
[["O","G","W"], ["B","G","R"], ["Y","Y","W"]]
]
cube = NdIndexer(rubixCube)
position = (0,0,1)
# getting
print(rubixCube[0][0][1]) # prints: G
print(cube[position]) # prints: G
# setting
cube[position] = "X"
print(rubixCube[0][0][1]) # prints: X
print(cube[position]) # prints: X

Apply map on *args in Python where *args are lists

I want to get a list of lists consisting of only 0 and 1 and map the first element of the first list with the first element of the second list and so on.
My mapping function is this:
def intersect(*values):
result = values[0]
for idx in range(1, len(values)):
result = result << 1
result = result | values[idx]
return result
I'm trying to do this but it does not work:
def intersect_vectors(*vectors):
return list(map(intersect, zip(vectors)))
It would work if I would knew the number of vectors and would have a function like this:
def intersect_vectors(v1, v2, v3):
return list(map(intersect, v1,v2,v3))
Example:
intersect_vectors([1,1],[1,0],[0,1]) would return [6,5] which is [b110, b101]
You can explode your vectors with * and it will work the same:
def intersect_vectors(*vectors):
return list(map(intersect, *vectors))
The simplest solution is probably to delegate the functionality of transforming between a list and 'arguments' to a lambda:
return [list(map((lambda v: intersect(*v)), zip(vectors)))]

Partition list of tuples based on a value within each tuple

I am trying to sort a set of data in to 2 separate lists, fulltime and parttime. But it doesn't seem to be working. Can somebody point to where I'm getting this wrong?
data = [(['Andrew'], ['FullTime'], [38]),
(['Fred'], ['PartTime'], [24]),
(['Chris'], ['FullTime'], [38])]
def sort(var1, datadump):
positionlist = []
for b in range(0, len(datadump)):
temp2 = datadump[b][1]
if (temp2 == var1):
positionlist.append(datadump[b])
return (positionlist)
FullTimeList = sort("FullTime", data)
PartTimeList = sort("PartTime", data)
print(FullTimeList)
print(PartTimeList)
This is solved by altering
if (temp2 == var1):
to
if (temp2[0] == var1):
This is because the elements within each tuple are lists holding a string, not the strings themselves.
This problem could also be solved using two list comprehensions:
FullTimeList = [x for x in data if x[1][0] == 'FullTime']
PartTimeList = [x for x in data if x[1][0] == 'PartTime']
Not an answer: just a suggestion. Learn how to use the python debugger.
python -m pdb <pythonscript.py>
In this case, set a breakpoint on line 9
b 9
Run the program
c
When it breaks, look at temp2
p temp2
It tells you
['FullTime']
Look at var1
p var1
It tells you
'FullTime'
And there is your problem.
You'll get a better understanding if you name your variables and functions with descriptive names:
data = [(['Andrew'], ['FullTime'], [38]),
(['Fred'], ['PartTime'], [24]),
(['Chris'], ['FullTime'], [38])]
def filter_records(value, records):
result = []
for i in range(len(records)): # i and j are usual variable names for indices (b is not)
record = records[i]
name, work, hours = record # give names to the parts
if work[0] == value: # work[0] since the values are lists (no need for parenthesis)
result.append(record)
return result # no need for parenthesis
FullTimeList = filter_records("FullTime", data)
PartTimeList = filter_records("PartTime", data)
the pattern:
for i in range(len(records)):
record = records[i]
is an anti-pattern in Python - meaning that there is a better way to write it:
for record in records:
...

How to count the number of letters in a string with a list of sample?

value = 'bcdjbcdscv'
value = 'bcdvfdvdfvvdfvv'
value = 'bcvfdvdfvcdjbcdscv'
def count_letters(word, char):
count = 0
for c in word:
if char == c:
count += 1
return count
How to count the number of letters in a string with a list of sample? I get nothing in my python shell when I wrote the above code in my python file.
There is a built-in method for this:
value.count('c')
functions need to be called, and the return values need to be printed to the stdout:
In [984]: value = 'bcvfdvdfvcdjbcdscv'
In [985]: count_letters(value, 'b')
Out[985]: 2
In [987]: ds=count_letters(value, 'd') #if you assign the return value to some variable, print it out:
In [988]: print ds
4
EDIT:
On calculating the length of the string, use python builtin function len:
In [1024]: s='abcdefghij'
In [1025]: len(s)
Out[1025]: 10
You'd better google it with some keywords like "python get length of a string" before you ask on SO, it's much time saving :)
EDIT2:
How to calculate the length of several strings with one function call?
use var-positional parameter *args, which accepts an arbitrary sequence of positional arguments:
In [1048]: def get_lengths(*args):
...: return [len(i) for i in args]
In [1049]: get_lengths('abcd', 'efg', '1234567')
Out[1049]: [4, 3, 7]
First you should probably look at correct indenting and only send in value. Also value is being overwritten so the last one will be the actual reference.
Second you need to call the function that you have defined.
#value = 'bcdjbcdscv'
#value = 'bcdvfdvdfvvdfvv'
value = 'bcvfdvdfvcdjbcdscv'
def count_letters(word, char):
count = 0
for c in word:
if char == c:
count += 1
return count
x = count_letters(value, 'b')
print x
# 2
This should produce the result you are looking for. You could also just call:
print value.count('b')
# 2
In python, there is a built-in method to do this. Simply type:
value = 'bcdjbcdscv'
value.count('c')

Trouble using max(list)

Im trying to use max() func to find max value in given a list.
Im creating a list for a given column from a txt file (representing a table, each line has a name and same amount of data columns).
for example - John,M,53,175,8000 (name,. gender, age, height, salary)
The problem is, I dont know if the column will contain numbers or strings. If the column contains integers, then it looks like this (for example):
['1','40','5','520','1025']
In that case, the max() func is comparing first digit and gives back a wrong value ('520').
Here is the relavant code - (Everything is within a class)
First func. is returning a list of a give column.
The second returns the name/names that has max value of the given column.
def get_column(self,colname):
if colname not in self.columns:
raise ValueError('Colname doesnt exists')
col_indx = self.columns.index(colname)+1
col_data = []
for i in range(len(self.names)):
col_data.append(self.data[i][col_indx])
return col_data
def get_row_name_with_max_value(self,colname):
if colname not in self.columns:
raise ValueError('Colname doesnt exists')
col_list = self.get_column(colname)
max_val = max(col_list)
counter = col_list.count(max_val)
max_name = []
k = -1
for i in range(counter):
index = col_list.index(max_val, k+1)
max_name.append(self.data[index][0])
k = index
return ', '.join(max_name)
thanks alot!
You can use the key argument of the max() function to specify comparison as ints:
In [1]: l = [1,'2',3,'100']
In [2]: max(l, key = int)
Out[2]: '100'
Probably you want to apply int() to the output as well.
Check if your list consist only of numeric strings before calling max, if this is a case, use key=int:
def kmax(col):
key = int if all(x.isdigit() for x in col) else str
return max(col, key=key)
print kmax(['1','40','5','520','1025']) # 1025
print kmax(['foo','bar','40','baz']) # foo

Categories