Using variable names for 2d matrix elements for readability

Using variable names for 2d matrix elements for readability - python

While solving Leetcode problems I've been trying to make my answers as easily intelligible as possible, so I can quickly glance at them later and make sense of them. Toward that end I assigned variable names to indices of interest in a 2D list. When I see "matrix[i][j+1]" and variations thereof repeatedly, I sometimes lose track of what I'm dealing with.
So, for this problem: https://leetcode.com/problems/maximal-square/
I wrote this code:
class Solution:
def maximalSquare(self, matrix: List[List[str]]) -> int:
maximum = 0
for y in range(len(matrix)):
for x in range(len(matrix[0])):
#convert to integer from string
matrix[y][x] = int(matrix[y][x])
#use variable for readability
current = matrix[y][x]
#build largest square counts by checking neighbors above and to left
#so, skip anything in first row or first column
if y!=0 and x!=0 and current==1:
#assign variables for readability. We're checking adjacent squares
left = matrix[y][x-1]
up = matrix[y-1][x]
upleft = matrix[y-1][x-1]
#have to use matrix directly to set new value
matrix[y][x] = current = 1 + min(left, up, upleft)
#reevaluate maximum
if current > maximum:
maximum = current
#return maximum squared, since we're looking for largest area of square, not largest side
return maximum**2
I don't think I've seen people do this before and I'm wondering if it's a bad idea, since I'm sort of maintaining two versions of a value.
Apologies if this is a "coding style" question and therefore just a matter of opinion, but I thought there might be a clear answer that I just haven't found yet.

It is very hard to give a straightforward answer, because it might vary from person to person. Let me start from your queries:
When I see "matrix[i][j+1]" and variations thereof repeatedly, I sometimes lose track of what I'm dealing with.
It depends. People who have moderate programming knowledge should not be confused by seeing a 2-D matrix in matrix[x-pos][y-pos] shape. Again, if you don't feel comfortable, you can use the way you have shared here. But, you should try to adopt and be familiar with this type of common concepts parallelly.
I don't think I've seen people do this before and I'm wondering if it's a bad idea, since I'm sort of maintaining two versions of a value.
It is not a bad idea at all. It is "Okay" as long as you are considering to do this for your comfort. But, if you like to share your code with others, then it might not be a very good idea to use something that is too obvious. It might reduce the understandability of your code to others. But, you should not worry with the maintaining two versions of a value, as long as the extra memory is constant.
Apologies if this is a "coding style" question and therefore just a matter of opinion, but I thought there might be a clear answer that I just haven't found yet.
You are absolutely fine by asking this question. As you mentioned, it is really just a matter of opinion. You can follow some standard language guideline like Google Python Style Guide. It is always recommended to follow some standards for this type of coding style things. Always keep in mind, a piece of good code is always self-documented and putting unnecessary comments sometimes make it boring. Also,
Here I have shared my version of your code. Feel free to comment if you have any question.
# Time: O(m*n)
# Space: O(1)
class Solution:
def maximalSquare(self, matrix: List[List[str]]) -> int:
"""Given an m x n binary matrix filled with 0's and 1's,
find the largest square containing only 1's and return its area.
Args:
matrix: An (m x n) string matrix.
Returns:
Area of the largest square containing only 1's.
"""
maximum = 0
for x in range(len(matrix)):
for y in range(len(matrix[0])):
# convert current matrix cell value from string to integer
matrix[x][y] = int(matrix[x][y])
# build largest square side by checking neighbors from up-row and left-column
# so, skip the cells from the first-row and first-column
if x != 0 and y != 0 and matrix[x][y] != 0:
# update current matrix cell w.r.t. the left, up and up-left cell values respectively
matrix[x][y] = 1 + min(matrix[x][y-1], matrix[x-1][y], matrix[x-1][y-1])
# re-evaluate maximum square side
if matrix[x][y] > maximum:
maximum = matrix[x][y]
# returning the area of the largest square
return maximum**2

Related

Mathematical explanation of Leetcode question: Container With Most Water

I was working on a medium level leetcode question 11. Container With Most Water. Besides the brute force solution with O(n^2), there is an optimal solution with complexity of O(n) by using two pointers from left and right side of the container. I am a little bit confused why this "two pointers" method must include the optimal solution. Does anyone know how to prove the correctness of this algorithm mathematically? This is an algorithm that I don't know of. Thank you!
The original question is:
You are given an integer array height of length n. There are n vertical lines drawn such that the two endpoints of the ith line are (i, 0) and (i, height[i]).
Find two lines that together with the x-axis form a container, such that the container contains the most water. Return the maximum amount of water a container can store. Notice that you may not slant the container.
A brutal solution for this question is(O(n^2)):
def maxArea(self, height: List[int]) -> int:
length = len(height)
volumn = 0
#calculate all possible combinations, and compare one by one:
for position1 in range(0,length):
for position2 in range (position1 + 1, length):
if min(height[position1],height[position2])*(position2 - position1) >=volumn:
volumn = min(height[position1],height[position2])*(position2 - position1)
else:
volumn = volumn
return volumn
Optimal solution approach, The code I wrote is like this(O(n)):
def maxArea(self, height: List[int]) -> int:
pointerOne, pointerTwo = 0, len(height)-1
maxVolumn = 0
#Move left or right pointer one step for whichever is smaller
while pointerOne != pointerTwo:
if height[pointerOne] <= height[pointerTwo]:
maxVolumn = max(height[pointerOne]*(pointerTwo - pointerOne), maxVolumn)
pointerOne += 1
else:
maxVolumn = max(height[pointerTwo]*(pointerTwo - pointerOne), maxVolumn)
pointerTwo -= 1
return maxVolumn
Does anyone know why this "two pointers" method can find the optimal solution? Thanks!

Based on my understanding the idea is roughly:
Staring from widest bars (i.e. first and last bar) and then narrowing
width to find potentially better pair(s).
Steps:
We need to have ability to loop over all 'potential' candidates (the candidates better than what we have on hand rather than all candidates as you did in brutal solution) thus starting from outside bars and no inner pairs will be missed.
If an inner bar pair does exist, it means the height is higher than bars we have on hand, so you should not just #Move left or right pointer one step but #Move left or right pointer to next taller bar .
Why #Move left or right pointer whichever is smaller? Because the smaller bar doesn't fulfill the 'potential' of the taller bar.
The core idea behind the steps is: starting from somewhere that captures optimal solution inside (step 1), then by each step you are reaching to a better solution than what you have on hand (step 2 and 3), and finally you will reach to the optimal solution.
One question left for you think about: what makes sure the optimal solution is not missed when you executing steps above? :)

An informal proof could go something like this: imagine we are at some position in the iteration before reaching the optimal pair:
|
|
|~~~~~~~~~~~~~~~~~~~~~~~|
|~~~~~~~~~~~~~~~~~~~~~~~|
|~~~~~~~~~~~~~~~~~~~~~~~|
|~~~~~~~~~~~~~~~~~~~~~~~|
^ ^
A B
Now lets fix A (the smaller vertical line) and consider all of the choices left of B that we could pair with it. Clearly all of them yield a container with a smaller amount of water than we have currently between A and B.
Since we have stated that we have yet to reach the optimal solution, clearly A cannot be one of the lines contributing to it. Therefore, we move its pointer.
Q.E.D.

8 Queens on a chessboard | PYTHON | Memory Error

I came across this question where 8 queens should be placed on a chessboard such that none can kill each other.This is how I tried to solve it:
import itertools
def allAlive(position):
qPosition=[]
for i in range(8):
qPosition.append(position[2*i:(2*i)+2])
hDel=list(qPosition) #Horizontal
for i in range(8):
a=hDel[0]
del hDel[0]
l=len(hDel)
for j in range(l):
if a[:1]==hDel[j][:1]:
return False
vDel=list(qPosition) #Vertical
for i in range(8):
a=vDel[0]
l=len(vDel)
for j in range(l):
if a[1:2]==vDel[j][1:2]:
return False
cDel=list(qPosition) #Cross
for i in range(8):
a=cDel[0]
l=len(cDel)
for j in range(l):
if abs(ord(a[:1])-ord(cDel[j][:1]))==1 and abs(int(a[1:2])-int(cDel[j][1:2]))==1:
return False
return True
chessPositions=['A1','A2','A3','A4','A5','A6','A7','A8','B1','B2','B3','B4','B5','B6','B7','B8','C1','C2','C3','C4','C5','C6','C7','C8','D1','D2','D3','D4','D5','D6','D7','D8','E1','E2','E3','E4','E5','E6','E7','E8','F1','F2','F3','F4','F5','F6','F7','F8','G1','G2','G3','G4','G5','G6','G7','G8','H1','H2','H3','H4','H5','H6','H7','H8']
qPositions=[''.join(p) for p in itertools.combinations(chessPositions,8)]
for i in qPositions:
if allAlive(i)==True:
print(i)
Traceback (most recent call last):
qPositions=[''.join(p) for p in itertools.combinations(chessPositions,8)]
MemoryError
I'm still a newbie.How can I overcome this error?Or is there any better way to solve this problem?

What you are trying to do is impossible ;)!
qPositions=[''.join(p) for p in itertools.combinations(chessPositions,8)]
means that you will get a list with length 64 choose 8 = 4426165368, since len(chessPositions) = 64, which you cannot store in memory. Why not? Combining what I stated in the comments and #augray in his answer, the result of above operation would be a list which would take
(64 choose 8) * 2 * 8 bytes ~ 66GB
of RAM, since it will have 64 choose 8 elements, each element will have 8 substrings like 'A1' and each substring like this consists of 2 character. One character takes 1 byte.
You have to find another way. I am not answering to that because that is your job. The n-queens problem falls into dynamic programming. I suggest you to google 'n queens problem python' and search for an answer. Then try to understand the code and dynamic programming.
I did searching for you, take a look at this video. As suggested by #Jean François-Fabre, backtracking. Your job is now to watch the video once, twice,... as long as you don't understand the solution to problem. Then open up your favourite editor (mine is Vi :D) and code it down!

This is one case where it's important to understand the "science" (or more accurately, math) part of computer science as much as it is important to understand the nuts and bolts of programming.
From the documentation for itertools.combinations, we see that the number of items returned is n! / r! / (n-r)! where n is the length of the input collection (in your case the number of chess positions, 64) and r is the length of the subsequences you want returned (in your case 8). As #campovski has pointed out, this results in 4,426,165,368. Each returned subsequence will consist of 8*2 characters, each of which is a byte (not to mention the overhead of the other data structures to hold these and calculate the answer). Each character is 1 byte, so in total, just counting the memory consumption of the resulting subsequences gives 4,426,165,368*2*8=70818645888. dividing this by 1024^3 gives the number of Gigs of memory held by these subsequences, about 66GB.
I'm assuming you don't have that much memory :-) . Calculating the answer to this question will require a well thought out algorithm, not just "brute force". I recommend doing some research on the problem- Wikipedia looks like a good place to start.

As the other answers stated you cant get every combination to fit in memory, and you shouldn't use brute force because the speed will be slow. However, if you want to use brute force, you could constrain the problem, and eliminate common rows and columns and check the diagonal
from itertools import permutations
#All possible letters
letters = ['a','b','c','d','e','f','g','h']
#All possible numbers
numbers = [str(i) for i in range(1,len(letters)+1)]
#All possible permutations given rows != to eachother and columns != to eachother
r = [zip(letters, p) for p in permutations(numbers,8)]
#Formatted for your function
points = [''.join([''.join(z) for z in b]) for b in r]
Also as a note, this line of code attempts to first find all of the combinations, then feed your function, which is a waste of memory.
qPositions=[''.join(p) for p in itertools.combinations(chessPositions,8)]
If you decided you do want to use a brute force method, it is possible. Just modify the code for itertools combinations. Remove the yield and return and just feed your check function one at a time.

Python algorithm: What is the complexity of my algorithm and how can I optimize it?

I'm trying to accomplish a task in which you are given n (the size of a square chessboard, queens (the squares they occupy), and queries(a list of coordinates). I am to return a list of Boolean values saying whether or not each query is attackable by any of the queens. I know my algorithm works but I also know that it isn't efficient "enough". Through research, I'm guessing that the complexity is O(queensqueries)? So that's my first questions, is it O(queensqueries), if not why? Second, does anyone have an idea how I can optimize it possibly even down to O(queens + queries)? I apologize if this is a rudimentary question, but I am mainly self-taught and don't have anyone I can ask this. Thanks in advance!
def squaresUnderQueenAttack(n, queens, queries):
output = []
for i in queries:
attackable = False
for j in queens:
if attackable != True and (i[0]==j[0] or i[1]==j[1] or abs(i[0]-j[0])==abs(i[1]-j[1])):
attackable = True
break
output.append(attackable)
return output

Your current code is O(queens * queries). You can improve it to O(queens + queries).
Make three arrays, each with 8 elements. Call them row, col, diagleft, and diagright.
Look at each queen, if it's in row 0, set that spot in row to true. If it's in col 3, set that spot to col in true. Label the diagonals 0-7 and make appropriate marks in them as well.
Now the arrays indicate all of the rows, columns, and diagonals that have queens.
Now, look at each query and check to see if any of the array entries corresponding to its location are marked true. If so, then a queen threatens it.
Note that in big-O notation, we're interested in which factor "dominates" the run-time, so the above solution is probably more appropriately thought of as operating in O(max(queens,queries)) time. That is, if you have many queries, the queens calculation takes only a little time. Similarly, if you have many queens and few queries, that operation will take most of your time.

Bug while implementing Monte Carlo Markov Chain in Python

I'm trying to implement a simple Markov Chain Monte Carlo in Python 2.7, using numpy. The goal is to find the solution to the "Knapsack Problem," where given a set of m objects of value vi and weight wi, and a bag with holding capacity b, you find the greatest value of objects that can be fit into your bag, and what those objects are. I started coding in the Summer, and my knowledge is extremely lopsided, so I apologize if I'm missing something obvious, I'm self-taught and have been jumping all over the place.
The code for the system is as follows (I split it into parts in an attempt to figure out what's going wrong).
import numpy as np
import random
def flip_z(sackcontents):
##This picks a random object, and changes whether it's been selected or not.
pick=random.randint(0,len(sackcontents)-1)
clone_z=sackcontents
np.put(clone_z,pick,1-clone_z[pick])
return clone_z
def checkcompliance(checkedz,checkedweights,checkedsack):
##This checks to see whether a given configuration is overweight
weightVector=np.dot(checkedz,checkedweights)
weightSum=np.sum(weightVector)
if (weightSum > checkedsack):
return False
else:
return True
def getrandomz(length):
##I use this to get a random starting configuration.
##It's not really important, but it does remove the burden of choice.
z=np.array([])
for i in xrange(length):
if random.random() > 0.5:
z=np.append(z,1)
else:
z=np.append(z,0)
return z
def checkvalue(checkedz,checkedvalue):
##This checks how valuable a given configuration is.
wealthVector= np.dot(checkedz,checkedvalue)
wealthsum= np.sum(wealthVector)
return wealthsum
def McKnapsack(weights, values, iterations,sack):
z_start=getrandomz(len(weights))
z=z_start
moneyrecord=0.
zrecord=np.array(["error if you see me"])
failures=0.
for x in xrange(iterations):
current_z= np.array ([])
current_z=flip_z(z)
current_clone=current_z
if (checkcompliance(current_clone,weights,sack))==True:
z=current_clone
if checkvalue(current_z,values)>moneyrecord:
moneyrecord=checkvalue(current_clone,values)
zrecord= current_clone
else:failures+=1
print "The winning knapsack configuration is %s" %(zrecord)
print "The highest value of objects contained is %s" %(moneyrecord)
testvalues1=np.array([3,8,6])
testweights1= np.array([1,2,1])
McKnapsack(testweights1,testvalues1,60,2.)
What should happen is the following: With a maximum carrying capacity of 2, it should randomly switch between the different potential bag carrying configurations, of which there are 2^3=8 with the test weights and values I've fed it, with each one or zero in the z representing having or not having a given item. It should discard the options with too much weight, while keeping track of the ones with the highest value and acceptable weight. The correct answer would be to see 1,0,1 as the configuration, with 9 as the maximized value. I get nine for value every time when I use even moderately high amounts of iterations, but the configurations seem completely random, and somehow break the weight rule. I've double-checked my "checkcompliance" function with a lot of test arrays, and it seems to work. How are these faulty, overweight configurations getting past my if statements and into my zrecord ?

The trick is that z (and therefore also current_z and also zrecord) end up always referring to the exact same object in memory. flip_z modifies this object in-place via np.put.
Once you find a new combination that increases your moneyrecord, you set a reference to it -- but then in the subsequent iteration you go ahead and change the data at that same reference.
In other words, lines like
current_clone=current_z
zrecord= current_clone
do not copy, they only make yet another alias to the same data in memory.
One way to fix this is to explicitly copy that combination once you find it's a winner:
if checkvalue(current_z, values) > moneyrecord:
moneyrecord = checkvalue(current_clone, values)
zrecord = current_clone.copy()

Testing a vector without scanning it [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
For a series of algorithms I'm implementing I need to simulate things like sets of coins being weighed or pooled blood samples. The overriding goal is to identify a sparse set of interesting items in a set of otherwise identical items. This identification is done by testing groups of items together. For example the classic problem is to find a light counterfeit coin in a group of 81 (identical) coins, using as few weightings of a pan balance as possible. The trick is to split the 81 coins into three groups and weigh two groups against each other. You then do this on the group which doesn't balance until you have 2 coins left.
The key point in the discussion above is that the set of interesting items is sparse in the wider set - the algorithms I'm implementing all outperform binary search etc for this type of input.
What I need is a way to test the entire vector that indicates the presence of a single, or more ones, without scanning the vector componentwise.
I.e. a way to return the Hamming Weight of the vector in an O(1) operation - this will accurately simulate pooling blood samples/weighing groups of coins in a pan balance.
It's key that the vector isn't scanned - but the output should indicate that there is at least one 1 in the vector. By scanning I mean looking at the vector with algorithms such as binary search or looking at each element in turn. That is need to simulate pooling groups of items (such as blood samples) and s single test on the group which indicates the presence of a 1.
I've implemented this 'vector' as a list currently, but this needn't be set in stone. The task is to determine, by testing groups of the sublist, where the 1s in the vector are. An example of the list is:
sparselist = [0]*100000
sparselist[1024] = 1
But this could equally well be a long/set/something else as suggested below.
Currently I'm using any() as the test but it's been pointed out to me that any() will scan the vector - defeating the purpose of what I'm trying to achieve.
Here is an example of a naive binary search using any to test the groups:
def binary_search(inList):
low = 0
high = len(inList)
while low < high:
mid = low + (high-low) // 2
upper = inList[mid:high]
lower = inList[low:mid]
if any(lower):
high = mid
elif any(upper):
low = mid+1
else:
# Neither side has a 1
return -1
return mid
I apologise if this code isn't production quality. Any suggestions to improve it (beyond the any() test) will be appreciated.
I'm trying to come up with a better test than any() as it's been pointed out that any() will scan the list - defeating the point of what I'm trying to do. The test needn't return the exact Hamming weight - it merely needs to indicate that there is (or isn't!) a 1 in the group being tested (i.e. upper/lower in the code above).
I've also thought of using a binary xor, but don't know how to use it in a way that isn't componentwise.

Here is a sketch:
class OrVector (list):
def __init__(self):
self._nonzero_counter = 0
list.__init__(self)
def append(self, x):
list.append(self, x)
if x:
self._nonzero_counter += 1
def remove(self, x):
if x:
self._nonzero_counter -= 1
list.remove(self, x)
def hasOne(self):
return self._nonzero_counter > 0
v = OrVector()
v.append(0)
print v
print v.hasOne()
v.append(1);
print v
print v.hasOne()
v.remove(1);
print v
print v.hasOne()
Output:
[0]
False
[0, 1]
True
[0]
False
The idea is to inherit from list, and add a single variable which stores the number of nonzero entries. While the crucial functionality is delegated to the base list class, at the same time you monitor the number of nonzero entries in the list, and can query it in O(1) time using hasOne() member function.
HTH.

any will only scan the whole vector if does not find you you're after before the end of the "vector".
From the docs it is equivalent to
def any(iterable):
for element in iterable:
if element:
return True
return False
This does make it O(n). If you have things sorted (in your "binary vector") you can use bisect.
e.g. position = index(myVector, value)

Ok, maybe I will try an alternative answer.
You cannot do this with out any prior knowledge of your data. The only thing you can do it to make a test and cache the results. You can design a data structure that will help you determine a result of any subsequent tests in case your data structure is mutable, or a data structure that will be able to determine answer in better time on a subset of your vector.
However, your question does not indicate this. At least it did not at the time of writing the answer. For now you want to make one test on a vector, for a presence of a particular element, giving no prior knowledge about the data, in time complexity less than O(log n) in average case or O(n) in worst. This is not possible.
Also keep in mind you need to load a vector at some point which takes O(n) operations, so if you are interested in performing one test over a set of elements you wont loose much. On the average case with more elements, the loading time will take much more than testing.
If you want to perform a set of tests you can design an algorithm that will "build up" some knowledge during the subsequent test, that will help it determine results in better times. However, that holds only if you want make more than one test!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using variable names for 2d matrix elements for readability - python

Related

Mathematical explanation of Leetcode question: Container With Most Water

8 Queens on a chessboard | PYTHON | Memory Error

Python algorithm: What is the complexity of my algorithm and how can I optimize it?

Bug while implementing Monte Carlo Markov Chain in Python

Testing a vector without scanning it [closed]

Categories

Resources