split big matrix into multiple smaller ones - difficulties - python

I have a 32*32 matrix and I want to break it into 4 8x8 matrixes.
Here's how I try to make a smaller matrix for top-left part of the big one (pix is a 32x32 matrix).
A = [[0]*mat_size]*mat_size
for i in range(mat_ size):
for j in range(mat_size):
A[i][j] = pix[i, j]
So, pix has the following values for top-left part:
198 197 194 194 197 192 189 196
199 199 198 198 199 195 195 145
200 200 201 200 200 204 131 18
201 201 199 201 203 192 57 56
201 200 198 200 207 171 41 141
200 200 198 199 208 160 38 146
198 198 198 198 206 157 39 129
198 197 197 199 209 157 38 77
But when I print(A) after the loop, all the rows of A equal to the last row of pix. So it's 8 rows of 198 197 197 199 209 157 38 77 I know I can use A = pix[:8, :8], but I prefer to use loop for some purpose. I wonder why that loop solution doesn't gives me correct result.

A = np.zeros((4, 4, 8, 8))
for i in range(4):
for j in range(4):
A[i, j] = pix[i*8:(i+1)*8, j*8:(j+1)*8]
If I understand your question correctly, this solution should work. What it's doing is iterating through the pix matrix, and selecting a 8*8 matrix each time. Is this what you need?

Consider using numpy in order to avoid multiple references pointing to the same list (the last list in the matrix):
mat_size = 8
A = np.empty((mat_size,mat_size))
pix = np.array(pix)
for i in range(mat_size):
for j in range(mat_size):
A[i][j] = pix[i][j]

Related

How do I sum all the numbers in a list of list by column without numpy?

So in Python NumPy, I have a list of list from 0 to 99 divided by 5:
array_b = np.arange(0,100).reshape(5, 20)
list_a = array_b.tolist()
I want to add the numbers in the list by column so that the result will be:
[200 205 210 215 220 225 230 235 240 245 250 255 260 265 270 275 280 285 290 295]
I know how to do it in the array version, but I want to do the same thing in the list version (without using np.sum(array_b, axis=0)).
Any help?
Without numpy this can be done with zip and map quite elegantly:
list(map(sum, zip(*list_a)))
Explanation:
zip(*list_a) aggregates the lists element-wise
map(sum, ...) tells to apply the sum on each of these aggregations
finally, list(..) simply unpacks the iterator returned by map into a list.
Easy as (num)py...
Use .sum(axis=0) on a numpy array
import numpy as np
result = np.array(values).sum(axis=0)
# [200 205 210 215 220 225 230 235 240 245 250 255 260 265 270 275 280 285 290 295]
With the other axis possibilities
result = np.array(values).sum(axis=1) # [ 190 590 990 1390 1790]
result = np.array(values).sum() # 4950
import numpy as np
a = [[...]]
sum_array = np.sum(a, axis=0)

Finding Common Elements (Amazon SDE-1)

Given two lists V1 and V2 of sizes n and m respectively. Return the list of elements common to both the lists and return the list in sorted order. Duplicates may be there in the output list.
Link to the problem : LINK
Example:
Input:
5
3 4 2 2 4
4
3 2 2 7
Output:
2 2 3
Explanation:
The first list is {3 4 2 2 4}, and the second list is {3 2 2 7}.
The common elements in sorted order are {2 2 3}
Expected Time complexity : O(N)
My code:
class Solution:
def common_element(self,v1,v2):
dict1 = {}
ans = []
for num1 in v1:
dict1[num1] = 0
for num2 in v2:
if num2 in dict1:
ans.append(num2)
return sorted(ans)
Problem with my code:
So the accessing time in a dictionary is constant and hence my time complexity was reduced but one of the hidden test cases is failing and my logic is very simple and straight forward and everything seems to be on point. What's your take? Is the logic wrong or the question desc is missing some vital details?
New Approach
Now I am generating two hashmaps/dictionaries for the two arrays. If a num is present in another array, we check the min frequency and then appending that num into the ans that many times.
class Solution:
def common_element(self,arr1,arr2):
dict1 = {}
dict2 = {}
ans = []
for num1 in arr1:
dict1[num1] = 0
for num1 in arr1:
dict1[num1] += 1
for num2 in arr2:
dict2[num2] = 0
for num2 in arr2:
dict2[num2] += 1
for number in dict1:
if number in dict2:
minFreq = min(dict1[number],dict2[number])
for _ in range(minFreq):
ans.append(number)
return sorted(ans)
The code is outputting nothing for this test case
Input:
64920
83454 38720 96164 26694 34159 26694 51732 64378 41604 13682 82725 82237 41850 26501 29460 57055 10851 58745 22405 37332 68806 65956 24444 97310 72883 33190 88996 42918 56060 73526 33825 8241 37300 46719 45367 1116 79566 75831 14760 95648 49875 66341 39691 56110 83764 67379 83210 31115 10030 90456 33607 62065 41831 65110 34633 81943 45048 92837 54415 29171 63497 10714 37685 68717 58156 51743 64900 85997 24597 73904 10421 41880 41826 40845 31548 14259 11134 16392 58525 3128 85059 29188 13812.................
Its Correct output is:
4 6 9 14 17 19 21 26 28 32 33 42 45 54 61 64 67 72 77 86 93 108 113 115 115 124 129 133 135 137 138 141 142 144 148 151 154 160 167 173 174 192 193 195 198 202 205 209 215 219 220 221 231 231 233 235 236 238 239 241 245 246 246 247 254 255 257 262 277 283 286 290 294 298 305 305 307 309 311 312 316 319 321 323 325 325 326 329 329 335 338 340 341 350 353 355 358 364 367 369 378 385 387 391 401 404 405 406 406 410 413 416 417 421 434 435 443 449 452 455 456 459 460 460 466 467 469 473 482 496 503 .................
And Your Code's output is:
Please find the below solution
def sorted_common_elemen(v1, v2):
res = []
for elem in v2:
res.append(elem)
v1.pop(0)
return sorted(res)
Your code ignores the number of times a given element occurs in the list. I think this is a good way to fix that:
class Solution:
def common_element(self, l0, l1):
li = []
for i in l0:
if i in l1:
l1.remove(i)
li.append(i)
return sorted(li)

How to use certain rows of a dataframe in a formula

So I have multiple data frames and all need the same kind of formula applied to certain sets within this data frame. I got the locations of the sets inside the df, but I don't know how to access those sets.
This is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt #might used/need it later to check the output
df = pd.read_csv('Dalfsen.csv')
l = []
x = []
y = []
#the formula(trendline)
def rechtzetten(x,y):
a = (len(x)*sum(x*y)- sum(x)*sum(y))/(len(x)*sum(x**2)-sum(x)**2)
b = (sum(y)-a*sum(x))/len(x)
y1 = x*a+b
print(y1)
METING = df.ID.str.contains("<METING>") #locating the sets
indicatie = np.where(METING == False)[0] #and saving them somewhere
if n in df[n] != indicatie & n+1 != indicatie: #attempt to add parts of the set in l
append.l
elif n in df[n] != indicatie & n+1 == indicatie: #attempt defining the end of the set and using the formula for the set
append.l
rechtzetten(l.x, l.y)
else: #emptying the storage for the new set
l = []
indicatie has the following numbers:
0 12 13 26 27 40 41 53 54 66 67 80 81 94 95 108 109 121
122 137 138 149 150 162 163 177 178 190 191 204 205 217 218 229 230 242
243 255 256 268 269 291 292 312 313 340 341 373 374 401 402 410 411 420
421 430 431 449 450 468 469 487 488 504 505 521 522 538 539 558 559 575
576 590 591 604 605 619 620 633 634 647
Because my df looks like this:
ID,NUM,x,y,nap,abs,end
<PROFIEL>not used data
<METING>data</METING>
<METING>data</METING>
...
<METING>data</METING>
<METING>data</METING>
</PROFIEL>,,,,,,
<PROFIEL>not usde data
...
</PROFIEL>,,,,,,
tl;dr I'm trying to use a formula in each profile as shown above. I want to edit the data between 2 numbers of the list indicatie.
For example:
the fucntion rechtzetten(x,y) for the x and y df.x&df.y[1:11](Because [0]&[12] are in the list indicatie.) And then the same for [14:25] etc. etc.
What I try to avoid is typing the following hundreds of times manually:
x_#=df.x[1:11]
y_#=df.y[1:11]
rechtzetten(x_#,y_#)
I cant understand your question clearly, but if you want to replace a specific column of your pandas dataframe with a numpy array, you could simply assign it :
df['Column'] = numpy_array
Can you be more clear ?

How to make a column of numbers increase from a certain value in python

I have a txt file like this:
127 181
151 188
120 201
148 207
148 212
145 215
86 219
108 219
67 239
And I want to the second column of numbers is added in order from 180, and the repeated number is added only once.
My expected results are as follows:
127 180
151 181
120 182
148 183
148 184
145 185
86 186
108 186
67 187
Can someone give me some advice?Thanks.
If you are open to use pandas:
df = pd.read_csv('textfile.txt', header=None, sep=' ')
startvalue = 180
df[1] = np.arange(startvalue, startvalue+len(df)) - df[1].duplicated().cumsum()
df.to_csv('textfile_out.txt', sep=' ', index=False, header=False)
Full example (with imports and textfile-creation):
import pandas as pd
import numpy as np
with open('textfile.txt', 'w') as f:
f.write('''\
127 181
151 188
120 201
148 207
148 212
145 215
86 219
108 219
67 239''')
df = pd.read_csv('textfile.txt', header=None, sep=' ')
startvalue = 180
df[1] = np.arange(startvalue, startvalue+len(df)) - df[1].duplicated().cumsum()
df.to_csv('textfile_out.txt', sep=' ', index=False, header=False)
Output:
127 180
151 181
120 182
148 183
148 184
145 185
86 186
108 186
67 187
Without using any library, I suggest this approach. Create a dictionary to store the relation (old value - new value) and iterate over column values.
n = 180
new_dict = {}
for index, value in enumerate(column):
if value in new_dict.keys():
column[index] = new_dict[value]
else:
new_dict[value] = n
column[index] = n
n += 1

Is there a way to save a custom matplotlib colorbar to use elsewhere?

Is there a way to save a custom maplotlib colourmap (matplotlib.cm) as a file (e.g Color Palette Table file (.cpt), like used in MATLAB) to be shared and then use later in other programs? (e.g. Panopoly, MATLAB...)
Example
Below a new LinearSegmentedColormap is made by modifying an existing colormap (by truncation, as shown in another question linked here).
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
# Get an existing colorbar
cb = 'CMRmap'
cmap = plt.get_cmap( cb )
# Variables to modify (truncate) the colormap with
minval = 0.15
maxval = 0.95
npoints = 100
# Now modify (truncate) the colorbar
cmap = matplotlib.colors.LinearSegmentedColormap.from_list(
'trunc({n},{a:.2f},{b:.2f})'.format(n=cmap.name, a=minval,
b=maxval), cmap(np.linspace(minval, maxval, npoints)))
# Now the data can be extracted as a dictionary
cdict = cmap._segmentdata
# e.g. variables ('blue', 'alpha', 'green', 'red')
print( cdict.keys() )
# Now, is it possible to save to this as a .cpt?
More detail
I am aware of ways of loading external colormaps in matplotlib (e.g. shown here and here).
From NASA GISS's Panoply documentation:
Color Palette Table (CPT) indicates a color palette format used by the
Generic Mapping Tools program. The format defines a number of solid
color and/or gradient bands between the colorbar extrema rather than a
finite number of distinct colors.
The following is a function that takes a colormap, some limits (vmin and vmax) and the number of colors as input and creates a cpt file from it.
import matplotlib.pyplot as plt
import numpy as np
def export_cmap_to_cpt(cmap, vmin=0,vmax=1, N=255, filename="test.cpt",**kwargs):
# create string for upper, lower colors
b = np.array(kwargs.get("B", cmap(0.)))
f = np.array(kwargs.get("F", cmap(1.)))
na = np.array(kwargs.get("N", (0,0,0))).astype(float)
ext = (np.c_[b[:3],f[:3],na[:3]].T*255).astype(int)
extstr = "B {:3d} {:3d} {:3d}\nF {:3d} {:3d} {:3d}\nN {:3d} {:3d} {:3d}"
ex = extstr.format(*list(ext.flatten()))
#create colormap
cols = (cmap(np.linspace(0.,1.,N))[:,:3]*255).astype(int)
vals = np.linspace(vmin,vmax,N)
arr = np.c_[vals[:-1],cols[:-1],vals[1:],cols[1:]]
# save to file
fmt = "%e %3d %3d %3d %e %3d %3d %3d"
np.savetxt(filename, arr, fmt=fmt,
header="# COLOR_MODEL = RGB",
footer = ex, comments="")
# test case: create cpt file from RdYlBu colormap
cmap = plt.get_cmap("RdYlBu",255)
# you may create your colormap differently, as in the question
export_cmap_to_cpt(cmap, vmin=0,vmax=1,N=20)
The resulting file looks like
# COLOR_MODEL = RGB
0.000000e+00 165 0 38 5.263158e-02 190 24 38
5.263158e-02 190 24 38 1.052632e-01 215 49 39
1.052632e-01 215 49 39 1.578947e-01 231 83 55
1.578947e-01 231 83 55 2.105263e-01 244 114 69
2.105263e-01 244 114 69 2.631579e-01 249 150 86
2.631579e-01 249 150 86 3.157895e-01 253 181 104
3.157895e-01 253 181 104 3.684211e-01 253 207 128
3.684211e-01 253 207 128 4.210526e-01 254 230 153
4.210526e-01 254 230 153 4.736842e-01 254 246 178
4.736842e-01 254 246 178 5.263158e-01 246 251 206
5.263158e-01 246 251 206 5.789474e-01 230 245 235
5.789474e-01 230 245 235 6.315789e-01 206 234 242
6.315789e-01 206 234 242 6.842105e-01 178 220 235
6.842105e-01 178 220 235 7.368421e-01 151 201 224
7.368421e-01 151 201 224 7.894737e-01 120 176 211
7.894737e-01 120 176 211 8.421053e-01 96 149 196
8.421053e-01 96 149 196 8.947368e-01 70 118 180
8.947368e-01 70 118 180 9.473684e-01 59 86 164
9.473684e-01 59 86 164 1.000000e+00 49 54 149
B 165 0 38
F 49 54 149
N 0 0 0
and would be in the required format.

Categories