Recursion - Creating a matrix to paint a PPM file - python

We have a .ppm file representing an image, that is converted into matrix form. Like so:
208 21 139 96 38 169 0 172 123 115 172 154 0 227 153 29 234 109 222 39
5 241 176 62 133 69 0 152 145 154 99 93 0 74 85 47 241 23 207 45
25 92 229 196 163 139 0 189 76 0 0 220 0 2 152 0 79 44 249 203
5 8 75 228 108 125 0 129 0 39 0 18 0 144 30 0 0 0 172 54
222 3 25 196 240 0 0 1 0 11 0 226 0 202 20 203 235 169 0 93
238 184 0 0 0 0 249 123 0 178 0 252 0 91 152 49 119 200 0 31
0 0 220 170 165 11 148 0 0 52 0 233 0 241 131 83 173 196 0 0
204 0 0 0 0 0 0 0 92 225 0 0 0 141 159 182 0 0 0 143
141 178 217 74 0 174 243 164 200 98 138 122 67 44 34 96 0 0 68 118
133 227 39 0 0 118 234 247 38 0 0 0 0 0 0 0 243 247 108 153
54 185 145 0 0 9 102 9 57 0 159 210 128 152 171 4 0 0 118 139
225 161 52 17 0 0 115 129 0 0 170 0 0 0 0 83 45 0 204 91
212 57 167 39 174 0 0 0 0 89 178 0 197 0 0 219 0 0 0 0
173 113 78 184 115 48 107 253 0 0 53 216 0 0 109 245 0 102 42 26
251 187 218 234 139 140 84 101 0 0 64 102 0 0 0 0 106 111 237 26
164 142 31 222 63 218 252 0 0 228 151 76 169 0 95 153 168 195 157 127
141 157 99 86 156 0 0 109 0 227 97 54 0 0 144 11 237 169 67 53
171 211 226 0 0 156 208 207 0 0 0 0 0 249 56 229 194 48 216 197
29 200 99 0 188 160 178 199 145 244 0 0 162 163 254 201 0 120 239 5
51 134 175 0 193 216 79 49 89 86 180 0 0 0 0 0 35 37 42 2
In this matrix zeroes (0) represent walls and positive numbers represent colors. As you can see the matrix is divided into areas&islands by walls (i.e. zeros)(diagonal zeros count as walls as well). I need a program that returns a list of islands including all numbers in that area. (So for example a list including all numbers in first island, then a list including all in the second etc.) I wrote a program below (it's incomplete) but it hits the recursion limit.
To give some perspective, what I am trying to build here is a program that averages the colors in every island. That is, I will need to convert every number within a certain island into an average number that is the average value of all numbers in that island, but I got stuck midway. I used a recursive algorithms as it made most sense to me.
def rec_appender(img,r,c,lst):
n_rows,n_cols=len(img),len(img[0])
if r<0 or c<0 or r>=n_rows or c>=n_cols: # check out actual borders
return
if img[r][c] == 0:
return
lst.append(img[r][c])
neigh_list=[[-1,0],[+1,0],[0,-1],[0,+1]]
for neigh in neigh_list:
rec_appender(img,r+neigh[0],c+neigh[1],lst)
def averager(img):
lst = []
n_rows,n_cols=len(img),len(img[0])
for r in range(0,n_rows):
for c in range(0,n_cols):
if img[r][c] != 0: # is wall
rec_appender(img,r,c,lst)
The second function checks for all points in matrix and if they are not walls refers to first function.
First function appends that point into a list then checks it neighbors whether they are part of the same island and adds them too recursively to the list if they are part of the island. (the code is incomplete as you can see islands won't be separated but my problem is the recursive limit)

Well this should work, in the method dfs() you iterate for every cell in the board, then when you find a cell not visited you tray to visit the entire island using the method _dfs(), every time you end the visit of an island you will have the sum of the colors then you divide by the total cells visited by _dfs(). I use a modified version of the algorithm DFS, you can find more info about it here .
def dfs(colors: list[list[int]]):
mask=[ [False]*len(colors[0]) for _ in range(len(colors))]
islands_average:list[float] = []
for i in range(len(colors)):
for j in range(len(colors[0])):
if not mask[i][j] and colors[i][j]!=0 :
total, sum_for_average=_dfs(i,j,colors, mask )
average = sum_for_average/total
islands_average.append(average)
return islands_average
def verify_pos(colors,x,y):
return x>=0 and x < len(colors) and\
y>=0 and y < len(colors[0])
def _dfs(x:int,y:int,colors, mask) -> tuple[int, int]:
mask[x][y] = True
sum_for_average = colors[x][y]
total = 1
for k in range(4):
xx = x + helper_x[k]
yy = y + helper_y[k]
if verify_pos(colors,xx,yy) and colors[xx][yy]!=0 and not mask[xx][yy]:
current=_dfs(xx,yy,colors,mask)
total+=current[0]
sum_for_average+=current[1]
return total, sum_for_average
EDIT: I assumed you had how to convert colors into matrix, you can do it as follow
def get_matrix(string:str):
sol=[]
for row in string.split("\n"):
sol.append([ int(c) for c in row.split(' ') if c!=''])
return sol
EDIT2: The start method of the algorithm is dfs(colors) and the argument colors is a list of lists of colors, you can use the method get_matrix(string) to get the colors from the input string format.

Related

Python, Prime number project

I'm trying to create a program in python that asks for a number of prime numbers to print. The program should then print them ten at a line and then continue on the next line. I managed to solve the prime number bit, but I can't seem to find a solution to the ten at a line bit.
I would really appreciate the help
Input:
num = int(input("How many primes: "))
count = 0
prime = 2
while count < num:
if all(prime%j!=0 for j in range(2, prime)):
print(prime, end =" ")
count+=1
prime +=1
Output: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79
83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173
179 181 191 193 197 199 211 223 227 229
But I need this output
How many primes? 50
2 3 5 7 11 13 17 19 23 29
31 37 41 43 47 53 59 61 67 71
73 79 83 89 97 101 103 107 109 113
127 131 137 139 149 151 157 163 167 173
179 181 191 193 197 199 211 223 227 229
num = int(input("How many primes: "))
count = 0
prime = 2
while count < num:
if all(prime % j != 0 for j in range(2, prime)):
print(prime, end=" ")
count += 1
if(count%10 == 0):
print("\n")
prime += 1

How to read binary file and convert it to text in Python

I have a time series from 1946-2020 for the discharges of gauge. The file is binary and if I open it in a text editor, or even in a hex-editor, I see values which do not make sense. I have searched a lot and found some code but I don't see any time series and values.
I can imagine that the time series is looking like that:
These values are also correct and are in the data.
t Q
17.11.1972 8,66
04.02.2020 28,2
I copied the beginning part of the file:
##4.00
à?š™™™™™é?ÍÌÌÌÌÌì?ffffffî?¸…ëQ¸î?\Âõ(\ï?®Gáz®ï?×£p=
×ï?V-²ïï?§èH.ÿï?Sš ä ÍÌL= ÿÿÍÌL= _ B €#
## NASIM26760601m³/sffûB°FAˆ¢A ¥¼x?  §=,ðñ=ÿ9jŒA´¯DA;Âò#¿‡Ø½ =|?0¥‡=?1=ÿ]”:A þA ¨ï¿eV4#)¡? i3|?`d‹=ek=ÿ‘_î#5Ý#¼˜DA
©]? cÂ{?Œ%¿=+>ÿÚÍ# %µ#À#•9AN? ýô{?h«=×Í­=ÿð½¢#»MAòöî# ¤¼x?¸~=Xä—=ÿ9jŒA
+BAïÕ#yBѾ ‚Äw?èrÈ=¯k“=ÿ]”:A¼/±#>. #„×9AG€
I copied the last part of the file, because I know there must be the time-discharge of 2020. Maybe it is in the end of the file.
×ï?V-²ïï?+‡ÙÎ÷ï? ÍÌL= ÿÿÍÌL= _ B €#
##
in the following screenshot you see the data , when I open it in Notepad++.
here is my python code and output
with open("time-serie_1946 bis 2020.hqr", "rb") as file:
data = file.read()
with open("out.txt", "w") as f:
f.write(" ".join(map(str,data)))
f.write("\n")
the beginning of output:
6 64 64 52 46 48 48 10 0 0 0 0 0 0 0 224 63 154 153 153 153 153 153 233 63 205 204 204 204 204 204 236 63 102 102 102 102 102 102 238 63 184 30 133 235 81 184 238 63 92 143 194 245 40 92 239 63 174 71 225 122 20 174 239 63 215 163 112 61 10 215 239 63 86 14 45 178 157 239 239 63 30 167 232 72 46 255 239 63 83 78 101 117 98 101 114 101 99 104 110 117 110 103 32 98 105 115 32 50 48 50 48 32 109 105 116 32 117 110 98 101 115 116 228 116 105 103 116 101 110 32 72 81 32 118 111 110 32 49 57 52 54 45 49 57 55 50 32 40 65 110 102 114 97 103 101 32 83 99 104 117 104 109 97 99 104 101 114 44 32 84 82 41 154 7 0 0 228 7 0 0 0 0 0 0
How can I decode it to get the time series?

pad rows on a pandas dataframe with zeros till N count

Iam loading data via pandas read_csv like so:
data = pd.read_csv(file_name_item, sep=" ", header=None, usecols=[0,1,2])
which looks like so:
0 1 2
0 257 503 48
1 167 258 39
2 172 242 39
3 172 403 81
4 180 228 39
5 183 394 255
6 192 179 15
7 192 347 234
8 192 380 243
9 192 437 135
10 211 358 234
I would like to pad this data with zeros till a row count of 256, meaning:
0 1 2
0 157 303 48
1 167 258 39
2 172 242 39
3 172 403 81
4 180 228 39
5 183 394 255
6 192 179 15
7 192 347 234
8 192 380 243
9 192 437 135
10 211 358 234
11 0 0 0
.. .. .. ..
256 0 0 0
How do I go about doing this? The file could have anything from 1 row to 200 odd rows and I am looking for something generic which pads this dataframe with 0's till 256 rows.
I am quite new to pandas and could not find any function to do this.
reindex with fill_value
df_final = data.reindex(range(257), fill_value=0)
Out[1845]:
0 1 2
0 257 503 48
1 167 258 39
2 172 242 39
3 172 403 81
4 180 228 39
.. ... ... ..
252 0 0 0
253 0 0 0
254 0 0 0
255 0 0 0
256 0 0 0
[257 rows x 3 columns]
We can do
new_df = df.reindex(range(257)).fillna(0, downcast='infer')

cumsum in numpy ndarray (edited)

I had an issue with cumsum in a dataframe, which was nicely resolved here : https://stackoverflow.com/a/61842690/7937578
But when I tried to do it with my entire dataframe, I couldn't fit all my data into pandas, so I tried converting it to numpy arrays only, but I can't seem to reproduce the code in numpy only.
So far I have this :
test = np.arange(200).reshape(4, 50)
test[2] = np.random.choice([-1, 0, 1], size=50)
TARGET_SUM = 10
x = np.cumsum(test[2] != 0)
changing = np.roll(x, 1) != x
indices = np.where(changing & (x % TARGET_SUM == 0) & (x > 0))[0]
indices = np.concatenate(([-1,], indices))
indices += 1
for i1, i2 in zip(indices[0:-1], indices[1:]):
print(i1, i2)
print(test[i1:i2])
But the output is this :
0 13
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47 48 49]
[ 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
86 87 88 89 90 91 92 93 94 95 96 97 98 99]
[ -1 1 0 -1 -1 0 0 -1 1 -1 1 1 -1 -1 0 -1 0 0
0 -1 0 0 -1 1 -1 1 1 -1 -1 0 1 0 0 -1 1 -1
1 0 0 0 1 0 -1 1 1 1 1 1 1 1]
[150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
186 187 188 189 190 191 192 193 194 195 196 197 198 199]]
13 29
[]
29 46
[]
Where it should be more like this :
0 13
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 50 51 52 53 54 55 56 57 58 59 60 61 62]
[ -1 1 0 -1 -1 0 0 -1 1 -1 1 1 -1]
[ 150 151 152 153 154 155 156 157 158 159 160 161 162]]
13 29
etc...
The solution of #Ben.T juste above worked perfectly !
Quoting : "I think suppr is a list of arrays, so I guess what you need is print(np.array(suppr)[:, i1:i2]) and if it is already an array, then suppr[:, i1:i2] should be enough. "

Can I separate data for each curve?

I want to get the points in quadratic curve to get the quadratic equation:
ay^2 + by + c = d
I get a set of data,
x = [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
95 0 92 0 92 96 0 92 96 0 92 96 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92 0 92
0 92 0 92 0 92 0 92 0 92 153 0 92 0 92 0 92 149
0 92 0 92 146 0 92 145 0 92 144 0 92 0 92 0 92 140
0 92 139 0 92 138 0 92 137 0 92 136 0 92 135 0 92 134
0 92 133 0 92 132 0 92 131 0 92 130 0 92 128 129 0 92
128 0 92 127 0 92 126 127 0 92 125 126 0 92 124 125 0 92
124 0 92 123 0 92 122 0 121 0 120 121 0 119 120 0 118 119
0 117 118 0 117 0 116 117 0 115 116 0 114 115 0 114 0 113
114 0 112 113 0 112 0 111 0 110 111 0 109 110 0 109 0 108
0 107 108 0 107 0 106 0 105 106 0 105 0 104 105 0 103 104
0 103 0 102 103 0 102 0 101 0 100 0 99 100 0 99 0 98
99 0 98 0 97 0 96 97 0 96 0 95 96 0 95 0 94 0
94 0 93 0 93 0 92 0 91 92 0 91 0 90 91 0 90 0
89 90 0 89 0 88 89 0 88 89 0 88 0 88 0 88 0 87
0 87 0 0 0 0 0 0 0 0 0 0 0 0 0]
y =
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 89 90 91
92 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
201 201 202 202 203 203 204 204 205 205 206 206 207 207 208 208 209 209
210 210 211 211 212 212 213 213 214 214 215 215 216 216 217 217 218 218
219 219 220 220 221 221 222 222 223 223 224 224 225 225 226 226 227 227
228 228 229 229 230 230 231 231 232 232 233 233 234 234 235 235 236 236
237 237 238 238 239 239 240 240 241 241 242 242 243 243 244 244 245 245
246 246 247 247 248 248 249 249 250 250 251 251 252 252 253 253 254 254
254 255 255 256 256 256 257 257 257 258 258 258 259 259 260 260 261 261
262 262 263 263 264 264 265 265 266 266 267 267 268 268 269 269 270 270
271 271 272 272 273 273 274 274 275 275 276 276 277 277 278 278 279 279
280 280 281 281 282 282 283 283 284 284 285 285 286 286 287 287 288 288
289 289 290 290 291 291 292 292 293 293 294 294 295 295 296 296 297 297
298 298 299 299 300 300 301 301 302 302 303 303 304 304 305 305 306 306
307 307 308 308 309 309 310 310 311 311 312 312 313 313 314 314 315 315
316 316 317 317 318 318 319 319 320 320 320 321 321 322 322 323 323 323
324 324 325 325 325 326 326 326 327 327 327 328 328 329 329 330 330 330
331 331 331 332 332 332 333 333 333 334 334 334 335 335 335 336 336 336
337 337 337 338 338 338 339 339 339 340 340 340 341 341 341 341 342 342
342 343 343 343 344 344 344 344 345 345 345 345 346 346 346 346 347 347
347 348 348 348 349 349 349 350 350 351 351 351 352 352 352 353 353 353
354 354 354 355 355 356 356 356 357 357 357 358 358 358 359 359 360 360
360 361 361 361 362 362 363 363 364 364 364 365 365 365 366 366 367 367
368 368 368 369 369 370 370 371 371 371 372 372 373 373 373 374 374 374
375 375 376 376 376 377 377 378 378 379 379 380 380 380 381 381 382 382
382 383 383 384 384 385 385 385 386 386 387 387 387 388 388 389 389 390
390 391 391 392 392 393 393 394 394 394 395 395 396 396 396 397 397 398
398 398 399 399 400 400 400 401 401 401 402 402 403 403 404 404 405 405
406 406 407 408 409 410 411 412 413 414 415 416 417 418 419]
I can view there were 3 lines in the plot. Can I separate data for each curve?
or can I only extract value of the quadratic curve?
Try DBSCAN algorithm, it is implemented in sklearn already.
It works well if your sample in each curve are very dense to each other in the same curve but very far from others in other curves

Categories