convert unicode into list

convert unicode into list - python

What I have when I convert an png image into blocks then add the section sign (§), I then convert it to a string using:
lframe = [e.encode('utf-8') for e in frame.split(',')]
but when I do, it gives me a:
['\xc2\xa70\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\
x88\xe2\x96\x88\xc2\xa76\xe2\x96\x88\xe2\x96\x88\xc2\xa70\xe2\x96\x88\xe2\x96\x8
8\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xc2\xa7r']
What I want to do, is to find a way to convert my output into something like
['\xc2','\xa70','\xe2','\x96','\x88'...]
Thanks!

The code below should do what you want.
lframe = [x for x in [e.encode('utf-8') for e in frame.split(',')][0]]

Related

In Python, how to remove items in a list based on the specific string format?

I have a Python list as below:
merged_cells_lst = [
'P19:Q19
'P20:Q20
'P21:Q21
'P22:Q22
'P23:Q23
'P14:Q14
'P15:Q15
'P16:Q16
'P17:Q17
'P18:Q18
'AU9:AV9
'P10:Q10
'P11:Q11
'P12:Q12
'P13:Q13
'A6:P6
'A7:P7
'D9:AJ9
'AK9:AQ9
'AR9:AT9'
'A1:P1'
]
I only want to unmerge the cells in the P and Q columns. Therefore, I seek to remove any strings/items in the merged_cells_lst that does not have the format "P##:Q##".
I think that regex is the best and most simple way to go about this. So far I have the following:
for item in merge_cell_lst:
if re.match(r'P*:Q*'):
pass
else:
merged_cell_lst.pop(item)
print(merge_cell_lst)
The code however is not working. I could use any additional tips/help. Thank you!

Modifying a list while looping over it causes troubles. You can use list comprehension instead to create a new list.
Also, you need a different regex expression. The current pattern P*:Q* matches PP:QQQ, :Q, or even :, but not P19:Q19.
import re
merged_cells_lst = ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15', 'P16:Q16', 'P17:Q17', 'P18:Q18', 'AU9:AV9', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13', 'A6:P6', 'A7:P7', 'D9:AJ9', 'AK9:AQ9', 'AR9:AT9', 'A1:P1']
p = re.compile(r"P\d+:Q\d+")
output = [x for x in merged_cells_lst if p.match(x)]
print(output)
# ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15',
# 'P16:Q16', 'P17:Q17', 'P18:Q18', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13']

Your list has some typos, should look something like this:
merged_cells_lst = [
'P19:Q19',
'P20:Q20',
'P21:Q21', ...]
Then something as simple as:
x = [k for k in merged_cells_lst if k[0] == 'P']
would work. This is assuming that you know a priori that the pattern you want to remove follows the Pxx:Qxx format. If you want a dynamic solution then you can replace the condition in the list comprehension with a regex match.

How do I turn each element in a list into a string with quotes

I am using PyCharm IDE.
I frequently work with large data sets, and sometimes I have to iterate through each data.
For instance, I have a list
ticker_symbols = [500.SI, 502.SI, 504.SI, 505.SI, 508.SI, 510.SI, 519.SI...]
How do I automatically format each element into a string with quotes, i .e.
ticker_symbols = ['500.SI', '502.SI', '504.SI', '505.SI', '508.SI', '510.SI', '519.SI'...] ?
Is there a short-cut on PyCharm?

You can just do something like:
ticker_symbols = '[500.SI,502.SI,504.SI,505.SI,508.SI,510.SI,519.SI]'
print(ticker_symbols[1:-1].split(','))
Or like your string:
ticker_symbols = '[500.SI, 502.SI, 504.SI, 505.SI, 508.SI, 510.SI, 519.SI]'
print(ticker_symbols[1:-1].split(', '))
Both reproduce:
['500.SI', '502.SI', '504.SI', '505.SI', '508.SI', '510.SI', '519.SI']

You can use list comprehension:
temp_list = ["'{}'".format(x) for x in ticker_symbols]
Result in:
['500.SI', '502.SI', '504.SI',...]

This will convert your list elements to stringsticker_symbols=str(ticker_symbols[1:-1].split(', '))

convert strings in list to float

this issue is driving me nuts. I have a scrapped data from a website and put those data into a dictionary. As a result I have a couple of lists and one of those lists looks like this:
'In': ['7,051,156,075,145', '878,009,569,197', '427,386,441,994', '278,189,230,134', '230,599,954,634', '197,088,252,840', '101,610,549,933', '78,426,830,219', '80,925,933,532', '58,451,193,176', '55,701,282,247', '49,748,756,546', '48,642,591,960', '45,686,162,172', '44,227,235,911', '40,467,951,256', '16,392,881,465', '5,988,546,624', '41,810,356,569', '23,515,110,330', '35,815,116,718', '10,968,016,226', '518,858,345', '29,947,210,177', '29,030,975,280', '28,803,225,552', '373,570,428', '27,527,784,709', '373,964,822', '514,671,410', '25,875,702,735', '416,462,736', '24,423,209,779', '24,332,893,924', '22,491,450,198', '22,894,037,015', '23,026,866,310', '6,148,324,700', '22,226,875,309', '21,127,010,221', '375,662,568', '18,845,330,059', '238,084,409', '18,338,638,037', '6,469,472,952', '16,988,637,757', '234,705,103', '16,164,528,769', '236,542,082', '15,878,894,181', '15,892,415,892', '384,601,333', '173,719,914', '14,374,301,195', '13,789,745,661', '13,333,600,469', '12,935,822,692', '1,414,494,923', '13,000,908,688', '2,875,324,761', '280,912,611', '12,443,874,812', '12,470,333,848', '188,668,181', '12,092,658,438', '676,583,644', '11,997,025,285', '11,677,854,811', '220,087,430', '11,251,777,539', '11,442,705,899', '8,628,429,553', '190,648,851', '11,187,421,523', '3,684,540,569', '10,670,576,444', '10,740,578,885', '10,582,331,778', '10,557,152,315', '9,804,556,177', '10,325,762,681', '10,193,777,314', '10,241,020,644', '10,218,671,348', '5,565,872,689', '6,066,496,977', '128,971,640', '160,853,134', '3,061,365,095', '8,849,393,167', '182,484,904', '161,406,328', '9,335,264,956', '158,941,175', '8,893,005,099', '132,642,660', '147,492,645', '133,898,533', '8,565,414,335', '8,543,285,361', '1,081,514,186', '8,010,900,010', '344,032,888', '7,851,320,645', '119,252,217', '7,708,770,926', '3,831,828,937', '266,060,360', '7,469,255,927', '2,553,584,433', '7,404,456,294', '1,775,993,183', '7,338,693,939', '7,337,702,662', '7,246,023,792', '3,147,875,441', '142,555,296', '1,953,694,528', '3,918,267,288', '1,324,557,844', '5,683,622,890', '6,927,422,982', '106,687,337', '6,912,850,849', '2,845,801,508', '6,818,774,192', '6,853,915,064', '147,347,763', '344,146,667', '6,711,901,497', '6,570,349,311', '6,519,300,790', '135,371,330', '6,472,184,188', '84,726,075', '6,224,918,718', '5,795,088,428', '5,348,330,674', '76,438,957', '6,156,100,475', '6,046,328,039', '1,572,859,369', '5,966,535,367', '5,960,854,825', '5,844,987,758', '99,526,367', '3,320,692,742', '5,763,785,447', '332,891,989', '5,673,010,795', '2,120,698,374', '5,600,425,762', '3,406,789,774']
The values have been stored as strings which I think could be directly converted to floats but I dont know how. Nevermind, I thought I could just convert those values into floats and work with them further. However I cant get it to work. I assigned the list to the variable "bob" and tried the following code to convert to float:
empty = []
der = np.array(empty, dtype = np.float32)
der = np.append(der, bob)
When i try
print(der/2)
it gives me this:
"TypeError: ufunc 'true_divide' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''"
whats going on here? Where is my mistake?
I appreciate any help! Thanks

you can use map and lambda to remove comma and convert the list of string floats to list of floats
result = list(map(lambda x: float(x.replace(",", "")), list_of_string_floats))
for integers it would be
result = list(map(lambda x: int(x.replace(",", "")), list_of_string_ints))
There is no need to use numpy

You first have to get rid of the ",".
s = '7,051,156,075,145'
f = float(s.replace(',',''))
After this, f is the float value of the given string.
For the whole thing, you can do this:
float_list = [float(s.replace(',','')) for s in string_list]

The most readable if L is your original list:
[[float(x) for x in s.split(',')] for s in L ]

check if that works
print map(lambda x: float("".join(x.split(","))), a)
https://repl.it/H6ZL

Is it possible to use variable name in imread? Basic issue in Python

My problem is really quite simple.
I have a 100 images on my computer, those images are called 1.ppm 2.ppm and so on until 100.ppm
I want to read each image to a variable using imread, and then perform a few operations. I want to do the exact same thing to all of the images.
My question is this - Instead of copy pasting one hundred times, is it possible to use imread in a loop? something like:
for i in range(1,100):
X=io.imread('/home/oria/Desktop/more pics/'i'.ppm')
Instead of copy pasting the same code block and just changing the picture number a hundred times, I want to do this in a loop.
I have a similar issue with numpy.load. I want to load files called ICA1 ICA2 etc up to ICA100. Is it possible to write something like
numpy.load('/home/oria/Desktop/ICA DB/ICA'i'.npy)?

Like this:
for i in range(1,100):
X=io.imread('/home/oria/Desktop/more pics/%s.ppm' %(i))
Or, like this:
for i in range(1,100):
X=io.imread('/home/oria/Desktop/more pics/'+str(i)+'.ppm')
Go ahead and read the article on basic string operations as well as this simple article on string formatting

If I correctly understand what you're asking, it could be done as:
for i in range(1, 101):
x = io.imread('/home/oria/Desktop/more pics/' + str(i) + '.ppm')
Note that the high end of the range function is not inclusive, so using range(1, 100) would only produce 1, 2, 3...99. Also note that i must be converted to a string or you will receive TypeError: cannot concatenate 'str' and 'int' objects.

import cv2
import os
def load_images_from_folder(folder):
images = []
for filename in os.listdir(folder):
img = cv2.imread(os.path.join(folder,filename))
if img is not None:
images.append(img)
return images

Just use str.format, passing the variable i:
for i in range(1,100):
X = io.imread('/home/oria/Desktop/more pics/{}.ppm'.format(i))
When you want to load with numpy do the same thing again:
for i in range(1,100):
X = numpy.load('/home/oria/Desktop/ICA DB/ICA{}.npy'.format(i))

Cast an array shape into a string

I need to convert the output of a 2D array's myarray.shape into a string, because I want to isolate the rows and columns and reassign them as height and width for an image that I've read in, WITHOUT using PIL.
I tried (str)image1.shape but it just gave a syntax error.
What's the correct way to do this?

It's str(image1.shape). If you want to then parse it (say it's (50,2)), you could do this:
myshape = str(image1.shape) # returns '(50, 2)'
part1, part2 = myshape.split(', ')
part1 = part1[1:] # now is '50'
part2 = part2[:-1] # now is '2'
Or, since you're really after the numbers (I think), just skip the str() step and directly parse the output of image1.shape:
firstnum, secondnum = image1.shape
and you're done.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

convert unicode into list - python

The code below should do what you want. lframe = [x for x in [e.encode('utf-8') for e in frame.split(',')][0]]

Related

In Python, how to remove items in a list based on the specific string format?

How do I turn each element in a list into a string with quotes

convert strings in list to float

Is it possible to use variable name in imread? Basic issue in Python

Cast an array shape into a string

Categories

Resources