convert strings in list to float - python
this issue is driving me nuts. I have a scrapped data from a website and put those data into a dictionary. As a result I have a couple of lists and one of those lists looks like this:
'In': ['7,051,156,075,145', '878,009,569,197', '427,386,441,994', '278,189,230,134', '230,599,954,634', '197,088,252,840', '101,610,549,933', '78,426,830,219', '80,925,933,532', '58,451,193,176', '55,701,282,247', '49,748,756,546', '48,642,591,960', '45,686,162,172', '44,227,235,911', '40,467,951,256', '16,392,881,465', '5,988,546,624', '41,810,356,569', '23,515,110,330', '35,815,116,718', '10,968,016,226', '518,858,345', '29,947,210,177', '29,030,975,280', '28,803,225,552', '373,570,428', '27,527,784,709', '373,964,822', '514,671,410', '25,875,702,735', '416,462,736', '24,423,209,779', '24,332,893,924', '22,491,450,198', '22,894,037,015', '23,026,866,310', '6,148,324,700', '22,226,875,309', '21,127,010,221', '375,662,568', '18,845,330,059', '238,084,409', '18,338,638,037', '6,469,472,952', '16,988,637,757', '234,705,103', '16,164,528,769', '236,542,082', '15,878,894,181', '15,892,415,892', '384,601,333', '173,719,914', '14,374,301,195', '13,789,745,661', '13,333,600,469', '12,935,822,692', '1,414,494,923', '13,000,908,688', '2,875,324,761', '280,912,611', '12,443,874,812', '12,470,333,848', '188,668,181', '12,092,658,438', '676,583,644', '11,997,025,285', '11,677,854,811', '220,087,430', '11,251,777,539', '11,442,705,899', '8,628,429,553', '190,648,851', '11,187,421,523', '3,684,540,569', '10,670,576,444', '10,740,578,885', '10,582,331,778', '10,557,152,315', '9,804,556,177', '10,325,762,681', '10,193,777,314', '10,241,020,644', '10,218,671,348', '5,565,872,689', '6,066,496,977', '128,971,640', '160,853,134', '3,061,365,095', '8,849,393,167', '182,484,904', '161,406,328', '9,335,264,956', '158,941,175', '8,893,005,099', '132,642,660', '147,492,645', '133,898,533', '8,565,414,335', '8,543,285,361', '1,081,514,186', '8,010,900,010', '344,032,888', '7,851,320,645', '119,252,217', '7,708,770,926', '3,831,828,937', '266,060,360', '7,469,255,927', '2,553,584,433', '7,404,456,294', '1,775,993,183', '7,338,693,939', '7,337,702,662', '7,246,023,792', '3,147,875,441', '142,555,296', '1,953,694,528', '3,918,267,288', '1,324,557,844', '5,683,622,890', '6,927,422,982', '106,687,337', '6,912,850,849', '2,845,801,508', '6,818,774,192', '6,853,915,064', '147,347,763', '344,146,667', '6,711,901,497', '6,570,349,311', '6,519,300,790', '135,371,330', '6,472,184,188', '84,726,075', '6,224,918,718', '5,795,088,428', '5,348,330,674', '76,438,957', '6,156,100,475', '6,046,328,039', '1,572,859,369', '5,966,535,367', '5,960,854,825', '5,844,987,758', '99,526,367', '3,320,692,742', '5,763,785,447', '332,891,989', '5,673,010,795', '2,120,698,374', '5,600,425,762', '3,406,789,774']
The values have been stored as strings which I think could be directly converted to floats but I dont know how. Nevermind, I thought I could just convert those values into floats and work with them further. However I cant get it to work. I assigned the list to the variable "bob" and tried the following code to convert to float:
empty = []
der = np.array(empty, dtype = np.float32)
der = np.append(der, bob)
When i try
print(der/2)
it gives me this:
"TypeError: ufunc 'true_divide' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''"
whats going on here? Where is my mistake?
I appreciate any help! Thanks
you can use map and lambda to remove comma and convert the list of string floats to list of floats
result = list(map(lambda x: float(x.replace(",", "")), list_of_string_floats))
for integers it would be
result = list(map(lambda x: int(x.replace(",", "")), list_of_string_ints))
There is no need to use numpy
You first have to get rid of the ",".
s = '7,051,156,075,145'
f = float(s.replace(',',''))
After this, f is the float value of the given string.
For the whole thing, you can do this:
float_list = [float(s.replace(',','')) for s in string_list]
The most readable if L is your original list:
[[float(x) for x in s.split(',')] for s in L ]
check if that works
print map(lambda x: float("".join(x.split(","))), a)
https://repl.it/H6ZL
Related
In Python, how to remove items in a list based on the specific string format?
I have a Python list as below: merged_cells_lst = [ 'P19:Q19 'P20:Q20 'P21:Q21 'P22:Q22 'P23:Q23 'P14:Q14 'P15:Q15 'P16:Q16 'P17:Q17 'P18:Q18 'AU9:AV9 'P10:Q10 'P11:Q11 'P12:Q12 'P13:Q13 'A6:P6 'A7:P7 'D9:AJ9 'AK9:AQ9 'AR9:AT9' 'A1:P1' ] I only want to unmerge the cells in the P and Q columns. Therefore, I seek to remove any strings/items in the merged_cells_lst that does not have the format "P##:Q##". I think that regex is the best and most simple way to go about this. So far I have the following: for item in merge_cell_lst: if re.match(r'P*:Q*'): pass else: merged_cell_lst.pop(item) print(merge_cell_lst) The code however is not working. I could use any additional tips/help. Thank you!
Modifying a list while looping over it causes troubles. You can use list comprehension instead to create a new list. Also, you need a different regex expression. The current pattern P*:Q* matches PP:QQQ, :Q, or even :, but not P19:Q19. import re merged_cells_lst = ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15', 'P16:Q16', 'P17:Q17', 'P18:Q18', 'AU9:AV9', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13', 'A6:P6', 'A7:P7', 'D9:AJ9', 'AK9:AQ9', 'AR9:AT9', 'A1:P1'] p = re.compile(r"P\d+:Q\d+") output = [x for x in merged_cells_lst if p.match(x)] print(output) # ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15', # 'P16:Q16', 'P17:Q17', 'P18:Q18', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13']
Your list has some typos, should look something like this: merged_cells_lst = [ 'P19:Q19', 'P20:Q20', 'P21:Q21', ...] Then something as simple as: x = [k for k in merged_cells_lst if k[0] == 'P'] would work. This is assuming that you know a priori that the pattern you want to remove follows the Pxx:Qxx format. If you want a dynamic solution then you can replace the condition in the list comprehension with a regex match.
comparing two list with different format
I have two list :- influx = [u'mphhos-fnwp-010101-2', u'mphhos-fnwp-010101-1', u'mphhos-fnwp-010101-7', u'mphhos-fnwp-010101-10', u'mphhos-fnwp-010101-9', u'mphhos-fnwp-010101-4', u'mphhos-fnwp-010101-3', u'mphhos-fnwp-010101-8', u'mphhos-fnwp-010101-6', u'mphhos-fnwp-010101-5', u'mphhos-fnwp-010101-11'] etcd =[u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-4', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-9', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-1', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-10', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-3', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-6', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-7', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-8', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-11', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-2', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-5'] Etcd is the parent list and I want to compare influx with Etcd. 1.) I want to get all elements which are not present in the list influx and return them. 2.) How I can convert the etcd list into influx list formatting by omitting /xymon/fnwp/mphhos/ Either of the above question will get me my solution. I tried lots of methods but I am not getting my solution as they are in different format. I will get my answer by doing set(etcd)-set(influx) but as they are in different format I am getting all the items in the list.
str.rsplit [x for x in etcd if x.rsplit('/', 1)[1] not in influx] Per rafaelc's suggestion infx = set(influx) [x for x in etcd if x.rsplit('/', 1)[1] not in infx]
One simple solution would be to remove the prefixes for i, char in enumerate(etcd): char = char.replace('/xymon/fnwp/mphhos/', '') etcd[i] = char And then you could find the differences using set().
influx = [u'mphhos-fnwp-010101-2', u'mphhos-fnwp-010101-1', u'mphhos-fnwp-010101-7', u'mphhos-fnwp-010101-10', u'mphhos-fnwp-010101-9', u'mphhos-fnwp-010101-4', u'mphhos-fnwp-010101-3', u'mphhos-fnwp-010101-8', u'mphhos-fnwp-010101-6', u'mphhos-fnwp-010101-5', u'mphhos-fnwp-010101-11'] etcd =[u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-4', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-9', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-1', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-10', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-3', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-6', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-7', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-8', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-11', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-2', u'/xymon/fnwp/mphhos/mphhos-fnwp-010101-5'] etcd = [x.replace('/xymon/fnwp/mphhos/', '') for x in etcd] # or using regex # etcd = [re.sub('/xymon/fnwp/mphhos/', '', x) for x in etcd] diff = set(etcd) - set(influx) print(diff)
How do I turn each element in a list into a string with quotes
I am using PyCharm IDE. I frequently work with large data sets, and sometimes I have to iterate through each data. For instance, I have a list ticker_symbols = [500.SI, 502.SI, 504.SI, 505.SI, 508.SI, 510.SI, 519.SI...] How do I automatically format each element into a string with quotes, i .e. ticker_symbols = ['500.SI', '502.SI', '504.SI', '505.SI', '508.SI', '510.SI', '519.SI'...] ? Is there a short-cut on PyCharm?
You can just do something like: ticker_symbols = '[500.SI,502.SI,504.SI,505.SI,508.SI,510.SI,519.SI]' print(ticker_symbols[1:-1].split(',')) Or like your string: ticker_symbols = '[500.SI, 502.SI, 504.SI, 505.SI, 508.SI, 510.SI, 519.SI]' print(ticker_symbols[1:-1].split(', ')) Both reproduce: ['500.SI', '502.SI', '504.SI', '505.SI', '508.SI', '510.SI', '519.SI']
You can use list comprehension: temp_list = ["'{}'".format(x) for x in ticker_symbols] Result in: ['500.SI', '502.SI', '504.SI',...]
This will convert your list elements to stringsticker_symbols=str(ticker_symbols[1:-1].split(', '))
limit a float list into 10 digits
I have a list import from a data file. lines=['1628.246', '100.0000', '0.4563232E-01', '0.4898217E-01', '0.3017656E-02', '0.2271272', '0.2437533', '0.1500232E-01', '0.4102987', '0.4117742', '0.5461504E-02', '2.080838', '0.5527303E-03', '-0.4542367E-03', '-0.2238781E-01', '-0.8196812E-03', '-0.3796306E-01', '-0.7906407E-03', '-0.6738000E-03', '0.000000'] I want to generate a new list include all element in same 10 digits and put back to file Here is I did: newline=map(float,lines) newline=map("{:.10f}".format,newline) newline=map(str,newline) jitterfile.write(join(newline)+'\n') It works, but looks not beautiful. Any idea to make it good looking?
You can do it in a single line like so: newline=["{:.10f}".format(float(i)) for i in lines] jitterfile.write(join(newline)+'\n') Of note, your third instruction newline=map(str,newline) is redundant as the entries in the list are already strings, so casting them is unnecessary.
The map function also accept lambda , also as the result of format is string you don't need to apply the str on your list ,and you need to use join with a delimiter like ',': >>> newline=map(lambda x:"{:.10f}".format(float(x)),newline) >>> newline ['1628.2460000000', '100.0000000000', '0.0456323200', '0.0489821700', '0.0030176560', '0.2271272000', '0.2437533000', '0.0150023200', '0.4102987000', '0.4117742000', '0.0054615040', '2.0808380000', '0.0005527303', '-0.0004542367', '-0.0223878100', '-0.0008196812', '-0.0379630600', '-0.0007906407', '-0.0006738000', '0.0000000000'] jitterfile.write(','.join(newline)+'\n')
convert unicode into list
What I have when I convert an png image into blocks then add the section sign (ยง), I then convert it to a string using: lframe = [e.encode('utf-8') for e in frame.split(',')] but when I do, it gives me a: ['\xc2\xa70\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\ x88\xe2\x96\x88\xc2\xa76\xe2\x96\x88\xe2\x96\x88\xc2\xa70\xe2\x96\x88\xe2\x96\x8 8\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xc2\xa7r'] What I want to do, is to find a way to convert my output into something like ['\xc2','\xa70','\xe2','\x96','\x88'...] Thanks!
The code below should do what you want. lframe = [x for x in [e.encode('utf-8') for e in frame.split(',')][0]]