Single quotes around list elements that should be floats - python

I am asked to "return a list of tuples containing the subset name (as a string) and a list of floating point data values".
My code is:
def load_data(filename):
fileopen = open(filename)
result_open=[]
for line in fileopen:
answer = (line.strip().split(","))
result_open.append((answer[0],(answer[1:])))
return result_open
However, when I run the code, the following appears:
[('Slow Loris', [' 21.72', ' 29.3', ' 20.08', ' 29.98', ' 29.85', ' 26.22', ' 19......)]
Is there anyway to change the tuple to appear without the apostrophes? I want it to look like:
[('Slow Loris', [21.72, 29.3, 20.08, 29.98, 29.85, 6.22, 19......)]

line is a string, and line.strip().split(",") is a list of strings. You need to convert the string values into float or Decimal values. One way would be:
result_open.append((answer[0], [float(val) for val in answer[1:]]))
That will raise an exception on values that can't be converted to a float, so you should think about how you want to handle such input.

Related

Read text file to create list and then convert to dictionary python

I am reading text file using
source= open(curr_file,"r")
lines= source.readlines()
This converts every line in my text file to a list, but some items in my list are created with double quotes while some are created with single quotes like below.
['[INFO] Name: Xxxx, section: yyyy, time: 21.2, status: 0\n', "proof:proof1,table: db.table_name,columns:['column_1'],count:10,status:SUCCESS\n",'run time: 30 seconds\n']
The first item in list is created with single quotes, while the second is created with double quotes.
When trying to convert the above to dictionary
new_line= dict(x.split(":"),1) for x in line.split(","))
It gives me a value error
Value error: dictionary update sequence element has length 1; 2 is required
The above error is because it considers the entire string under double quotes as single value and it's not able to convert it to dictionary.
Is there a way to convert it to single quotes instead of double. I tried using replace, strip. But nothing helps.
Expected output:
{
Name:Xxxx,
section:yyyy,
time:21.2,
proof:proof1
table:db.table_name
status: success
}
The quotes has nothing to do with the error. The exterior quotes of each line are not part of the str object. They are only printed to you know it is a str. The single quotes are switched to double because the content has single quotes in it, then single quotes cannot be used to delimit the str. But again, that is only a change in what is printed not in what is stored in memory.
Try to do it in steps and print the intermediate objects you get to debug the program.
for x in line: #prints nicer than print(line)
print(x)
arg = [x.split(":",1) for x in line.split(",")]
for x in arg:
print(x)
new_line = dict(arg)
you should get printed tuples with two elements
for convert your one line(str) to dict, you can use dictionary comprehension:
new_line = dict(x.split(":",1) for x in line.split()

Format string output

With this python's code I may read all tickers in the tickers.txt file:
fh = open("tickers.txt")
tickers_list = fh.read()
print(tickers_list)
The output that I obtain is this:
A2A.MI, AMP.MI, ATL.MI, AZM.MI, BGN.MI, BMED.MI, BAMI.MI,
Neverthless, I'd like to obtain as ouput a ticker string exactly formatted in this manner:
["A2A.MI", "AMP.MI", "ATL.MI", "AZM.MI", ...]
Any idea?
Thanks in advance.
If you want the output to look in that format you want, you would need to do the following:
tickers_list= "A2A.MI, AMP.MI, ATL.MI, AZM.MI, BGN.MI, BMED.MI, BAMI.MI"
print("["+"".join(['"' + s + '",' for s in tickers_list.split(",")])[:-1]+"]")
With the output:
["A2A.MI"," AMP.MI"," ATL.MI"," AZM.MI"," BGN.MI"," BMED.MI"," BAMI.MI"]
Code explanation:
['"' + s + '",' for s in tickers_list.split(",")]
Creates a list of strings that contain each individual value, with the brackets as well as the comma.
"".join(...)[:-1]
Joins the list of strings into one string, removing the last character which is the extra comma
"["+..+"]"
adds the closing brackets
Another alternative is to simple use:
print(tickers_list.split(","))
However, the output will be slightly different as in:
['A2A.MI', ' AMP.MI', ' ATL.MI', ' AZM.MI', ' BGN.MI', ' BMED.MI', ' BAMI.MI']
Having ' instead of "
A solution for that however is this:
z = str(tickers_list.split(","))
z = z.replace("'",'"')
print(z)
Having the correct output, by replacing that character
you can to use Split function:
tickers_list = fh.read().split(',')

How to read text file as list of floats? [duplicate]

This question already has answers here:
How to convert string representation of list to a list
(19 answers)
Closed 1 year ago.
This seems like a simple question, but couldn't find it on the Stack community. I have a dataset like the one below in a text file. I would like to read it in as a list with each value as a float. Currently the output is very odd from the simple list needed (also below).
data.txt:
[1130.1271455966723, 1363.3947962724474, 784.433380329118, 847.2140341725295, 803.0276763894814,..]
Code attempted:
my_file = open(r"data.txt", "r")
content = my_file.read()
content_list = content.split(",")
my_file.close()
The output is odd. The values are string and list inside of list and added spaces:
Current result:
['[1130.1271455966723',
' 1363.3947962724474',
' 784.433380329118',
' 847.2140341725295',
' 803.0276763894814',
' 913.7751118925291',
' 1055.3775618432019',...]']
I also tried the approach here (How to convert string representation of list to a list?) with the following code but produced an error:
import ast
x = ast.literal_eval(result)
raise ValueError('malformed node or string: ' + repr(node))
ValueError: malformed node or string: ['[1130.1271455966723', '1363.3947962724474', ' 784.433380329118', ' 847.2140341725295', ' 803.0276763894814',...]']
Ideal result:
list = [1130.1271455966723, 1363.3947962724474, 784.433380329118, 847.2140341725295, 803.0276763894814]
Your data is valid JSON, so just use the corresponding module that will take care of all the parsing for you:
import json
with open("data.txt") as f:
data = json.load(f)
print(data)
Output:
[1130.1271455966723, 1363.3947962724474, 784.433380329118, 847.2140341725295, 803.0276763894814]

Converting a string to floats error

So I am trying to run this piece of code:
reader = list(csv.reader(open('mynew.csv', 'rb'), delimiter='\t'))
print reader[1]
number = [float(s) for s in reader[1]]
inside reader[1] i have the following values:
'5/1/2013 21:39:00.230', '46.09', '24.76', '0.70', '0.53', '27.92',
I am trying to store each one of values into an array like so:
number[0] = 46.09
number[1] = 24.09
and so on.....
My question is: how would i skip the date and the number following it and just store legitimate floats. Or store the contents in an array that are separated by comma?
It throws an error when I try to run the code above:
ValueError: invalid literal for float(): 5/1/2013 21:39:00.230
Thanks!
Just skip values which cannot be converted to float:
number = []
for s in reader[1]:
try:
number.append(float(s))
except ValueError:
pass
If it's always the first value that's not a float, you can take it out doing:
reader = list(csv.reader(open('mynew.csv', 'rb'), delimiter='\t'))
print reader[1]
number = [float(s) for s in reader[1][1:]]
Or you can search for / in the string and pass if exists, something like this
my_list_results = []
my_list = ['5/1/2013 21:39:00.230', '46.09', '24.76', '0.70', '0.53', '27.92']
for m in my_list:
if '/' not in m: #if we don't find /
my_list_results.append(m)
print my_list_results
number = []
for s in reader[1]:
number.append(int(float(s)))
this will convert string into exact float

python: find and replace numbers < 1 in text file

I'm pretty new to Python programming and would appreciate some help to a problem I have...
Basically I have multiple text files which contain velocity values as such:
0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00
etc for many lines...
What I need to do is convert all the values in the text file that are less than 1 (e.g. 0.137865E+00 above) to an arbitrary value of 0.100000E+01. While it seems pretty simple to replace specific values with the 'replace()' method and a while loop, how do you do this if you want to replace a range?
thanks
I think when you are beginning programming, it's useful to see some examples; and I assume you've tried this problem on your own first!
Here is a break-down of how you could approach this:
contents='0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00'
The split method works on strings. It returns a list of strings. By default, it splits on whitespace:
string_numbers=contents.split()
print(string_numbers)
# ['0.259515E+03', '0.235095E+03', '0.208262E+03', '0.230223E+03', '0.267333E+03', '0.217889E+03', '0.156233E+03', '0.144876E+03', '0.136187E+03', '0.137865E+00']
The map command applies its first argument (the function float) to each of the elements of its second argument (the list string_numbers). The float function converts each string into a floating-point object.
float_numbers=map(float,string_numbers)
print(float_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 0.13786499999999999]
You can use a list comprehension to process the list, converting numbers less than 1 into the number 1. The conditional expression (1 if num<1 else num) equals 1 when num is less than 1, otherwise, it equals num.
processed_numbers=[(1 if num<1 else num) for num in float_numbers]
print(processed_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 1]
This is the same thing, all in one line:
processed_numbers=[(1 if num<1 else num) for num in map(float,contents.split())]
To generate a string out of the elements of processed_numbers, you could use the str.join method:
comma_separated_string=', '.join(map(str,processed_numbers))
# '259.515, 235.095, 208.262, 230.223, 267.333, 217.889, 156.233, 144.876, 136.187, 1'
typical technique would be:
read file line by line
split each line into a list of strings
convert each string to the float
compare converted value with 1
replace when needed
write back to the new file
As I don't see you having any code yet, I hope that this would be a good start
def float_filter(input):
for number in input.split():
if float(number) < 1.0:
yield "0.100000E+01"
else:
yield number
input = "0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00"
print " ".join(float_filter(input))
import numpy as np
a = np.genfromtxt('file.txt') # read file
a[a<1] = 0.1 # replace
np.savetxt('converted.txt', a) # save to file
You could use regular expressions for parsing the string. I'm assuming here that the mantissa is never larger than 1 (ie, begins with 0). This means that for the number to be less than 1, the exponent must be either 0 or negative. The following regular expression matches '0', '.', unlimited number of decimal digits (at least 1), 'E' and either '+00' or '-' and two decimal digits.
0\.\d+E(-\d\d|\+00)
Assuming that you have the file read into variable 'text', you can use the regexp with the following python code:
result = re.sub(r"0\.\d*E(-\d\d|\+00)", "0.100000E+01", text)
Edit: Just realized that the description doesn't limit the valid range of input numbers to positive numbers. Negative numbers can be matched with the following regexp:
-0\.\d+E[-+]\d\d
This can be alternated with the first one using the (pattern1|pattern2) syntax which results in the following Python code:
result = re.sub(r"(0\.\d+E(-\d\d|\+00)|-0\.\d+E[-+]\d\d)", "0.100000E+00", subject)
Also if there's a chance that the exponent goes past 99, the regexp can be further modified by adding a '+' sign after the '\d\d' patterns. This allows matching digits ending in two OR MORE digits.
I've got the script working as I want now...thanks people.
When writing the list to a new file I used the replace method to get rid of the brackets and commas - is there a simpler way?
ftext = open("C:\\Users\\hhp06\\Desktop\\out.grd", "r")
otext = open("C:\\Users\\hhp06\\Desktop\\out2.grd", "w+")
for line in ftext:
stringnum = line.split()
floatnum = map(float, stringnum)
procnum = [(1.0 if num<1 else num) for num in floatnum]
stringproc = str(procnum)
s = (stringproc).replace(",", " ").replace("[", " ").replace("]", "")
otext.writelines(s + "\n")
otext.close()

Categories