I have a dataset that looks like this:
0 _ _ 23.0186E-03
10 _ _51.283E-03
20 _ _125.573E-03
where the numbers are lined up line by line (the underscores represent spaces).
The numbers in the right hand column are currently part of the line's string. I am trying to convert the numbers on the right into numerical values (0.0230186 etc). I can convert them with int() once they are in a simple numerical form, but I need to change the "E"s to get there. If you know how to change it for any value of E such as E-01, E-22 it would be very helpful.
Currently my code looks like so:
fin = open( 'stringtest1.txt', "r" )
fout = open("stringtest2.txt", "w")
while 1:
x=fin.readline()
a=x[5:-1]
##conversion code should go here
if not x:
break
fin.close()
fout.close()
I would suggest the following for the conversion:
float(x.split()[-1])
str.split() will split on white space when no arguments are provided, and float() will convert the string into a number, for example:
>>> '20 125.573E-03'.split()
['20', '125.573E-03']
>>> float('20 125.573E-03'.split()[-1])
0.12557299999999999
You should use context handlers, and file handles are iterable:
with open('test1.txt') as fhi, open('test2.txt', 'w') as fho:
for line in fhi:
f = float(line.split()[-1])
fho.write(str(f))
If I understand what you want to do correctly, there's no need to do anything with the E's: in python float('23.0186E-03') returns 0.0230186, which I think is what you want.
All you need is:
fout = open("stringtest2.txt", "w")
for line in open('stringtest1.txt', "r"):
x = float(line.strip().split()[1])
fout.write("%f\n"%x)
fout.close()
Using %f in the output string will make sure the output will be in decimal notation (no E's). If you just use str(x), you may get E's in the output depending on the original value, so the correct conversion method depends on which output you want:
>>> str(float('23.0186E-06'))
'2.30186e-05'
>>> "%f"%float('23.0186E-06')
'0.000023'
>>> "%.10f"%float('23.0186E-06')
'0.0000230186'
You can add any number to %f to specify the precision. For more about string formatting with %, see http://rgruet.free.fr/PQR26/PQR2.6.html#stringMethods (scroll down to the "String formatting with the % operator" section).
float("20 _ _125.573E-03".split()[-1].strip("_"))
Related
I have a binary file mixed with ASCII in which there are some floating point numbers I want to find. The file contains some lines like this:
1,1,'11.2','11.3';1,1,'100.4';
In my favorite regex tester I found that the correct regex should be ([0-9]+\.{1}[0-9]+).
Here's the code:
import re
data = open('C:\\Users\\Me\\file.bin', 'rb')
pat = re.compile(b'([0-9]+\.{1}[0-9]+)')
print(pat.match(data.read()))
I do not get a single match, why is that? I'm on Python 3.5.1.
You can try like this,
import re
with open('C:\\Users\\Me\\file.bin', 'rb') as f:
data = f.read()
re.findall("\d+\.\d+", data)
Output:
['11.2', '11.3', '100.4']
re.findall returns string list. If you want to convert to float you can do like this
>>> list(map(float, re.findall("\d+\.\d+", data)))
[11.2, 11.3, 100.4]
How to find floating point numbers in binary file with Python?
float_re = br"[+-]? *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?"
for m in generate_tokens(r'C:\Users\Me\file.bin', float_re):
print(float(m.group()))
where float_re is from this answer and generate_tokens() is defined here.
pat.match() tries to match at the very start of the input string and your string does not start with a float and therefore you "do not get a single match".
re.findall("\d+\.\d+", data) produces TypeError because the pattern is Unicode (str) but data is a bytes object in your case. Pass the pattern as bytes:
re.findall(b"\d+\.\d+", data)
I want to append following line to my text file:
Degree of polarization is 8.23 % and EVPA is 45.03 degree.
i.e. I want both string and numeric values to be appended.
I want to append above line with different numeric values after each run of my python code.
Any help will be appreciated.
For example
>>> a = 10.5
>>> with open("myfile.txt","a") as f:
... f.write(a)
gives me error.
Do you mean something like this:
while True:
polarization = getPolarization()
evpa = getEvpa()
my_text = "Degree of polarization is {} % and EVPA is {} degree.".format(polarization, evpa)
with open("test.txt", "a") as myfile:
myfile.write(my_text)
Maybe you should also write what have you tried yet and what problems/errors occurred
You can only write strings to files.
Strings can be concatenated:
>>> 'a' + 'b'
'ab'
Numbers can be converted to strings:
>>> str(4)
'4'
>>> str(5.6)
'5.6'
You should be able to get started with that.
Also, Python's string formatting will automatically do this for you:
>>> '{} % and {} degree'.format(6.7, 8.9)
'6.7 % and 8.9 degree'
Or with a more readable format using keywords:
>>> '{polarization} % and {evpa} degree'.format(polarization=6.7, evpa=8.9)
'6.7 % and 8.9 degree'
I have a text file of complex numbers called output.txt in the form:
[-3.74483279909056 + 2.54872970226369*I]
[-3.64042002652517 + 0.733996349939531*I]
[-3.50037473491252 + 2.83784532111642*I]
[-3.80592861109028 + 3.50296053533826*I]
[-4.90750592116062 + 1.24920836601026*I]
[-3.82560512449716 + 1.34414866823615*I]
etc...
I want to create a list from these (read in as a string in Python) of complex numbers.
Here is my code:
data = [line.strip() for line in open("output.txt", 'r')]
for i in data:
m = map(complex,i)
However, I'm getting the error:
ValueError: complex() arg is a malformed string
Any help is appreciated.
From the help information, for the complex builtin function:
>>> help(complex)
class complex(object)
| complex(real[, imag]) -> complex number
|
| Create a complex number from a real part and an optional imaginary part.
| This is equivalent to (real + imag*1j) where imag defaults to 0.
So you need to format the string properly, and pass the real and imaginary parts as separate arguments.
Example:
num = "[-3.74483279909056 + 2.54872970226369*I]".translate(None, '[]*I').split(None, 1)
real, im = num
print real, im
>>> -3.74483279909056 + 2.54872970226369
im = im.replace(" ", "") # remove whitespace
c = complex(float(real), float(im))
print c
>>> (-3.74483279909+2.54872970226j)
Try this:
numbers = []
with open("output.txt", 'r') as data:
for line in data.splitlines():
parts = line.split('+')
real, imag = tuple( parts[0].strip(' ['), parts[1].strip(' *I]') )
numbers.append(complex(float(real), float(imag)))
The problem with your original approach is that your input file contains lines of text that complex() does not know how to process. We first need to break each line down to a pair of numbers - real and imag. To do that, we need to do a little string manipulation (split and strip). Finally, we convert the real and imag strings to floats as we pass them into the complex() function.
Here is a concise way to create the list of complex values (based on dal102 answer):
data = [complex(*map(float,line.translate(None, ' []*I').split('+'))) for line in open("output.txt")]
def digits_plus(test):
test=0
while (test<=3):
print str(test)+"+",
test = test+1
return()
digits_plus(3)
The output is:
0+ 1+ 2+ 3+
However i would like to get: 0+1+2+3+
Another method to do that would be to create a list of the numbers and then join them.
mylist = []
for num in range (1, 4):
mylist.append(str(num))
we get the list [1, 2, 3]
print '+'.join(mylist) + '+'
If you're stuck using Python 2.7, start your module with
from __future__ import print_function
Then instead of
print str(test)+"+",
use
print(str(test)+"+", end='')
You'll probably want to add a print() at the end (out of the loop!-) to get a new-line after you're done printing the rest.
You could also use the sys.stdout object to write output (to stdout) that you have more fine control over. This should let you output exactly and only the characters you tell it to (whereas print will do some automatic line endings and casting for you)
#!/usr/bin/env python
import sys
test = '0'
sys.stdout.write(str(test)+"+")
# Or my preferred string formatting method:
# (The '%s' implies a cast to string)
sys.stdout.write("%s+" % test)
# You probably don't need to explicitly do this,
# If you get unexpected (missing) output, you can
# explicitly send the output like
sys.stdout.flush()
I'm pretty new to Python programming and would appreciate some help to a problem I have...
Basically I have multiple text files which contain velocity values as such:
0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00
etc for many lines...
What I need to do is convert all the values in the text file that are less than 1 (e.g. 0.137865E+00 above) to an arbitrary value of 0.100000E+01. While it seems pretty simple to replace specific values with the 'replace()' method and a while loop, how do you do this if you want to replace a range?
thanks
I think when you are beginning programming, it's useful to see some examples; and I assume you've tried this problem on your own first!
Here is a break-down of how you could approach this:
contents='0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00'
The split method works on strings. It returns a list of strings. By default, it splits on whitespace:
string_numbers=contents.split()
print(string_numbers)
# ['0.259515E+03', '0.235095E+03', '0.208262E+03', '0.230223E+03', '0.267333E+03', '0.217889E+03', '0.156233E+03', '0.144876E+03', '0.136187E+03', '0.137865E+00']
The map command applies its first argument (the function float) to each of the elements of its second argument (the list string_numbers). The float function converts each string into a floating-point object.
float_numbers=map(float,string_numbers)
print(float_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 0.13786499999999999]
You can use a list comprehension to process the list, converting numbers less than 1 into the number 1. The conditional expression (1 if num<1 else num) equals 1 when num is less than 1, otherwise, it equals num.
processed_numbers=[(1 if num<1 else num) for num in float_numbers]
print(processed_numbers)
# [259.51499999999999, 235.095, 208.262, 230.22300000000001, 267.33300000000003, 217.88900000000001, 156.233, 144.876, 136.18700000000001, 1]
This is the same thing, all in one line:
processed_numbers=[(1 if num<1 else num) for num in map(float,contents.split())]
To generate a string out of the elements of processed_numbers, you could use the str.join method:
comma_separated_string=', '.join(map(str,processed_numbers))
# '259.515, 235.095, 208.262, 230.223, 267.333, 217.889, 156.233, 144.876, 136.187, 1'
typical technique would be:
read file line by line
split each line into a list of strings
convert each string to the float
compare converted value with 1
replace when needed
write back to the new file
As I don't see you having any code yet, I hope that this would be a good start
def float_filter(input):
for number in input.split():
if float(number) < 1.0:
yield "0.100000E+01"
else:
yield number
input = "0.259515E+03 0.235095E+03 0.208262E+03 0.230223E+03 0.267333E+03 0.217889E+03 0.156233E+03 0.144876E+03 0.136187E+03 0.137865E+00"
print " ".join(float_filter(input))
import numpy as np
a = np.genfromtxt('file.txt') # read file
a[a<1] = 0.1 # replace
np.savetxt('converted.txt', a) # save to file
You could use regular expressions for parsing the string. I'm assuming here that the mantissa is never larger than 1 (ie, begins with 0). This means that for the number to be less than 1, the exponent must be either 0 or negative. The following regular expression matches '0', '.', unlimited number of decimal digits (at least 1), 'E' and either '+00' or '-' and two decimal digits.
0\.\d+E(-\d\d|\+00)
Assuming that you have the file read into variable 'text', you can use the regexp with the following python code:
result = re.sub(r"0\.\d*E(-\d\d|\+00)", "0.100000E+01", text)
Edit: Just realized that the description doesn't limit the valid range of input numbers to positive numbers. Negative numbers can be matched with the following regexp:
-0\.\d+E[-+]\d\d
This can be alternated with the first one using the (pattern1|pattern2) syntax which results in the following Python code:
result = re.sub(r"(0\.\d+E(-\d\d|\+00)|-0\.\d+E[-+]\d\d)", "0.100000E+00", subject)
Also if there's a chance that the exponent goes past 99, the regexp can be further modified by adding a '+' sign after the '\d\d' patterns. This allows matching digits ending in two OR MORE digits.
I've got the script working as I want now...thanks people.
When writing the list to a new file I used the replace method to get rid of the brackets and commas - is there a simpler way?
ftext = open("C:\\Users\\hhp06\\Desktop\\out.grd", "r")
otext = open("C:\\Users\\hhp06\\Desktop\\out2.grd", "w+")
for line in ftext:
stringnum = line.split()
floatnum = map(float, stringnum)
procnum = [(1.0 if num<1 else num) for num in floatnum]
stringproc = str(procnum)
s = (stringproc).replace(",", " ").replace("[", " ").replace("]", "")
otext.writelines(s + "\n")
otext.close()