Script not reading end of file new line - python

I am writing a script to validate if a given file has a blank line at the end or not. I am reading that file into python using code
with open(input_file, "r") as file_data:
for line in file_data:
if line.strip() == "":
print "found empty line"
When i open the test file in sublime, i can see that sublime shows 370 lines and the 370th line is just an empty line. however, when i use that file as test in my python script, i don't see my if condition being true. is python library already skipping an empty-end-of-file-new-line?
UPDATE
little bit more context.
The data was generated using linux system. These files were then copied to MAC and were re-process. During this re-processing the data written line by line with command file_handle.write(data + "\n").
I hope this gives more context.

Comparison for identity with the empty string is a delicate matter. Do this instead:
with open(input_file, "r") as file_data:
for line in file_data:
if not line.strip():
print "found empty line"
Works for me with Python 3.6.

Related

C Comment Stripper in Python

I'm trying to create a program using Python that will go through a file containing a git diff (in C code), go through the file, and remove the comments. I tried to read from the file and print a new comment-less version in a different file, but it doesn't seem to be working. I'm also now becoming aware that it will not work for multiline comments.
Here's my code:
write_path = "diff_file" # new file to write in
read_path = "text_diff" # text_diff is the original file with the diff
with open(read_path,'r') as read_file:
text_diff = read_file.read().lower()
for line in read_file:
if line.startswith("/*") and line.endswith("*/"):
with open(write_path, 'a') as write_file:
write_file.write(line + "/n")
For reference, I'm running it under WSL.
I tried this. I changed 'a' to 'w' (write) when opening the output file, and changed its position to avoid opening everytime. I also changed the if condition. That way when there is a comment line it is not printed to the new file.
Also, in endswith I included \n, since a new line is included at the end of the string. And deleted the \n when writing.
write_path = "diff_file" # new file to write in
read_path = "text_diff" # text_diff is the original file with the diff
with open(read_path,'r') as read_file:
text_diff = read_file.readlines()
with open(write_path, 'w') as write_file:
for line in text_diff:
if not (line.startswith("/*") and line.endswith("*/\n")):
write_file.write(line)

Python Function: using a batch file to pass parameters from .txt file to python function and execute function

I would like to pass 1 or more parameters from a text file to a python function using a batch file. Is this possible? Ideally I would like to read from a line in the text file which would pass the specific com port to the python function, my_function and these actions could be done using a batch file
I can currently call a python script using a batch file as shown below. Separately I can also call a python function and pass a parameter to it using Python Shell. I need to be able to pass different values from a text file to the same function which is where I'm stuck.
Any help would be greatly appreciated.
Current batch file code calling a python script
echo[
#echo. The Step below calls the script which opens COM 12
echo[
"C:\Python\python.exe" "C:\Scripts\open_COM12.py"
Current python code to pass parameter (com port number) and call python function
import ConfigComPort as cw
from ConfigComPort import my_function
my_function('12')
Connection successfully made
Text File contents
COM_PORTS
12
19
23
22
If you have a file called parameters.txt with data
foo
bar
foobar
And a function
def my_function(some_text):
print("I was called with " + some_text)
Then you can do this to pass every line of the file to the function:
with open('parameters.txt', 'r') as my_file:
for line in my_file:
# remove the # and space from the next line to enable output to console:
# print(line.rstrip())
my_function(line.rstrip())
Note that the rstrip() method in all my examples strips off the trailing newline (as well as other trailing whitespace) that would otherwise be part of each line.
If your parameter file has a header, as in your example, you have multiple possibilities to skip that.
For example you can read all lines at once into a list and then iterate over a subset:
with open('parameters.txt', 'r') as my_file:
all_lines = [line.rstrip() for line in my_file.readlines()]
# print(all_lines)
for line in all_lines[1:]:
# print(line)
my_function(line)
However, that would just ignore the header. If you accidentally passed a wrong file or one that has invalid content, this could spell trouble.
It's better to check whether the header of the file is correct. You can just expand the code from above:
with open('parameters.txt', 'r') as my_file:
all_lines = [line.rstrip() for line in my_file.readlines()]
# print(all_lines)
if all_lines[0] != 'COM_PORTS':
raise RuntimeError("file has wrong header")
for line in all_lines[1:]:
# print(line)
my_function(line)
Or you can do this inside the loop, for instance:
expect_header = True
with open('parameters.txt', 'r') as my_file:
for line in my_file:
stripped = line.rstrip()
if expect_header:
if stripped != 'COM_PORTS':
raise RuntimeError("header of file is wrong")
expect_header = False
continue
# print(stripped)
my_function(stripped)
Or you can use an generator expression to check the header outside of the loop:
with open('parameters.txt', 'r') as my_file:
all_lines = (line.rstrip() for line in my_file.readlines())
if next(all_lines) != 'COM_PORTS':
raise RuntimeError("file has wrong header")
for line in all_lines:
# print(line)
my_function(line)
I would likely prefer this last one, as it is has a clear structure and no magic numbers (such as 0 and 1, referring to which line is the header and how many to skip, respectively) and it doesn't need to read all lines into memory at once.
However, the solution further above that reads all lines into a list at once is probably better if you want to do further processing on them as the data is already available in that case and you don't need to read the file again.

Python 2.7.8 for line iteration error

I want to iterate over all lines in a file with the following script
import sys
infile = open("test.txt")
infile.read()
for line in infile
if line.find("This") != -1
print line
infile.close()
Unfortunately, I am getting this error message:
File "getRes.py", line 6
for line in infile
^
SyntaxError: invalid syntax
I've been trying for an hour to figure out what is the error and I am still not able to find it. Can you tell me what is wrong and how to fix it?
PS: I am using Python 2.7.8, I would like to use this old version instead of a more recent version.
You need a colon after any line that introduces a block in Python.
for line in infile:
if line.find("This") != -1:
There's another mistake in your code, you don't need:
infile.read()
Because it reads all contens of infile and doesn't save it to any variable. And what is more important it moves you to the end of file, so there's no more lines to read.
In addition there's no need to manualy close file, it's better to use with statement:
with open("test.txt") as infile:
for line in infile:
# do what you want
# here file will be close automaticaly, when we exit "with" scope.

python search for string in file return entire line + next line into new text file

I have a very large text file (50,000+ lines) that should always be in the same sequence. In python I want to search the text file for each of the $INGGA lines and join this line with the subsequent $INHDT to create a new text file. I need to do this without reading into memory as this causes it to crash every time. I can find return the $INGGA line but I'm not sure of the best way of then getting the next line and joining into a new string that is memory efficient
Thanks
Phil
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2016.05.06 09:11:34 =~=~=~=~=~=~=~=~=~=~=~= > $PRDID,2.15,-0.10,31.87*6E
$INGGA,091124.00,5249.8336,N,00120.9619,W,1,20,0.6,95.0,M,49.4,M,,*50
$INHDT,31.9,T*1E $INZDA,091124.0055,06,05,2016,,*7F
$INVTG,22.0,T,,M,4.4,N,8.1,K,A*24 $PRDID,2.13,-0.06,34.09*6C
$INGGA,091124.20,5249.8338,N,00120.9618,W,1,20,0.6,95.0,M,49.4,M,,*5D
$INHDT,34.1,T*13 $INZDA,091124.2055,06,05,2016,,*7D
$INVTG,24.9,T,,M,4.4,N,8.1,K,A*2B $PRDID,2.16,-0.03,36.24*61
$INGGA,091124.40,5249.8340,N,00120.9616,W,1,20,0.6,95.0,M,49.4,M,,*5A
$INHDT,36.3,T*13 $INZDA,091124.4055,06,05,2016,,*7B
$INVTG,27.3,T,,M,4.4,N,8.1,K,A*22 $PRDID,2.11,-0.05,38.33*68
$INGGA,091124.60,5249.8343,N,00120.9614,W,1,20,0.6,95.1,M,49.4,M,,*58
$INHDT,38.4,T*1A $INZDA,091124.6055,06,05,2016,,*79
$INVTG,29.5,T,,M,4.4,N,8.1,K,A*2A $PRDID,2.09,-0.02,40.37*6D
$INGGA,091124.80,5249.8345,N,00120.9612,W,1,20,0.6,95.1,M,49.4,M,,*56
$INHDT,40.4,T*15 $INZDA,091124.8055,06,05,2016,,*77
$INVTG,31.7,T,,M,4.4,N,8.1,K,A*21 $PRDID,2.09,0.02,42.42*40
$INGGA,091125.00,5249.8347,N,00120.9610,W,1,20,0.6,95.1,M,49.4,M,,*5F
$INHDT,42.4,T*17
You can just read a line of file and write to another new file.
Like this:
import re
#open new file with append
nf = open('newfile', 'at')
#open file with read
with open('file', 'rt') as f:
for line in f:
r = re.match(r'\$INGGA', line)
if r is not None:
nf.write(line)
nf.write("$INHDT,31.9,T*1E" + '\n')
You can use at to append write and wt to read line!
I have 150,000 lines file, It's run well!
I suggest using a simple regex that will parse and capture the parts you care about. Here is an example that will capture the piece you care about:
(\$INGGA.*\n\$INHDT.*\n)
https://regex101.com/r/tK1hF0/3
As in my above link, you'll notice that I used the "global" g setting on the regex, telling it to capture all groups that match. Otherwise, it'll stop after the first match.
I also had trouble determining where the actual line breaks exist in your above example file, so you can tweak the above to match exactly where the breaks occur.
Here is some starter python example code:
import re
test_str = # load your file here
p = re.compile(ur'(\$INGGA.*\n\$INHDT.*\n)')
matches = re.findall(p, test_str)
In the example PuTTY log you give, its all one line separated with space.
So in this case you can use this to replace the space with new line and gets new file -
cat large_file | sed 's/ /\n/g' > new_large_file
To iterate over the file separated with new line, run this -
cat new_large_file | python your_script.py
Your script get line by line so your computer should not crash.
your_script.py -
import sys
INGGA_line = ""
for line in sys.stdin:
line_striped = line.strip()
if line_striped.startswith("$INGGA"):
INGGA_line = line_striped
elif line_striped.startswith("$INZDA"):
print line_striped, INGGA_line
else:
print line_striped
This answer is aimed at python 3.
According to this other answer (and the docs), you can iterate your file line-by-line memory-efficiently:
with open(filename, 'r') as f:
for line in f:
...process...
An example of how you could fulfill your above criteria could be
# Target file write-only, source file read-only
with open(targetfile, 'w') as tf, open(sourcefile, 'r') as sf:
# Flag for whether we are looking for 1st or 2nd part
look_for_ingga = True
for line in sf:
if look_for_ingga:
if line.startswith('$INGGA,'):
tf.write(line)
look_for_ingga = False
elif line.startswith('$INHDT,'):
tf.write(line)
look_for_ingga = True
In the case where you have multiple '$INGGA,' prior to the '$INHDT,', this grabs the first one and disregards the rest. In case you want to take only the last '$INGGA,' before the '$INHDT,', store the last '$INGGA,' in a variable instead of writing it to disk. Then, when you find your '$INHDT,', store both.
In case you meant that you want to write to a separate new file for each INGGA-INHDT pair, the target file with-statement should be nested inside for line in sf instead, or the results should be buffered in a list for later storage.
Refer to the docs for introductions to with-statements and file reading/writing.

How can I add a new line of text at top of a file?

I'm developing a simple program which makes a Python script executable, and I'm working in the part which adds the interpreter path (#! /usr/bin/python). I tried to do it, but instead of adding a new line, it replaces the current and removes part of the next line. What I'm doing wrong?
I uploaded the source code to Ubuntu Pastebin: http://pastebin.ubuntu.com/1032683/ The wrong code is between lines 28 and 31:
wfile = open(file, 'r+')
if wfile.readline() != "#! /usr/bin/python\n":
wfile.seek(0)
wfile.write("#! /usr/bin/python\n")
Using Python 2.7.2 with an iPad 2 (Python for iOS), also using 2.5.1 in the same iPad (Cydia port) for testing.
You can't do what you're trying to do. Seeking to the beginning of a file and doing a write will overwrite from that position, not append.
The only way to add a line in the middle (or beginning) of a file is to write out a new file with the data inserted where you want it to.
Joe is correct in that you have to can't just "insert" lines at the beginning of the file. Here is a solution for you, however:
with open(my_python_script, "r+") as f:
first_line = f.readline()
if first_line != "#! /usr/bin/python\n":
lines = f.readlines()
f.seek(0)
f.write("#! /usr/bin/python\n")
f.write(first_line)
f.writelines(lines)
To add/replace the first line in each file given at a command line:
#!/usr/bin/env python
import fileinput
shebang = "#! /usr/bin/python\n"
for line in fileinput.input(inplace=1):
if fileinput.isfirstline() and line != shebang:
print shebang,
if not line.startswith("#!"):
print line,
else:
print line,

Categories