how to print only the first value in python - python

In my output file I have:
Energy=-0.111....
'other text'
Energy=-0.1223
Now I am trying to write a script where I open the output, I read the energy value and print it in another output.
Below is my code
with open('.out', 'rt') as f:
data = f.readlines()
for line in data:
if line.__contains__('Energy'):
print(line)
My problem is that I want in my script print only the first value of energy Energy=-0.111.... but with my script in the output I have all of them, so it doesn't work. How I can correct my script? I want to understand how I can tell to him to print the first value of Energy in one script and in another only the second one.

Since you need to extend the logic to print the first, second etc times 'Energy' lines are found, you can store them in a list and access whichever one you need to print.
data = f.readlines()
energy_lines = []
for line in data:
if 'Energy' in line:
energy_lines.append(line)
print(energy_lines[0]) # first line
print(energy_lines[1]) # second line

Why not just use "break" after printing?
for line in data:
if line.__contains__('Energy'):
print(line)
break

First observation: Add a break as mentioned by John.
n observation: You could create a counter - if i want the third observation:
data = f.readlines()
for line in data:
counter = 0
if line.__contains__('Energy'):
counter+=1
if counter==3:
print(line)
break

If the energy line is always the first line in your text file:
with open('.out', 'rt') as f:
line = next(f)
if 'Energy' in line:
print(line)
If not, then still no point reading in all lines (This should consume less memory and a little less time)
num_occurence = 1
match_count = 0
with open('.out', 'rt') as f:
for line in f:
if 'Energy' in line:
match_count += 1
if match_count == num_occurences:
print(line)
break
I also changed
if line.__contains__('Energy'):
into
if 'Energy' in line:
As one should normally avoid to call magic functions, (less readable and more magic)

Related

I want to replace words from a file by the line no using python i have a list of line no?

if I have a file like:
Flower
Magnet
5001
100
0
and I have a list containing line number, which I have to change.
list =[2,3]
How can I do this using python and the output I expect is:
Flower
Most
Most
100
0
Code that I've tried:
f = open("your_file.txt","r")
line = f.readlines()[2]
print(line)
if line=="5001":
print "yes"
else:
print "no"
but it is not able to match.
i want to overwrite the file which i am reading
You may simply loop through the list of indices that you have to replace in your file (my original answer needlessly looped through all lines in the file):
with open('test.txt') as f:
data = f.read().splitlines()
replace = {1,2}
for i in replace:
data[i] = 'Most'
print('\n'.join(data))
Output:
Flower
Most
Most
100
0
To overwrite the file you have opened with the replacements, you may use the following:
with open('test.txt', 'r+') as f:
data = f.read().splitlines()
replace = {1,2}
for i in replace:
data[i] = 'Most'
f.seek(0)
f.write('\n'.join(data))
f.truncate()
The reason that you're having this problem is that when you take a line from a file opened in python, you also get the newline character (\n) at the end. To solve this, you could use the string.strip() function, which will automatically remove these characters.
Eg.
f = open("your_file.txt","r")
line = f.readlines()
lineToCheck = line[2].strip()
if(lineToCheck == "5001"):
print("yes")
else:
print("no")

Need help on looping my code so it repeats instead of searching once

i dont under stand how to make this loop
term = input("")
file = open('file.txt')
for line in file:
line.strip().split('/n')
if term in line:
print(line)
if term in line:
print('Not on database, (try using caps)')
file.close()
(i know it is incorrect)
If by "repeat" you mean you want the user to be able to input a term multiple times, then just use a while loop.
while True:
term = input("")
file = open('file.txt')
for line in file:
line.strip().split('/n')
if term in line:
print(line)
if term in line:
print('Not on database, (try using caps)')
file.close()
I am not sure how many times you want to be able to loop, but this loop will go indefinitely.
I guess what you need is this.
term = input("")
# file is a python object by default, so it's better not to use it.
f = open('file.txt')
# Put each line in the file into a list
content = f.readlines()
for line in content:
# str.strip() does not replace the original string
# I modify it so that it is replaced.
line = line.strip()
if term in line:
print(line)
# The line below is unnecessary because it's the same condition as the previous if statement
# if term in line:
print('Not on database, (try using caps)')
f.close()
You can open the file using with and loop through the lines.
term = input("")
with open('file.text') as f:
for line in f.readlines():
if term in line.strip():
print(line)
break #breaks for loop and exits

Parsing a text file in python and outputting to a CSV

Preface - I'm pretty new to Python, having had more experience in another language.
I have a text file with single column list of strings in the generic (but slightly varying) format "./abc123a1/type/1ab2_x_data_type.file.type"
I need to extract the abc123a1 and the 1ab2 portions from all several hundred of the rows and put them under two columns (column a and b) in a csv. Sometimes there may be a "1ab2_a" and a "1ab2_b", but I only want one 1ab2. So I'd want to grab "1ab2_a" and ignore all others.
I have the regex which I THINK will work:
tmp = list()
if re.findall(re.compile(r'^([a-zA-Z0-9]{4})_'), x):
tmp = re.findall(re.compile(r'^([a-zA-Z0-9]{4})_'), x)
elif re.findall(re.compile(r'_([a-zA-Z0-9]{4})_'), x):
tmp = re.findall(re.compile(r'_([a-zA-Z0-9]{4})_'), x)
if len(tmp) == 0:
return None
elif len(tmp) > 1:
print "ERROR found multiple matches"
return "ERROR"
else:
return tmp[0].upper()
I am trying to make this script step by step and testing things to make sure it works, but it's just not.
import sys
import csv
listOfData = []
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
for line in f:
listOfData.append([line])
print listOfData
with open('extracted.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('column a', 'column b'))
writer.writerows(listOfData)
print listOfData
Still failing to get anything in the csv other than column headers, much less a parsed version!
Does anyone have any better ideas or formats I could do this in? A friend mentioned looking into glob.glob, but I haven't had luck getting that to work either.
IMHO, you were not far from making it work. The problem is that you read once the whole file just to print the lines, and then (once at end of file) you try to put them into a list... and get an empty list !
You should read the file only once:
import sys
import csv
listOfData = []
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
listOfData.append([line])
print listOfData
with open('extracted.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('column a', 'column b'))
writer.writerows(listOfData)
print listOfData
once it works, you still have to use the regex to get relevant data to put into the csv file
I am not sure about your regex (it will most probably not work) , but the reason why your current (non-regex , simple) code does not work is because -
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
for line in f:
listOfData.append([line])
As you can see you are first iterating over each line in file and printing it, it should be fine, but after the loop ends, the file pointer is at the end of file, so trying to iterate over it again , would not produce any result. You should only iterate over it once, and do both printing and appending to list in it. Example -
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
listOfData.append([line])
I think at least part of the problem is the two for loops in the following:
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
for line in f:
listOfData.append([line])
The first one prints all the lines of f, so there's nothing left for the second one to iterate over unless you first f.seek(0) and rewind the file.
An alternative way would to simply to this:
with open(sys.argv[1]) as f:
print "yes"
for line in f:
print line
listOfData.append([line])
It's hard to tell if your regexes are OK without more than one line of sample input data.
Are you sure you need all of the regular expressions? You seem to be parsing a list of paths and filenames. The path could be split up using a split command, for example:
print "./abc123a1/type/1ab2_a_data_type.file.type".split("/")
Would give:
['.', 'abc123a1', 'type', '1ab2_a_data_type.file.type']
You could then create a set consisting of the second entry and up to the '_' in forth entry, e.g.
('abc123a1', '1ab2')
This could then be used to print only the first entry from each:
pairs = set()
with open(sys.argv[1], 'r') as in_file, open('extracted.csv', 'wb') as out_file:
writer = csv.writer(out_file)
for row in in_file:
folders = row.split("/")
col_a = folders[1]
col_b = folders[3].split("_")[0]
if (col_a, col_b) not in pairs:
pairs.add((col_a, col_b))
writer.writerow([col_a, col_b])
So for an input looking like this:
./abc123a1/type/1ab2_a_data_type.file.type
./abc123a1/type/1ab2_b_data_type.file.type
./abc123a2/type/1ab2_a_data_type.file.type
./abc123a3/type/1ab2_a_data_type.file.type
You would get a CSV file looking like:
abc123a1,1ab2
abc123a2,1ab2
abc123a3,1ab2

How to get 2nd thing out of every line using python and file parsing

i'm trying to parse through a file with structure:
0 rs41362547 MT 10044
1 rs28358280 MT 10550
...
and so forth, where i want the second thing in each line to be put into an array. I know it should be pretty easy, but after a lot of searching, I'm still lost. I'm really new to python, what would be the script to do this?
THanks!
You can split the lines using str.split:
with open('file.txt') as infile:
result = []
for line in infile: #loop through the lines
data = line.split(None, 2)[1] #split, get the second column
result.append(data) #append it to our results
print data #Just confirming
This will work:
with open('/path/to/file') as myfile: # Open the file
data = [] # Make a list to hold the data
for line in myfile: # Loop through the lines in the file
data.append(line.split(None, 2)[1]) # Get the data and add it to the list
print (data) # Print the finished list
The important parts here are:
str.split, which breaks up the lines based on whitespace.
The with-statement, which auto-closes the file for you when done.
Note that you could also use a list comprehension:
with open('/path/to/file') as myfile:
data = [line.split(None, 2)[1] for line in myfile]
print (data)

Python- how to use while loop to return longest line of code

I just started learning python 3 weeks ago, I apologize if this is really basic. I needed to open a .txt file and print the length of the longest line of code in the file. I just made a random file named it myfile and saved it to my desktop.
myfile= open('myfile', 'r')
line= myfile.readlines()
len(max(line))-1
#the (the "-1" is to remove the /n)
Is this code correct? I put it in interpreter and it seemed to work OK.
But I got it wrong because apparently I was supposed to use a while loop. Now I am trying to figure out how to put it in a while loop. I've read what it says on python.org, watched videos on youtube and looked through this site. I just am not getting it. The example to follow that was given is this:
import os
du=os.popen('du/urs/local')
while 1:
line= du.readline()
if not line:
break
if list(line).count('/')==3:
print line,
print max([len(line) for line in file(filename).readlines()])
Taking what you have and stripping out the parts you don't need
myfile = open('myfile', 'r')
max_len = 0
while 1:
line = myfile.readline()
if not line:
break
if len(line) # ... somethin
# something
Note that this is a crappy way to loop over a file. It relys on the file having an empty line at the end. But homework is homework...
max(['b','aaa']) is 'b'
This lexicographic order isn't what you want to maximise, you can use the key flag to choose a different function to maximise, like len.
max(['b','aaa'], key=len) is 'aaa'
So the solution could be: len ( max(['b','aaa'], key=len) is 'aaa' ).
A more elegant solution would be to use list comprehension:
max ( len(line)-1 for line in myfile.readlines() )
.
As an aside you should enclose opening a file using a with statement, this will worry about closing the file after the indentation block:
with open('myfile', 'r') as mf:
print max ( len(line)-1 for line in mf.readlines() )
As other's have mentioned, you need to find the line with the maximum length, which mean giving the max() function a key= argument to extract that from each of lines in the list you pass it.
Likewise, in a while loop you'd need to read each line and see if its length was greater that the longest one you had seen so far, which you could store in a separate variable and initialize to 0 before the loop.
BTW, you would not want to open the file with os.popen() as shown in your second example.
I think it will be easier to understand if we keep it simple:
max_len = -1 # Nothing was read so far
with open("filename.txt", "r") as f: # Opens the file and magically closes at the end
for line in f:
max_len = max(max_len, len(line))
print max_len
As this is homework... I would ask myself if I should count the line feed character or not. If you need to chop the last char, change len(line) by len(line[:-1]).
If you have to use while, try this:
max_len = -1 # Nothing was read
with open("t.txt", "r") as f: # Opens the file
while True:
line = f.readline()
if(len(line)==0):
break
max_len = max(max_len, len(line[:-1]))
print max_len
For those still in need. This is a little function which does what you need:
def get_longest_line(filename):
length_lines_list = []
open_file_name = open(filename, "r")
all_text = open_file_name.readlines()
for line in all_text:
length_lines_list.append(len(line))
max_length_line = max(length_lines_list)
for line in all_text:
if len(line) == max_length_line:
return line.strip()
open_file_name.close()

Categories