Trying to make columns in text file from python - python

So far I have this. I opened the data file, I was able to make a list from the data and print the data I needed from the list in 2 columns correctly. It shows up in python just fine. But when I try to write it to a txt file, it all shows up on 1 line. Not sure what to do so it's into 2 columns in the new text file.
# open file
data = open("BigCoCompanyData.dat", "r")
data.readline()
# skip header and print number of employees
n = eval(data.readline())
print(n)
# read in employee information
longest = 0
# save phone list in text file
phoneFile = open("PhoneList.txt", "w")
for i in range(n):
lineI = data.readline().split(",")
nameLength = len(lineI[1])+len(lineI[2])
if nameLength > longest:
longest = nameLength
longest = longest + 5
print((lineI[2].title()+", "+lineI[1].title()).ljust(longest) + ("("+lineI[-2][0:3]+")"+lineI[-2][3:6]+"-"+lineI[-2][6:10]).rjust(14))
phoneFile.write((lineI[2].title()+", "+lineI[1].title()).ljust(longest) + ("("+lineI[-2][0:3]+")"+lineI[-2][3:6]+"-"+lineI[-2][6:10]).rjust(14))
data.close()
# close the file
phoneFile.close()

phoneFile.write(...)simply writes the line you give it. Every time you give a line it appends it to the previous lines, unless you end your lines with \n.
phoneFile.write((lineI[2].title()+", "+lineI[1].title()).ljust(longest) +
("("+lineI[-2][0:3]+")"+lineI[-2][3:6]+"-"+lineI[-2][6:10]).rjust(14)+'\n')

Related

How can i edit several numbers/words in a txt file using python?

I want to rewrite a exisiting file with things like:
Tom A
Mike B
Jim C
to
Tom 1
Mike 2
Jim 3
The letters A,B,C can also be something else. Basicaly i want to keep the spaces between the names and what comes behind, but change them to numbers. Does someone have an idea please? Thanks a lot for your help.
I assume your first and second columns are separated by a tab (i.e. \t)?
If so, you can do this by reading the file into a list, use the split function to split each line of the file into components, edit the second component of each line, concatenate the two components back together with a tab separator and finally rewrite to a file.
For example, if test.txt is your input file:
# Create list that holds the desired output
output = [1,2,3]
# Open the file to be overwritten
with open('test.txt', 'r') as f:
# Read file into a list of strings (one string per line)
text = f.readlines()
# Open the file for writing (FYI this CLEARS the file as we specify 'w')
with open('test.txt', 'w') as f:
# Loop over lines (i.e. elements) in `text`
for i,item in enumerate(text):
# Split line into elements based on whitespace (default for `split`)
line = item.split()
# Concatenate the name and desired output with a tab separator and write to the file
f.write("%s\t%s\n" % (line[0],output[i]))
I assumed your first and second columns were separated by a spaces in the file.
You can read the file contents into a list and use the function replace_end(line,newline) and it will replace the end of the line with what you passed. then you can just write out the changed list back to the file.
""" rewrite a exisiting file """
def main():
""" main """
filename = "update_me.txt"
count = 0
lst = []
with open(filename, "r",encoding = "utf-8") as filestream:
_lines = filestream.readlines()
for line in _lines:
lst.insert(count,line.strip())
count += 1
#print(f"Line {count} {line.strip()}")
count = 0
# change the list
for line in lst:
lst[count] = replace_end(line,"ABC")
count +=1
count = 0
with open(filename, "w", encoding = "utf-8") as filestream:
for line in lst:
filestream.write(line+"\n")
count +=1
def replace_end(line,newline):
""" replace the end of a line """
return line[:-len(newline)] + newline
if __name__ == '__main__':
main()

I am struggling with reading specific words and lines from a text file in python

I want my code to be able to find what the user has asked for and print the 5 following lines. For example if the user entered "james" into the system i want it to find that name in the text file and read the 5 lines below it. Is this even possible? All i have found whilst looking through the internet is how to read specific lines.
So, you want to read a .txt file and you want to read, let's say the word James and the 5 lines after it.
Our example text file is as follows:
Hello, this is line one
The word James is on this line
Hopefully, this line will be found later,
and this line,
and so on...
are we at 5 lines yet?
ah, here we are, the 5th line away from the word James
Hopefully, this should not be found
Let's think through what we have to do.
What We Have to Do
Open the text file
Find the line where the word 'James' is
Find the next 5 lines
Save it to a variable
Print it
Solution
Let's just call our text file info.txt. You can call it whatever you want.
To start, we must open the file and save it to a variable:
file = open('info.txt', 'r') # The 'r' allows us to read it
Then, we must save the data from it to another variable, we shall do it as a list:
file_data = file.readlines()
Now, we iterate (loop through) the line with a for loop, we must save the line that 'James' is on to another variable:
index = 'Not set yet'
for x in range(len(file_data)):
if 'James' in file_data[x]:
index = x
break
if index == 'Not set yet':
print('The word "James" is not in the text file.')
As you can see, it iterates through the list, and checks for the word 'James'. If it finds it, it breaks the loop. If the index variable still is equal to what it was originally set as, it obviously has not found the word 'James'.
Next, we should find the five lines next and save it to another variable:
five_lines = [file_data[index]]
for x in range(5):
try:
five_lines.append(file_data[index + x + 1])
except:
print(f'There are not five full lines after the word James. {x + 1} have been recorded.')
break
Finally, we shall print all of these:
for i in five_lines:
print(i, end='')
Done!
Final Code
file = open('info.txt', 'r') # The 'r' allows us to read it
file_data = file.readlines()
index = 'Not set yet'
for x in range(len(file_data)):
if 'James' in file_data[x]:
index = x
break
if index == 'Not set yet':
print('The word "James" is not in the text file.')
five_lines = [file_data[index]]
for x in range(5):
try:
five_lines.append(file_data[index + x + 1])
except:
print(f'There are not five full lines after the word James. {x + 1} have been recorded.')
break
for i in five_lines:
print(i, end='')
I hope that I have been helpful.
Yeah, sure. Say the keyword your searching for ("james") is keywrd and Xlines is the number of lines after a match you want to return
def readTextandSearch1(txtFile, keywrd, Xlines):
with open(txtFile, 'r') as f: #Note, txtFile is a full path & filename
allLines = f.readlines() #Send all the lines into a list
#with automatically closes txt file at exit
temp = [] #Dim it here so you've "something" to return in event of no match
for iLine in range(0, len(allLines)):
if keywrd in allLines[iLine]:
#Found keyword in this line, want the next X lines for returning
maxScan = min(len(allLines),Xlines+1) #Use this to avoid trying to address beyond end of text file.
for iiLine in range(1, maxScan):
temp.append(allLines[iLine+iiLine]
break #On assumption of only one entry of keywrd in the file, can break out of "for iLine" loop
return temp
Then by calling readTextandSearch1() with appropriate parameters, you'll get a list back that you can print at your leisure. I'd take the return as follows:
rtn1 = readTextAndSearch1("C:\\Docs\\Test.txt", "Jimmy", 6)
if rtn1: #This checks was Jimmy within Test.txt
#Jimmy was in Test.txt
print(rtn1)

Search for values in all text files and multiply them by fixed value ? (in PYTHON ?)

-
Hi friends.
I have a lot of files, which contains text information, but I want to search only specific lines, and then in these lines search for on specific position values and multiply them with fixed value (or entered with input).
Example text:
1,0,0,0,1,0,0
15.000,15.000,135.000,15.000
7
3,0,0,0,2,0,0
'holep_str',50.000,-15.000,20.000,20.000,0.000
3
3,0,0,100,3,-8,0
58.400,-6.600,'14',4.000,0.000
4
3,0,0,0,3,-8,0
50.000,-15.000,50.000,-15.000
7
3,0,0,0,4,0,0
'holep_str',100.000,-15.000,14.000,14.000,0.000
3
3,0,0,100,5,-8,0
108.400,-6.600,'14',4.000,0.000
And I want to identify and modify only lines with "holep_str" text:
'holep_str',50.000,-15.000,20.000,20.000,0.000
'holep_str',100.000,-15.000,14.000,14.000,0.000
There are in each line that begins with the string "holep_str" two numbers, at position 3rd and 4th value:
20.000 20.000
14.000 14.000
And these can be identified like:
1./ number after 3rd comma on line beginning with "holep_str"
2./ number after 4th comma on line beginning with "holep_str"
RegEx cannot help, Python probably sure, but I'm in time press - and go no further with the language...
Is there somebody that can explain how to write this relative simple code, that finds all lines with "search string" (= "holep_str") - and multiply the values after 3rd & 4th comma by FIXVALUE (or value input - for example "2") ?
The code should walk through all files with defined extension (choosen by input - for example txt) where the code is executed - search all values on needed lines and multiply them and write back...
So it looks like - if FIXVALUE = 2:
'holep_str',50.000,-15.000,40.000,40.000,0.000
'holep_str',100.000,-15.000,28.000,28.000,0.000
And whole text looks like then:
1,0,0,0,1,0,0
15.000,15.000,135.000,15.000
7
3,0,0,0,2,0,0
'holep_str',50.000,-15.000,40.000,40.000,0.000
3
3,0,0,100,3,-8,0
58.400,-6.600,'14',4.000,0.000
4
3,0,0,0,3,-8,0
50.000,-15.000,50.000,-15.000
7
3,0,0,0,4,0,0
'holep_str',100.000,-15.000,28.000,28.000,0.000
3
3,0,0,100,5,-8,0
108.400,-6.600,'14',4.000,0.000
Thank You.
with open(file_path) as f:
lines = f.readlines()
for line in lines:
if line.startswith(r"'holep_str'"):
split_line = line.split(',')
num1 = float(split_line[3])
num2 = float(split_line[4])
print num1, num2
# do stuff with num1 and num2
Once you .split() the lines with the argument ,, you get a list. Then, you can find the values you want by index, which are 3 and 4 in your case. I also convert them to float at the end.
Also final solution - whole program (version: python-3.6.0-amd64):
# import external functions / extensions ...
import os
import glob
# functions definition section
def fnc_walk_through_files(path, file_extension):
for (dirpath, dirnames, filenames) in os.walk(path):
for filename in filenames:
if filename.endswith(file_extension):
yield os.path.join(path, filename)
# some variables for counting
line_count = 0
# Feed data to program by entering them on keyboard
print ("Enter work path (e.g. d:\\test) :")
workPath = input( "> " )
print ("File extension to perform Search-Replace on [spf] :")
fileExt = input( "> " )
print ("Enter multiplier value :")
multiply_value = input( "> " )
print ("Text to search for :")
textToSearch = input( "> " )
# create temporary variable with path and mask for deleting all ".old" files
delPath = workPath + "\*.old"
# delete old ".old" files to allow creating backups
for files_to_delete in glob.glob(delPath, recursive=False):
os.remove(files_to_delete)
# do some needed operations...
print("\r") #enter new line
multiply_value = float(multiply_value) # convert multiplier to float
textToSearch_mod = "\'" + textToSearch # append apostrophe to begin of searched text
textToSearch_mod = str(textToSearch_mod) # convert variable to string for later use
# print information line of what will be searched for
print ("This is what will be searched for, to identify right line: ", textToSearch_mod)
print("\r") #enter new line
# walk through all files with specified extension <-- CALLED FUNCTION !!!
for fname in fnc_walk_through_files(workPath, fileExt):
print("\r") # enter new line
# print filename of processed file
print(" Filename processed:", fname )
# and proccess every file and print out numbers
# needed to multiplying located at 3rd and 4th position
with open(fname, 'r') as f: # opens fname file for reading
temp_file = open('tempfile','w') # open (create) tempfile for writing
lines = f.readlines() # read lines from f:
line_count = 0 # reset counter
# loop througt all lines
for line in lines:
# line counter increment
line_count = line_count + 1
# if line starts with defined string - she will be processed
if line.startswith(textToSearch_mod):
# line will be divided into parts delimited by ","
split_line = line.split(',')
# transfer 3rd part to variable 1 and make it float number
old_num1 = float(split_line[3])
# transfer 4th part to variable 2 and make it float number
old_num2 = float(split_line[4])
# multiply both variables
new_num1 = old_num1 * multiply_value
new_num2 = old_num2 * multiply_value
# change old values to new multiplied values as strings
split_line[3] = str(new_num1)
split_line[4] = str(new_num2)
# join the line back with the same delimiter "," as used for dividing
line = ','.join(split_line)
# print information line on which has been the searched string occured
print ("Changed from old:", old_num1, old_num2, "to new:", new_num1, new_num2, "at line:", line_count)
# write changed line with multiplied numbers to temporary file
temp_file.write(line)
else:
# write all other unchanged lines to temporary file
temp_file.write(line)
# create new name for backup file with adding ".old" to the end of filename
new_name = fname + '.old'
# rename original file to new backup name
os.rename(fname,new_name)
# close temporary file to enable future operation (in this case rename)
temp_file.close()
# rename temporary file to original filename
os.rename('tempfile',fname)
Also after 2 days after asking with a help of good people and hard study of the language :-D (indentation was my nightmare) and using some snippets of code on this site I have created something that works... :-) I hope it helps other people with similar question...
At beginning the idea was clear - but no knowledge of the language...
Now - all can be done - only what man can imagine is the border :-)
I miss GOTO in Python :'( ... I love spaghetti, not the spaghetti code, but sometimes it would be good to have some label<--goto jumps... (but this is not the case...)

Not able to frame text while adding a line to middle of file in python

My text.txt looks like this
abcd
xyzv
dead-hosts
-abcd.srini.com
-asdsfcd.srini.com
And I want to insert few lines after "dead-hosts" line, I made a script to add lines to file, there is extra space before last line, that's mandatory in my file, but post added new lines that space got removed, dont know how to maintain the space as it is.
Here is my script
Failvrlist = ['srini.com','srini1.com']
tmplst = []
with open(‘test.txt’,'r+') as fd:
for line in fd:
tmplst.append(line.strip())
pos = tmplst.index('dead-hosts:')
tmplst.insert(pos+1,"#extra comment ")
for i in range(len(Failvrlist)):
tmplst.insert(pos+2+i," - "+Failvrlist[i])
tmplst.insert(pos+len(Failvrlist)+2,"\n")
for i in xrange(len(tmplst)):
fd.write("%s\n" %(tmplst[i]))
output is as below
abcd
xyzv
dead-hosts
#extra comment
- srini.com
- srini1.com
- abcd.srini.com
- asdsfcd.srini.com
if you look at the last two lines the space got removed, please advise .
Points:
In you code , pos = tmplst.index('dead-hosts:'), you are trying to find dead-hosts:. However, input file you have given has only "dead hosts". No colon after dead-hosts, I am considering dead-hosts:
While reading file first time into list, use rstrip() instead of strip(). Using rstrip() will keep spaces at the start of line as it is.
Once you read file into list, code after that should be outside with block which is use to open and read file.
Actually, flow of code should be
Open file and read lines to list and close the file.
Modify list by inserting values at specific index.
Write the file again.
Code:
Failvrlist = ['srini.com','srini1.com']
tmplst = []
#Open file and read it
with open('result.txt','r+') as fd:
for line in fd:
tmplst.append(line.rstrip())
#Modify list
pos = tmplst.index('dead-hosts:')
tmplst.insert(pos+1,"#extra comment")
pos = tmplst.index('#extra comment')
a = 1
for i in Failvrlist:
to_add = " -" + i
tmplst.insert(pos+a,to_add)
a+=1
#Write to file
with open('result.txt','w') as fd:
for i in range(len(tmplst)):
fd.write("%s\n" %(tmplst[i]))
Content of result.txt:
abcd
xyzv
dead-hosts:
#extra comment
-srini.com
-srini1.com
-abcd.srini.com
-asdsfcd.srini.com

Separating letters in file based on position

I have one .fa file with letters sequence like ACGGGGTTTTGGGCCCGGGGG and .txt file with numbers that show start and stop position like start 2 stop 7. How could I extract letters only from the specific positions from my .fa file and create new file that will contain only letters from the assigned positions? I wrote such code but I got the error "string index out of range'' my position txtx file is just a lit with positions like [[1,52],[66,88].....
my_file = open('dna.fa')
transcript = my_file.read()
positions = open('exons.txt')
positions = positions.read()
coding_sequence = '' # declare the variable
for i in xrange(len(positions)):
start = positions[i][0]
stop = positions[i][1]
exon = transcript[start:stop]
coding_sequence = coding_sequence + exon
print coding_sequence `
Assuming that your positions are stored in a list called positions, that the name of your infile is infile.fa, and the name of your outfile is outfile.fa:
with open("infile.fa") as infile:
text = infile.read()
letters = "".join(text[i] for i in positions)
with open("outfile.fa", "w") as outfile:
outfile.write(letters)
As has been mentioned in #KIDJourney's comment, this could theoretically fail for files large enough that there is not enough memory to store it. Here is how you could do it if that is the case:
with open("infile.fa") as infile:
with open("outfile.fa", "a") as outfile:
outfile.seek(0)
i = 0
for line in infile:
for char in line:
if i in positions:
outfile.write(char)
i += 1
If you are trying to do this job with a VERY LARGE file , the solution of #zondo may failed something for the lack of RAM .
You can use seek when you are trying to read part of file .
def readData(filename , start_pos , end_pos):
with open(filename) as f :
f.seek(start_pos)
data = f.read(end_pos - start_pos)
return data

Categories