fp = open ('data.txt','r')
saveto = open('backup.txt','w')
someline = fp.readline()
savemodfile = ''
while someline :
temp_array = someline.split()
print('temp_array[1] {0:20} temp_array[0] {0:20}'.format(temp_array[1], temp_array[0]), '\trating:', temp_array[len(temp_array)-1]))
someline = fp.readline()
savemodfile = temp_array[1] + ' ' + temp_array[0] +',\t\trating:'+ temp_array[10]
saveto.write(savemodfile + '\n')
fp.close()
saveto.close()
The input file :data.txt has records of this pattern: firstname Lastname age address
I would like the backup.txt to has this format: Lastname firstname address age
How do i store the data in the backup.txt in a nice formatted way? I think i should use format() method somehow...
I use the print object in the code to show you what i understood about format() so far. Of course, i do not get the desired results.
To answer your question:
you can indeed use the .format() method on a string template, see the documentation https://docs.python.org/3.5/library/stdtypes.html#str.format
For example:
'the first parameter is {}, the second parameter is {}, the third one is {}'.format("this one", "that one", "there")
Will output: 'the first parameter is this one, the second parameter is that one, the third one is there'
You do not seem to use format() properly in your case: 'temp_array[1] {0:20} temp_array[0] {0:20}'.format(temp_array[1], temp_array[0]) will output something like 'temp_array[1] Lastname temp_array[0] Lastname '. That is because {0:20} will output the 1st parameter to format(), right padded with spaces to 20 characters.
Additionally, there is many things to be improved in your code. I guess you are learning Python so that's normal. Here is a functionally equivalent code that produces the output you want, and makes good use of Python features and syntax:
with open('data.txt', 'rt') as finput, \
open('backup.txt','wt') as foutput:
for line in finput:
firstname, lastname, age, address = line.strip().split()
foutput.write("{} {} {} {}\n".format(lastname, firstname, address, age)
This code will give you a formatted output on the screen and in the output file
fp = open ('data.txt','r')
saveto = open('backup.txt','w')
someline = fp.readline()
savemodfile = ''
while someline :
temp_array = someline.split()
str = '{:20}{:20}{:20}{:20}'.format(temp_array[1], temp_array[0], temp_array[2], temp_array[3])
print(str)
savemodfile = str
saveto.write(savemodfile + '\n')
someline = fp.readline()
fp.close()
saveto.close()
But this is not a very nice code in working with files, try using the following pattern:
with open('a', 'w') as a, open('b', 'w') as b:
do_something()
refer to : How can I open multiple files using "with open" in Python?
fp = open ('data.txt','r')
saveto = open('backup.txt','w')
someline = fp.readline()
savemodfile = ''
while someline :
temp_array = someline.split()
someline = fp.readline()
savemodfile = '{:^20} {:^20} {:^20} {:^20}'.format(temp_array[1],temp_array[0],temp_array[3],temp_array[2])
saveto.write(savemodfile + '\n')
fp.close()
saveto.close()
Related
I am a complete beginner with programming. I try to add some information to a txt file, but it doesn't work... It does print the parameters, but won't add it in the txt file. All the help will be appreciated.
def addpersons(student_number, name, phone_number):
new_person = student_number + name + phone_number
data = data + new_person
with open("data.txt", 'w') as f:
f.write (data)
print(200300, "Jim", "031213245123")
Is that all the code you have? Because you are adding data + person where data is not defined, that should throw an error. Which you probably don't see because if that is all your Code you are not calling the function add all.
To have it work make sure you acctually call the function addpersonand make sure that data is defined before you do data = data + person
Also there shouldn't be a space between f.write and (data) but I doubt that matters.
Here is a version that should work:
def addpersons(student_number, name, phone_number):
new_person = str(student_number) + name + phone_number
with open("data.txt", 'w') as f:
f.write(new_person)
addpersons(200300, "Jim", "031213245123")
print(200300, "Jim", "031213245123")
I took a look at your code and lets just say its completely wrong. Also in future please use the md feature with backticks to simply paste your code, it makes life much easier for people who try and answer, anyways i digress. Your first mistake is in this line
new_person = student_number + name + phone_number
Student_number is an integer, you cannot concat ints and strs in python, you can use the str() builtin to convert it to a string.
Your next error is:
data = data + new_person
data is not defined before this, i assume you are doing this so you can put multiple people in, however you can achieve this by appending to the file instead of writing. This is achievable by doing:
with open("data.txt", "a") as f:
Then you can just do:
with open("data.txt", "w") as f:
f.write(new_student)
Try this:
def addpersons(student_number, name, phone_number):
data = ""
new_person = str(student_number) + name + str(phone_number)
data = data + new_person
with open("data.txt", 'a') as f:
f.write(data + '\n')
addpersons(200300, "Jim", "03121324")
addpersons(12345, "Jorj", "098765434")
Output:
or Try this:
def addpersons(student_number, name, phone_number):
data = ""
new_person = str(student_number) + "\t" + name + "\t" + str(phone_number)
data = data + new_person
with open("data.txt", 'a') as f:
f.write(data + '\n')
addpersons(200300, "Jim", "03121324")
addpersons(12345, "Jorj", "098765434")
Output:
with open("C:\\Users\\Nav\\Desktop\\script\\names.txt", 'r+') as f:
for x in range (0, 100):
f_contents = f.readline()
name = f_contents
name2 = name
print(name.lower().replace(" ", "") + "#gmail.com" + "\n")
x = input()
With this code, I am trying to read a file with a full name on each line and format it, that works fine but when I add the "#gmail.com" and get it printed out it gets printed to two different lines in the console.
For example, my output is
austenrush
#gmail.com
yuvaanduncan
#gmail.com
jawadpatton
#gmail.com
hanifarusso
#gmail.com
kerysbeck
#gmail.com
safiyamcguire
#gmail.com
oluwatobilobamiddleton
#gmail.com
while I would like to get:
austenrush#gmail.com
yuvaanduncan#gmail.com
jawadpatton#gmail.com
hanifarusso#gmail.com
kerysbeck#gmail.com
safiyamcguire#gmail.com
oluwatobilobamiddleton#gmail.com
readline doesn't strip the newline read from the file; you have to do that yourself.
f_contents = f.readline().rstrip("\n")
Files are iterable, though, so you don't need to call readline explicitly.
from itertools import islice
with open("C:\\Users\\Nav\\Desktop\\script\\names.txt", 'r+') as f:
for f_contents in islice(f, 100):
name = f_contents.rstrip("\n").lower().replace(" ", "")
print(name + "#gmail.com" + "\n")
x = input()
Matching a file in this form. It always begins with InvNo, ~EOR~ is End Of Record.
InvNo: 123
Tag1: rat cake
Media: d234
Tag2: rat pudding
~EOR~
InvNo: 5433
Tag1: strawberry tart
Tag5: 's got some rat in it
~EOR~
InvNo: 345
Tag2: 5
Media: d234
Tag5: rather a lot really
~EOR~
It should become
IN 123
UR blabla
**
IN 345
UR blibli
**
Where UR is a URL. I want to keep the InvNo as first tag. ** is now the end of record marker. This works:
impfile = filename[:4]
media = open(filename + '_earmark.dat', 'w')
with open(impfile, 'r') as f:
HASMEDIA = False
recordbuf = ''
for line in f:
if 'InvNo: ' in line:
InvNo = line[line.find('InvNo: ')+7:len(line)]
recordbuf = 'IN {}'.format(InvNo)
if 'Media: ' in line:
HASMEDIA = True
mediaref = line[7:len(line)-1]
URL = getURL(mediaref) # there's more to it, but that's not important now
recordbuf += 'UR {}\n'.format(URL))
if '~EOR~' in line:
if HASMEDIA:
recordbuf += '**\n'
media.write(recordbuf)
HASMEDIA = False
recordbuf = ''
media.close()
Is there a better, more Pythonic way? Working with the recordbuffer and the HASMEDIA flag seems, well, old hat. Any examples or tips for good or better practice?
(Also, I'm open to suggestions for a more to-the-point title to this post)
You could set InvNo and URL initially to None, and only print a record when InvNo and URL are both not Falsish:
impfile = filename[:4]
with open(filename + '_earmark.dat', 'w') as media, open(impfile, 'r') as f:
InvNo = URL = None
for line in f:
if line.startswith('InvNo: '):
InvNo = line[line.find('InvNo: ')+7:len(line)]
if line.startswith('Media: '):
mediaref = line[7:len(line)-1]
URL = getURL(mediaref)
if line.startswith('~EOR~'):
if InvNo and URL:
recordbuf = 'IN {}\nUR {}\n**\n'.format(InvNo, URL)
media.write(recordbuf)
InvNo = URL = None
Note: I changed 'InvNo: ' in line to line.startswith('InvNo: ') based on the assumption that InvNo always occurs at the beginning of the line. It appears to be true in your example, but the fact that you use line.find('InvNo: ') suggests that 'InvNo:' might appear anywhere in the line.
If InvNo: appears only at the beginning of the line, then use line.startswith(...) and remove line.find('InvNo: ') (since it would equal 0).
Otherwise, you'll have to retain 'InvNo:' in line and line.find (and of course, the same goes for Media and ~EOR~).
The problem with using code like 'Media' in line is that if the Tags can contain anything, it might contain the string 'Media' without being a true field header.
Here is a version if you don't want to slice and if you ever need to write to the same output file again, you may not, you can change 'w' to 'a'.
with open('input_file', 'r') as f, open('output.dat', 'a') as media:
write_to_file = False
lines = f.readlines()
for line in lines:
if line.startswith('InvNo:'):
first_line = 'IN ' + line.split()[1] + '\n'
if line.startswith('Media:'):
write_to_file = True
if line.startswith('~EOR~') and write_to_file:
url = 'blabla' #Put getUrl() here
media.write(first_line + url + '\n' + '**\n')
write_to_file = False
first_line = ''
I have several test files kept in one directory
I want to go to each file and search some text "Text 1" and "Text 2" and print everything in front of this text in output file....
This I have done using python script.....
But next thing is I want only the first instance of "Text 1" and "Text 2" in each file. If I add break in the current script I am not able to print in out file..
Please guide me.. I am a python beginner...
import os
path = "D:\test"
in_files = os.listdir(path)
desc = open("desc.txt", "w")
print >> desc, "Mol_ID, Text1, Text2"
moldesc = ['Text1', 'Text2']
for f in in_files:
file = os.path.join(path, f)
text = open(file, "r")
hit_count = 0
hit_count1 = 0
for line in text:
if moldesc[0] in line:
Text1 = line.split()[-1]
if moldesc[1] in line:
Text2 = line.split()[-1]
print >> desc, f + "," + Text1 + "," + Text2
text.close()
print "Text extraction done !!!"
There are a couple of issues with your code:
Your text.close() should be at the same level as the for line in text loop.
The print >> desc statement is out of place: you should print only if both Text1 and Text2 are defined. You could set them as None just outside the for line in text loop, and test if they are both not None. (Alternatively, you could set hit_count0=1 in the if moldesc[0] test, hit_count1=1 in the if moldesc[1] and test for hit_count0 and hit_count1). In that case, print the output and use a break to escape the loop.
(so, in plain code:)
for f in in_files:
file = os.path.join(path, f)
with open(file, "r") as text:
hit_count = 0
hit_count1 = 0
for line in text:
if moldesc[0] in line:
Text1 = line.split()[-1]
hit_count = 1
if moldesc[1] in line:
Text2 = line.split()[-1]
hit_count1 = 1
if hit_count and hit_count1:
print >> desc, f + "," + Text1 + "," + Text2
break
There's a third issue:
You mention wanting the text before Text1 ? Then you may want to use Text1 = line[:line.index(moldesc[0])] instead of your Text1 = line.split()[-1]...
I would go for an mmap and possibly use CSV for the results file approach, something like (untested) and rough around the edges... (needs better error handling, may want to use mm.find() instead of an regex, some of the code is copied verbatim from OP etc..., and my computer's battery is about to die...)
import os
import csv
import mmap
from collections import defaultdict
PATH = r"D:\test" # note 'r' prefix to escape '\t' interpretation
in_files = os.listdir(path)
fout = open('desc.txt', 'w')
csvout = csv.writer(fout)
csvout.writerow( ['Mol_ID', 'Text1', 'Text2'] )
dd = defaultdict(list)
for filename in in_files:
fin = open(os.path.join(path, f))
mm = mmap.mmap(fin.fileno(), 0, access=mmap.ACCESS_READ)
# Find stuff
matches = re.findall(r'(.*?)(Text[12])', mm) # maybe user finditer depending on exact needs
for text, matched in matches:
dd[matched].append(text)
# do something with dd - write output using csvout.writerow()...
mm.close()
fin.close()
csvout.close()
i needed to create a program that would read a text file and count the number of lines, words and characters. I got it all working below if seperated individually but i wanted to convert it into using functions so it can read the file once, but i keep getting different answers and unsure what im doing wrong.
Words Code
print ' '
fname = "question2.txt"
infile = open ( fname, 'r' )
fcontents = infile.read()
words = fcontents.split()
cwords = len(words)
print "Words: ",cwords
Characters Code
fname = "question2.txt"
infile = open ( fname, 'r' )
fcontents = infile.read()
char = len(fcontents)
print "Characters: ", char
Lines Code
fname = "question2.txt"
infile = open ( fname, 'r' )
fcontents = infile.readlines()
lines = len(fcontents)
print "Lines: ", lines
Correct Results
Words: 87
Characters: 559
Lines: 12
This is what I came up while trying to use functions but just cant figure out what's wrong.
def filereader():
fname = 'question2.txt'
infile = open ( fname, 'r' )
fcontents = infile.read()
fcontents2 = infile.readlines()
return fname, infile, fcontents, fcontents2
def wordcount(fcontents):
words = fcontents.split(fcontents)
cwords = len(words)
return cwords
def charcount(fcontents):
char = len(fcontents)
return char
def linecount(fcontents2):
lines = len(fcontents2)
return lines
def main():
print "Words: ", wordcount ('cwords')
print "Character: ", charcount ('char')
print "Lines: ", linecount ('lines')
main()
Wrong Results
Words: 2
Character: 4
Lines: 5
You need to use filereader in main:
def main():
fname, infile, fcontents, fcontents2 = filereader()
print "Words: ", wordcount (fcontents)
print "Character: ", charcount (fcontents)
print "Lines: ", linecount (fcontents2)
Otherwise, how would you obtain the values for fcontents and fcontents2 to pass to your other functions? You also need to fix filereader to make sure it will read the file once:
def filereader():
fname = 'question2.txt'
infile = open ( fname, 'r' )
fcontents = infile.read()
fcontents2 = fcontents.splitlines(True)
return fname, infile, fcontents, fcontents2
Note that the line for fcontents2 has been modified to split fcontents on newlines (see str.splitlines). This will also gives you a list of strings as .readlines() would do.
infile = open ( fname, 'r' )
fcontents = infile.read()
fcontents2 = infile.readlines()
You cannot read from a file twice.
When you read from a file, the file handle remembers its position in the file. Thus, after your call infile.read(), infile will be placed at the end of the file. When you then call infile.readlines(), it will try to read all the characters between its current position and the end of the file, and hence return an empty list.
You can rewind the file to its initial position using infile.seek(0). Thus:
>>> fcontents = infile.read()
>>> infile.seek(0)
>>> flines = infile.readlines()
will work.
Alternatively, having read the file into the string fcontents, you can split the string into lines using splitlines:
>>> fcontents = infile.read()
>>> flines = fcontents.splitlines()