Reading strings and decreasing them - python

I just have a simple question when dealing with text files:
I have a text file and want to make a python program to read it and if it finds any number it replaces it by the number preceding it like if it finds 4 it replaces it with 3 so how can I do that?
The problem for me in this program is that python reads the numbers as strings, not integers, so it can't decrease or increase them.
out = open("out.txt", "w")
with open("Spider-Man.Homecoming.2017.1080p.BluRay.x264-[YTS.AG].txt", "r") as file:
lines = file.readlines()
for line in lines:
if line.isdigit():
out.write(str(int(line - 1)))
else:
out.write(line)
This code doesn't detect the numbers as numbers and I don't know why.

Putting #Samwise's comment together with your code:
with open("Spider-Man.Homecoming.2017.1080p.BluRay.x264-[YTS.AG].txt", "r") as file:
lines = file.readlines()
new_lines = []
for line in lines:
decreased = ''.join(str(int(c)-1) if c.isdigit() else c for c in line)
new_lines.append(decreased)
with open('out.txt', 'w') as out:
out.writelines(new_lines)
You also should close the file after writing to it, so switched to with open at the end as a better way to write to file.

Related

Open and Read a CSV File without libraries

I have the following problem. I am supposed to open a CSV file (its an excel table) and read it without using any library.
I tried already a lot and have now the first row in a tuple and this in a list. But only the first line. The header. But no other row.
This is what I have so far.
with open(path, 'r+') as file:
results=[]
text = file.readline()
while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
The output should: be every line in a tuple and all the tuples in a list.
My question is now, how can I read the other lines in python?
I am really sorry, I am new to programming all together and so I have a real hard time finding my mistake.
Thank you very much in advance for helping me out!
This problem was many times on Stackoverflow so you should find working code.
But much better is to use module csv for this.
You have wrong indentation and you use return results after reading first line so it exits function and it never try read other lines.
But after changing this there are still other problems so it still will not read next lines.
You use readline() so you read only first line and your loop will works all time with the same line - and maybe it will never ends because you never set text = ''
You should use read() to get all text which later you split to lines using split("\n") or you could use readlines() to get all lines as list and then you don't need split(). OR you can use for line in file: In all situations you don't need while
def read_csv(path):
with open(path, 'r+') as file:
results = []
text = file.read()
for line in text.split('\n'):
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
def read_csv(path):
with open(path, 'r+') as file:
results = []
lines = file.readlines()
for line in lines:
line = line.rstrip('\n') # remove `\n` at the end of line
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
def read_csv(path):
with open(path, 'r+') as file:
results = []
for line in file:
line = line.rstrip('\n') # remove `\n` at the end of line
items = line.split(',')
results.append(tuple(items))
# after for-loop
return results
All this version will not work correctly if you will '\n' or , inside item which shouldn't be treated as end of row or as separtor between items. These items will be in " " which also can make problem to remove them. All these problem you can resolve using standard module csv.
Your code is pretty well and you are near goal:
with open(path, 'r+') as file:
results=[]
text = file.read()
#while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
Your Code:
with open(path, 'r+') as file:
results=[]
text = file.readline()
while text != '':
for line in text.split('\n'):
a=line.split(',')
b=tuple(a)
results.append(b)
return results
So enjoy learning :)
One caveat is that the csv may not end with a blank line as this would result in an ugly tuple at the end of the list like ('',) (Which looks like a smiley)
To prevent this you have to check for empty lines: if line != '': after the for will do the trick.

Find unique entries in files

guess you have a solution concerning the following issue:
I want to compare two lists for common entries (on the basis of column 10) and write common entries to one file and unique entries for the first list into another file. The code I wrote is:
INFILE1 = open ("c:\\python\\test\\58962.filtered.csv", "r")
INFILE2 = open ("c:\\python\\test\\83887.filtered.csv", "r")
OUTFILE1 = open ("c:\\python\\test\\58962_vs_83887.common.csv", "w")
OUTFILE2 = open ("c:\\python\\test\\58962_vs_83887.unique.csv", "w")
for line in INFILE1:
line = line.rstrip().split(",")
if line[11] in INFILE2:
OUTFILE1.write(line)
else:
OUTFILE2.write(line)
INFILE1.close()
INFILE2.close()
OUTFILE1.close()
OUTFILE2.close()
The following error appears:
8 OUTFILE1.write(line)
9 else:
---> 10 OUTFILE2.write(line)
11 INFILE1.close()
TypeError: write() argument must be str, not list
Does somebody know about help for this?
Best
This line
line = line.rstrip().split(",")
replaces the line you read from a file by it's splitted list. You then try to write the splitted list to your file - thats not how the write method works and it tells you exactly that.
Change it to :
for line in INFILE1:
lineList = line.rstrip().split(",") # dont overwrite line, use lineList
if lineList[11] in INFILE2: # used lineList
OUTFILE1.write(line) # corrected indentation
else:
OUTFILE2.write(line)
You could have easily found your error yourself, just printing out the line before and after splitting or just befrore writing.
Please read How to debug small programs (#1) and follow it - its easier to find and fix bugs yourself then posting questions here.
You have some other problem at hand, though:
Files are stream based, they start with a position of 0 in the file. The position is advanced if you access parts of the file. When at the end, you wont get anything by using INFILE2.read() or other methods.
So if you want to repeatadly check if some lines column of file1 is somewhere in file2 you need to read file2 into a list (or other datastructure) so your repeated checks work. In other words, this:
if lineList[11] in INFILE2:
might work once, then the file is consumed and it will return false all the time.
You also might want to change from:
f = open(...., ...)
# do something with f
f.close()
to
with open(name,"r") as f:
# do something with f, no close needed, closed when leaving block
as it is safer, will close the file even if exceptions happen.
To solve that try this (untested) code:
with open ("c:\\python\\test\\83887.filtered.csv", "r") as file2:
infile2 = file2.readlines() # read in all lines as list
with open ("c:\\python\\test\\58962.filtered.csv", "r") as INFILE1:
# next 2 lines are 1 line, \ at end signifies line continues
with open ("c:\\python\\test\\58962_vs_83887.common.csv", "w") as OUTFILE1, \
with open ("c:\\python\\test\\58962_vs_83887.unique.csv", "w") as OUTFILE2:
for line in INFILE1:
lineList = line.rstrip().split(",")
if any(lineList[11] in x for x in infile2): # check the list of lines if
# any contains line[11]
OUTFILE1.write(line)
else:
OUTFILE2.write(line)
# all files are autoclosed here
Links to read:
the-with-statement
any() and other built-ins

Dividing a .txt file in multiple parts in Python

I'm a begginer in Python, and I have a question about file reading :
I need to process info in a file to write it in another one. I know how to do that, but it's reaaally ressource-consuming for my computer, as the file is really big, but I know how it's formatted !
The file follows that format :
4 13
9 3 4 7
3 3 3 3
3 5 2 1
I won't explain what it is for, as it would take ages and would not be very useful, but the file is essentialy made of four lines like these, again and again. For now, I use this to read the file and convert it in a very long chain :
inputfile = open("input.txt", "r")
output = open("output.txt", "w")
Chain = inputfile.read()
Chain = Chain.split("\n")
Chained = ' '.join(Chain)
Chain = Chained.split(" ")
Chain = list(map(int, Chain))
Afterwards, I just treat it with "task IDs", but I feel like it's really not efficient.
So do you know how I could divide the chain into multiple ones knowing how they are formatted?
Thanks for reading !
How about:
res = []
with open('file', 'r') as f:
for line in f:
for num in line.split(' '):
res.append(int(num))
Instead of reading the whole file into memory, you go line by line.
Does this help?
If you need to go 4 lines at a time, just add an internal loop.
Regarding output, I'm assuming you want to do some computation on the input, so I wouldn't necessarily do this in the same loop. Either process the input once reading is done, or instead of using a list, use a queue and have another thread read from the queue while this thread is writing to it.
Perhaps the utility of a list comprehension will help a bit as well (I doubt this will make an impact):
res = []
with open('file', 'r') as f:
for line in f:
res.append( int(num) for num in line.split() )
hmm there's some method to write to a file without reading it i believe
Add text to end of line without loading file
https://docs.python.org/2.7/library/functions.html#print
from __future__ import print_function
# if you are using python2.7
i = open("input","r")
f = open("output.txt","w")
a = "awesome"
for line in i:
#iterate lines in file input
line.strip()
#this will remove the \n in the end of the string
print(line,end=" ",file=f)
#this will write to file output with space at the end of it
this might help, i'm a newbie too, but with better google fu XD
Maybe do it line by line. This way it consumes less memory.
inputfile = open("input.txt", "r")
output = open("output.txt", "a")
while True:
line = inputfile.readline()
numbers = words.split(" ")
integers = list(map(int, numbers))
if not line:
break
There is probably a newline character \n in the words. You should also replace that with an empty string.
If you don't wanna to consume memory (you can run of it if file is very large), you need to read lien by line.
with open('input.txt', 'w') as inputfile, open('"output.txt', 'w') as output:
for line in inputfile:
chain = line.split(" ")
#do some calculations or what ever you need
#and write those numbers to new file
numbers = list(map(int, chain))
for number in numbers
output.write("%d " % number)

python: Open file, edit one line, save it as the same file

I want to open a file, search for a specific word, change the word and save the file again. Sounds really easy - but I just can't get it working... I know that I have to overwrite the whole file but only change this one word!
My Code:
f = open('./myfile', 'r')
linelist = f.readlines()
f.close
for line in linelist:
i =0;
if 'word' in line:
for number in arange(0,1,0.1)):
myNumber = 2 - number
myNumberasString = str(myNumber)
myChangedLine = line.replace('word', myNumberasString)
f2 = open('./myfile', 'w')
f2.write(line)
f2.close
#here I have to do some stuff with these files so there is a reason
#why everything is in this for loop. And I know that it will
#overwrite the file every loop and that is good so. I want that :)
If I make it like this, the 'new' myfile file contains only the changed line. But I want the whole file with the changed line... Can anyone help me?
****EDIT*****
I fixed it! I just turned the loops around and now it works perfectly like this:
f=open('myfile','r')
text = f.readlines()
f.close()
i =0;
for number in arange(0,1,0.1):
fw=open('mynewfile', 'w')
myNumber = 2 - number
myNumberasString = str(myNumber)
for line in text:
if 'word' in line:
line = line.replace('word', myNumberasString)
fw.write(line)
fw.close()
#do my stuff here where I need all these input files
You just need to write out all the other lines as you go. As I said in my comment, I don't know what you are really trying to do with your replace, but here's a slightly simplified version in which we're just replacing all occurrences of 'word' with 'new':
f = open('./myfile', 'r')
linelist = f.readlines()
f.close
# Re-open file here
f2 = open('./myfile', 'w')
for line in linelist:
line = line.replace('word', 'new')
f2.write(line)
f2.close()
Or using contexts:
with open('./myfile', 'r') as f:
lines = f.readlines()
with open('./myfile', 'w') as f:
for line in lines:
line = line.replace('word', 'new')
f.write(line)
Use fileinput passing in whatever you want to replace:
import fileinput
for line in fileinput.input("in.txt",inplace=True):
print(line.replace("whatever","foo"),end="")
You don't seem to be doing anything special in your loop that cannot be calculated first outside the loop, so create the string you want to replace the word with and pass it to replace.
inplace=True will mean the original file is changed. If you want to verify everything looks ok then remove the inplace=True for the first run and you will actually see the replaced output instead of the lines being written to the file.
If you want to write to a temporary file, you can use a NamedTemporaryFile with shutil.move:
from tempfile import NamedTemporaryFile
from shutil import move
with open("in.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
for line in f:
out.write(line.replace("whatever","foo"))
move("in.txt",out.name)
One problem you may encounter is matching substrings with replace so if you know the word is always followed in the middle of a sentence surrounded by whitespace you could add that but if not you will need to split and check every word.
from tempfile import NamedTemporaryFile
from shutil import move
from string import punctuation
with open("in.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
for line in f:
out.write(" ".join(word if word.strip(punctuation) != "whatever" else "foo"
for word in line.split()))
The are three issues with your current code. First, create the f2 file handle before starting the loop, otherwise you'll overwrite the file in each iteration. Third, you are writing an unmodified line in f2.write(line). I guess you meant f2.write(myChangedLine)? Third, you should add an else statement that writes unmodified lines to the file. So:
f = open('./myfile', 'r')
linelist = f.readlines()
f.close
f2 = open('./myfile', 'w')
for line in linelist:
i =0;
if 'word' in line:
for number in arange(0,1,0.1)):
myNumber = 2 - number
myNumberasString = str(myNumber)
myChangedLine = line.replace('word', myNumberasString)
f2.write(myChangedLine)
else:
f2.write(line)
f2.close()

Copy the last three lines of a text file in python?

I'm new to python and the way it handles variables and arrays of variables in lists is quite alien to me. I would normally read a text file into a vector and then copy the last three into a new array/vector by determining the size of the vector and then looping with a for loop a copy function for the last size-three into a new array.
I don't understand how for loops work in python so I can't do that.
so far I have:
#read text file into line list
numberOfLinesInChat = 3
text_file = open("Output.txt", "r")
lines = text_file.readlines()
text_file.close()
writeLines = []
if len(lines) > numberOfLinesInChat:
i = 0
while ((numberOfLinesInChat-i) >= 0):
writeLine[i] = lines[(len(lines)-(numberOfLinesInChat-i))]
i+= 1
#write what people say to text file
text_file = open("Output.txt", "w")
text_file.write(writeLines)
text_file.close()
To get the last three lines of a file efficiently, use deque:
from collections import deque
with open('somefile') as fin:
last3 = deque(fin, 3)
This saves reading the whole file into memory to slice off what you didn't actually want.
To reflect your comment - your complete code would be:
from collections import deque
with open('somefile') as fin, open('outputfile', 'w') as fout:
fout.writelines(deque(fin, 3))
As long as you're ok to hold all of the file lines in memory, you can slice the list of lines to get the last x items. See http://docs.python.org/2/tutorial/introduction.html and search for 'slice notation'.
def get_chat_lines(file_path, num_chat_lines):
with open(file_path) as src:
lines = src.readlines()
return lines[-num_chat_lines:]
>>> lines = get_chat_lines('Output.txt', 3)
>>> print(lines)
... ['line n-3\n', 'line n-2\n', 'line n-1']
First to answer your question, my guress is that you had an index error you should replace the line writeLine[i] with writeLine.append( ). After that, you should also do a loop to write the output :
text_file = open("Output.txt", "w")
for row in writeLine :
text_file.write(row)
text_file.close()
May I suggest a more pythonic way to write this ? It would be as follow :
with open("Input.txt") as f_in, open("Output.txt", "w") as f_out :
for row in f_in.readlines()[-3:] :
f_out.write(row)
A possible solution:
lines = [ l for l in open("Output.txt")]
file = open('Output.txt', 'w')
file.write(lines[-3:0])
file.close()
This might be a little clearer if you do not know python syntax.
lst_lines = lines.split()
This will create a list containing all the lines in the text file.
Then for the last line you can do:
last = lst_lines[-1]
secondLAst = lst_lines[-2]
etc... list and string indexes can be reached from the end with the '-'.
or you can loop through them and print specific ones using:
start = start line, stop = where to end, step = what to increment by.
for i in range(start, stop-1, step):
string = lst_lines[i]
then just write them to a file.

Categories