Python: Deleting specific strings from file - python

I am reposting after changing a few things with my earlier post. thanks to all who gave suggestions earlier. I still have problems with it.
I have a data file (un-structed messy file) from which I have to scrub specific list of strings (delete strings).
Here is what I am doing but with no result:
infile = r"messy_data_file.txt"
outfile = r"cleaned_file.txt"
delete_list = ["firstname1 lastname1","firstname2 lastname2"....,"firstnamen lastnamen"]
fin = open(infile,"")
fout = open(outfile,"w+")
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
fin.close()
fout.close()
When I execute the file, I get the following error:
NameError: name 'word' is not defined

I'm unable to replicate your error; the error I get with your code is the empty mode string - either put "r" or delete it, read is the default.
Traceback (most recent call last):
File "test.py", line 6, in <module>
fin = open(infile, "")
ValueError: empty mode string
Otherwise, seems fine!

Related

I get an error whenever I try to read a file in Python, how can I fix this?

My code:
String = open(r"C:\Users\chloe\OneDrive\Documents\Python\Python code\Python text files\Story\VerbJust.txt", "r").read()
print(String)
I have the file stored in the exact folder, but I got an error:``
Traceback (most recent call last):
File "C:\Users\chloe\OneDrive\Documents\Python\Python code\StoryClasses.py", line 47, in <module>
VerbTo = ReadFile("VerbTo")
File "C:\Users\chloe\OneDrive\Documents\Python\Python code\StoryClasses.py", line 41, in ReadFile
string = open(w[variable][0], "r").read()
FileNotFoundError: [Errno 2] No such file or directory: 'C'
Why is this? Can Python not access OneDrive?
In this line:
string = open(w[variable][0], "r").read()
it appears that w[variable] contains the filename. Adding [0] to that uses just the first character of the filename. Get rid of that.
string = open(w[variable], "r").read()
This error occurs because the quotation marks are formatted incorrectly.
Also, I suspect the variable name you chose, "String", may cause some issues.
Try:
string = open(r"filepath", "r").read()
print(string)

Python reading from CSV file in python 2.7 but not python 3.6. What do i have to do to make it work in 3.6

I have some code which i coded in python 2.7, however I need it to work for 3.6 and when i run it i get this error and i am not sure why.
import csv
def ReadFromFile():
with open('File.csv', 'r') as File:
cr = csv.reader(File)
for row in cr:
Name = row[0]
Gender = row[1]
print(Name + Gender)
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
ReadFromFile()
File "F:/Test.py", line 6, in ReadFromFile
Name = row[0]
IndexError: list index out of range
I am using the same code saved on a memory stick with the file in 2.7 i get my desired out come of it being read but in 3.6 i am stuck with the error. Thanks for any help
Edit: Added Print
After adding print i got
ELIZABETHFemale
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
ReadFromFile()
File "F:/Test.py", line 6, in ReadFromFile
Name = row[0]
IndexError: list index out of range
So it gave me the first line but nothing more
Python's CSV module has changed how it wants the files you pass to it to be opened. You want to avoid the file object doing any newline transformation because some CSV formats allow embedded newlines within quoted fields. The csv module will do its own newline normalization, so the usual universal newline handling the file object does is redundant.
This is mentioned in the csv.reader documentation, where it is talking about the file argument:
If csvfile is a file object, it should be opened with newline=''.
So for your code, try changing open('File.csv', 'r') to open('File.csv', 'r', newline='').
Have you tried pandas?
I think you may want to use something like
import pandas as pd
def ReadFromFile():
df = pd.read_csv('File.csv')
for row in df:
Name = row[0]
Gender = row[1]
print(Name + Gender)

Creating text files, appending them to zip, then delete them

I am trying to get the code below to read the file raw.txt, split it by lines and save every individual line as a .txt file. I then want to append every text file to splits.zip, and delete them after appending so that the only thing remaining when the process is done is the splits.zip, which can then be moved elsewhere to be unzipped. With the current code, I get the following error:
Traceback (most recent call last): File "/Users/Simon/PycharmProjects/text-tools/file-splitter-txt.p‌​y",
line 13, in <module> at stonehenge summoning the all father. z.write(new_file)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framewo‌​rk/Versions/2.7/lib/‌​python2.7/zipfile.py‌​", line 1123, in write st = os.stat(filename) TypeError: coercing to Unicode: need string or buffer,
file found
My code:
import zipfile
import os
z = zipfile.ZipFile("splits.zip", "w")
count = 0
with open('raw.txt','r') as infile:
for line in infile:
print line
count +=1
with open(str(count) + '.txt','w') as new_file:
new_file.write(str(line))
z.write(new_file)
os.remove(new_file)
You could simply use writestr to write a string directly into the zipFile. For example:
zf.writestr(str(count) + '.txt', str(line), compress_type=...)
Use the file name like below. write method expects the filename and remove expects path. But you have given the file (file_name)
z.write(str(count) + '.txt')
os.remove(str(count) + '.txt')

How to skip reading first line from file when using fileinput method

I am doing this to read the file:
import fileinput
for line in fileinput.input('/home/manish/java.txt'):
if not fileinput.isfirstline():
... data = proces_line(line);
... output(data)
It is throwing error as proces_line is not defined.
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
NameError: name 'proces_line' is not defined
I have to read the data line by line and store in list, each line being separate element of list.
You can skip the first line as follows:
import fileinput
def output(line):
print(line)
fi = fileinput.input('/home/manish/java.txt')
next(fi) # skip first line
for line in fi:
output(line)
This avoids you having to test for a first line each time in the for loop.
To store each of the lines into a list, you could do the following:
import fileinput
fi = fileinput.input('/home/manish/java.txt')
next(fi) # skip first line
output = list(fi)
fi.close()
print(output)
You can try with this:
fname = '/home/manish/java.txt'
with open(fname) as f:
content = f.readlines()
content is of type list. You can ignore content[0] and loop through with the rest to fetch the required data.
You are looking for the "readline ()" fuction. Pulls in the next line from the file and truncated the newline Python documentation for File Input
Usage
For each in openFile:
List += openFile.readline ()
In addition, you are trying to use a function that does not exist. As well as being miss spelled.

How can i fix this File "<string>" error in python

I trying to do the simple script and its throwing the below error at for loop,
WASX7017E: Exception received while running file "/abc/websphere/wasad/createusers.py";
exception information: com.ibm.bsf.BSFException: exception from Jython:
Traceback (innermost last):
File "<string>", line 22, in ?
AttributeError: __getitem__
filename=sys.argv[0]
file_read= open( filename) ---- this is line 22
for row in file_read:
Please let me know the reason for this.
Here you can find my code,
import sys
filename="/usr/websphere/onefolder/Userlist.txt"
fileread = open(filename, 'r')
for row in fileread:
column=row.strip().split(';')
user_name=column[0]
pass_word=column[1]
AdminTask.createUser(['-uid',user_name, '-password', pass_word, '-confirmPassword', pass_word])
AdminTask.mapUsersToAdminRole(['-roleName','Administrator','-userids',user_name])
AdminTask.addMemberToGroup('[-memberUniqueName user_name,o=defaultWIMFileBasedRealm -groupUniqueName cn=webarch,o=defaultWIMFileBasedRealm]')
fileread.close()
AdminConfig.save()
print 'Saving Configuration is completed'
It looks like you want to iterate over each line in the file. The open method in Python returns a file object. If you want to iterate over each line in the file, you'll need to call readlines to retrieve the contents of the file, and then loop over that.
This should work:
import sys
filename="/usr/websphere/onefolder/Userlist.txt"
fileread = open(filename, 'r')
filelines = fileread.readlines()
for row in filelines:
column=row.strip().split(';')
user_name=column[0]
pass_word=column[1]
AdminTask.createUser(['-uid',user_name, '-password', pass_word, '-confirmPassword', pass_word])
AdminTask.mapUsersToAdminRole(['-roleName','Administrator','-userids',user_name])
AdminTask.addMemberToGroup('[-memberUniqueName user_name,o=defaultWIMFileBasedRealm -groupUniqueName cn=webarch,o=defaultWIMFileBasedRealm]')
fileread.close()
AdminConfig.save()
print 'Saving Configuration is completed'

Categories