additional list added after each row python - python

There is a discrepancy in execution of code in repl.it (which works fine, presumably because the bugs in Python have been fixed/updated), and IDLE, in which the code does not work correctly.
I have consulted the documentation, and previous stack overflow answers to add the "newline", but the problem persists.
You'll notice the repl it, here: (works perfectly)
https://repl.it/Jbv6/0
However, in IDLE on pasting the file contents (without a line break) it works fine
001,Joe,Bloggs,Test1:99,Test2:100,Test3:1002,Ash,Smith,Test1:20,Test2:20,Test3:100003003,Jonathan,Peter,Test1:99,Test2:33,Test3:44
but on pasting the file contents into the txt file as it should be (with each record on a new line) as so:
001,Joe,Bloggs,Test1:99,Test2:100,Test3:1
002,Ash,Smith,Test1:20,Test2:20,Test3:100003
003,Jonathan,Peter,Test1:99,Test2:33,Test3:44
the error on output is as follows (produces a new list after each line):
[['001', 'Joe', 'Bloggs', 'Test1:99', 'Test2:100', 'Test3:1'], [], ['002', 'Ash', 'Smith', 'Test1:20', 'Test2:20', 'Test3:100'], ['003'], ['', 'Jonathan', 'Peter', 'Test1:99', 'Test2:33', 'Test3:44']]
The code is here:
import csv
#==========1. Open the File, Read it into a list, and Print Contents
print("1==============Open File, Read into List, Print Contents")
#open the file, read it into a list (each line is a list within a list, and the end of line spaces are stripped as well as the individual elements split at the comma)
with open("studentinfo.txt","rb",newline="") as f:
studentlist=list(csv.reader(f))
print(studentlist)
I have tried, as the documentation, and previous answers on stackoverflow suggests, adding this: (the newline)
with open("studentinfo.txt","r",newline="") as f:
Unfortunately the error persists.
Any suggestions/solutions with an explanation would be appreciated.
Update, I also tried this:
with open("studentinfo.txt",newline="") as f:
reader=csv.reader(f)
for row in reader:
print(row)
again, it works perfectly in replit
https://repl.it/Jbv6/2
but this error in IDLE
1==============Open File, Read into List, Print Contents
['001', 'Joe', 'Bloggs', 'Test1:99', 'Test2:100', 'Test3:1']
[]
['002', 'Ash', 'Smith', 'Test1:20', 'Test2:20', 'Test3:100']
['003']
['', 'Jonathan', 'Peter', 'Test1:99', 'Test2:33', 'Test3:44']
>>>
This is a huge issue for students who need to be able to have consistency across both repl.it and IDLE which is what they are working on between their school and home environments.
Any answer that shows code that allows it to work on both is what I'm after.

The answer that is easiest is the following:
import csv
# ==========1. Open the File, Read it into a list, and Print Contents
print("1==============Open File, Read into List, Print Contents")
# open the file, read it into a list (each line is a list within a list,
# and the end of line spaces are stripped as well as the individual
# elements split at the comma)
studentlist = []
with open("studentinfo.txt", "r", newline="") as f:
for row in csv.reader(f):
if len(row) > 0:
studentlist.append(row)
print(studentlist)
But your original code should work - I've run it, but on linux rather than windows. If I could ask you to do more work:
with open("studentinfo.txt", "r", newline="") as f:
ascii_ch = list(map(ord,f.read()))
eol_delims = list(map(str,(ch if ch < 32 else '' for ch in ascii_ch)))
print(",".join(eol_delims))
This will produce a list of ,s but interspersed with either 13,10 or 10, but possibly even something like 10,13,10. These are the \r\n and \n that were talked about, but I'm wondering if you've managed to get that third option somehow?
If so, I think you'll need to rewrite that text file to get normal line endings.
-- (update in response to comment)
The only advice I have regarding the 10,13,10 is to only edit the text file in one application (say, notepad), and never edit it in another.
The actual problem comes from editing the file in two applications, which each have a different interpretation of what the line endings should be (windows applications should be \r\n, "repl.it" is \n. I've come across it before, but never worked out the sequence of actions required.

Try use codecs and explicitly specify the encoding of file to UTF-8.
import csv
import codecs
print("1==============Open File, Read into List, Print Contents")
with codecs.open("studentinfo.txt",encoding='utf-8') as f:
studentlist=list(csv.reader(f))
print(studentlist)

Using a filter may help:
with open('studentinfo.txt', 'rU') as f:
filtered = (line.replace('\r', '') for line in f)
for row in csv.reader(filtered):
print(row)

Pasting strings into a text editor and saving the file will not produce byte-identical files on different platforms. (Even different editors on the same platform are inconsistent!)
However, the CSV format accepted by the csv module is specified in terms of a byte-exact representation. The behavior can be customized by using a dialect (either a built-in dialect or implementing a new one) -- see the Python documentation for details. The default dialect is excel which requires Windows-style line endings (CR/LF). If you save the file in a different format it will not be parsed correctly.

Related

Replace string in specific line of nonstandard text file

Similar to posting: Replace string in a specific line using python, however results were not forethcomming in my slightly different instance.
I working with python 3 on windows 7. I am attempting to batch edit some files in a directory. They are basically text files with .LIC tag. I'm not sure if that is relevant to my issue here. I am able to read the file into python without issue.
My aim is to replace a specific string on a specific line in this file.
import os
import re
groupname = 'Oldtext'
aliasname = 'Newtext'
with open('filename') as f:
data = f.readlines()
data[1] = re.sub(groupname,aliasname, data[1])
f.writelines(data[1])
print(data[1])
print('done')
When running the above code I get an UnsupportedOperation: not writable. I am having some issue writing the changes back to the file. Based on suggestion of other posts, I edited added the w option to the open('filename', "w") function. This causes all text in the file to be deleted.
Based on suggestion, the r+ option was tried. This leads to successful editing of the file, however, instead of editing the correct line, the edited line is appended to the end of the file, leaving the original intact.
Writing a changed line into the middle of a text file is not going to work unless it's exactly the same length as the original - which is the case in your example, but you've got some obvious placeholder text there so I have no idea if the same is true of your actual application code. Here's an approach that doesn't make any such assumption:
with open('filename', 'r') as f:
data = f.readlines()
data[1] = re.sub(groupname,aliasname, data[1])
with open('filename', 'w') as f:
f.writelines(data)
EDIT: If you really wanted to write only the single line back into the file, you'd need to use f.tell() BEFORE reading the line, to remember its position within the file, and then f.seek() to go back to that position before writing.

Writing to CSV and saving the file

I have two programs in Python. One writes a customer's information to a CSV. The other accesses it. When the first has written it, I can open the CSV file (in Excel) and see that it has been written correctly. However for the other program to access the new data in the CSV file I have to manually open it and save it (in Excel) otherwise it doesn't work. Does anyone know why this may be?
Edit:
This writes to it (from first program):
f = open('details.csv', 'at', newline=''); csv_f = csv.reader(f)
csv_w.writerow(clientList)
f.close()
And this reads it (second program):
f = open('details.csv', 'rt', newline=''); csv_f = csv.reader(f)
for row in csv_f:
name.append(row[0])
I get this error when trying to append row[0] to a list.
Traceback (most recent call last):
File "C:\Users\Dan\Desktop\Garden Centre\work.py", line 8, in <module>
name.append(row[0])
IndexError: list index out of range
I have seen a number of such problems stemming from differnt platform line endings. Under Python 2, you might try the "universal line endings" file reading mode:
with open('data.csv', 'rU') as f:
for row in csv.reader(f):
print row
Because Excel does often use the old Mac (\r) and Windows (\r\n) standards, which can get in the csv module's way, esp. on a Unix or Mac platform where Python expects the Unix standard line ending (\n). Python 3 is generally smarter about this (and other file/string encoding issues), so generally doesn't need a special mode.
I found an answer after hours of trying. In Excel, each item in an 'empty' row contains '' for the largest number of items in any row. Python doesn't write it like that to the CSV and instead only one item on an empty row contains None. As it iterated through each row, there was no first item to add to the list on the empty rows.
I had to manually add an extra '' to the list that is wrote by the first program.

Webbrowser() reading through a text file for URLS

I am trying to write a script to automate browsing to my most commonly visited websites. I have put the websites into a list and am trying to open it using the webbrowser() module in Python. My code looks like the following at the moment:
import webbrowser
f = open("URLs", "r")
list = f.readline()
for line in list:
webbrowser.open_new_tab(list)
This only reads the first line from my file "URLs" and opens it in the browser. Could any one please help me understand how I can achieve reading through the entire file and also opening the URLs in different tabs?
Also other options that can help me achieve the same.
You have two main problems.
The first problem you have is that you are using readline and not readlines. readline will give you the first line in the file, while readlines gives you a list of your file contents.
Take this file as an example:
# urls.txt
http://www.google.com
http://www.imdb.com
Also, get in to the habit of using a context manager, as this will close the file for you once you have finished reading from it. Right now, even though for what you are doing, there is no real danger, you are leaving your file open.
Here is the information from the documentation on files. There is a mention about best practices with handling files and using with.
The second problem in your code is that, when you are iterating over list (which you should not use as a variable name, since it shadows the builtin list), you are passing list in to your webrowser call. This is definitely not what you are trying to do. You want to pass your iterator.
So, taking all this in to mind, your final solution will be:
import webbrowser
with open("urls.txt") as f:
for url in f:
webbrowser.open_new_tab(url.strip())
Note the strip that is called in order to ensure that newline characters are removed.
You're not reading the file properly. You're only reading the first line. Also, assuming you were reading the file properly, you're still trying to open list, which is incorrect. You should be trying to open line.
This should work for you:
import webbrowser
with open('file name goes here') as f:
all_urls = f.read().split('\n')
for each_url in all_urls:
webbrowser.open_new_tab(each_url)
My answer is assuming that you have the URLs 1 per line in the text file. If they are separated by spaces, simply change the line to all_urls = f.read().split(' '). If they're separated in another way just change the line to split accordingly.

Python error in processing lines from a file

wrote a python script in windows 8.1 using Sublime Text editor and I just tried to run it from terminal in OSX Yosemite but I get an error.
My error occurs when parsing the first line of a .CSV file. This is the slice of the code
lines is an array where each element is the line in the file it is read from as a string
we split the string by the desired delimiter
we skip the first line because that is the header information (else condition)
For the last index in the for loop i = numlines -1 = the number of lines in the file - 2
We only add one to the value of i because the last line is blank in the file
for i in range(numlines):
if i == numlines-1:
dataF = lines[i+1].split(',')
else:
dataF = lines[i+1].split(',')
dataF1 = list(dataF[3])
del(dataF1[len(dataF1)-1])
del(dataF1[len(dataF1)-1])
del(dataF1[0])
f[i] = ''.join(dataF1)
return f
All the lines in the csv file looks like this (with the exception of the header line):
"08/06/2015","19:00:00","1","410"
So it saves the single line into an array where each element corresponds to one of the 4 values separated by commas in a line of the CSV file. Then we take the 3 element in the array, "410" ,and create a list that should look like
['"','4','1','0','"','\n']
(and it does when run from windows)
but it instead looks like
['"','4','1','0','"','\r','\n']
and so when I concatenate this string based off the above code I get 410 instead of 410.
My question is: Where did the '\r' term come from? It is non-existent in the original files when ran by a windows machine. At first I thought it was the text format so I saved the CSV file to a UTF-8, that didn’t work. I tried changing the tab size from 4 to 8 spaces, that didn’t work. Running out of ideas now. Any help would be greatly appreciated.
Thanks
The "\r" is the line separator. The "\r\n" is also a line separator. Different platforms have different line separators.
A simple fix: if you read a line from a file yourself, then line.rstrip() will remove the whitespace from the line end.
A proper fix: use Python's standard CSV reader. It will skip the blank lines and comments, will properly handle quoted strings, etc.
Also, when working with long lists, it helps to stop thinking about them as index-addressed 'arrays' and use the 'stream' or 'sequential reading' metaphor.
So the typical way of handling a CSV file is something like:
import csv
with open('myfile.csv') as f:
reader = csv.reader(f)
# We assume that the file has 3 columns; adjust to taste
for (first_field, second_field, third_field) in reader:
# do something with field values of the current lines here

How to write a list to a file with newlines in Python3

I'm trying to write an array (list?) to a text file using Python 3. Currently I have:
def save_to_file(*text):
with open('/path/to/filename.txt', mode='wt', encoding='utf-8') as myfile:
for lines in text:
print(lines, file = myfile)
myfile.close
This writes what looks like the array straight to the text file, i.e.,
['element1', 'element2', 'element3']
username#machine:/path$
What I'm looking to do is create the file with
element1
element2
element3
username#machine:/path$
I've tried different ways to loop through and append a "\n" but it seems that the write is dumping the array in one operation. The question is similar to How to write list of strings to file, adding newlines? but the syntax looked like it was for Python 2? When I tried a modified version of it:
def save_to_file(*text):
myfile = open('/path/to/filename.txt', mode='wt', encoding='utf-8')
for lines in text:
myfile.write(lines)
myfile.close
...the Python shell gives "TypeError: must be str, not list" which I think is because of changes between Python2 and Python 3. What am I missing to get each element on a newline?
EDIT: Thank you to #agf and #arafangion; combining what both of you wrote, I came up with:
def save_to_file(text):
with open('/path/to/filename.txt', mode='wt', encoding='utf-8') as myfile:
myfile.write('\n'.join(text))
myfile.write('\n')
It looks like I had part of the issue with "*text" (I had read that expands arguments but it didn't click until you wrote that [element] was becoming [[element]] that I was getting a str-not-list type error; I kept thinking I needed to tell the definition that it was getting a list/array passed to it and that just stating "test" would be a string.) It worked once I changed it to just text and used myfile.write with join, and the additional \n puts in the final newline at the end of the file.
myfile.close -- get rid of that where you use with. with automatically closes myfile, and you have to call close like close() anyway for it to do anything when you're not using with. You should just always use with on Python 3.
with open('/path/to/filename.txt', mode='wt', encoding='utf-8') as myfile:
myfile.write('\n'.join(lines))
Don't use print to write to files -- use file.write. In this case, you want to write some lines with line breaks in between, so you can just join the lines with '\n'.join(lines) and write the string that is created directly to the file.
If the elements of lines aren't strings, try:
myfile.write('\n'.join(str(line) for line in lines))
to convert them first.
Your second version doesn't work for a different reason. If you pass
['element1', 'element2', 'element3']
to
def save_to_file(*text):
it will become
[['element1', 'element2', 'element3']]
because the * puts each argument it gets into a list, even if what you pass is already a list.
If you want to support passing multiple lists, and still write them one after another, do
def save_to_file(*text):
with open('/path/to/filename.txt', mode='wt', encoding='utf-8') as myfile:
for lines in text:
myfile.write('\n'.join(str(line) for line in lines))
myfile.write('\n')
or, for just one list, get rid of the * and do what I did above.
Edit: #Arafangion is right, you should probably just use b instead of t for writing to your files. That way, you don't have to worry about the different ways different platforms handle newlines.
There are numerous mistakes there.
Your indentation is messed up.
The 'text' attribute in save_to_file refers to ALL the arguments, not just the specific argument.
You're using "text mode", which is also confusing. Use binary mode instead, so that you have a consistent and defined meaning for '\n'.
When you iterate over the 'lines in text', because of these mistakes, what you're really doing is iterating over your arguments in the function, because 'text' represents all your arguments. That is what '*' does. (At least, in this situation. Strictly speaking, it iterates over all the remaining arguments - please read the documentation).

Categories