Python Writing to a new file - But wrong EOL Conversion - python

My script writes to and creates a new file but it is currently making it in Mac EOL Conversion instead of Windows. This means that each line ends with only 'CR' instead of 'CR LF' which won't work for what i'm trying to do.
Now why this is, or how I can change it?
f = open('...')
text_file1.write(str(i) + ',' + harvestServer + ',' + finalString + harvestCommand + '\r')
text_file1.close()

Replace the \r with \n, having made sure you open the file in text mode. This will use the native convention for your platform (that is, os.linesep).
Alternatively, open the file in binary mode and use \r\n. This will use the Windows convention no matter where you run your code.
Finally, you can control the newline translation by giving the optional newline argument to open().

Related

How can I write characters such as § into a file using Python?

This is my code for creating the string to be written ('result' is the variable that holds the final text):
fileobj = open('file_name.yml','a+')
begin = initial+":0 "
n_name = '"§'+tag+name+'§!"'
begin_d = initial+"_desc:0 "
n_desc = '"§3'+desc+'§!"'
title = ' '+begin + n_name
descript = ' '+begin_d + n_desc
result = title+'\n'+descript
print()
fileobj.close()
return result
This is my code for actually writing it into the file:
text = writing(initial, tag, name, desc)
override = inserter(fileobj, country, text)
fileobj.close()
fileobj = open('file_name.yml','w+')
fileobj.write(override)
fileobj.close()
(P.S: Override is a function which works perfectly. It returns a longer string to be written into the file.)
I have tried this with .txt and .yml files but in both cases, instead of §, this is what takes its place: xA7 (I cannot copy the actual text into the internet as it changes into the correct character. It is, however, appearing as xA7 in the file.) Everything else is unaffected, and the code runs fine.
Do let me know if I can improve the question in any way.
You're running into a problem called character encoding. There are two parts to the problem - first is to get the encoding you want in the file, the second is to get the OS to use the same encoding.
The most flexible and common encoding is UTF-8, because it can handle any Unicode character while remaining backwards compatible with the very old 7-bit ASCII character set. Most Unix-like systems like Linux will handle it automatically.
fileobj = open('file_name.yml','w+',encoding='utf-8')
You can set your PYTHONIOENCODING environment value to make it the default.
Windows operating systems are a little trickier because they'll rarely assume UTF-8, especially if it's a Microsoft program opening the file. There's a magic byte sequence called a BOM that will trigger Microsoft to use UTF-8 if it's at the beginning of a file. Python can add that automatically for you:
fileobj = open('file_name.yml','w+',encoding='utf_8_sig')

Endline symbol in python

Does it exist in python something like 'std::endl' in c++ std? Or, how can I get an end line symbol of the current system?
It seems very important thing because an end line symbol may be different in different OSs.
The os module has linesep which is the platform-specific string to end a line. However, quoting the docs:
Do not use os.linesep as a line terminator when writing files opened in text mode (the default); use a single '\n' instead, on all platforms.
The default Python behaviour for text files and file-like objects is that if your program writes '\n', it will be translated into whatever is appropriate for the local system. So as Mateen Ulhaq wrote, just use '\n'

Handling a literal space in a filename

I have problem with os.access(filename, os.R_OK) when file is an absolute path on a Linux system with space in the filename. I have tried many ways of quoting the space, from "'" + filename + "'" to filename.replace(' ', '\\ ') but it doesn't work.
How can I escape the filename so my shell knows how to access it? In terminal I would address it as '/home/abc/LC\ 1.a'
You don't need to (and shouldn't) escape the space in the file name. When you are working with a command line shell, you need to escape the space because that's how the shell tokenizes the command and its arguments. Python, however, is expecting a file name, so if the file name has a space, you just include the space.

Writing on new lines to a text file

The section of coding I've written is as such:
thing=9
text_file=open("something.txt", "a")
text_file.write("\n", str(thing))
text_file.close()
This always returns the error Type error: "write" only takes 1 argument. 2 given.
What I'm trying to do is that each time I run this code it writes on a new line rather than the same line. Right now, if this doesn't work, I'm a bit confused how to do this. Any help would be appreciated!
Add a newline to the end1 of the string with the + operator:
text_file.write(str(thing) + "\n")
1Note: If you add it to the front, you will get a blank line at the top of your file, which may not be what you want.
The python interpreter is correct in saying:
"write" only takes 1 argument. 2 given
Python's file methods are documented here.
All you need to be doing is concatenating your string with the newline character. You can do so by replacing:
text_file.write("\n", str(thing))
with:
text_file.write("\n" + str(thing))
This will write an empty line before writing out what you want. This might not be what you are looking for. Instead you can do:
text_file.write(str(thing) + '\n')

How can I detect DOS line breaks in a file?

I have a bunch of files. Some are Unix line endings, many are DOS. I'd like to test each file to see if if is dos formatted, before I switch the line endings.
How would I do this? Is there a flag I can test for? Something similar?
Python can automatically detect what newline convention is used in a file, thanks to the "universal newline mode" (U), and you can access Python's guess through the newlines attribute of file objects:
f = open('myfile.txt', 'U')
f.readline() # Reads a line
# The following now contains the newline ending of the first line:
# It can be "\r\n" (Windows), "\n" (Unix), "\r" (Mac OS pre-OS X).
# If no newline is found, it contains None.
print repr(f.newlines)
This gives the newline ending of the first line (Unix, DOS, etc.), if any.
As John M. pointed out, if by any chance you have a pathological file that uses more than one newline coding, f.newlines is a tuple with all the newline codings found so far, after reading many lines.
Reference: http://docs.python.org/2/library/functions.html#open
If you just want to convert a file, you can simply do:
with open('myfile.txt', 'U') as infile:
text = infile.read() # Automatic ("Universal read") conversion of newlines to "\n"
with open('myfile.txt', 'w') as outfile:
outfile.write(text) # Writes newlines for the platform running the program
You could search the string for \r\n. That's DOS style line ending.
EDIT: Take a look at this
(Python 2 only:) If you just want to read text files, either DOS or Unix-formatted, this works:
print open('myfile.txt', 'U').read()
That is, Python's "universal" file reader will automatically use all the different end of line markers, translating them to "\n".
http://docs.python.org/library/functions.html#open
(Thanks handle!)
As a complete Python newbie & just for fun, I tried to find some minimalistic way of checking this for one file. This seems to work:
if "\r\n" in open("/path/file.txt","rb").read():
print "DOS line endings found"
Edit: simplified as per John Machin's comment (no need to use regular expressions).
dos linebreaks are \r\n, unix only \n. So just search for \r\n.
Using grep & bash:
grep -c -m 1 $'\r$' file
echo $'\r\n\r\n' | grep -c $'\r$' # test
echo $'\r\n\r\n' | grep -c -m 1 $'\r$'
You can use the following function (which should work in Python 2 and Python 3) to get the newline representation used in an existing text file. All three possible kinds are recognized. The function reads the file only up to the first newline to decide. This is faster and less memory consuming when you have larger text files, but it does not detect mixed newline endings.
In Python 3, you can then pass the output of this function to the newline parameter of the open function when writing the file. This way you can alter the context of a text file without changing its newline representation.
def get_newline(filename):
with open(filename, "rb") as f:
while True:
c = f.read(1)
if not c or c == b'\n':
break
if c == b'\r':
if f.read(1) == b'\n':
return '\r\n'
return '\r'
return '\n'

Categories