Endline symbol in python - python

Does it exist in python something like 'std::endl' in c++ std? Or, how can I get an end line symbol of the current system?
It seems very important thing because an end line symbol may be different in different OSs.

The os module has linesep which is the platform-specific string to end a line. However, quoting the docs:
Do not use os.linesep as a line terminator when writing files opened in text mode (the default); use a single '\n' instead, on all platforms.
The default Python behaviour for text files and file-like objects is that if your program writes '\n', it will be translated into whatever is appropriate for the local system. So as Mateen Ulhaq wrote, just use '\n'

Related

Open an external file in python

x = open('Homework','r')
print(x.name)
x.close()
I got this error when I run the code.
File "C:/Users/LENOVO/Desktop/pythonhome/tobechanged.py", line 16, in <module>
x = open('Homework','r')
FileNotFoundError: [Errno 2] No such file or directory: 'Homework'
SO I tried to type the full path
x = open('C:\Users\LENOVO\Desktop\pythonhome','r')
print(x.name)
x.close()
I got an Unicode error.
btw I'm using windows.
As the comments mentioned, it's usually good to type out the full path to the file, because running a script in IDLE, for example, can cause Python to search for the file in a directory that you are not intending. The reason you got the Unicode error is because you are using a special character, the backslash (\) which starts something known as an escape sequence. Escape sequences allow coders to specify special characters, like the newline character: \n. You can read more about these in Python's docs here
You have to either use a raw string (a string preceded with r, like this r'C:\Users\...'), or escape these characters with double backslashes, like this: C:\\Users\\....
Additionally, you need to specify the extension for the Homework file, otherwise the file system won't be able to find the file you are referring to, resulting in the FileNotFoundError you encountered. As #tdelaney mentioned, these extensions may be hidden by default in Windows Explorer.
Also, the recommended way in Python to open files is using the with statement, as this handles closing the object for you. Here is a sample (assuming that the extension of the Homework file is .txt):
with open('C:\\Users\\LENOVO\\Desktop\\pythonhome\\Homework.txt', 'r') as x:
print(x.name)
It is because you are forgetting the extension to the file (the ending of it). For example, if you have a text file that is named Homework, you would include it in like this
open(r'Homework.txt','r')
For this example, it must be in the same directory as your script. If you wanted to open a file outside of your scripts directory, you would have to find the full path of it. Here is an example of the Homework.txt file in my downloads folder.
open(r'C:\Users\USER\Downloads\Homework.txt','r')
You can also see in this code I use an r infront of the path. This tells Python the expression is a raw string and escape sequences are not parsed.

Opening Files in Python with rb [duplicate]

I have noticed that, in addition to the documented mode characters, Python 2.7.5.1 in Windows XP and 8.1 also accepts modes U and D at least when reading files. Mode U is used in numpy's genfromtxt. Mode D has the effect that the file is deleted, as per the following code fragment:
f = open('text.txt','rD')
print(f.next())
f.close() # file text.txt is deleted when closed
Does anybody know more about these modes, especially whether they are a permanent feature of the language applicable also on Linux systems?
The D flag seems to be Windows specific. Windows seems to add several flags to the fopen function in its CRT, as described here.
While Python does filter the mode string to make sure no errors arise from it, it does allow some of the special flags, as can be seen in the Python sources here. Specifically, it seems that the N flag is filtered out, while the T and D flags are allowed:
while (*++mode) {
if (*mode == ' ' || *mode == 'N') /* ignore spaces and N */
continue;
s = "+TD"; /* each of this can appear only once */
...
I would suggest sticking to the documented options to keep the code cross-platform.
This is a bit misleading.
open() as mode arg accepts any character, while you pass a valid one i.e.: "w,r,b,+,a".
Thus you can write: open("fname", "w+ANYTHINGYOUWANT").
It will open file as open("fname", "w+").
And open("fname", "rANYTHINGYOUWANT").
will open file as open("fname", "r").
Regarding "U" flag:
In addition to the standard fopen() values mode may be 'U' or 'rU'.
Python is usually built with universal newlines support; supplying 'U'
opens the file as a text file, but lines may be terminated by any of
the following: the Unix end-of-line convention '\n', the Macintosh
convention '\r', or the Windows convention '\r\n'. All of these
external representations are seen as '\n' by the Python program. If
Python is built without universal newlines support a mode with 'U' is
the same as normal text mode. Note that file objects so opened also
have an attribute called newlines which has a value of None (if no
newlines have yet been seen), '\n', '\r', '\r\n', or a tuple
containing all the newline types seen.
As you can read in Python documentation https://docs.python.org/2/library/functions.html#open
EDIT:
D: Specifies a file as temporary. It is deleted when the last file
pointer is closed.
as you can read in #tmr232's link.
The c, n, t, S, R, T, and D mode options are Microsoft extensions for
fopen and _fdopen and should not be used where ANSI portability is
desired
Further update:
I propose to submit the phenomenon as a bug, because opening a file as read only i.e. with flag "r", then allowing to delete after/via closing it adding a single character like "D", even accidentally is a serious security issue, I think.
But, if this has some unavoidable functionality, please inform me.

Python file open function modes

I have noticed that, in addition to the documented mode characters, Python 2.7.5.1 in Windows XP and 8.1 also accepts modes U and D at least when reading files. Mode U is used in numpy's genfromtxt. Mode D has the effect that the file is deleted, as per the following code fragment:
f = open('text.txt','rD')
print(f.next())
f.close() # file text.txt is deleted when closed
Does anybody know more about these modes, especially whether they are a permanent feature of the language applicable also on Linux systems?
The D flag seems to be Windows specific. Windows seems to add several flags to the fopen function in its CRT, as described here.
While Python does filter the mode string to make sure no errors arise from it, it does allow some of the special flags, as can be seen in the Python sources here. Specifically, it seems that the N flag is filtered out, while the T and D flags are allowed:
while (*++mode) {
if (*mode == ' ' || *mode == 'N') /* ignore spaces and N */
continue;
s = "+TD"; /* each of this can appear only once */
...
I would suggest sticking to the documented options to keep the code cross-platform.
This is a bit misleading.
open() as mode arg accepts any character, while you pass a valid one i.e.: "w,r,b,+,a".
Thus you can write: open("fname", "w+ANYTHINGYOUWANT").
It will open file as open("fname", "w+").
And open("fname", "rANYTHINGYOUWANT").
will open file as open("fname", "r").
Regarding "U" flag:
In addition to the standard fopen() values mode may be 'U' or 'rU'.
Python is usually built with universal newlines support; supplying 'U'
opens the file as a text file, but lines may be terminated by any of
the following: the Unix end-of-line convention '\n', the Macintosh
convention '\r', or the Windows convention '\r\n'. All of these
external representations are seen as '\n' by the Python program. If
Python is built without universal newlines support a mode with 'U' is
the same as normal text mode. Note that file objects so opened also
have an attribute called newlines which has a value of None (if no
newlines have yet been seen), '\n', '\r', '\r\n', or a tuple
containing all the newline types seen.
As you can read in Python documentation https://docs.python.org/2/library/functions.html#open
EDIT:
D: Specifies a file as temporary. It is deleted when the last file
pointer is closed.
as you can read in #tmr232's link.
The c, n, t, S, R, T, and D mode options are Microsoft extensions for
fopen and _fdopen and should not be used where ANSI portability is
desired
Further update:
I propose to submit the phenomenon as a bug, because opening a file as read only i.e. with flag "r", then allowing to delete after/via closing it adding a single character like "D", even accidentally is a serious security issue, I think.
But, if this has some unavoidable functionality, please inform me.

How to create a filename with a trailing period in Windows?

How does one work with filenames that end in a period in Python? According to MSDN's site, such filenames are valid in Windows, but whenever I try to create one in Python, it removes the final period. I even tried creating a raw file descriptor with os.open, but it still removes the period.
For example, this will create a file simply named 'test'
os.open('test.', os.O_CREAT | os.O_WRONLY, 0777)
Edit: Here is the exact quote
About spaces and dots in filenames and directories. The limits are
in the windows shell -- not in Windows or NT. Using 'bash', you can
create files with spaces (or dots), both, at the beginning and end of
a filename. You can then list and open those files in explorer, and
you can 'list' them in the shell (cmd.exe), but you won't necessarily
be able to open them from the shell (especially trailing spaces and
dots).
I figured out how to do this. Apparently, passing a normal filename will strip the period even when calling the Win API directly from C. In order to create the weird filenames, you must use the \\?\ prefix (this also disables relative paths and slash conversion).
open('\\\\?\\C:\\whatever\\test.','w')
It's ugly and nonportable, but it works.
The \\?\ syntax also works with cmd.exe:
dir>"\\?\C:\whatever\test."
Windows will strip the final trailing period, assuming it is the delimiter between a filename and a blank extension. Try using two periods.

Difference between binary and text I/O in python on Windows

I know that I should open a binary file using "rb" instead of "r" because Windows behaves differently for binary and non-binary files.
But I don't understand what exactly happens if I open a file the wrong way and why this distinction is even necessary. Other operating systems seem to do fine by treating both kinds of files the same.
Well this is for historical (or as i like to say it, hysterical) reasons. The file open modes are inherited from C stdio library and hence we follow it.
For Windows, there is no difference between text and binary files, just like in any of the Unix clones. No, i mean it! - there are (were) file systems/OSes in which text file is completely different beast from object file and so on. In some you had to specify the maximum length of lines in advance and fixed size records were used... fossils from the times of 80-column paper punch-cards and such. Luckily, not so in Unices, Windows and Mac.
However - all other things equal - Unix, Windows and Mac hystorically differ in what characters they use in output stream to mark end of one line (or, same thing, as separator between lines). In Unix, \x0A (\n) is used. In Windows, sequence of two characters \x0D\x0A (\r\n) is used; on Mac - just \xOD (\r). Here are some clues on the origin of use of those two symbols - ASCII code 10 is called Line Feed (LF) and when sent to teletype, would cause it to move down one line (Y++), without changing its horizontal (X) position. Carriage Return (CR) - ASCII 13 - on the other hand, would cause the printing carriage to return to the beginning of the line (X=0) without scrolling one line down. So when sending output to the printer, both \r and \n had to be send, so that the carriage will move to the beginning of a new line. Now when typing on terminal keyboard, operators naturally are expected to press one key and not two for end of line. That on Apple][ was the key 'Return' (\r).
At any rate, this is how things settled. C's creators were concerned about portability - much of Unix was written in C, unlike before, when OSes were written in assembler. So they did not want to deal with each platform quirks about text representation, so they added this evil hack to their I/O library depending on the platform, the input and output to that file will be "patched" on the fly so that the program will see the new lines the righteous, Unix-way - as '\n' - no matter if it was '\r\n' from Windows or '\r' from Mac. So the developer need not worry on what OS the program ran, it could still read and write text files in native format.
There was a problem, however - not all files are text, there are other formats and in they are very sensitive to replacing one character with another. So they though, we will call those "binary files" and indicate that to fopen() by including 'b' in the mode - and this will flag the library not to do any behind-the-scenes conversion. And that's how it came to be the way it is :)
So to recap, if file is open with 'b' in binary mode, no conversions will take place. If it was open in text mode, depending on the platform, some conversions of the new line character(s) may occur - towards Unix point of view. Naturally, on Unix platform there is no difference between reading/writing to "text" or "binary" file.
This mode is about conversion of line endings.
When reading in text mode, the platform's native line endings (\r\n on Windows) are converted to Python's Unix-style \n line endings. When writing in text mode, the reverse happens.
In binary mode, no such conversion is done.
Other platforms usually do fine without the conversion, because they store line endings natively as \n. (An exception is Mac OS, which used to use \r in the old days.) Code relying on this, however, is not portable.
In Windows, text mode will convert the newline \n to a carriage return followed by a newline \r\n.
If you read text in binary mode, there are no problems. If you read binary data in text mode, it will likely be corrupted.
For reading files there should be no difference. When writing to text-files Windows will automatically mess up your line-breaks (it will add \r's before the \n's). That's why you should use "wb".

Categories