"\n" not working in python while writing to files - python

I wrote python code to write to a file like this:
with codecs.open("wrtieToThisFile.txt",'w','utf-8') as outputFile:
for k,v in list1:
outputFile.write(k + "\n")
The list1 is of type (char,int)
The problem here is that when I execute this, file doesn't get separated by "\n" as expected. Any idea what is the problem here ? I think it is because of the
with
Any help is appreciated. Thanks in advance.
(I am using Python 3.4 with "Python Tools for Visual Studio" version 2.2)

If you are on windows the \n doesn't terminate a line.
Honestly, I'm surprised you are having a problem, by default any file opened in text mode would automatically convert the \n to os.linesep. I have no idea what codecs.open() is but it must be opening the file in binary mode.
Given that is the case you need to explicitly add os.linesep:
outputFile.write(k + os.linesep)
Obviously you have to import os somewhere.

Figured it out, from here:
How would I specify a new line in Python?
I had to use "\r\n" as in Windows, "\r\n" will work.

Per codecs.open's documentation, codecs.open opens the underlying file in binary mode, without line ending conversion. Frankly, codecs.open is semi-deprecated; in Python 2.7 and onwards, io.open (which is the same thing as the builtin open function in Python 3.x) handles 99% of the cases people used to use codecs.open, but better (faster, and without stupid issues like line endings). If you're reliably running on Python 3, just use plain open; if you need to run on Python 2.7 as well, import io and use io.open.

If you are on windows, try '\r\n'. Or open it with an editor that recognizes unix style new lines.

Related

How can I make Vim open .py files in different lines?

Previously, I used Spyder to write my files, but recently began making the transition to Vim. When I open a .py file using Vim, all previous lines are blended into the first, but separated with ^M.
My ~/.vimrc file uses filetype plugin indent on which I thought would solve this issue.
Thanks!
A file that has only ^M (also known as <CR>, or carriage return) as line separator is using the file format of mac. That seems to be the format of the file you're opening here.
Since this file format is so unusual, Vim will not try to detect it. You can tell Vim to detect it by adding the following to your vimrc file:
set file formats+=mac
Alternatively, you can use this format while opening a single file by using:
:e ++ff=mac script.py
You might want to convert these files to the more normal unix file format. You can do so after opening a file, with:
:set ff=unix
And then saving the file, with :w or similar.
From what I know ^M is the special character used for carriage return in vim. You should just be able to replace it
:%s/^M/\r/g
I would imagine that the ^M is there as some kind of specific configuration spider uses that couldn't be translated well in vim. I am just speculating here, could be totally wrong.
Substituting using hexadecimal notation is better in my opinion:
:%s/\%x0D$//e
Explanation: If the user try to type caret M instead of CtrlvEnter he of she will think that is a mistake.

Special characters like ç and ã aren't decoded when the text is obtained from a file

I'm learning Python and tried to make a hanging game (literal translation - don't know the real name of the game in English. Sorry.). For those who aren't familiar with this game, the player must discover a secret word by guessing one letter at a time.
In my code, I get a collection of secret words which is imported from a txt file using the following code:
words_bank = open('palavras.txt', 'r')
words = []
for line in words_bank:
words.append(line.strip().lower())
words_bank.close()
print(words)
The output of print(words) is ['ma\xc3\xa7\xc3\xa3', 'a\xc3\xa7a\xc3\xad', 'tucum\xc3\xa3'] but if I try print('maçã, açaí, tucumã') in order to check the special characters, everything is printed correctly. Looks like the issue is in the encoding (or decoding... I'm still reading lots of articles about it to really understand) special characters from files.
The content of line 1 of my code is # coding: utf-8 because after some research I found out that I have to specify the Unicode format that is required for the text to be encoded/decoded. Before adding it, I was receiving the following message when running the code:
File "path/forca.py", line 12
SyntaxError: Non-ASCII character '\xc3' in file path/forca.py on line 12, but no encoding declared
Line 12 content: print('maçã, açaí, tucumã')
Things that I've already tried:
add encode='utf-8' as parameter in open('palavras.txt', 'r')
add decode='utf-8' as parameter in open('palavras.txt', 'r')
same as above but with latin1
substitute line 1 content for #coding: latin1
My OS is Ubuntu 20.04 LTS, my IDE is VS Code.
Nothing works!
I don't know what search and what to do anymore.
SOLUTION HERE
Thanks to the help given by the friends above, I was able to find out that the real problem was in the combo VS Code extension (Code Runner) + python alias version from Ubuntu 20.04 LTS.
Code Runner is set to run codes in Terminal in my situation, so apparently, when it calls for python the alias version was python 2.7.x. To overcome this situation I've used this thread to set python 3 as default.
It's done! Whenever python is called, both in terminal and VS Code with Code Runner, all special characters works just fine.
Thank's everybody for your time and your help =)
This only happens when using Python 2.x.
The error is probably because you're printing a list not printing items in the list.
When calling print(words) (words is a list), Python invokes a special function called repr on the list object. The list then creates a summary representation of the list by calling repr in each child in the list, then creates a neat string visualisation.
repr(string) actually returns an ASCII representation (with escapes) rather than a suitable version for your terminal.
Instead, try:
for x in words:
print(x)
Note. The option for open is encoding. E.g
open('myfile.txt', encoding='utf-8')
You should always, always pass the encoding option to open - Python <=3.8 on Linux and Mac will assume UTF-8 (for most people). Python <=3.8 on Windows will use an 8-bit code page.
Python 3.9 will always use UTF-8
See Python 2.x vs 3.x behaviour:
Py2
>>> print ['maçã', 'açaí', 'tucumã']
['ma\xc3\xa7\xc3\xa3', 'a\xc3\xa7a\xc3\xad', 'tucum\xc3\xa3']
>>> repr('maçã')
"'ma\\xc3\\xa7\\xc3\\xa3'"
>>> print 'maçã'
maçã
Py3
>>> print(['maçã', 'açaí', 'tucumã'])
['maçã', 'açaí', 'tucumã']
>>> repr('maçã')
"'maçã'"

Is there any function like iconv in Python?

I have some CSV files need to convert from shift-jis to utf-8.
Here is my code in PHP, which is successful transcode to readable text.
$str = utf8_decode($str);
$str = iconv('shift-jis', 'utf-8'. '//TRANSLIT', $str);
echo $str;
My problem is how to do same thing in Python.
I don't know PHP, but does this work :
mystring.decode('shift-jis').encode('utf-8') ?
Also I assume the CSV content is from a file. There are a few options for opening a file in python.
with open(myfile, 'rb') as fin
would be the first and you would get data as it is
with open(myfile, 'r') as fin
would be the default file opening
Also I tried on my computed with a shift-js text and the following code worked :
with open("shift.txt" , "rb") as fin :
text = fin.read()
text.decode('shift-jis').encode('utf-8')
result was the following in UTF-8 (without any errors)
' \xe3\x81\xa6 \xe3\x81\xa7 \xe3\x81\xa8'
Ok I validate my solution :)
The first char is indeed the good character: "\xe3\x81\xa6" means "E3 81 A6"
It gives the correct result.
You can try yourself at this URL
for when pythons built-in encodings are insufficient there's an iconv at PyPi.
pip install iconv
unfortunately the documentation is nonexistant.
There's also iconv_codecs
pip install iconv_codecs
eg:
>>> import iconv_codecs
>>> iconv_codecs.register('ansi_x3.110-1983')
>>> "foo".encode('ansi_x3.110-1983')
It would be helpful if you could post the string that you are trying to convert since this error suggest some problem with the in-data, older versions of PHP failed silently on broken input strings which makes this hard to diagnose.
According to the documentation this might also be due to differences in shift-jis dialects, try using 'shift_jisx0213' or 'shift_jis_2004' instead.
If using another dialect does not work you might get away with asking python to fail silently by using .decode('shift-jis','ignore') or .decode('shift-jis','replace') .

Python command line: editing mistake on previous line?

When using python via the command line, if I see a mistake on a previous line of a nested statement is there any way to remove or edit that line once it has already been entered?
e.g.:
>>> file = open("file1", "w")
>>> for line in file:
... parts = line.split('|') <-- example, I meant to type '\' instead
... print parts[0:1]
... print ";"
... print parts[1:]
so rather than retyping the entire thing all over to fix one char, can I go back and edit something in hindsight?
I know I could just code it up in vim or something and have a persistent copy I can do anything I want with, but I was hoping for a handy-dandy trick with the command line.
-- thanks!
You can't do such a thing in the original python interpreter, however, if you use the last version of IPython, it provides a lightweight GUI (looks like a simple shell, but is a GUI in fact) which features multi-line editing, syntax highlighting and a bunch of other things. To use IPython GUI, run it with the ipython qtconsole command.
Not that I know of in all the years I've been coding Python. That's what text editors are for =)
If you are an Emacs user, you can set your environment up such that the window is split into the code buffer and Python shell buffer, and then execute your entire buffer to see the changes.
Maybe. The Python Tutorial says:
Perhaps the quickest check to see whether command line editing is supported is typing Control-P to the first Python prompt you get. If it beeps, you have command line editing; see Appendix Interactive Input Editing and History Substitution for an introduction to the keys. If nothing appears to happen, or if ^P is echoed, command line editing isn’t available; you’ll only be able to use backspace to remove characters from the current line.
In addition to #MatToufoutu's suggestion, you might also take a look at DreamPie, though it's just a GUI for the shell without IPython's other extensions.
Now instead of ipython use
jupyter console
in cmd prompt

What to do with "The input line is too long" error message?

I am trying to use os.system() to call another program that takes an input and an output file. The command I use is ~250 characters due to the long folder names.
When I try to call the command, I'm getting an error: The input line is too long.
I'm guessing there's a 255 character limit (its built using a C system call, but I couldn't find the limitations on that either).
I tried changing the directory with os.chdir() to reduce the folder trail lengths, but when I try using os.system() with "..\folder\filename" it apparently can't handle relative path names. Is there any way to get around this limit or get it to recognize relative paths?
Even it's a good idea to use subprocess.Popen(), this does not solve the issue.
Your problem is not the 255 characters limit, this was true on DOS times, later increased to 2048 for Windows NT/2000, and increased again to 8192 for Windows XP+.
The real solution is to workaround a very old bug in Windows APIs: _popen() and _wpopen().
If you ever use quotes during the command line you have to add the entire command in quoates or you will get the The input line is too long error message.
All Microsoft operating systems starting with Windows XP had a 8192 characters limit which is now enough for any decent command line usage but they forgot to solve this bug.
To overcome their bug just include your entire command in double quotes, and if you want to know more real the MSDN comment on _popen().
Be careful because these works:
prog
"prog"
""prog" param"
""prog" "param""
But these will not work:
""prog param""
If you need a function that does add the quotes when they are needed you can take the one from http://github.com/ssbarnea/tendo/blob/master/tendo/tee.py
You should use the subprocess module instead. See this little doc for how to rewrite os.system calls to use subprocess.
You should use subprocess instead of os.system.
subprocess has the advantage of being able to change the directory for you:
import subprocess
my_cwd = r"..\folder\"
my_process = subprocess.Popen(["command name", "option 1", "option 2"], cwd=my_cwd)
my_process.wait() # wait for process to end
if my_process.returncode != 0:
print "Something went wrong!"
The subprocess module contains some helper functions as well if the above looks a bit verbose.
Assuming you're using windows, from the backslashes, you could write a .bat file from python and then os.system() on that. It's a hack.
Make sure when you're using '\' in your strings that they're being properly escaped.
Python uses the '\' as the escape character, so the string "..\folder\filename" evaluates to "..folderfilename" since an escaped f is still an f.
You probably want to use
r"..\folder\filename"
or
"..\\folder\\filename"
I got the same message but it was strange because the command was not that long (130 characters) and it used to work, it just stopped working one day.
I just closed the command window and opened a new one and it worked.
I have had the command window opened for a long time (maybe months, it's a remote virtual machine).
I guess is some windows bug with a buffer or something.

Categories