Strange String Formatting In Python - python

i have tried to assign some string in string but some invisible character will be added automatically,
Try to run this code and you see its some invisible characters has been added \xe2\x80\x8b.
>>> "LIN+{}+5+​{}:EN'".format(str(5), 2526526545852 or '')
"LIN+5+5+\xe2\x80\x8b2526526545852:EN'"
So it shoud be simple like,
>>>>"LIN+5+5+2526526545852:EN'"
Please share suggestions.

Related

Python prompt printing multi line comment with extra characters

I am using Prompt_Toolkit to create a terminal like program. One of the commands I have is help. Basically you enter help and it will return a multiline comment. But when it is being printed, the formatting is being interpreted weird.
As you can see, when I type prompt I will return the __doc__ string which is just a formatted string with tabs.
I am not sure what the ^I are doing in there and what I should be looking up to get rid of them?
Looks like the terminal's encoding differs from yours, so it outputs certain characters as ^I as a means to represent tabs. Nothing to be worried about.

Python 3.x and printing Unicode symbols

Hi
i'm trying to print out some unicode symbols, lets say from U+2660 to U+2667.
With one there's no problem, I just write:
print('\u2660')
but when I want to print set of symbols in loop (or one, but dependent from variable), something like that doesn't work:
for i in range(2660, 2668):
print('\u{}'.format(i))
I thought Python would execute .format function first and replace {} with number, and then look what is inside quotes and print it. It doesn't, and I don't understand why. :)
Please help,
TIA
wiktor
The parsing of the Unicode escape is done at compile-time, not runtime.
for i in range(0x2660, 0x2668):
print(chr(i))

How to recognize special eol character when I see it, using Python?

I'm scraping a set of originally pdf files, using Python. Having gotten them to text, I had a lot of trouble getting the line endings out. I couldn't figure out what the line separator was. The trouble is, I still don't know.
It's not a '\n', or, I don't think, '\r\n'. However, I've managed to isolate one of these special characters. I literally have it in memory, and by doing a call to my_str.replace(eol, ''), I can remove all of these characters from one of my files.
So my question is open-ended. I'm a bit lost when it comes to unicode and such. How can I identify this character in my files without resorting to something ridiculous, like serializing it and then reading it in? Is there a way I can refer to it as a code, perhaps? I can't get Python to yield what it actually IS. All I ever see if I print it, or call unicode(special_eol) is the character in its functional usage as a newline.
Please help! Thanks, and sorry if I'm missing something obvious.
To determine what specific character that is, you can use str.encode('unicode_escape') or repr() to get (in Python 2) a ASCII-printable representation of the character:
>>> print u'☃'.encode('unicode_escape')
\u2603
>>> print repr(u'☃')
u'\u2603'

pydoc.render_doc() adds characters - how to avoid that?

There are already some questions touching this but no one seems to actually solve it.
import pydoc
hlpTxt = pydoc.render_doc(help)
already does what I want! looks flawless when printed to the (right) console but it has those extra characters included:
_\x08_H\x08He\x08el\x08lp\x08pe\x08er\x08r
In Maya for instance it looks like its filled up with ◘-symbols! While help() renders it flawless as well.
Removing \x08 leaves me with an extra letter each:
__HHeellppeerr
which is also not very useful.
Someone commented that it works for him when piped to a subprocess or into a file. I also failed to do that already. Is there another way than
hlpFile = open('c:/help.txt', 'w')
hlpFile.write(hlpTxt)
hlpFile.close()
? Because this leaves me with the same problem. Notepad++ actually shows BS symbols at the places. Yes for backspace obwiously.
Anyway: There must be a reason that these symbols are added and removing them afterwards might work but I can't imagine there isn't a way to have them not created in the first place!
So finally is there another pydoc method I'm missing? Or a str.encode/decode thing I have not yet seen?
btw: I'm not looking for help.__doc__!
In python 2, you can remove the boldface sequences with pydoc.plain:
pydoc.plain(pydoc.render_doc(help))
>>> help(pydoc.plain)
Help on function plain in module pydoc:
plain(text)
Remove boldface formatting from text.
In python 3 pydoc.render_doc accepts a renderer:
pydoc.render_doc(help, renderer=pydoc.plaintext)

Python Printing from python32

I can't get Python to print a word doc. What I am trying to do is to open the Word document, print it and close it. I can open Word and the Word document:
import win32com.client
msword = win32com.client.Dispatch("Word.Application")
msword.Documents.Open("X:\Backoffice\Adam\checklist.docx")
msword.visible= True
I have tried next to print
msword.activedocument.printout("X:\Backoffice\Adam\checklist.docx")
I get the error of "print out not valid".
Could someone shed some light on this how I can print this file from Python. I think it might be as simple as changing the word "printout". Thanks, I'm new to Python.
msword.ActiveDocument gives you the current active document. The PrintOut method prints that document: it doesn't take a document filename as a parameter.
From http://msdn.microsoft.com/en-us/library/aa220363(v=office.11).aspx:
expression.PrintOut(Background, Append, Range, OutputFileName, From, To, Item,
Copies, Pages, PageType, PrintToFile, Collate, FileName, ActivePrinterMacGX,
ManualDuplexPrint, PrintZoomColumn, PrintZoomRow, PrintZoomPaperWidth,
PrintZoomPaperHeight)
Specifically Word is trying to use your filename as a boolean Background which may be set True to print in the background.
Edit:
Case matters and the error is a bit bizarre. msword.ActiveDocument.Printout() should print it. msword.ActiveDocument.printout() throws an error complaining that 'PrintOut' is not a property.
I think what happens internally is that Python tries to compensate when you don't match the case on properties but it doesn't get it quite right for methods. Or something like that anyway. ActiveDocument and activedocument are interchangeable but PrintOut and printout aren't.
You probably have to escape the backslash character \ with \\:
msword.Documents.Open("X:\\Backoffice\\Adam\\checklist.docx")
EDIT: Explanation
The backslash is usually used to declare special characters. For example \n is the special character for a new-line. If you want a literal \ you have to escape it.

Categories