Python Curses, reading wide character's attribute from screen - python

The problem I'm trying to solve is to get a couple ch,att representing the character and the associated attribute currently displayed at some given position.
Now, when the displayed character is not a wide one (i.e. an ASCII character), the method .inch does the job up to masking correctly the results. The issue comes when the displayed character is wide. More precisely I know how to get the given character through .instr, however this function does not return any information about the attribute.
Since, as far as I know, there is no specific function to get the attribute alone, my first attempt was to use .inch, drop the 8 less significant bit and interpret the result as the attribute. This seemed to work to some extent but double checking I realized that reading greek letters (u"u\03b1" for instance) with no attribute in this way returns att = 11.0000.0000 instead of 0. Is there a better way to approach the problem?
EDIT, a minimal example for Python3
import curses
def bin(x):
out = ''
while x > 0:
out = str(x % 2) + out
x = x // 2
return out
def main(s):
s.addstr(1, 1, u'\u03b1')
s.refresh()
chratt = s.inch(1, 1)
att = chratt & 0xFF00
s.addstr(2, 1, bin(att))
s.refresh()
while True:
pass
curses.wrapper(main)

In curses, inch and instr is only for ascii characters as you suspected. "complex" or "wide" characters like characters from utf-8 have another system, as explained here on stackoverflow by one of the ncurses creators.
However, onto the bad news. They aren't implemented in python curses (yet). A pull request was submitted here and it is very close to merging (90%), so if you really need it then why not go contribute yourself?
And if that isn't an option, then you could try to store every change you make to your screen in a variable and then pull the wide characters from there.

Related

Python | tkinter: What does tkinter.END do?

Learning python through a book, and tkinter.END is used in a block of code without being explained
import tkinter
def count(text, out_data):
""" Update out_data with the total number of As, Ts, Cs, and Gs found in text."""
data = text.get('0.0', **tkinter.END**)
counts = {}
for char in 'ATCG':
counts[char] = data.count(char)
out_data.set('Num As: {0} Num Ts: {1} Num Cs: {2} Num Gs: {3}'.format(
counts['A'], counts['T'], counts['C'], counts['G']))
...
I've looked online, and I only run across examples of it, never mentioning it's function.
I tried help(tkinter) in a shell and got END = 'end' which wasn't very useful.
If more code is required, just let me know. Didn't want to post the entire code making you pointlessly read more for no reason.
It doesn't "do" anything. It is a constant, the literal string "end". In this context it represents the point immediately after the last character entered by the user. The function get on a text widget requires two values: a starting position and an ending position.
Note: in the line text.get('0.0', tkinter.END), '0.0' is invalid (though, tkinter graciously accepts it, and treats it the same as '1.0'). Text indexes are of the form line.character. Lines start counting at 1, characters start at zero. So, the first character is '1.0', not '0.0'.
It's just a constant.
The Python tkinter library is a wrapper around tk, so you'll want to reference the source documentation, which can be found at: http://www.tkdocs.com/tutorial/text.html#basics.
For your question, see the section on Retrieving the Text. In their Python example, they don't even use the constant:
thetext = text.get('1.0', 'end')

Python - Curses : how to use the inch method to get the attribute of a character

I am learning python and curses.
I am at a point where I want to be able to tell if a specific character is either A_BOLD, A_DIM or A_REVERSE etc... So I could eventually change its attribute accordingly (using for example window.chgat(attr)).
But I do not know how to retrieve this information.
According to the documentation:
window.inch([y, x])¶
Return the character at the given position in the window.
The bottom 8 bits are the character proper, and upper bits are the
attributes.
I understand that the information about the character attribute is incorporated within the result from inch and as a matter of fact, printing the character obtained displays it with its attributes as well.
But Im not fluent enough in computer speak to understand how to use this. How do I get and interpret those upper bit?...
What should I do, say, to check if the character is printed in bold or not?
You need to use bitwise operators (eg &)
attrs = window.inch([y, x])
ch = chr(attrs & 0xFF)
isbold = bool(attrs & curses.A_BOLD)
etc

String problems with Python

I am a Python noob so I may be missing something here, but I have a problem with how a string is handled inside my program. When I display it, only the first character is displayed.
# some code
MessageBox = ctypes.windll.user32.MessageBoxA
# some other code
testString = self.statusBar1.GetStatusText(0)
# displays "azertyu"
MessageBox(None, "azertyu", 'COUCOU', 0)
# displays 'M'
MessageBox(None, testString, 'COUCOU3', 0)
# displays 'a'
MessageBox(None, testString[1:], 'COUCOU3', 0) #
#displays 'c'
MessageBox(None, testString[2:], 'COUCOU3', 0)
The full string is 'Machine' (it's actually longer than that).
How comes Python considers any character is the ending one and displays only one character at once ? Am I missing some Python basics here ?
PS. GetStatusText reference is available at http://www.wxpython.org/docs/api/wx.StatusBar-class.html#GetStatusText. I have tested GetStatusText with a very long string and it doesn't seem to cut texts.
MessageBoxA is the ascii version of the MessageBox win32 API. Your testString is probably a Unicode value, so the value being passed to MessageBoxA will end up looking like an array of bytes with a zero in every other index. In other words it looks like a character string with just one character terminated by a NULL character. I bet if you use str(testString) or switch to MessageBoxW then it will work as expected, however you really should be using wx.MessageBox or wx.MessageDialog instead.
If you are using wxPython, why are you trying to show a message box with ctypes? The wxPython package has its own message dialogs. See the following links:
http://wiki.wxpython.org/MessageBoxes
http://wxpython.org/docs/api/wx.MessageDialog-class.html
http://www.blog.pythonlibrary.org/2010/07/10/the-dialogs-of-wxpython-part-2-of-2/
The wxPython demo package (downloadable from the wxPython website) has examples of MessageDialog and GenericMessageDialog.
Try ctypes.windll.user32.MessageBoxW instead of ctypes.windll.user32.MessageBoxA:
import ctypes
ctypes.windll.user32.MessageBoxW(None, "Hello, world!", "Test", 0)
It's treating the testString as a list
In [214]: for x in "Machine":
.....: print x
.....:
M
a
c
h
i
n
e
Have you tried ?
MessageBox(None, [testString], 'COUCOU3', 0)
as it's as if MessageBox is expecting a list of txt, which might makes sense:
["DANGER", "Will Robinson"]
Would then give two lines of txt on your message.
PURE GUESSWORK

Python’s `str.format()`, fill characters, and ANSI colors

In Python 2, I’m using str.format() to align a bunch of columns of text I’m printing to a terminal. Basically, it’s a table, but I’m not printing any borders or anything—it’s simply rows of text, aligned into columns.
With no color-fiddling, everything prints as expected.
If I wrap an entire row (i.e., one print statement) with ANSI color codes, everything prints as expected.
However: If I try to make each column a different color within a row, the alignment is thrown off. Technically, the alignment is preserved; it’s the fill characters (spaces) that aren’t printing as desired; in fact, the fill characters seem to be completely removed.
I’ve verified the same issue with both colorama and xtermcolor. The results were the same. Therefore, I’m certain the issue has to do with str.format() not playing well with ANSI escape sequences in the middle of a string.
But I don’t know what to do about it! :( I would really like to know if there’s any kind of workaround for this problem.
Color and alignment are powerful tools for improving readability, and readability is an important part of software usability. It would mean a lot to me if this could be accomplished without manually aligning each column of text.
Little help? ☺
This is a very late answer, left as bread crumbs for anyone who finds this page while struggling to format text with built-in ANSI color codes.
byoungb's comment about making padding decisions on the length of pre-colorized text is exactly right. But if you already have colored text, here's a work-around:
See my ansiwrap module on PyPI. Its primary purpose is providing textwrap for ANSI-colored text, but it also exports ansilen() which tells you "how long would this string be if it didn't contain ANSI control codes?" It's quite useful in making formatting, column-width, and wrapping decisions on pre-colored text. Add width - ansilen(s) spaces to the end or beginning of s to left (or respectively, right) justify s in a column of your desired width. E.g.:
def ansi_ljust(s, width):
needed = width - ansilen(s)
if needed > 0:
return s + ' ' * needed
else:
return s
Also, if you need to split, truncate, or combine colored text at some point, you will find that ANSI's stateful nature makes that a chore. You may find ansi_terminate_lines() helpful; it "patch up" a list of sub-strings so that each has independent, self-standing ANSI codes with equivalent effect as the original string.
The latest versions of ansicolors also contain an equivalent implementation of ansilen().
Python doesn't distinguish between 'normal' characters and ANSI colour codes, which are also characters that the terminal interprets.
In other words, printing '\x1b[92m' to a terminal may change the terminal text colour, Python doesn't see that as anything but a set of 5 characters. If you use print repr(line) instead, python will print the string literal form instead, including using escape codes for non-ASCII printable characters (so the ESC ASCII code, 27, is displayed as \x1b) to see how many have been added.
You'll need to adjust your column alignments manually to allow for those extra characters.
Without your actual code, that's hard for us to help you with though.
Also late to the party. Had this same issue dealing with color and alignment. Here is a function I wrote which adds padding to a string that has characters that are 'invisible' by default, such as escape sequences.
def ljustcolor(text: str, padding: int, char=" ") -> str:
import re
pattern = r'(?:\x1B[#-_]|[\x80-\x9F])[0-?]*[ -/]*[#-~]'
matches = re.findall(pattern, text)
offset = sum(len(match) for match in matches)
return text.ljust(padding + offset,char[0])
The pattern matches all ansi escape sequences, including color codes. We then get the total length of all matches which will serve as our offset when we add it to the padding value in ljust.

How to work with very long strings in Python?

I'm tackling project euler's problem 220 (looked easy, in comparison to some of the
others - thought I'd try a higher numbered one for a change!)
So far I have:
D = "Fa"
def iterate(D,num):
for i in range (0,num):
D = D.replace("a","A")
D = D.replace("b","B")
D = D.replace("A","aRbFR")
D = D.replace("B","LFaLb")
return D
instructions = iterate("Fa",50)
print instructions
Now, this works fine for low values, but when you put it to repeat higher then you just get a "Memory error". Can anyone suggest a way to overcome this? I really want a string/file that contains instructions for the next step.
The trick is in noticing which patterns emerge as you run the string through each iteration. Try evaluating iterate(D,n) for n between 1 and 10 and see if you can spot them. Also feed the string through a function that calculates the end position and the number of steps, and look for patterns there too.
You can then use this knowledge to simplify the algorithm to something that doesn't use these strings at all.
Python strings are not going to be the answer to this one. Strings are stored as immutable arrays, so each one of those replacements creates an entirely new string in memory. Not to mention, the set of instructions after 10^12 steps will be at least 1TB in size if you store them as characters (and that's with some minor compressions).
Ideally, there should be a way to mathematically (hint, there is) generate the answer on the fly, so that you never need to store the sequence.
Just use the string as a guide to determine a method which creates your path.
If you think about how many "a" and "b" characters there are in D(0), D(1), etc, you'll see that the string gets very long very quickly. Calculate how many characters there are in D(50), and then maybe think again about where you would store that much data. I make it 4.5*10^15 characters, which is 4500 TB at one byte per char.
Come to think of it, you don't have to calculate - the problem tells you there are 10^12 steps at least, which is a terabyte of data at one byte per character, or quarter of that if you use tricks to get down to 2 bits per character. I think this would cause problems with the one-minute time limit on any kind of storage medium I have access to :-)
Since you can't materialize the string, you must generate it. If you yield the individual characters instead of returning the whole string, you might get it to work.
def repl220( string ):
for c in string:
if c == 'a': yield "aRbFR"
elif c == 'b': yield "LFaLb"
else yield c
Something like that will do replacement without creating a new string.
Now, of course, you need to call it recursively, and to the appropriate depth. So, each yield isn't just a yield, it's something a bit more complex.
Trying not to solve this for you, so I'll leave it at that.
Just as a word of warning be careful when using the replace() function. If your strings are very large (in my case ~ 5e6 chars) the replace function would return a subset of the string (around ~ 4e6 chars) without throwing any errors.
You could treat D as a byte stream file.
Something like:-
seedfile = open('D1.txt', 'w');
seedfile.write("Fa");
seedfile.close();
n = 0
while (n
warning totally untested

Categories