Character Encoding Python 2.7.6 yQT Displaying chr(128) - python

I am running on a Windows system. Though I would like to be cross-platform. Right now I'll be happy to solve the question.
My test code is:
for n in range(32, 255):
if n%16 == 0:
xStr = xStr + '\n'
xStr = xStr + str(n) + ':' + chr(n) + '\t'
form.lstResponse.addItem(xStr) (This is a pyQT 4.8.5 QListWidget)
I am trying to use the symbol 'Arial' 128 which looks like a 'C" with two horizontal lines through it.
But, it appears that characters between 128 and 160 are missing?
How do I use the characters between 128 and 160?
All this Unicode stuff is quite baffling to me...
Thanks...

First, the
# -*- coding: utf-8 -*-
Has to be the first line. I have not tried all possibilities, But, moving it down the file kills it.
When the above code is included the following lines work as desired.
s = u'€'
xStr = QString(s)
form.lstResponse.addItem(xStr)
form.lstResponse.addItem(u'€')

Related

strange terminal output when using colour/style formatting

I have the code:
import sys
for i in range(0, 20):
for j in range(0, 20):
sys.stdout.write('\x1b[1;32;40m' + ' ' + '\x1b[0m')
sys.stdout.write("\n")
which outputs 400 white squares in a 20x20 grid but also after about 180 squares outputs [1;32;40m in between some of the squares. It doesn't always output in the same place. Why is this happening, and how can it be fixed?
It's probably caused by some buffering bug in the terminal (I guess).
I can't replicate your issue, so here are some suggestions.
Output to sys.stdout is buffered. You could try flushing the output after each line of text ~
sys.stdout.flush()
But a better solution might be to add the bunch of escape codes to a string, then write it in a block ~
import sys
colour_line = '\x1b[1;32;40m' + (' ' * 20) + '\x1b[0m' + '\n'
for i in range(0, 20):
sys.stdout.write(colour_line)
This obviously simplifies the code, but this is getting away from your original design.
It's not ideal, but
time.sleep(0.0005)
or a different adjusted time between each print seems to work

Piping my Python Program through another program

I'm trying to make program using Python.
I want to be able to pipe program through another program:
" #EXAMPLE " ./my_python | another programme "
Here is the code I have so far.
This code saves output to file:
#!/usr/bin/env python
import os, random, string
# This is not my own code
''' As far asi know, It belongs to NullUserException. Was found on stackoverflow.com'''
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
# my code
file_out = open('newRa.txt','w') # Create a 'FILE' to save Generated Passwords
list1=[]
while len(list1) < 100000:
list1.append(''.join(random.choice(chars) for i in range(length)))
for item in list1:
file_out.write('%s\n' % item)
file_out.close()
file_out1=open('test.txt','w')
for x in list1:
file_out1.write('%s\n' %x[::-1])
This is the code I have trying to pipe it through another program:
#!/usr/bin/env python
import os,string,random,sys
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
keep=[]
keep1=[]
while len(keep)<1000:
keep.append(''.join(random.choice(chars) for i in range(length)))
print '\n',keep[::-1]
for x in keep:
keep1.append(x[::-1])
while len(keep1) < 1000:
print keep1
I have tried chmod and using the script as a executable.
Ok sorry for my lack of google search.
sys.stdout is the answer
#!/usr/bin/env python
import os,string,random,sys
length = 8
chars = string.ascii_letters.upper()+string.digits
random.seed = (os.urandom(1024))
keep=[]
while len(keep)<1000:
keep = (''.join(random.choice(chars) for i in range(length)))
print sys.stdout.write(keep)
sys.stdout.flush()
I stripped my code down (as it makes it a lot faster, But I'm getting this when execute
my code........
P5DBLF4KNone
DVFV3JQVNone
CIMKZFP0None
UZ1QA3HTNone
How do I get rid of the 'None' on the end?
What I have done to cause this ?
Should This Be A Seperate Question??

How to re-indent Python code after changing indent width in Emacs?

I changed python-indent from 3 to 4. I then mark-whole-buffer and indent-for-tab-command. It gave me garbage.
There is the indent-region function. So I'd try mark the whole buffer, then M-x and type indent-region. It's usually bound to C-M-\, as far as I know.
Edit
Re-indentation does not work for a tab-width change. As I wrote in the comments changing spaces to tabs and then altering the tab-width is a solution:
"Guessing you are indenting with space and not tabs, you'd first do tabify on the buffer content with your tab-width set to 3. Then change tab-width to 4 and run untabify."
This is kind of a hack, but it won't give you the garbage that indent-region is giving you
1) Make sure tabs as spaces are set to 4 spaces. In a scratch buffer type:
(setq tab-width 4)
And then evaluate it by marking it and using M-x eval-region
2) Globally replace all sets of three spaces with a tab character
M-x replace-regexp [SPC][SPC][SPC][RET] C-q[TAB][RET]
3) Highlight the whole buffer and untabify
M-x mark-whole-buffer M-x untabify
This will convert all tabs into four spaces.
Try indent-region instead on the buffer. Initially bounded to C-M-\
This is a bit of a hack, but it worked for me as a quick work-around: to a "M-X replace-string", " " -> " ". Then you have to close and re-open if your emacs does an automatic idnent-detection on the file. Then you have to go through and fix mult-line code (with tab), and strings that have lots of spaces.
This might also help:
http://www.emacswiki.org/emacs/IndentingPython
In particular, PythonTidy is very effective for restructuring messy code, with minor hiccups (unfortunately the tool is not easy to configure):
http://www.emacswiki.org/emacs/PythonProgrammingInEmacs#toc17
May be it will not be useful or not by theme, but I use such script.
Run it from command line. (python reindent.py some.py)
Change string_equal, and replace_to.
import sys
file_name = sys.argv[1]
string_equal = " "
replace_to = " "
with open(file_name) as f:
data = f.readlines()
f.close()
def create_new_line(i):
new_line = ""
flag = True
cur_s = ""
for k in i:
if flag and k == " ":
cur_s += k
if cur_s == string_equal:
new_line += replace_to
cur_s = ""
else:
flag = False
new_line += k
return new_line
with open(file_name, "w") as f:
for i in data:
l = create_new_line(i)
f.write(l)
f.close()

Python kludge to read UCS-2 (UTF-16?) as ASCII

I'm in a little over my head on this one, so please pardon my terminology in advance.
I'm running this using Python 2.7 on Windows XP.
I found some Python code that reads a log file, does some stuff, then displays something.
What, that's not enough detail? Ok, here's a simplified version:
#!/usr/bin/python
import re
import sys
class NotSupportedTOCError(Exception):
pass
def filter_toc_entries(lines):
while True:
line = lines.next()
if re.match(r""" \s*
.+\s+ \| (?#track)
\s+.+\s+ \| (?#start)
\s+.+\s+ \| (?#length)
\s+.+\s+ \| (?#start sec)
\s+.+\s*$ (?#end sec)
""", line, re.X):
lines.next()
break
while True:
line = lines.next()
m = re.match(r"""
^\s*
(?P<num>\d+)
\s*\|\s*
(?P<start_time>[0-9:.]+)
\s*\|\s*
(?P<length_time>[0-9:.]+)
\s*\|\s*
(?P<start_sector>\d+)
\s*\|\s*
(?P<end_sector>\d+)
\s*$
""", line, re.X)
if not m:
break
yield m.groupdict()
def calculate_mb_toc_numbers(eac_entries):
eac = list(eac_entries)
num_tracks = len(eac)
tracknums = [int(e['num']) for e in eac]
if range(1,num_tracks+1) != tracknums:
raise NotSupportedTOCError("Non-standard track number sequence: %s", tracknums)
leadout_offset = int(eac[-1]['end_sector']) + 150 + 1
offsets = [(int(x['start_sector']) + 150) for x in eac]
return [1, num_tracks, leadout_offset] + offsets
f = open(sys.argv[1])
mb_toc_urlpart = "%20".join(str(x) for x in calculate_mb_toc_numbers(filter_toc_entries(f)))
print mb_toc_urlpart
The code works fine as long as the log file is "simple" text (I'm tempted to say ASCII although that may not be precise/accurate - for e.g. Notepad++ indicates it's ANSI).
However, the script doesn't work on certain log files (in these cases, Notepad++ says "UCS-2 Little Endian").
I get the following error:
Traceback (most recent call last):
File "simple.py", line 55, in <module>
mb_toc_urlpart = "%20".join(str(x) for x in calculate_mb_toc_numbers(filter_
toc_entries(f)))
File "simple.py", line 49, in calculate_mb_toc_numbers
leadout_offset = int(eac[-1]['end_sector']) + 150 + 1
IndexError: list index out of range
This log works
This log breaks
I believe it's the encoding that's breaking the script because if I simply do this at a command prompt:
type ascii.log > scrubbed.log
and then run the script on scrubbed.log, the script works fine (this is actually fine for my purposes since there's no loss of important information and I'm not writing back to a file, just printing to the console).
One workaround would be to "scrub" the log file before passing it to Python (e.g. using the type pipe trick above to a temporary file and then have the script run on that), but I would like to have Python "ignore" the encoding if it's possible. I'm also not sure how to detect what type of log file the script is reading so I can act appropriately.
I'm reading this and this but my eyes are still spinning around in their head, so while that may be my longer term strategy, I'm wondering if there's an interim hack I could use.
codecs.open() will allow you to open a file using a specific encoding, and it will produce unicodes. You can try a few, going from most likely to least likely (or the tool could just always produce UTF-16LE but ha ha fat chance).
Also, "Unicode In Python, Completely Demystified".
works.log appears to be encoded in ASCII:
>>> data = open('works.log', 'rb').read()
>>> all(d < '\x80' for d in data)
True
breaks.log appears to be encoded in UTF-16LE -- it starts with the 2 bytes '\xff\xfe'. None of the characters in breaks.log are outside the ASCII range:
>>> data = open('breaks.log', 'rb').read()
>>> data[:2]
'\xff\xfe'
>>> udata = data.decode('utf16')
>>> all(d < u'\x80' for d in udata)
True
If these are the only two possibilities, you should be able to get away with the following hack. Change your mainline code from:
f = open(sys.argv[1])
mb_toc_urlpart = "%20".join(
str(x) for x in calculate_mb_toc_numbers(filter_toc_entries(f)))
print mb_toc_urlpart
to this:
f = open(sys.argv[1], 'rb')
data = f.read()
f.close()
if data[:2] == '\xff\xfe':
data = data.decode('utf16').encode('ascii')
# ilines is a generator which produces newline-terminated strings
ilines = (line + '\n' for line in data.splitlines())
mb_toc_urlpart = "%20".join(
str(x) for x in calculate_mb_toc_numbers(filter_toc_entries(ilines)) )
print mb_toc_urlpart
Python 2.x expects normal strings to be ASCII (or at least one byte). Try this:
Put this at the top of your Python source file:
from __future__ import unicode_literals
And change all the str to unicode.
[edit]
And as Ignacio Vazquez-Abrams wrote, try codecs.open() to open the input file.

Python: Retrieve Image from MSSQL

I'm working on a Python project that retrieves an image from MSSQL. My code is able to retrieve the images successfully but with a fixed size of 63KB. if the image is greater than that size, it just brings the first 63KB from the image!
The following is my code:
#!/usr/bin/python
import _mssql
mssql=_mssql.connect('<ServerIP>','<UserID>','<Password>')
mssql.select_db('<Database>')
x=1
while x==1:
query="select TOP 1 * from table;"
if mssql.query(query):
rows=mssql.fetch_array()
rowNumbers = rows[0][1]
#print "Number of rows fetched: " + str(rowNumbers)
for row in rows:
for i in range(rowNumbers):
FILE=open('/home/images/' + str(row[2][i][1]) + '-' + str(row[2][i][2]).strip() + ' (' + str(row[2][i][0]) + ').jpg','wb')
FILE.write(row[2][i][4])
FILE.close()
print 'Successfully downloaded image: ' + str(row[2][i][0]) + '\t' + str(row[2][i][2]).strip() + '\t' + str(row[2][i][1])
else:
print mssql.errmsg()
print mssql.stdmsg()
mssql.close()
It's kind of hard to tell what the problem is when you're using a database like this. Your query isn't explicitly selecting any columns, so we have no idea what your table structure is, or what types the columns are. I suspect the table format is not what you're expecting, or the columntype is incorrect for your data.
Also your code doesn't even look like it would run. You have "for row in rows:" and then don't indent after that. Maybe post your schema?
If your using freetds (I think you are): Search in your freetds.conf for the 'text size' setting.. standard its at 63 kb

Categories