Related
I'm learning Python and had a small issue. I have this loop:
found = None
print ('Before', found)
for value in [ 41, 5, 77, 3, 21, 55, 6]:
if value == 21:
found = True
else:
found = False
print (found, value)
print ('After', found)
The code is well, but the issue is print ('After', found) I want it to tell me that there was a True value found in the loop. Is there a way to keep the code the way it is and resolve the issue?
You don't want to reset found to False once you've set it to True. Initialize it to False, then only set it to True if value == 21; don't do anything if value != 21.
found = False
print ('Before', found)
for value in [ 41, 5, 77, 3, 21, 55, 6]:
if value == 21:
found = True
print (found, value)
print ('After', found)
Ignoring the print statement in the loop, you could just use any:
found = any(value == 21 for value in [ 41, 5, 77, 3, 21, 55, 6])
or even
found = 21 in [ 41, 5, 77, 3, 21, 55, 6]
I'm working with a xorcipher app to benchmark both python and rust function but I don't know how to add the rust function to python code can anyone help I need the save the output in buf I'm including my rust code
use std::convert::TryInto;
/*
Apply a simple XOR cipher using they specified `key` of size
`key_size`, to the `msg` char/byte array of size `msg_len`.
Writes he ciphertext to the externally allocated buffer `buf`.
*/
#[no_mangle]
pub unsafe fn cipher(msg: *const i8, key: *const i8, buf: *mut i8, msg_len: usize, key_len: usize)
{
let mut i: isize = 0;
while i < msg_len.try_into().unwrap() {
let key_len_i8: i8 = key_len.try_into().unwrap();
*buf.offset(i) = *msg.offset(i) ^ (*key.offset(i) % key_len_i8);
i = i + 1;
}
}
Let's say your rust code is named rs_cipher.rs.
If you compile it with this command
rustc --crate-type dylib rs_cipher.rs
you obtain a native dynamic library for your system.
(of course, this can also be done with cargo and the crate-type = ["cdylib"] option)
Here is a python code that loads this library, finds the function and calls it (see the comments for these different stages).
import sys
import os
import platform
import ctypes
import traceback
# choose application specific library name
lib_name='rs_cipher'
# determine system system specific library name
if platform.system().startswith('Windows'):
sys_lib_name='%s.dll'%lib_name
elif platform.system().startswith('Darwin'):
sys_lib_name='lib%s.dylib'%lib_name
else:
sys_lib_name='lib%s.so'%lib_name
# try to load native library from various locations
# (current directory, python file directory, system path...)
ntv_lib=None
for d in [os.path.curdir,
os.path.dirname(sys.modules[__name__].__file__),
'']:
path=os.path.join(d, sys_lib_name)
try:
ntv_lib=ctypes.CDLL(path)
break
except:
# traceback.print_exc()
pass
if ntv_lib is None:
sys.stderr.write('cannot load library %s\n'%lib_name)
sys.exit(1)
# try to find native function
fnct_name='cipher'
ntv_fnct=None
try:
ntv_fnct=ntv_lib[fnct_name]
except:
# traceback.print_exc()
pass
if ntv_fnct is None:
sys.stderr.write('cannot find function %s in library %s\n'%
(fnct_name, lib_name))
sys.exit(1)
# describe native function prototype
ntv_fnct.restype=None # no return value
ntv_fnct.argtypes=[ctypes.c_void_p, # msg
ctypes.c_void_p, # key
ctypes.c_void_p, # buf
ctypes.c_size_t, # msg_len
ctypes.c_size_t] # key_len
# use native function
msg=(ctypes.c_int8*10)()
key=(ctypes.c_int8*4)()
buf=(ctypes.c_int8*len(msg))()
for i in range(len(msg)):
msg[i]=1+10*i
for i in range(len(key)):
key[i]=~(20*(i+2))
sys.stdout.write('~~~~ first initial state ~~~~\n')
sys.stdout.write('msg: %s\n'%[v for v in msg])
sys.stdout.write('key: %s\n'%[v for v in key])
sys.stdout.write('buf: %s\n'%[v for v in buf])
sys.stdout.write('~~~~ first call ~~~~\n')
ntv_fnct(msg, key, buf, len(msg), len(key))
sys.stdout.write('msg: %s\n'%[v for v in msg])
sys.stdout.write('key: %s\n'%[v for v in key])
sys.stdout.write('buf: %s\n'%[v for v in buf])
(msg, buf)=(buf, msg)
for i in range(len(buf)):
buf[i]=0
sys.stdout.write('~~~~ second initial state ~~~~\n')
sys.stdout.write('msg: %s\n'%[v for v in msg])
sys.stdout.write('key: %s\n'%[v for v in key])
sys.stdout.write('buf: %s\n'%[v for v in buf])
sys.stdout.write('~~~~ second call ~~~~\n')
ntv_fnct(msg, key, buf, len(msg), len(key))
sys.stdout.write('msg: %s\n'%[v for v in msg])
sys.stdout.write('key: %s\n'%[v for v in key])
sys.stdout.write('buf: %s\n'%[v for v in buf])
Running this python code shows this result
~~~~ first initial state ~~~~
msg: [1, 11, 21, 31, 41, 51, 61, 71, 81, 91]
key: [-41, -61, -81, -101]
buf: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
~~~~ first call ~~~~
msg: [1, 11, 21, 31, 41, 51, 61, 71, 81, 91]
key: [-41, -61, -81, -101]
buf: [-42, -56, -70, -124, -2, -16, -110, -36, -122, -104]
~~~~ second initial state ~~~~
msg: [-42, -56, -70, -124, -2, -16, -110, -36, -122, -104]
key: [-41, -61, -81, -101]
buf: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
~~~~ second call ~~~~
msg: [-42, -56, -70, -124, -2, -16, -110, -36, -122, -104]
key: [-41, -61, -81, -101]
buf: [1, 11, 21, 31, 41, 51, 61, 71, 81, 91]
Note that I replaced these two lines in your rust code
let key_len_i8: i8 = key_len.try_into().unwrap();
*buf.offset(i) = *msg.offset(i) ^ (*key.offset(i) % key_len_i8);
by these
let key_len_isize: isize = key_len.try_into().unwrap();
*buf.offset(i) = *msg.offset(i) ^ (*key.offset(i % key_len_isize));
because the offset on key was going out of bounds.
Maybe, instead of dealing with pointer arithmetic and unsigned/signed conversions, should you build usual slices from the raw pointers and use them in the safe way?
#[no_mangle]
pub fn cipher(
msg_ptr: *const i8,
key_ptr: *const i8,
buf_ptr: *mut i8,
msg_len: usize,
key_len: usize,
) {
let msg = unsafe { std::slice::from_raw_parts(msg_ptr, msg_len) };
let key = unsafe { std::slice::from_raw_parts(key_ptr, key_len) };
let buf = unsafe { std::slice::from_raw_parts_mut(buf_ptr, msg_len) };
for i in 0..msg_len {
buf[i] = msg[i] ^ key[i % key_len];
}
}
import re
import os
import sys
class Marks:
def __init__(self):
self.marks = []
self.marks_file = '/root/projectpython/mark.txt'
def loadAll(self):
file = open(self.marks_file, 'r')
for line in file.readlines():
name,math,phy,chem = line.strip().split()
name=name
math=int(math)
phy=int(phy)
chem=int(chem)
self.marks=[name,math,phy,chem]
print(self.marks)
file.close()
def percent(self):
dash = '-' * 40
self.loadAll()
for n in self.marks:
print(n)
Book_1 = Marks()
Book_1.percent()
output:-
['gk', 50, 40, 30]
['rahul', 34, 54, 30]
['rohit', 87, 45, 9]
rohit
87
45
9
but i want to print all value in tabular format,it showing only last record.
is it correct method to use list to store student data name and marks.
problem here is with the line read
self.marks=[name,math,phy,chem]
this will keep reinitializing the list each time mark is read
instead use:
self.marks.append([name,math,phy,chem])
You continue to initialize the list in the for statement
and declare it so that only the array value of the last line is reflected.
I think you can remove the initialization statement and process it as an append.
import re
import os
import sys
class Marks:
def __init__(self):
self.marks = []
self.marks_file = '/root/projectpython/mark.txt'
def loadAll(self):
file = open(self.marks_file, 'r')
for line in file.readlines():
name,math,phy,chem = line.strip().split()
name=name
math=int(math)
phy=int(phy)
chem=int(chem)
self.marks.append(name)
self.marks.append(math)
self.marks.append(phy)
self.marks.append(chem)
# self.marks=[name,math,phy,chem]
print(self.marks)
file.close()
def percent(self):
dash = '-' * 40
self.loadAll()
for n in self.marks:
print(n)
Book_1 = Marks()
Book_1.percent()
Make self.marks=[name,math,phy,chem] as self.marks.append([name,math,phy,chem]).
Then easiest solution is to transpose the self.marks list and print them.
suppose your marks list is [['gk', 50, 40, 30],['rahul', 34, 54, 30],['rohit', 87, 45, 9]] then simply transpose it.
print(marks)
transposed=list(zip(*marks))
print(transposed)
for x in transposed:
print(x)
output :
[['gk', 50, 40, 30], ['rahul', 34, 54, 30], ['rohit', 87, 45, 9]] #marks list
[('gk', 'rahul', 'rohit'), (50, 34, 87), (40, 54, 45), (30, 30, 9)] #transposed list
('gk', 'rahul', 'rohit') # output the way you want
(50, 34, 87)
(40, 54, 45)
(30, 30, 9)
Its working now.
i was doing mistake earlier here only self.marks.append([name,math,phy,chem])
[['gk', 50, 40, 30], ['rahul', 34, 54, 30], ['rohit', 87, 45, 9]]
I connect my python to MySQL server with mySQL-Connector(in pyCharm) and i can read from server then write text file.This seems to be:
(1, 'PENELOPE', 'GUINESS', datetime.datetime(2006, 2, 15, 4, 34, 33))
(2, 'NICK', 'WAHLBERG', datetime.datetime(2006, 2, 15, 4, 34, 33))
(3, 'ED', 'CHASE', datetime.datetime(2006, 2, 15, 4, 34, 33))
(4, 'JENNIFER', 'DAVIS', datetime.datetime(2006, 2, 15, 4, 34, 33))
(5, 'JOHNNY', 'LOLLOBRIGIDA', datetime.datetime(2006, 2, 15, 4, 34, 33))
I need to change the commas which are between two areas from , to ~ i can find source code could you help me ?Which class I do change?
this is my python code
import mysql.connector
mydb = mysql.connector.connect(
host="localhost",
user="root",
passwd="2153417",
database="sakila"
)
tweets = open("keywords.txt", "w")
mycursor = mydb.cursor()
mycursor.execute("SELECT * FROM actor")
with open("C:/Users/Erhan/PycharmProjects/MySQL/keywords.txt", "w", newline='') as f:
for row in mycursor:
print(row, file=f)
this is working correctly just need change commas(,) among name,surname and datetime
like this
(1~ 'PENELOPE' ~ 'GUINESS' ~ datetime.datetime(2006, 2, 15, 4, 34, 33))
(2~ 'NICK' ~ 'WAHLBERG' ~ datetime.datetime(2006, 2, 15, 4, 34, 33))
Assuming you want the string representation of all objects(e.g 2006-02-15 04:34:33) instead of the objects representations datetime.datetime(2006, 2, 15, 4, 34, 33), try this:
As per your requirement, try this:
print(' ~ '.join(map(repr, row)), file=f)
I would recommend to try directly writing to the file instead:
f.write(' ~ '.join(map(repr, row))
I have loaded HTML into pyqt and would like to create a list of all the content on the page.
I then need to be able to get the position of the text, using .geometry()
I would like a list of objects, where the following would be possible:
for i in list_of_content_in_html:
print i.toPlainText(), i.geometry() #prints the text, and the position.
In case I am unclear, by "contents" I mean in the HTML below, contents is
'c', 'r1 c1', 'r1, c2', 'row2 c2', 'more contents' - the text the web user sees in the browser, basically.
c
<table border="1">
<tr>
<td>r1 c1</td>
<td>r1 c2</td>
</tr>
<tr>
<td></td>
<td>row2 c2</td>
</tr>
</table>
more contents
This doesn't seem to be possible using QtWebKit and pages like this one, that nest objects but don't use <p>...</p> for other text, that is outside of the table. In result c and more contents don't go into separate QWebElements. They are only to be found in the BODY level block. As a solution one could run that page through a parser. Simply traversing through children of currentFrame documentElement brings out following elements:
# position in element tree, bounding box, tag, text:
(0, 0) [0, 0, 75, 165] HTML - u'c\nr1 c1\tr1 c2\nrow2 c2\nmore contents'
(1, 1) [8, 8, 67, 157] BODY - u'c\nr1 c1\tr1 c2\nrow2 c2\nmore contents'
(2, 0) [8, 27, 75, 119] TABLE - u'r1 c1\tr1 c2\nrow2 c2'
(3, 0) [9, 28, 74, 118] TBODY - u'r1 c1\tr1 c2\nrow2 c2'
(4, 0) [9, 30, 74, 72] TR - u'r1 c1\tr1 c2'
(5, 0) [11, 30, 32, 72] TD - u'r1 c1'
(5, 1) [34, 30, 72, 72] TD - u'r1 c2'
(4, 1) [9, 74, 74, 116] TR - u'row2 c2'
(5, 1) [34, 74, 72, 116] TD - u'row2 c2'
Code for this:
import sys
from PySide.QtCore import *
from PySide.QtGui import *
from PySide.QtWebKit import *
class WebPage(QObject):
finished = Signal()
def __init__(self, data, parent=None):
super(WebPage, self).__init__(parent)
self.output = []
self.data = data
self.page = QWebPage()
self.page.loadFinished.connect(self.process)
def start(self):
self.page.mainFrame().setHtml(self.data)
#Slot(bool)
def process(self, something=False):
self.page.setViewportSize(self.page.mainFrame().contentsSize())
frame = self.page.currentFrame()
elem = frame.documentElement()
self.gather_info(elem)
self.finished.emit()
def gather_info(self, elem, i=0):
if i > 200: return
cnt = 0
while cnt < 100:
s = elem.toPlainText()
rect = elem.geometry()
name = elem.tagName()
dim = [rect.x(), rect.y(),
rect.x() + rect.width(), rect.y() + rect.height()]
if s: self.output.append(dict(pos=(i, cnt), dim=dim, tag=name, text=s))
child = elem.firstChild()
if not child.isNull():
self.gather_info(child, i+1)
elem = elem.nextSibling()
if elem.isNull():
break
cnt += 1
webpage = None
def print_strings():
for s in webpage.output:
print s['pos'], s['dim'], s['tag'], '-', repr(s['text'])
if __name__ == '__main__':
app = QApplication(sys.argv)
data = open(sys.argv[1]).read()
webpage = WebPage(data)
webpage.finished.connect(print_strings)
webpage.start()
.
A different approach
The desired course of action depends on what you want to achieve. You can get all the strings from the QWebPage using webpage.currentFrame().documentElement().toPlainText(), but that just shows the whole page as a string with no positioning information related to all the tags. Browsing the QWebElement tree gives you the desired information but it has the drawbacks, which I mentioned above.
If you really want to know the position of all text, The only accurate way to do this (other than rendering the page and using OCR) is breaking text into characters and saving their individual bounding boxes. Here's how I did it:
First I parsed the page with BeautifulSoup4 and enclosed every non-space text character X in a <span class="Nd92KSx3u2">X</span>. Then I ran a PyQt script (actually a PySide script) which loads the altered page and printed out the characters with their bounding boxes after I looked them up using findAllElements('span[class="Nd92KSx3u2"]').
parser.py:
import sys, cgi, re
from bs4 import BeautifulSoup, element
magical_class = "Nd92KSx3u2"
restricted_tags="title script object embed".split()
re_my_span = re.compile(r'<span class="%s">(.+?)</span>' % magical_class)
def no_nl(s): return str(s).replace("\r", "").replace("\n", " ")
if len(sys.argv) != 3:
print "Usage: %s <input_html_file> <output_html_file>" % sys.argv[0]
sys.exit(1)
def process(elem):
for x in elem.children:
if isinstance(x, element.Comment): continue
if isinstance(x, element.Tag):
if x.name in restricted_tags:
continue
if isinstance(x, element.NavigableString):
if not len(no_nl(x.string).strip()):
continue # it's just empty space
print '[', no_nl(x.string).strip(), ']', # debug output of found strings
s = ""
for c in x.string:
if c in (' ', '\r', '\n', '\t'): s += c
else: s += '<span class="%s">%s</span>' % (magical_class, c)
x.replace_with(s)
continue
process(x)
soup = BeautifulSoup(open(sys.argv[1]))
process(soup)
output = re_my_span.sub(r'<span class="%s">\1</span>' % magical_class, str(soup))
with open(sys.argv[2], 'w') as f:
f.write(output)
charpos.py:
import sys
from PySide.QtCore import *
from PySide.QtGui import *
from PySide.QtWebKit import *
magical_class = "Nd92KSx3u2"
class WebPage(QObject):
def __init__(self, data, parent=None):
super(WebPage, self).__init__(parent)
self.output = []
self.data = data
self.page = QWebPage()
self.page.loadFinished.connect(self.process)
def start(self):
self.page.mainFrame().setHtml(self.data)
#Slot(bool)
def process(self, something=False):
self.page.setViewportSize(self.page.mainFrame().contentsSize())
frame = self.page.currentFrame()
elements = frame.findAllElements('span[class="%s"]' % magical_class)
for e in elements:
s = e.toPlainText()
rect = e.geometry()
dim = [rect.x(), rect.y(),
rect.x() + rect.width(), rect.y() + rect.height()]
if s and rect.width() > 0 and rect.height() > 0: print dim, s
if __name__ == '__main__':
app = QApplication(sys.argv)
data = open(sys.argv[1]).read()
webpage = WebPage(data)
webpage.start()
input.html (slightly altered to show more problems with simple string dumping:
a<span>b<span>c</span></span>
<table border="1">
<tr><td>r1 <font>c1</font> </td><td>r1 c2</td></tr>
<tr><td></td><td>row2 & c2</td></tr>
</table>
more <b>contents</b>
and the test run:
$ python parser.py input.html temp.html
[ a ] [ b ] [ c ] [ r1 ] [ c1 ] [ r1 c2 ] [ row2 & c2 ] [ more ] [ contents ]
$ charpos.py temp.html
[8, 8, 17, 26] a
[17, 8, 26, 26] b
[26, 8, 34, 26] c
[13, 48, 18, 66] r
[18, 48, 27, 66] 1
[13, 67, 21, 85] c
[21, 67, 30, 85] 1
[36, 48, 41, 66] r
[41, 48, 50, 66] 1
[36, 67, 44, 85] c
[44, 67, 53, 85] 2
[36, 92, 41, 110] r
[41, 92, 50, 110] o
[50, 92, 61, 110] w
[61, 92, 70, 110] 2
[36, 111, 47, 129] &
[51, 111, 59, 129] c
[59, 111, 68, 129] 2
[8, 135, 21, 153] m
[21, 135, 30, 153] o
[30, 135, 35, 153] r
[35, 135, 44, 153] e
[8, 154, 17, 173] c
[17, 154, 27, 173] o
[27, 154, 37, 173] n
[37, 154, 42, 173] t
[42, 154, 51, 173] e
[51, 154, 61, 173] n
[61, 154, 66, 173] t
[66, 154, 75, 173] s
Looking at the bounding boxes, it is (in this simple case without changes in font size and things like subscripts) quite easy to glue them back into words if you wish.
I worked it out.
for elem in QWebView().page().currentFrame().documentElement().findAll('*'):
print unicode(elem.toPlainText()), unicode(elem.geometry().getCoords()), '\n'
It matches anything, and then iterates over what is found - thereby iterating over the DOM tree.