I'm writing a pluma plugin (in python) to automate HTML markup of a selected text.
According to (the poor and scarce) documentation, the selected text in the editor should be found in os.environ["PLUMA_SELECTED_TEXT"].
However, when I select some text, run my plugin and examine the environment there is no variable such as "PLUMA_SELECTED_TEXT".
I do find 'PLUMA_CURRENT_LINE' but it contains only the last line of the selected text.
Here is the plugin itself (with debugging stuff...)
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import os
import re
print(os.environ)
try:
ptext = os.environ["PLUMA_SELECTED_TEXT"]
except KeyError:
ptext = "SELECTION NOT FOUND"
print(ptext)
#ptext = re.sub('\n','<br/>\n',ptext)
#ptext = "<p>\n%s\n</p>\n"%ptext
#print(ptext)
Anyone ran into this?
I found the solution, for the benefit of whoever runs into this.
The selected text is actually sent to the script as STDIN so this needs to be read.
Hence the code looks like that:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
try:
ptext = sys.stdin.read()
except:
ptext = "SELECTION NOT FOUND"
ptext = re.sub('\n','<br/>\n',ptext)
ptext = "<p>\n%s\n</p>\n"%ptext
print(ptext)
Related
I'm trying to connect Zoho Analytics and Python via Zoho client library here: https://www.zoho.com/analytics/api/#python-library
I downloaded the client library file but now have no idea how to use it. What I want to do is importing data from Zoho Analytics to Python and the suggested code on Zoho is:
from __future__ import with_statement
from ReportClient import ReportClient
import sys
from __future__ import with_statement
from ReportClient import ReportClient
import sys
class Sample:
LOGINEMAILID="abc#zoho.com"
AUTHTOKEN="************"
DATABASENAME="Workspace Name"
TABLENAME="Table Name"
rc = None
rc = ReportClient(self.AUTHTOKEN)
def importData(self,rc):
uri = rc.getURI(self.LOGINEMAILID,self.DATABASENAME,self.TABLENAME)
try:
with open('StoreSales.csv', 'r') as f:
importContent = f.read()
except Exception,e:
print "Error Check if file StoreSales.csv exists in
the current directory"
print "(" + str(e) + ")"
return
impResult = rc.importData(uri,"APPEND",importContent,None)
print "Added Rows :" +str(impResult.successRowCount) + " and Columns :"
+ str(impResult.selectedColCount)
obj = Sample()
obj.importData(obj.rc)
How do I make from ReportClient import ReportClient work?
Also, how does rc = ReportClient(self.AUTHTOKEN) work if self wasn't predefined?
On the site you linked, you can download a zip file containing the file Zoho/ZohoReportPythonClient/com/adventnet/zoho/client/report/python/ReportClient.py. I'm not sure why it's so deeply nested, or why most of the folders contain an __init__.py file which only has #$Id$ in it.
You'll need to extract that file, and place it somewhere where your Python interpreter can find it. For more information about where Python will look for the module (ReportClient.py), see this question: How does python find a module file if the import statement only contains the filename?
Please note that the file is Python 2 code. You'll need to use a Python 2 interpreter, or convert it to Python 3 code. Once you've got it importing properly, you can use their API reference to start writing code with it: https://css.zohostatic.com/db/api/v7_m2/docs/python/
I have some code which I would like to be decoded but not having much luck in guessing what the codepage is, if any is being used. Any help would be much appreciated.
i am using python command line in windows 7 pc,if any python guru guide me how to decrypt and see the code thaat would be appreciated.
exec("import re;import base64");exec((lambda p,y:(lambda o,b,f:re.sub(o,b,f))(r"([0-9a-f]+)",lambda m:p(m,y),base64.b64decode("NTQgYgo1NCA3CjU0IDMKNTQgMWUKNTQgOQo1NCAxOAozZiAgICAgICA9IGIuMTAoKQoxNiAgID0gIjQzOi8vMTIuM2QvNGMvMWQuMjUuZi00ZC4zZSIKYSA9ICIxZC4yNS5mIgoyYSA4KDYpOgoJMzMgMy5jKCczYy4yZSglNTIpJyAlIDYpID09IDEKCjJhIDE1KDM1KToKCTUgPSAzLjQoMWUuNS4xYygnMTM6Ly8yZC8xZicsJzMwJykpCgkyMyA1CgkyMSA9IDcuMTQoKQoJMjEuMzgoIjEwIDI4IiwiMjAgMTAuLiIsJycsICczNiA0MCcpCgkxMT0xZS41LjFjKDUsICdlLjNlJykKCTM5OgoJCTFlLjFhKDExKQoJMWI6CgkJMmMKCQk5LmUoMzUsIDExLCAyMSkKCQkyID0gMy40KDFlLjUuMWMoJzEzOi8vMmQnLCcxZicpKQoJCTIzIDIKCQkyMS4zNCgwLCIiLCAiM2IgNDciKQoJCTE4LjQ4KDExLDIsMjEpCgkJCgkJMy41MygnMjIoKScpOyAKCQkzLjUzKCcyNigpJyk7CgkJMy41MygiNDUuZCgpIik7IAoJCTE5PTcuMzcoKTsgMTkuNTAoIjMyISIsIjJmIDNhIDQ5IDQxIDI5IiwiICAgWzI0IDQ2XTMxIDUxIDRhIDRlIDE3LjNkWy8yNF0iKQoJCSIiIgoJCTM5OgoJCQkxZS4xYSgxMSkKCQkxYjoKCQkJMmMKCQkJIzI3KCkKCQk0Mjo0NCgpCgkJIiIiCgoyYSAyYigpOgoJNGYgNGIgOChhKToKCQkxNSgxNikKCQoKCjJiKCk=")))(lambda a,b:b[int("0x"+a.group(1),16)],"0|1|addonfolder|xbmc|translatePath|path|script_name|xbmcgui|script_chk|downloader|scriptname|xbmcaddon|getCondVisibility|UpdateLocalAddons|download|supermax|Addon|lib|supermaxwizard|special|DialogProgress|INSTALL|website|SuperMaxWizard|extract|dialog|remove|except|join|plugin|os|addons|Installing|dp|UnloadSkin|print|COLOR|video|ReloadSkin|FORCECLOSE|Installer|Installed|def|Main|pass|home|HasAddon|SuperMax|packages|Brought|Success|return|update|url|Please|Dialog|create|try|Wizard|Nearly|System|com|zip|addon|Wait|been|else|http|quit|XBMC|gold|Done|all|has|You|not|sm|MP|By|if|ok|To|s|executebuiltin|import".split("|")))
The code is uglified. You can unobfuscate it yourself by executing the contents of exec(...) in your Python shell.
import re
import base64
print ((lambda p,y.....split("|")))
EDIT: As snakecharmerb says, it is generally not safe to execute unknown code. I analysed the code to find that running the insides of exec will only decrypt, and leaving off the exec itself will just result in a string. This procedure ("execute stuff inside exec") is by no means a generally safe method to decrypt uglified code, and you need to actually analyse what it does. But, at this point, I was asking you to trust my judgement, which, if it is wrong, theoretically could expose you to an attack. In addition, it seems you have problems getting it to run on your Python; so here's what I'm getting from the above:
import xbmcaddon
import xbmcgui
import xbmc
import os
import downloader
import extract
addon = xbmcaddon.Addon()
website = "http://supermaxwizard.com/sm/plugin.video.supermax-MP.zip"
scriptname = "plugin.video.supermax"
def script_chk(script_name):
return xbmc.getCondVisibility('System.HasAddon(%s)' % script_name) == 1
def INSTALL(url):
path = xbmc.translatePath(os.path.join('special://home/addons','packages'))
print path
dp = xbmcgui.DialogProgress()
dp.create("Addon Installer","Installing Addon..",'', 'Please Wait')
lib=os.path.join(path, 'download.zip')
try:
os.remove(lib)
except:
pass
downloader.download(url, lib, dp)
addonfolder = xbmc.translatePath(os.path.join('special://home','addons'))
print addonfolder
dp.update(0,"", "Nearly Done")
extract.all(lib,addonfolder,dp)
xbmc.executebuiltin('UnloadSkin()');
xbmc.executebuiltin('ReloadSkin()');
xbmc.executebuiltin("XBMC.UpdateLocalAddons()");
dialog=xbmcgui.Dialog(); dialog.ok("Success!","SuperMax Wizard has been Installed"," [COLOR gold]Brought To You By SuperMaxWizard.com[/COLOR]")
"""
try:
os.remove(lib)
except:
pass
#FORCECLOSE()
else:quit()
"""
def Main():
if not script_chk(scriptname):
INSTALL(website)
Main()
Utf-8 doesn't work on my computer. I tried the exact same code at another computer and it worked but on my computer it doesn't. It's in python.
My program starts like this:
# -*- coding: utf-8 -*- # Behövs i python 2 för åäö
from Tkinter import *
class Kryssruta(Button):
""" Knapp som kryssas i/ur när man trycker på den """
def __init__(self, master, nr = 0, rad = 0, kolumn = 0):
#Konstruktor, notera master
Button.__init__(self,master)
self.master = master
self.rad = rad
self.kolumn = kolumn
self.markerad = False
self.kryssad = False
self.cirklad = False
self["command"] = self.kryssa
def kryssa(self):
if self.markerad==False:
self.master.klickat(self)
On one computer it works like a charm, but on my own computer I get the message.
SyntaxError: Non-ASCII character '\xc3' in file 'blah' but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
Using a PC, running in powershell.
Anyone who knows what seems to be the problem?
You have a (number of) blank line(s) above the coding: line. From the document listed in the error message:
To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file, such as:
You declare that the source file is using utf-8 encoding but actually it isn't, it's using the Windows code page default for your system.
Open the file in Notepad and save it out again with Save As, setting UTF-8 in the Encoding dropdown.
I wrote a code to connect to imap and then parse the body information and insert into database. But I am having some problems with accents.
From email header I got this information:
Content-Type: text/html; charset=ISO-8859-1
But, I am not sure if I can trust in this information...
The email was wrote in portuguese, so we have a lot of words with accents. For example, I extract the following phrase from the email source code (using my browser):
"...instalação de eletrônicos..."
So, I connected to imap and fetched some emails:
... typ, data = M.fetch(num, '(RFC822)') ...
When I print the content, I get the following word:
print data[0][1]
instala+º+úo de eletr+¦nicos
I tried to use .decode('utf-8') but I had no success.
instalação de eletrônicos
How can I make it a human readable? My database is in utf-8.
The header says it is using "ISO-8859-1" charset. So you need to decode the string with that encoding.
Try this:
data[0][1].decode('iso-8859-1')
Specifying the source code encoding worked for me. It's the code at the top of my example code below. This should be defined at the top of your python file.
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
value = """...instalação de eletrônicos...""".decode("iso-8859-15")
print value
# prints: ...instalação de eletrônicos...
import unicodedata
value = unicodedata.normalize('NFKD', value).encode('ascii','ignore')
print value
# prints: ...instalacao de eletronicos...
And now you can do str(value) without an exception as well.
See: http://docs.python.org/2/library/unicodedata.html
This seems to keep all accents:
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
import unicodedata
value = """...instalação de eletrônicos...""".decode("iso-8859-15")
value = unicodedata.normalize('NFKC', value).encode('utf-8')
print value
print str(value)
# prints (without exceptions/errors):
# ...instalação de eletrônicos...
# ...instalação de eletrônicos...
EDIT:
Do note that with the last version even though the outcome looks the same it doesn't return equal is True. In example:
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
import unicodedata
inValue = """...instalação de eletrônicos...""".decode("iso-8859-15")
normalizedValue = unicodedata.normalize('NFKC', inValue).encode('utf-8')
try:
print inValue == normalizedValue
except UnicodeWarning:
pass
# False
EDIT2:
This returns the same:
normalizedValue = unicode("""...instalação de eletrônicos...""".decode("iso-8859-15")).encode('utf-8')
print normalizedValue
print str(normalizedValue )
# prints (without exceptions/errors):
# ...instalação de eletrônicos...
# ...instalação de eletrônicos...
Though I'm not sure this will actually be valid for a utf-8 encoded database. Probably not?
Thanks for Martijn Pieters. We figured out that the email had two different encode. I had to split this parts and treat individually.
I have a bunch of improperly formatted Chinese html files. They contain unnecessary spaces and line breaks which will be displayed as extra spaces in the browser. I've written a script using lxml to modify the html files. It works fine on simple tags, but I'm stuck on nested ones. For example:
<p>祝你<span>19</span>岁
生日快乐。</p>
will be displayed is the browser as:
祝你19岁 生日快乐。
Notice the extra space. This is what needs to be deleted. The result html should be like this:
<p>祝你<span>19</span>岁生日快乐。</p>
How do I do this?
Note that the nesting(like the span tag) could be arbitrary, but I don't need to consider the content in the nested elements, they should be preserved as they are. Only the text in the outer element needs to by formatted.
This is what I've got:
# -*- coding: utf-8 -*-
import lxml.html
import re
s1 = u"""<p>祝你19岁
生日快乐。</p>"""
p1 = lxml.html.fragment_fromstring(s1)
print p1.text # I get the whole line.
p1.text = re.sub("\s+", "", p1.text)
print p1.tostring() # spaces are removed.
s2 = u"""<p>祝你<span>19</span>岁
生日快乐。</p>"""
p2 = lxml.html.fragment_fromstring(s2)
print p2.text # I get "祝你"
print p2.tail # I get None
i = p2.itertext()
print i.next() # I get "祝你"
print i.next() # I get "19" from <span>
print i.next() # I get the tailed text, but how do I assemble them back?
print p2.text_content() # The whole text, but how do I put <span> back?
>>> root = etree.fromstring('<p>祝你<span>19</span>岁\n生日快乐。</p>')
>>> etree.tostring(root)
b'<p>祝你<span>19</span>岁\n生日快乐。</p>'
>>> for e in root.xpath('/p/*'):
... if e.tail:
... e.tail = e.tail.replace('\n', '')
...
>>> etree.tostring(root)
b'<p>祝你<span>19</span>岁生日快乐。</p>'
Controversially, I wonder whether this is possible to complete without using an HTML/XML parser, considering that it appears to be cause by line wrapping.
I built a regular expression to look for whitespace between Chinese text with the help of this solution here: https://stackoverflow.com/a/2718268/267781
I don't know whether the catch-all of any whitespace between characters or whether the more specific [char]\n\s*[char] is most suitable to your problem.
# -*- coding: utf-8 -*-
import re
# Whitespace in Chinese HTML
## Used this solution to create regexp: https://stackoverflow.com/a/2718268/267781
## \s+
fixwhitespace2 = re.compile(u'[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d](\s+)[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d]',re.M)
## \n\s*
fixwhitespace = re.compile(u'[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d](\n\s*)[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d]',re.M)
sample = u'<html><body><p>\u795d\u4f6019\u5c81\n \u751f\u65e5\u5feb\u4e50\u3002</p></body></html>'
fixwhitespace.sub('',sample)
Yielding
<html><body><p>祝你19日快乐。</p></body></html>
However, here's how you might do it using the parser and xpath to find linefeeds:
# -*- coding: utf-8 -*-
from lxml import etree
import re
fixwhitespace = re.compile(u'[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d](\n\s*)[\u2e80-\u2e99\u2e9b-\u2ef3\u2f00-\u2fd5\u3005\u3007\u3021-\u3029\u3038-\u303a\u303b\u3400-\u4db5\u4e00-\u9fc3\uf900-\ufa2d\ufa30-\ufa6a\ufa70-\ufad9\U00020000-\U0002a6d6\U0002f800-\U0002fa1d]',re.M)
sample = u'<html><body><p>\u795d\u4f6019\u5c81\n \u751f\u65e5\u5feb\u4e50\u3002</p></body></html>'
doc = etree.HTML(sample)
for t in doc.xpath("//text()[contains(.,'\n')]"):
if t.is_tail:
t.getparent().tail = fixwhitespace.sub('',t)
elif t.is_text:
t.getparent().text = fixwhitespace.sub('',t)
print etree.tostring(doc)
Yields:
<html><body><p>祝你19日快乐。</p></body></html>
I'm curious what the best match to your working data is.