How do I use python-WikEdDiff? - python

I recently installed python-WikEdDiff package to my system. I understand it is a python extension of the original JavaScript WikEdDiff tool. I tried to use it but I couldn't find any documentation for it. I am stuck at using WikEdDiff.diff(). I wish to use the other functions of this class, such as getFragments() and others, but on checking, it shows the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/WikEdDiff/diff.py", line 1123, in detectBlocks
self.getSameBlocks()
File "/usr/local/lib/python3.4/dist-packages/WikEdDiff/diff.py", line 1211, in getSameBlocks
while j is not None and self.oldText.tokens[j].link is None:
IndexError: list index out of range
On checking, I found out that the tokens[] structure in the object remains empty whereas it should have been initialized.
Is there an initialize function that I need to call apart from the default constructor? Or is it something to do with the `WikEdDiffConfig' config structure I passed to the constructor?

You get this error because the WikEdDiff object was cleared internally inside diff(), as shown in this section of the code:
def diff( self, oldString, newString ):
...
# Free memory
self.newText.tokens.clear()
self.oldText.tokens.clear()
# Assemble blocks into fragment table
fragments = self.getDiffFragments()
# Free memory
self.blocks.clear()
self.groups.clear()
self.sections.clear()
...
return fragments
If you just need the fragments, use the returned variable of diff() like this:
import WikEdDiff as WED
config=WED.WikEdDiffConfig()
w = WED.WikEdDiff(config)
f = w.diff("abc", "efg")
# do whatever you want with f, but don't use w
print(' '.join([i.text+i.type for i in f]))
# outputs '{ [ (> abc- ) abc< efg+ ] }'

Related

Python - Additional "members" appended to JSON object when passing it to function

I have the following JSON object located in its own file called build.json:
{
"name": "utils",
"version": "1.0.0",
"includes": [],
"libraries": [],
"testLibraries": []
}
I obtain this object in my Python program using the following method:
def getPackage(packageName):
jsonFilePath = os.path.join(SRCDIR, packageName, "build.json")
packageJson = None
try:
with open(jsonFilePath, "r") as jsonFile:
packageJson = json.load(jsonFile)
except:
return None
return packageJson
I verify that the JSON object for the current package (which is one of many packages I am iterating over) did not come back None in the following method. Note that I am temporarily printing out the keys of the dictionary:
def compileAllPackages():
global COMPILED_PACKAGES
for packageName in os.listdir(SRCDIR):
package = getPackage(packageName)
if package == None:
continue
# TEMP ==============
for i in package:
print(i)
# ===================
compiledSuccessfully = compilePackage(package)
if not compiledSuccessfully:
return False
return True
Lastly, I am currently also printing out the keys of the dictionary once it is received in the compilePackage function:
def compilePackage(package):
global COMPILED_PACKAGES, INCLUDE_TESTS
# TEMP ==============
for i in package:
print(i)
# ===================
...
Output from compileAllPackages function:
name
version
includes
libraries
testLibraries
Output from compilePackage function:
name
version
includes
libraries
testLibraries
u
t
i
l
s
I can not for the life of me figure out what is happening to my dictionary during that function call??? Please note that the build.json file is located within a directory named "utils".
Edit:
The Python script is located separate from the build.json file and works on absolute paths. It should also be noted that after getting that strange output, I also get the following exception when trying to access a valid key later (it seems to think the dictionary is a string?...):
Traceback (most recent call last):
File "/Users/nate/bin/BuildTool/unix/build.py", line 493, in <module>
main()
File "/Users/nate/bin/BuildTool/unix/build.py", line 481, in main
compiledSuccessfully = compileAllPackages()
File "/Users/nate/bin/BuildTool/unix/build.py", line 263, in compileAllPackages
compiledSuccessfully = compilePackage(package)
File "/Users/nate/bin/BuildTool/unix/build.py", line 287, in compilePackage
compiledSuccessfully = compilePackage(include)
File "/Users/nate/bin/BuildTool/unix/build.py", line 279, in compilePackage
includes = getPackageIncludes(package)
File "/Users/nate/bin/BuildTool/unix/build.py", line 194, in getPackageIncludes
includes = [package["name"]] # A package always includes itself
TypeError: string indices must be integers
Edit: If I change the parameter name to something other than 'package', I no longer get that weird output or an exception later on. This is not necessarily a fix, however, as I do not know what could be wrong with the name 'package'. There are no globals named as such either.
The answer ended up being very stupid. compilePackage() has the possibility of being called recursively, due to any dependencies the package may rely on. In recursive calls to the function, I was passing a string to the function rather than a dictionary.
I tried your code and the result is like this
Output from compileAllPackages function:
name
version
includes
libraries
testLibraries
Output from compilePackage function:
name
version
includes
libraries
testLibraries
My directory structure is like this
├── test.py
└── tt
└── cc
└── utils
└── build.json
I think your code is correct, it should be that the path parameter you passed is incorrect.

Don't understand this ConfigParser.InterpolationSyntaxError

So I have tried to write a small config file for my script, which should specify an IP address, a port and a URL which should be created via interpolation using the former two variables. My config.ini looks like this:
[Client]
recv_url : http://%(recv_host):%(recv_port)/rpm_list/api/
recv_host = 172.28.128.5
recv_port = 5000
column_list = Name,Version,Build_Date,Host,Release,Architecture,Install_Date,Group,Size,License,Signature,Source_RPM,Build_Host,Relocations,Packager,Vendor,URL,Summary
In my script I parse this config file as follows:
config = SafeConfigParser()
config.read('config.ini')
column_list = config.get('Client', 'column_list').split(',')
URL = config.get('Client', 'recv_url')
If I run my script, this results in:
Traceback (most recent call last):
File "server_side_agent.py", line 56, in <module>
URL = config.get('Client', 'recv_url')
File "/usr/lib64/python2.7/ConfigParser.py", line 623, in get
return self._interpolate(section, option, value, d)
File "/usr/lib64/python2.7/ConfigParser.py", line 691, in _interpolate
self._interpolate_some(option, L, rawval, section, vars, 1)
File "/usr/lib64/python2.7/ConfigParser.py", line 716, in _interpolate_some
"bad interpolation variable reference %r" % rest)
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
I have tried debugging, which resulted in giving me one more line of error code:
...
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
Exception AttributeError: "'NoneType' object has no attribute 'path'" in <function _remove at 0x7fc4d32c46e0> ignored
Here I am stuck. I don't know where this _remove function is supposed to be... I tried searching for what the message is supposed to tell me, but quite frankly I have no idea. So...
Is there something wrong with my code?
What does '< function _remove at ... >' mean?
There was indeed a mistake in my config.ini file. I did not regard the s at the end of %(...)s as a necessary syntax element. I suppose it refers to "string" but I couldn't really confirm this.
My .ini file for starting the Python Pyramid server had a similar problem.
And to use the variable from the .env file, I needed to add the following: %%(VARIEBLE_FOR_EXAMPLE)s
But I got other problems, and I solved them with this: How can I use a system environment variable inside a pyramid ini file?

TypeError: cannot concatenate 'str' and 'SFType' objects (simple-salesforce)

my project is to extract the contents of all my salesforce tables, including the custom ones. To do this, I need to know the names of the columns (fields), since SOQL does not support "SELECT * from TABLENAME".
With simple-salesforce, I know that the following works:
sf = Salesforce(username='foo#bar.com', password='abcd', security_token='ZCdsdPdE4eI2DZMl5gwCFIGEFU')
field_data = sf.Contact.describe()["fields"]
But my problem is that I need to parameterize the "Contact" string in the actual method call above, so that I can call this method for objects that I do not know the names of (ie not defined in standard salesforce). For example I need to do:
field_data = sf.CustomTableName.describe()["fields"]
When I try and use the SFType class:
contact = SFType('Contact',sf.sessionid,sf.sf_instance)
f = contact.describe()
I get this error:
Traceback (most recent call last):
File "./simple-example.py", line 13, in <module>
f = contact.describe()["fields"]
File "/Library/Python/2.7/site-packages/simple_salesforce/api.py", line 430, in describe
result = self._call_salesforce('GET', self.base_url + 'describe')
File "/Library/Python/2.7/site-packages/simple_salesforce/api.py", line 570, in _call_salesforce
'Authorization': 'Bearer ' + self.session_id,
TypeError: cannot concatenate 'str' and 'SFType' objects
Thanks in advance for any advice.
If you look in the source code for simple-salesforce (as of 2015-11-12) you'll see that in the init() of Salesforce() we set the session to self.session_id and instance to self.sf_instance
In your case, you're using sf.sessionid, and because simple-salesforce is setup to return a SFType() object whenever a method or property does not exist on Salesforce() (and sessionid does not exist on Salesforce()) you're actually inserting a SFType() object into the init of your SFType()
SFType.__init__() doesn't do any form of validation to confirm you're passing in strings as arguments, so the error you're getting is from simple-salesforce trying to use the SFType() object you're passing in as a string.
Try this code:
contact = SFType('Contact', sf.session_id, sf.sf_instance)
f = contact.describe()
I ran into the same issue and seemed to have fixed this by removing the protocol ("https://") from the instance_url. This is weird but seems to work for me now and I can do contact.describe()
Something like this:
contact = SFType(sf_object, session_id, instance_url.replace("https://",''))
contact.describe()

how to use TTreeReader in PyROOT

I'm trying to get up and running using the TTreeReader approach to reading TTrees in PyROOT. As a guide, I am using the ROOT 6 Analysis Workshop (http://root.cern.ch/drupal/content/7-using-ttreereader) and its associated ROOT file (http://root.cern.ch/root/files/tutorials/mockupx.root).
from ROOT import *
fileName = "mockupx.root"
file = TFile(fileName)
tree = file.Get("MyTree")
treeReader = TTreeReader("MyTree", file)
After this, I am a bit lost. I attempt to access variable information using the TTreeReader object and it doesn't quite work:
>>> rvMissingET = TTreeReaderValue(treeReader, "missingET")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/ROOT/v6-03-01/root/lib/ROOT.py", line 198, in __call__
result = _root.MakeRootTemplateClass( *newargs )
SystemError: error return without exception set
Where am I going wrong here?
TTreeReaderValue is a templated class, as shown in the example on the TTreeReader documentation, so you need to specify the template type.
You can do this with
rvMissingET = ROOT.TTreeReaderValue(ROOT.Double)(treeReader, "missingET")
The Python built-ins can be used for int and float types, e.g.
rvInt = ROOT.TTreeReaderValue(int)(treeReader, "intBranch")
rvFloat = ROOT.TTreeReaderValue(float)(treeReader, "floatBranch")
Also note that using TTreeReader in PyROOT is not recommended. (If you're looking for faster ntuple branch access in Python, you might look in to the Ntuple class I wrote.)

lxml order of attributes

As stated in this question:
lxml preserves attributes order?
And taking the #abarnet suggestion I wrote the following line of code:
root = ET.Element('{%s}Catalogo' % SATNS, OrderedDict([("Ano","2014"),("Mes","02"),("TotalCtas","219"),("RFC","XXX010101XXX"),("Version","1.0")]), nsmap={'catalogocuentas':SATNS})
I get this:
<catalogocuentas:Catalogo xmlns:catalogocuentas="http://www.sat.gob.mx/catalogocuentas" Ano="2014" Mes="02" TotalCtas="219" RFC="XXX010101XXX" Version="1.0"/>
which is great(it preserves the desired order), but when I want to add the missing information:
xmlns:xsi="link_2" xsi:schemaLocation="http://www.sat.gob.mx/catalogocuentas"
as part of my xml and then I add this info in my python code:
attrib={location_attribute: 'http://www.sat.gob.mx/catalogocuentas'}
so that it becomes:
root = ET.Element('{%s}Catalogo' % SATNS, OrderedDict([("Ano","2014"),("Mes","02"),("TotalCtas","219"),("RFC","XXX010101XXX"),("Version","1.0")]), nsmap={'catalogocuentas':SATNS}, attrib={location_attribute: 'http://www.sat.gob.mx/catalogocuentas'})
I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lxml.etree.pyx", line 2558, in lxml.etree.Element (src/lxml/lxml.etree.c:52829)
TypeError: Element() got multiple values for keyword argument 'attrib'
How can I fix it?
Thanks in advance!!
The problem is you are sending the Element() init method the same keyword argument twice. For your initialization your second argument is being used as the attrib keyword. Which in this instance is your OrderedDict() which contains all your attributes. You then try to supply it again which is where it runs into a collision. You can remedy this one of two ways:
You can add this attribute to your attribute OrderedDict() like so:
root = ET.Element('{%s}Catalogo' % SATNS, OrderedDict([("Ano","2014"),("Mes","02"),("TotalCtas","219"),("RFC","XXX010101XXX"),("Version","1.0"),("location_attribute","http://www.sat.gob.mx/catalogocuentas")]), nsmap={'catalogocuentas':SATNS})
Alternatively you could add it on the next line as well by doing this:
root.attrib["location_attribute"] = "http://www.sat.gob.mx/catalogocuentas"

Categories