I'm trying to extract values from JSON input using python. There are many tag that I need to extract and not all JSON files have the same structure as the sources are multiple. Sometimes there is a possibility that a tag might be missing. So, a KeyError is bound to happen. If a tag is missing then respective variable will by default be None and it will be returned as a list (members) to the main call.
I tried calling a function to pass each tags into an individual try/except. But, I get hit by an error on the function call itself where the tag is being passed. So, instead I tried the below code but it skips any subsequent lines even if the tags are present. Is there a better way to do this?
def extract(self):
try:
self.data_version = self.data['meta']['data_version']
self.created = self.data['meta']['created']
self.revision = self.data['meta']['revision']
self.gender = self.data['info']['gender']
self.season = self.data['info']['season']
self.team_type = self.data['info']['team_type']
self.venue = self.data['info']['venue']
status = True
except KeyError:
status = False
members = [attr for attr in dir(self) if
not callable(getattr(self, attr)) and not attr.startswith("__") and getattr(self, attr) is None]
return status, members
UPDATED:
Thanks Barmar & John! .get() worked really well.
I am trying to make design metadata driven data pipelines, so I define in external textual metadata (json, yaml) things like:
dataFunctionSequence = DataFunctionSequence(
functions=[
Function(
functionClass="Function1",
functionPackage="pipelines.experiments.meta_driven.function_1",
parameters=[
Parameter(name="param1", dataType="str", value="this is my function str value")
]
)
]
)
Now with help of importlib i can get class:
functionMainClass = getattr(
importlib.import_module(
functionMeta.functionPackage), functionMeta.functionClass)
But I want real instance, so I got it working with this piece of code which utilizes on top of importlib as well eval builtin function:
def create_instance(class_str:str):
"""
Create a class instance from a full path to a class constructor
:param class_str: module name plus '.' plus class name and optional parens with arguments for the class's
__init__() method. For example, "a.b.ClassB.ClassB('World')"
:return: an instance of the class specified.
"""
try:
if "(" in class_str:
full_class_name, args = class_name = class_str.rsplit('(', 1)
args = '(' + args
else:
full_class_name = class_str
args = ()
# Get the class object
module_path, _, class_name = full_class_name.rpartition('.')
mod = importlib.import_module(module_path)
klazz = getattr(mod, class_name)
# Alias the the class so its constructor can be called, see the following link.
# See https://www.programiz.com/python-programming/methods/built-in/eval
alias = class_name + "Alias"
instance = eval(alias + args, { alias: klazz})
return instance
except (ImportError, AttributeError) as e:
raise ImportError(class_str)
And I can construct a string that will construct the class and it works like charm.
Now my problem is that the class requires another parameter which is complex Spark DataFrame object which is not loaded from metadata but from some database or s3 bucket for example. Here I fail to be able to create dynamically instance with non-string variable.
I am failing here:
instance = eval(alias + args, { alias: klazz})
I tried to extend the create_instance() fnc with **kwargs, so I can dynamically search for parameter by name eg kwargs["dataFrame"], but how to assign it dynamically to init?
evel is not the right way probably or my expression is not correct?
NOTE: Another possible approach was to iterate somehoe over init object where I will get all params of constructor, but still I don't know what python reflection fuction to use to make a real instance.
Workaround: What I can do is that I can simply remove the dataFrame from class constructor, create instance only on simple params (eg. strings) and call method:
instance.setDataFrame(dataFrame)
And it will work. but I wanted some reflection base approach if possible.
Thank you very much.
Ladislav
[EDIT: I'm running Python 2.7.3]
I'm a network engineer by trade, and I've been hacking on ncclient (the version on the website is old, and this was the version I've been working off of) to make it work with Brocade's implementation of NETCONF. There are some tweaks that I had to make in order to get it to work with our Brocade equipment, but I had to fork off the package and make tweaks to the source itself. This didn't feel "clean" to me so I decided I wanted to try to do it "the right way" and override a couple of things that exist in the package*; three things specifically:
A "static method" called build() which belongs to the HelloHandler class, which itself is a subclass of SessionListener
The "._id" attribute of the RPC class (the original implementation used uuid, and Brocade boxes didn't like this very much, so in my original tweaks I just changed this to a static value that never changed).
A small tweak to a util function that builds XML filter attributes
So far I have this code in a file brcd_ncclient.py:
#!/usr/bin/env python
# hack on XML element creation and create a subclass to override HelloHandler's
# build() method to format the XML in a way that the brocades actually like
from ncclient.xml_ import *
from ncclient.transport.session import HelloHandler
from ncclient.operations.rpc import RPC, RaiseMode
from ncclient.operations import util
# register brocade namespace and create functions to create proper xml for
# hello/capabilities exchange
BROCADE_1_0 = "http://brocade.com/ns/netconf/config/netiron-config/"
register_namespace('brcd', BROCADE_1_0)
brocade_new_ele = lambda tag, ns, attrs={}, **extra: ET.Element(qualify(tag, ns), attrs, **extra)
brocade_sub_ele = lambda parent, tag, ns, attrs={}, **extra: ET.SubElement(parent, qualify(tag, ns), attrs, **extra)
# subclass RPC to override self._id to change uuid-generated message-id's;
# Brocades seem to not be able to handle the really long id's
class BrcdRPC(RPC):
def __init__(self, session, async=False, timeout=30, raise_mode=RaiseMode.NONE):
self._id = "1"
return super(BrcdRPC, self).self._id
class BrcdHelloHandler(HelloHandler):
def __init__(self):
return super(BrcdHelloHandler, self).__init__()
#staticmethod
def build(capabilities):
hello = brocade_new_ele("hello", None, {'xmlns':"urn:ietf:params:xml:ns:netconf:base:1.0"})
caps = brocade_sub_ele(hello, "capabilities", None)
def fun(uri): brocade_sub_ele(caps, "capability", None).text = uri
map(fun, capabilities)
return to_xml(hello)
#return super(BrcdHelloHandler, self).build() ???
# since there's no classes I'm assuming I can just override the function itself
# in ncclient.operations.util?
def build_filter(spec, capcheck=None):
type = None
if isinstance(spec, tuple):
type, criteria = spec
# brocades want the netconf prefix on subtree filter attribute
rep = new_ele("filter", {'nc:type':type})
if type == "xpath":
rep.attrib["select"] = criteria
elif type == "subtree":
rep.append(to_ele(criteria))
else:
raise OperationError("Invalid filter type")
else:
rep = validated_element(spec, ("filter", qualify("filter")),
attrs=("type",))
# TODO set type var here, check if select attr present in case of xpath..
if type == "xpath" and capcheck is not None:
capcheck(":xpath")
return rep
And then in my file netconftest.py I have:
#!/usr/bin/env python
from ncclient import manager
from brcd_ncclient import *
manager.logging.basicConfig(filename='ncclient.log', level=manager.logging.DEBUG)
# brocade server capabilities advertising as 1.1 compliant when they're really not
# this will stop ncclient from attempting 1.1 chunked netconf message transactions
manager.CAPABILITIES = ['urn:ietf:params:netconf:capability:writeable-running:1.0', 'urn:ietf:params:netconf:base:1.0']
# BROCADE_1_0 is the namespace defined for netiron configs in brcd_ncclient
# this maps to the 'brcd' prefix used in xml elements, ie subtree filter criteria
with manager.connect(host='hostname_or_ip', username='username', password='password') as m:
# 'get' request with no filter - for brocades just shows 'show version' data
c = m.get()
print c
# 'get-config' request with 'mpls-config' filter - if no filter is
# supplied with 'get-config', brocade returns nothing
netironcfg = brocade_new_ele('netiron-config', BROCADE_1_0)
mplsconfig = brocade_sub_ele(netironcfg, 'mpls-config', BROCADE_1_0)
filterstr = to_xml(netironcfg)
c2 = m.get_config(source='running', filter=('subtree', filterstr))
print c2
# so far it only looks like the supported filters for 'get-config'
# operations are: 'interface-config', 'vlan-config' and 'mpls-config'
Whenever I run my netconftest.py file, I get timeout errors because in the log file ncclient.log I can see that my subclass definitions (namely the one that changes the XML for hello exchange - the staticmethod build) are being ignored and the Brocade box doesn't know how to interpret the XML that the original ncclient HelloHandler.build() method is generating**. I can also see in the generated logfile that the other things I'm trying to override are also being ignored, like the message-id (static value of 1) as well as the XML filters.
So, I'm kind of at a loss here. I did find this blog post/module from my research, and it would appear to do exactly what I want, but I'd really like to be able to understand what I'm doing wrong via doing it by hand, rather than using a module that someone has already written as an excuse to not have to figure this out on my own.
*Can someone explain to me if this is "monkey patching" and is actually bad? I've seen in my research that monkey patching is not desirable, but this answer and this answer are confusing me quite a bit. To me, my desire to override these bits would prevent me from having to maintain an entire fork of my own ncclient.
**To give a little more context, this XML, which ncclient.transport.session.HelloHandler.build() generates by default, the Brocade box doesn't seem to like:
<?xml version='1.0' encoding='UTF-8'?>
<nc:hello xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<nc:capabilities>
<nc:capability>urn:ietf:params:netconf:base:1.0</nc:capability>
<nc:capability>urn:ietf:params:netconf:capability:writeable-running:1.0</nc:capability>
</nc:capabilities>
</nc:hello>
The purpose of my overridden build() method is to turn the above XML into this (which the Brocade does like:
<?xml version="1.0" encoding="UTF-8"?>
<hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<capabilities>
<capability>urn:ietf:params:netconf:base:1.0</capability>
<capability>urn:ietf:params:netconf:capability:writeable-running:1.0</capability>
</capabilities>
</hello>
So it turns out that the "meta info" should not have been so hastily removed, because again, it's difficult to find answers to what I'm after when I don't fully understand what I want to ask. What I really wanted to do was override stuff in a package at runtime.
Here's what I've changed brcd_ncclient.py to (comments removed for brevity):
#!/usr/bin/env python
from ncclient import manager
from ncclient.xml_ import *
brcd_new_ele = lambda tag, ns, attrs={}, **extra: ET.Element(qualify(tag, ns), attrs, **extra)
brcd_sub_ele = lambda parent, tag, ns, attrs={}, **extra: ET.SubElement(parent, qualify(tag, ns), attrs, **extra)
BROCADE_1_0 = "http://brocade.com/ns/netconf/config/netiron-config/"
register_namespace('brcd', BROCADE_1_0)
#staticmethod
def brcd_build(capabilities):
hello = brcd_new_ele("hello", None, {'xmlns':"urn:ietf:params:xml:ns:netconf:base:1.0"})
caps = brcd_sub_ele(hello, "capabilities", None)
def fun(uri): brcd_sub_ele(caps, "capability", None).text = uri
map(fun, capabilities)
return to_xml(hello)
def brcd_build_filter(spec, capcheck=None):
type = None
if isinstance(spec, tuple):
type, criteria = spec
# brocades want the netconf prefix on subtree filter attribute
rep = new_ele("filter", {'nc:type':type})
if type == "xpath":
rep.attrib["select"] = criteria
elif type == "subtree":
rep.append(to_ele(criteria))
else:
raise OperationError("Invalid filter type")
else:
rep = validated_element(spec, ("filter", qualify("filter")),
attrs=("type",))
if type == "xpath" and capcheck is not None:
capcheck(":xpath")
return rep
manager.transport.session.HelloHandler.build = brcd_build
manager.operations.util.build_filter = brcd_build_filter
And then in netconftest.py:
#!/usr/bin/env python
from brcd_ncclient import *
manager.logging.basicConfig(filename='ncclient.log', level=manager.logging.DEBUG)
manager.CAPABILITIES = ['urn:ietf:params:netconf:capability:writeable-running:1.0', 'urn:ietf:params:netconf:base:1.0']
with manager.connect(host='host', username='user', password='password') as m:
netironcfg = brcd_new_ele('netiron-config', BROCADE_1_0)
mplsconfig = brcd_sub_ele(netironcfg, 'mpls-config', BROCADE_1_0)
filterstr = to_xml(netironcfg)
c2 = m.get_config(source='running', filter=('subtree', filterstr))
print c2
This gets me almost to where I want to be. I still have to edit the original source code to change the message-id's from being generated with uuid1().urn because I haven't figured out or don't understand how to change an object's attributes before __init__ happens at runtime (chicken/egg problem?); here's the offending code in ncclient/operations/rpc.py:
class RPC(object):
DEPENDS = []
REPLY_CLS = RPCReply
def __init__(self, session, async=False, timeout=30, raise_mode=RaiseMode.NONE):
self._session = session
try:
for cap in self.DEPENDS:
self._assert(cap)
except AttributeError:
pass
self._async = async
self._timeout = timeout
self._raise_mode = raise_mode
self._id = uuid1().urn # Keeps things simple instead of having a class attr with running ID that has to be locked
Credit goes to this recipe on ActiveState for finally cluing me in on what I really wanted to do. The code I had originally posted I don't think was technically incorrect - if what I wanted to do was fork off my own ncclient and make changes to it and/or maintain it, which wasn't what I wanted to do at all, at least not right now.
I'll edit my question title to better reflect what I had originally wanted - if other folks have better or cleaner ideas, I'm totally open.
hard to word the question so ill go right to the point, i wrote the following template tag
def do_simple_tag(parser, token):
try:
tag_name, name = token.split_contents()
except ValueError:
raise template.TemplateSyntaxError("%r tag requires exactly one argument" % token.contents.split()[0])
if not (name[0] == name[-1] and name[0] in ('"', "'")):
raise template.TemplateSyntaxError("%r tag's argument should be in quotes" % tag_name)
return SimpleTagNode(name[1:-1])
class SimpleTagNode(template.Node):
def __init__(self, name):
self.name = name
def render(self, context):
content = get_content(context, request, name)
return content
register.tag('simple_tag', do_simple_tag)
then i wrote a function that scans for this tag in a template and gets all instances of this tag within said template in a list like so
def get_tags(template):
compiled_template = get_template(template)
simple_tag_instances = _scan_tag(compiled_template.nodelist)
def _scan_tag(nodelist, current_block=None, ignore_blocks=[]):
tags = []
for node in nodelist:
if isinstance(node, SimpleTagNode):
tags.append(node.get_name())
so, my question is why does the isinstance fail if node is infact an instance of SimpleTagNode ( or so i believe ) , i checked nodelist and saw that indeed there were instances of SimpleTagNode, but they would all return false in the isinstance condition, i have spent a long time trying to figure this one out, but found nothing, i even used the shell running the funcions above and still returned fals, any help is much a appreciated
So i finally solved it, basically in the module that contained the _scan_tag function at the top of the file i was importing the SimpleTagNode class like so
from simple_tag.templatetags.simple_tag import SimpleTagNode
simple_tag being the name of my app, and also the name of the template file, for some reason this was conflicted with isinstance , so i tried
from paulo.simple_tag.templatetags.simple_tag import SimpleTagNode
paulo being my project app, and it worked.
This code is copy from http://code.google.com/p/closure-library/source/browse/trunk/closure/bin/build/source.py
The Source class's __str
__method referred self._path
Is it a special property for self?
Cuz, i couldn't find the place define this variable at Source Class
import re
_BASE_REGEX_STRING = '^\s*goog\.%s\(\s*[\'"](.+)[\'"]\s*\)'
_PROVIDE_REGEX = re.compile(_BASE_REGEX_STRING % 'provide')
_REQUIRES_REGEX = re.compile(_BASE_REGEX_STRING % 'require')
# This line identifies base.js and should match the line in that file.
_GOOG_BASE_LINE = (
'var goog = goog || {}; // Identifies this file as the Closure base.')
class Source(object):
"""Scans a JavaScript source for its provided and required namespaces."""
def __init__(self, source):
"""Initialize a source.
Args:
source: str, The JavaScript source.
"""
self.provides = set()
self.requires = set()
self._source = source
self._ScanSource()
def __str__(self):
return 'Source %s' % self._path #!!!!!! what is self_path !!!!
def GetSource(self):
"""Get the source as a string."""
return self._source
def _ScanSource(self):
"""Fill in provides and requires by scanning the source."""
# TODO: Strip source comments first, as these might be in a comment
# block. RegExes can be borrowed from other projects.
source = self.GetSource()
source_lines = source.splitlines()
for line in source_lines:
match = _PROVIDE_REGEX.match(line)
if match:
self.provides.add(match.group(1))
match = _REQUIRES_REGEX.match(line)
if match:
self.requires.add(match.group(1))
# Closure's base file implicitly provides 'goog'.
for line in source_lines:
if line == _GOOG_BASE_LINE:
if len(self.provides) or len(self.requires):
raise Exception(
'Base files should not provide or require namespaces.')
self.provides.add('goog')
def GetFileContents(path):
"""Get a file's contents as a string.
Args:
path: str, Path to file.
Returns:
str, Contents of file.
Raises:
IOError: An error occurred opening or reading the file.
"""
fileobj = open(path)
try:
return fileobj.read()
finally:
fileobj.close()
No, _path is just an attribute that may or me not be set on an object like any other attribute. The leading underscore simply means that the author felt it was an internal detail of the object and didn't want it regarded as part of the public interface.
In this particular case, unless something is setting the attribute from outside that source file, it looks like it's simply a mistake. It won't do any harm unless anyone ever tries to call str() on a Source object and probably nobody ever does.
BTW, you seem to be thinking there is something special about self. The name self isn't special in any way: it's a convention to use this name for the first parameter of a method, but it is just a name like any other that refers to the object being processed. So if you could access self._path without causing an error you could access it equally well through any other name for the object.