Is creating multiple instances of a class dynamically good practice? Alternatives? - python

Background
I'm working with API calls that return network element data (telecommunications industry fyi).
I'm dynamically creating 'x' number of class instances based on a 'meta' response from the api, that tells me how many of such elements I'm querying exist in the network. This is not something I can know beforehand (may be 1 element, may be 10, 100 etc).
Now each element has several possible attributes (name, lat/long, ip_address, MAC, type.... etc) hence why I thought a class structure would be best. This is my first time using Classes.
Question
Is this best practice? What are the alternatives to this approach?
Code
# Update the Node class's 'num_nodes' class level variable
Node.get_number_of_nodes()
# Create 'x' amount of Node instances and store in a list, based on 'num_nodes'
nodes = [Node() for x in range(Node.num_nodes)]
# Grab unique node ids and assign to the instances of Node
[node.get_node_ids() for node in nodes]
class Node:
"""
A class to represent a network node.
"""
# Shared class variables
num_nodes = 0
# Shared class instances
session = Session()
#classmethod
def get_number_of_nodes(cls):
"""
Api call to ### which fetches number of nodes metadata
"""
path = "####"
url = f"{cls.session.HOST}{path}"
headers = {'Authorization': #####,}
params = {'limit': 10000,}
resp = request(
"GET", url, headers=headers, params=params, verify=False
)
resp = resp.text.encode('utf8')
resp = json.loads(resp)
cls.num_nodes = resp['meta']['total']
def get_node_ids(self):
"""
For an instance of Node, get its attributes from ###.
"""
pass

Related

Query based in Embedded Documents List fields in mongoengine

I'm running into an issue using mongoengine. A raw query that works on Compass isn't working using _ _ raw _ _ on mongoengine. I'd like to rewrite it using mongoengine's methods, but I'd like to understand why it's not working using _ _ raw_ _ either.
I'm running an embedded document list field that has inheritence. The query is "give me all sequences that are have a 'type A' Assignment "
My schema:
class Sequence(Document):
seq = StringField(required = True)
samples = EmbeddedDocumentListField(Sample)
assignments = EmbeddedDocumentListField(Assignment)
class Sample(EmbeddedDocument):
name = StringField()
class Assignment(EmbeddedDocument):
name = StringField()
meta = {'allow_inheritance': True}
class TypeA(Assignment):
pass
class TypeB(Assignment):
other_field = StringField()
pass
Writing {'assignments._cls': 'TypeA'} into Compass returns a list. But on mongoengine I get an empty field:
from mongo_objects import Sequence
def get_samples_assigned_as_class(cls : str):
query_raw = Sequence.objects(__raw__={'assignments._cls': cls}) # raw query, fails
#query2 = Sequence.objects(assignments___cls = cls) # Fist attempt, failed
#query3 = Sequence.objects.get().assignments.filter(cls = cls) # Second attempt, also failed. Didn't like that it queried everything first
print(query_raw) # empty list, iterating does nothing.
get_samples_assigned_as_class('TypeA')
"Assignments" is a list because one sequence may have multiples of the same class. An in depth awnser on how to query these lists for categorical information would be ideal, as I'm not sure how to properly go about it. I'm mostly filtering on the inheritence _cls, but eventually I'd like to do nested queries (cls : TypeA, sample : Sample_1)
Thanks

Node added to neo4j DB but prints out as "None"

I have the following code:
import py2neo
from py2neo import Graph, Node, Relationship
def createRelationshipWithProperties():
print("Start - Creating Relationships")
# Authenticate the user using py2neo.authentication
# Ensure that you change the password 'sumit' as per your database configuration.
py2neo.authenticate("localhost:7474", "neo4j", "")
# Connect to Graph and get the instance of Graph
graph = Graph("http://localhost:7474/db/data/")
# Create Node with Properties
amy = Node("FEMALE", name="Amy")
# Create one more Node with Properties
kristine = Node("FEMALE",name="Kristine")
# Create one more Node with Properties
sheryl = Node("FEMALE",name="Sheryl")
kristine_amy = Relationship(kristine,"FRIEND",amy,since=2005)
print (kristine_amy)
amy_sheryl = Relationship(sheryl,("FRIEND"),amy,since=2001)
#Finally use graph Object and Create Nodes and Relationship
#When we create Relationship between, then Nodes are also created.
resultNodes = graph.create(kristine_amy)
resultNodes1 = graph.create(amy_sheryl)
#Print the results (relationships)
print("Relationship Created - ",resultNodes)
print("Relationship Created - ",resultNodes1)
if __name__ == '__main__':
createRelationshipWithProperties()
The resultsNodes = graph.create line seems to commit the nodes and relationships to the server because I can see them when I match(n) Return n. However, when the code prints resultsNodes, I get None as if they don't exist. This is the output that I get:
Start - Creating Relationships
(kristine)-[:FRIEND {since:2005}]->(amy)
Relationship Created - None
Relationship Created - None
You're using the API incorrectly. The create method doesn't return nodes but instead updates the supplied argument. Therefore to get the relationship nodes, you need to interrogate the relationship object after performing the create.

How can I show arbitrary pages based oy the URL requested in CherryPy?

For example, if the user were to go to localhost:8080/foobar228, I wouldn't need to hardcode in the request for foobar228 page by defining a function. I want to simply have a function that gives you the requested URL ("foobar228") and other information like GET and POST requests, and returns the HTML to be served. Any way to do this?
Use the default method. Here is an example.
import cherrypy
class Root:
#cherrypy.expose
def default(self, *url_parts, **params):
req = cherrypy.request
print(url_parts)
body = [
'Page served with method: %s' % req.method,
'Query string from req: %s' % req.query_string,
'Path info from req: %s' % req.path_info,
'Params (query string + body for POST): %s' % params,
'Body params (only for POST ): %s' % req.body.params
]
if url_parts: # url_parts is path_info but divided by "/" as a tuple
if url_parts[0] == 'foobar228':
body.append( 'Special foobar page')
else:
body.append('Page for %s' % '/'.join(url_parts))
else:
body.append("No path, regular page")
return '<br>\n'.join(body)
cherrypy.quickstart(Root())
The url segments becomes the positional arguments and any query string (?foo=bar) are part of the keyword arguments of the method, also the body parameters for the POST method are included in the keyword arguments (in this case under the name params in the method definition.
The special _cp_dispatch method
_cp_dispatch is a special method you declare in any of your controller to massage the remaining segments before CherryPy gets to process them. This offers you the capacity to remove, add or otherwise handle any segment you wish and, even, entirely change the remaining parts.
import cherrypy
class Band(object):
def __init__(self):
self.albums = Album()
def _cp_dispatch(self, vpath):
if len(vpath) == 1:
cherrypy.request.params['name'] = vpath.pop()
return self
if len(vpath) == 3:
cherrypy.request.params['artist'] = vpath.pop(0) # /band name/
vpath.pop(0) # /albums/
cherrypy.request.params['title'] = vpath.pop(0) # /album title/
return self.albums
return vpath
#cherrypy.expose
def index(self, name):
return 'About %s...' % name
class Album(object):
#cherrypy.expose
def index(self, artist, title):
return 'About %s by %s...' % (title, artist)
if __name__ == '__main__':
cherrypy.quickstart(Band())
Notice how the controller defines _cp_dispatch, it takes a single argument, the URL path info broken into its segments.
The method can inspect and manipulate the list of segments, removing any or adding new segments at any position. The new list of segments is then sent to the dispatcher which will use it to locate the appropriate resource.
In the above example, you should be able to go to the following URLs:
http://localhost:8080/nirvana/
http://localhost:8080/nirvana/albums/nevermind/
The /nirvana/ segment is associated to the band and the /nevermind/ segment relates to the album.
To achieve this, our _cp_dispatch method works on the idea that the default dispatcher matches URLs against page handler signatures and their position in the tree of handlers.
taken from docs

django-tables2 - How to assign total item count with non-queryset data pagination?

I am fetching data from an API with "requests" library and I want to show 10 item per page in html table. So I am fetching 10 item from API with total object count (assume there are 1000 items). When I push the data to html table, pagination not creating because I don't know how to assign total item count to table.
# tables.py
class CustomerTable(tables.Table):
id = tables.Column()
name = tables.LinkColumn('customer:edit', kwargs={'id': A('id')})
class Meta:
order_by = 'name'
# views.py
# content of a view
data = {'total_count': 1000, "objects": [{'id':1, 'name': 'foo'}, {'id':2, 'name': 'bar'}, {'id':3, 'name': 'baz'}]}
table = CustomerTable(data['objects'])
table.paginate(page=self.request.GET.get('page', 1), per_page=1)
self.render_to_response({'table': table})
Question: How to assign total item count(data['total_count']) to the table for pagination?
From the documentation here:
Tables are compatible with a range of input data structures. If you’ve
seen the tutorial you’ll have seen a queryset being used, however any
iterable that supports len() and contains items that expose key-based
accessed to column values is fine.
So you can create your own wrapper class around your API calls which requests the length of your data when len() is called.
Something like this might work, although you'd probably want to optimize it to only access the API and return just the items needed, not the entire data set as is suggested below.
class ApiDataset(object):
def __init__(self, api_addr):
self.http_api_addr = api_addr
self.data = None
def cache_data(self):
# Access API and cache returned data on object.
if self.data is None:
self.data = get_data_from_api()
def __iter__(self):
self.cache_results()
for item in self.data['objects']:
yield item
def __len__(self):
self.cache_results()
return self.data['total_count']
Using this setup, you would pass in an APIDataset instance to the django-tables2 Table constructor.

Python OOP Project Organization

I'm a bit new to Python dev -- I'm creating a larger project for some web scraping. I want to approach this as "Pythonically" as possible, and would appreciate some help with the project structure. Here's how I'm doing it now:
Basically, I have a base class for an object whose purpose is to go to a website and parse some specific data on it into its own array, jobs[]
minion.py
class minion:
# Empty getJobs() function to be defined by object pre-instantiation
def getJobs(self):
pass
# Constructor for a minion that requires site authorization
# Ex: minCity1 = minion('http://portal.com/somewhere', 'user', 'password')
# or minCity2 = minion('http://portal.com/somewhere')
def __init__(self, title, URL, user='', password=''):
self.title = title
self.URL = URL
self.user = user
self.password = password
self.jobs = []
if (user == '' and password == ''):
self.reqAuth = 0
else:
self.reqAuth = 1
def displayjobs(self):
for j in self.jobs:
j.display()
I'm going to have about 100 different data sources. The way I'm doing it now is to just create a separate module for each "Minion", which defines (and binds) a more tailored getJobs() function for that object
Example: minCity1.py
from minion import minion
from BeautifulSoup import BeautifulSoup
import urllib2
from job import job
# MINION CONFIG
minTitle = 'Some city'
minURL = 'http://www.somewebpage.gov/'
# Here we define a function that will be bound to this object's getJobs function
def getJobs(self):
page = urllib2.urlopen(self.URL)
soup = BeautifulSoup(page)
# For each row
for tr in soup.findAll('tr'):
tJob = job()
span = tr.findAll(['span', 'class="content"'])
# If row has 5 spans, pull data from span 2 and 3 ( [1] and [2] )
if len(span) == 5:
tJob.title = span[1].a.renderContents()
tJob.client = 'Some City'
tJob.source = minURL
tJob.due = span[2].div.renderContents().replace('<br />', '')
self.jobs.append(tJob)
# Don't forget to bind the function to the object!
minion.getJobs = getJobs
# Instantiate the object
mCity1 = minion(minTitle, minURL)
I also have a separate module which simply contains a list of all the instantiated minion objects (which I have to update each time I add one):
minions.py
from minion_City1 import mCity1
from minion_City2 import mCity2
from minion_City3 import mCity3
from minion_City4 import mCity4
minionList = [mCity1,
mCity2,
mCity3,
mCity4]
main.py references minionList for all of its activities for manipulating the aggregated data.
This seems a bit chaotic to me, and was hoping someone might be able to outline a more Pythonic approach.
Thank you, and sorry for the long post!
Instead of creating functions and assigning them to objects (or whatever minion is, I'm not really sure), you should definitely use classes instead. Then you'll have one class for each of your data sources.
If you want, you can even have these classes inherit from a common base class, but that isn't absolutely necessary.

Categories