Python Pyramid(cornice) with Elasticsearch DSL - python

Using python pyramid and ElastiSearch. I looked at pythonelasticsearch-dsl which offers a nice ORM but I'm not sure how to integrate it with pyramid.
So far I made a "global connection" as per pythonelasticsearch-dsl and expose the connection via an attribute into pyramid's request.
Do you see anything wrong with this code ?!
from elasticsearch_dsl import connections
def _create_es_connection(config):
registry = config.registry
settings = registry.settings
es_servers = settings.get('elasticsearch.' + 'servers', ['localhost:9200'])
es_timeout = settings.get('elasticsearch.' + 'timeout', 20)
registry.es_connection = connections.create_connection(
hosts=es_servers,
timeout=es_timeout)
def get_es_connection(request):
return getattr(request.registry, 'es_connection',
connections.get_connection())
# main
def main(global_config, **settings):
...
config = Configurator(settings=settings)
config.add_request_method(
get_es_connection,
'es',
reify=True)
I use the connection as
#view
request.es ...
If there are any other ways I would appreciate any pointers - thank you.

A few things look weird, but I guess it comes from the copy/paste from your project (missing type cast in settings, connections undefined, etc.)
What you are trying to do is very similar to what you'd do with SQLAlchemy:
https://docs.pylonsproject.org/projects/pyramid_cookbook/en/latest/database/sqlalchemy.html
But according the docs of pythonelasticsearch-dsl you don't even have to bother with all that, since the lib allows you define a global default connection:
https://elasticsearch-dsl.readthedocs.io/en/latest/configuration.html#default-connection

Related

Set Spark Config property for spark-testing-base

While I was trying to use spark-testing-base in Python, I needed to test a function which writes on a Postgres DB.
To do so is necessary to provide to the Spark Session the Driver to connect to Posgtres; to achieve that I first tried to override the getConf() method (as reported in the comment Override this to specify any custom configuration.). But apparently it doesn't work. Probably I'm not passing the value with the required syntax or whatever but after many attempts I anyway get the error java.lang.ClassNotFoundException: org.postgresql.Driver (typical of when the Driver Jar was not correctly downloaded through the conf parameter).
Attempted getConf override:
def getConf(self):
return ("spark.jars.packages", "org.postgresql:postgresql:42.1.1")
def getConf(self):
return {"spark.jars.packages", "org.postgresql:postgresql:42.1.1"}
def getConf(self):
return SparkConf()\
.setMaster("local[*]")\
.setAppName("test")\
.set("spark.jars.packages", "org.postgresql:postgresql:42.1.1")
So I even tried to Override the setUp() method like that:
def setUp(self):
try:
from pyspark.sql import Session
self.session = Session.Builder.config("spark.jars.packages", "org.postgresql:postgresql:42.1.1")
self.sqlCtx = self.session._wrapped
except Exception:
self.sqlCtx = SQLContext(self.sc)
But still no luck. So what I am doing wrong? How am I supposed to override the getConf() method?
Not exactly sure how to do this in python. In scala, using sbt, it is quite straight forward. But anyways, the System.setProperty("spark.jars.packages", "org.postgresql:postgresql:42.1.1") method found here: https://github.com/holdenk/spark-testing-base/issues/187 worked for me.
So I would rec looking up how to do that with python + spark.
It was necessary to override the setUpClass method:
#classmethod
def setUpClass(cls):
"""Setup a basic Spark context for testing"""
class_name = cls.__name__
conf = SparkConf().set("spark.jars.packages", "org.postgresql:postgresql:42.1.1")
cls.sc = SparkContext(cls.getMaster(), appName=class_name, conf=conf)
quiet_py4j()
And in this way is then possible to pass to the Spark test library external jars.
Credits to Leonardo Noleto: https://github.com/holdenk/spark-testing-base/issues/281#event-2200108290

Configuring eulexistdb with python bringing errors in django setting module

I have following code written in python in order to communicate with ExistDB using eulexistdb module.
from eulexistdb import db
class TryExist:
def __init__(self):
self.db = db.ExistDB(server_url="http://localhost:8899/exist")
def get_data(self, query):
result = list()
qresult = self.db.executeQuery(query)
hits = self.db.getHits(qresult)
for i in range(hits):
result.append(str(self.db.retrieve(qresult, i)))
return result
query = '''
let $x:= doc("/db/sample/books.xml")
return $x/bookstore/book/author/text()
'''
a = TryExist()
response = a.get_data(query)
print response
I am amazed that this code runs fine in Aptana Studio 3 giving me the output I want, but when running from other IDE or using command "python.exe myfile.py" brings following error:
django.core.exceptions.ImproperlyConfigured: Requested setting EXISTDB_TIMEOUT, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.
I used my own localsetting.py to solve the problem using following code:
import os
# must be set before importing anything from django
os.environ['DJANGO_SETTINGS_MODULE'] = 'localsettings'
... writing link for existdb here...
Then I get error as:
django.core.exceptions.ImproperlyConfigured: The SECRET_KEY setting must not be empty.
How do I configure the setting in Django to suit for ExistDB? Help me here please..
Never Mind. I found the answer with little research from this site. What I did was created a localsetting.py file with following configurations.
EXISTDB_SERVER_USER = 'user'
EXISTDB_SERVER_PASSWORD = 'admin'
EXISTDB_SERVER_URL = "http://localhost:8899/exist"
EXISTDB_ROOT_COLLECTION = "/db"
and in my main file myfile.py I used :
from localsettings import EXISTDB_SERVER_URL
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'localsettings.py'
and In the class TryExist I changed in __ init __() as:
def __init__(self):
self.db = db.ExistDB(server_url=EXISTDB_SERVER_URL)
PS: Using only os.environ['DJANGO_SETTINGS_MODULE'] = 'localsettings' brings the django.core.exceptions.ImproperlyConfigured: The SECRET_KEY setting must not be empty..
The reason your code works in an IDE but not at the command line is probably that you have a difference in what Python environments are used to run your code.
I've done a couple of tests:
Virtualenv with eulexistdb installed but not Django. eulexistdb tries to load django.conf but fails and so does not try to get its configuration from a Django configuration. Ultimately, your code runs without error.
Virtualenv with 'eulexistdb*and* Django:eulexistdbtries to loaddjango.conf` and succeed. I then tries to get is configuration from the Django configuration but fails. I get the same error you describe in your question.
To prevent the error in the presence of a Django installation, the problem can be fixed by adding a Django configuration like you did in your accepted self-answer. But if the code you are writing does not otherwise use Django, that's a bit of a roundabout way to get your code to run. The most direct way to fix the problem is to simply add a timeout parameter to the code that creates the ExistDB instance:
self.db = db.ExistDB(
server_url="http://localhost:8080/exist", timeout=None)
If you do this, then there won't be any error. Setting the timeout to None leaves the default behavior in place but prevents eulexistdb from looking for a Django configuration.

Implementing Sqlalchemy beaker caching in pyramid framework

As per the example provided by sqlalchemy documentation to cache a sqlalchemy query we are suppose to do this
from caching_query import FromCache
# load Person objects. cache the result under the namespace "all_people".
print "loading people...."
people = Session.query(Person).options(FromCache("default", "all_people")).all()
I have the following configuration for beaker in development.ini
cache.regions = day, hour, minute, second
cache.type = file
cache.data_dir = %(here)s/cache/sess/data
cache.lock_dir = %(here)s/cache/sess/lock
cache.second.expire = 1
cache.minute.expire = 60
cache.hour.expire = 3600
cache.day.expire = 86400
When i use the above example code in my application data is not cached in the cache folder, so i am assuming memory based caching is the default, Is it possible to switch sqlalchemy cache type to file based? or am i getting it wrong?
Your question is missing some details, but let me try:
the first parameter passed to FromCache() is a name of a Beaker cache region, it should match one of the configured regions, which is not the case here. Or perhaps you configure default region in the code (I'd expect BeakerException being thrown if region is unknown)?
you need pyramid_beaker module installed and included in Pyramid's project configuration. I suggest you follow pyramid_beaker manual's Setup section.
you need some extra code in __init__.py of your application in order to propagate .ini file settings to Beaker. This is described in Beaker cache region support section of the manual.
And here's a working sample from my current project, configuring both Beaker-based sessions and caching (all irrelevant parts removed):
from pyramid.config import Configurator
from pyramid_beaker import set_cache_regions_from_settings
from pyramid_beaker import session_factory_from_settings
def main(global_config, **settings):
# Configure Beaker caching/sessions
set_cache_regions_from_settings(settings)
session_factory = session_factory_from_settings(settings)
config = Configurator(settings=settings)
config.set_session_factory(session_factory)
config.include('pyramid_beaker')
config.add_static_view('static', 'static', cache_max_age=3600)
config.add_route('home', '/')
config.scan()
return config.make_wsgi_app()

Pyramid error: AttributeError: No session factory registered

I'm trying to learn Pyramid and having problems getting the message flash to work. I'm totally new but read the documentation and did the tutorials.
I did the tutorial on creating a wiki(tutorial here, Code here ). It worked great and was pretty easy so I decided to try to apply the flash message I saw in todo list tutorial I did(tutorial here, full code is in a single file at the bottom of the page). Basically when a todo list is created, the page is refreshed with a message saying 'New task was successfully added!'. I wanted to do that everytime someone updated a wiki article in the wiki tutorial.
So I re-read the session section in the documentaion and it says I really just need to do this:
from pyramid.session import UnencryptedCookieSessionFactoryConfig
my_session_factory = UnencryptedCookieSessionFactoryConfig('itsaseekreet')
from pyramid.config import Configurator
config = Configurator(session_factory = my_session_factory)
then in my code I need to add: request.session.flash('New wiki was successfully added!') but I get a error everytime: Pyramid error: AttributeError: No session factory registered
Here's my function(its the exact same from the tutorial except for the request.session.flash part):
#view_config(route_name='edit_page', renderer='templates/edit.pt', permission='edit')
def edit_page(request):
name = request.matchdict['pagename']
page = DBSession.query(Page).filter_by(name=name).one()
if 'form.submitted' in request.params:
page.data = request.params['body']
DBSession.add(page)
request.session.flash('page was successfully edited!')
return HTTPFound(location = request.route_url('view_page',
pagename=name))
return dict(
page=page,
save_url = request.route_url('edit_page', pagename=name),
logged_in=authenticated_userid(request),
)
(note: One thing that I think I could be doing wrong is in the todo example, all the data is in one file, but in the wiki example there are several files..I added my session imports in the view.py file because the flash message is being generated by the view itself).
What am I doing wrong? Any suggestions?
The code you provided is just an example, of course you need to apply it in a correct place. In Pyramid you should (in simple cases ;) have only 1 place in your code where you create just 1 Configurator instance, in the tutorial it is in the main function. A Configurator does not do anything by itself, except create a WSGI application with make_wsgi_app.
Thus, to add sessions there, modify wiki2/src/views/tutorial/__init__.py as follows:
from pyramid.config import Configurator
from sqlalchemy import engine_from_config
from pyramid.session import UnencryptedCookieSessionFactoryConfig
from .models import DBSession
def main(global_config, **settings):
""" This function returns a Pyramid WSGI application.
"""
engine = engine_from_config(settings, 'sqlalchemy.')
DBSession.configure(bind=engine)
my_session_factory = UnencryptedCookieSessionFactoryConfig('itsaseekreet')
config = Configurator(settings=settings, session_factory=my_session_factory)
...

what is the best way to make my folders invisible / restricted in twistd?

a fews days ago, i tried to learn the python twisted..
and this is how i make my webserver :
from twisted.application import internet, service
from twisted.web import static, server, script
from twisted.web.resource import Resource
import os
class NotFound(Resource):
isLeaf=True
def render(self, request):
return "Sorry... the page you're requesting is not found / forbidden"
class myStaticFile(static.File):
def directoryListing(self):
return self.childNotFound
#root=static.file(os.getcwd()+"/www")
root=myStaticFile(os.getcwd()+"/www")
root.indexNames=['index.py']
root.ignoreExt(".py")
root.processors = {'.py': script.ResourceScript}
root.childNotFound=NotFound()
application = service.Application('web')
sc = service.IServiceCollection(application)
i = internet.TCPServer(8080, server.Site(root))##UndefinedVariable
i.setServiceParent(sc)
in my code, i make an instance class for twisted.web.static.File and override the directoryListing.
so when user try to access my resource folder (http://localhost:8080/resource/ or http://localhost:8080/resource/css), it will return a notFound page.
but he can still open/read the http://localhost:8080/resource/css/style.css.
it works...
what i want to know is.. is this the correct way to do that???
is there another 'perfect' way ?
i was looking for a config that disable directoryListing like root.dirListing=False. but no luck...
Yes, that's a reasonable way to do it. You can also use twisted.web.resource.NoResource or twisted.web.resource.Forbidden instead of defining your own NotFound.

Categories