does anybody have an idea what is happening? I'm not so advanced in coding to understand the reason behind the following behavior:
I have a short python script connecting to Snowflake and getting a DDL definition from some table. Having the DDL I want to parse it using simple-ddl-parser.
import snowflake.connector
from simple_ddl_parser import DDLParser
ctx = snowflake.connector.connect(
user='xxx',
account='xxx',
warehouse='xxx',
database='xxx',
schema ='xxx',
role = 'xxx',
authenticator="externalbrowser")
cs = ctx.cursor()
try:
cs.execute("select get_ddl('table', 'xxx');")
ddl = cs.fetchone()
# print(ddl[0])
result = DDLParser(ddl[0]).run()
finally:
cs.close()
ctx.close()
When I'm commenting out the line result = DDLParser(ddl[0]).run() and simply printing out the ddl on the screen everything is working fine. In terminal I'm getting the information that my browser will open etc. (because of externalbrowser authentication) and I can see the DDL.
However when I'm starting to use the DDLParser to parse the DDL I'm getting a lot of information from snowflake-connector. It looks like some debug info or all the details about connection etc.:
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
connection.py: 557:closed
telemetry.py: 151:Closing telemetry client.
telemetry.py: 116:Sending 1 logs to telemetry. Data is {'logs': [{'message': {'type': 'client_time_consume_first_result', 'source': 'PythonConnector', 'query_id': '01a7449c-0c03-a272-0000-ce55d70a2ffe', 'value': 493}, 'timestamp': '1664357536706'}]}.
network.py:1147:Session status for SessionPool 'xxx', SessionPool 1/1 active sessions
network.py: 827:remaining request timeout: 5, retry cnt: 1
network.py: 808:Request guid: 8524f554-bc67-4209-af13-d87249a7fae6
network.py:1006:socket timeout: 60
connectionpool.py: 456:https://xxx:443 "POST /telemetry/send?request_guid=8524f554-bc67-4209-af13-d87249a7fae6 HTTP/1.1" 200 86
network.py:1032:SUCCESS
network.py:1152:Session status for SessionPool 'xxx', SessionPool 0/1 active sessions
network.py: 715:ret[code] = None, after post request
telemetry.py: 140:Successfully uploading metrics to telemetry.
connection.py: 560:No async queries seem to be running, deleting session
network.py:1147:Session status for SessionPool 'xxx', SessionPool 1/1 active sessions
network.py: 827:remaining request timeout: 5, retry cnt: 1
network.py: 808:Request guid: 33d96b5f-f8db-4fbc-b88f-54a5c330615c
network.py:1006:socket timeout: 60
connectionpool.py: 456:https://xxx:443 "POST /session?delete=true&request_guid=33d96b5f-f8db-4fbc-b88f-54a5c330615c HTTP/1.1" 200 76
network.py:1032:SUCCESS
network.py:1152:Session status for SessionPool 'xxx', SessionPool 0/1 active sessions
network.py: 715:ret[code] = None, after post request
connection.py: 571:Session is closed
connection.py: 548:Rest object has been destroyed, cannot close session
_api.py: 172:Attempting to acquire lock 1617266024480 on C:\Users\xxx\AppData\Local\Snowflake\Caches\ocsp_cache.lock
_api.py: 176:Lock 1617266024480 acquired on C:\Users\xxx\AppData\Local\Snowflake\Caches\ocsp_cache.lock
_api.py: 209:Attempting to release lock 1617266024480 on C:\Users\xxx\AppData\Local\Snowflake\Caches\ocsp_cache.lock
_api.py: 212:Lock 1617266024480 released on C:\Users\xxx\AppData\Local\Snowflake\Caches\ocsp_cache.lock
I'm getting my DDL successfully parsed but why I'm getting all this additional information displayed on the screen?
Is snowflake-connector and simple-ddl-parser somehow interfering with each other? Is it possible that some variables or functions are named the same or something in the simple-ddl-parser code is switching on some debug info from snowflake-connector? I have no idea... but I don't want to see this "debug" info on my screen but want to use simple-ddl-parser.
I will be happy to get any feedback! Thx!
Thank you #Sergiu!
Your solution is working!
So again to understand... if I'm using multiple python packages in my code and one of them is changing the logging level to e.g. DEBUG, I will get debug info from all of my other packages as well? So this one setting is impacting the logging behavior in my whole code? Interesting.
I have found the way how to change the logging level for my whole program independent from the settings in a particular package (put it at the beginning of your code):
import logging
logging.basicConfig(level=logging.CRITICAL)
Ps.
I'm using another function from simple-ddl-parser (below), but it seems it's not using any parameters as log_level. I will ask the creator if it would be a good idea to add this feature there also and for now I will add above code to my program (I think WARNING level is also sufficient).
from simple_ddl_parser import parse_from_file
result = parse_from_file('tests/sql/test_one_statement.sql')
print(result)
The DDLParser has default log level set to DEBUG as I can see here.
You can set the DDLParser logging to CRITICAL and these messages won't appear on the screen anymore.
So, in your existing script add:
import logging
...
result = DDLParser(ddl[0], log_level=logging.CRITICAL).run()
See more information here.
When the loadbalancer in front of the tested https web site fails-over, this generates some HTTPError 500 for few seconds, then Locust hangs:
The response time graph stops (empty graph)
The total requests per second turns to a wrong green flat line.
If I just stop & start the test, locust restart monitoring properly the response time.
We can see some HTTPError 500 in the Failures tab
Is this a bug ?
How can I make sure Locust kills and restarts users, either manually or when timeout ?
My attempt to regularly "RescheduleTaskImmediately" did not help.
My locustfile.py:
#!/usr/bin/env python
import time
import random
from locust import HttpUser, task, between, TaskSet
from locust.exception import InterruptTaskSet, RescheduleTaskImmediately
URL_LIST = [
"/url1",
"/url2",
"/url3",
]
class QuickstartTask(HttpUser):
wait_time = between(0.1, 0.5)
connection_timeout = 15.0
network_timeout = 20.0
def on_start(self):
# Required to use the http_proxy & https_proxy
self.client.trust_env = True
print("New user started")
self.client.timeout = 5
self.client.get("/")
self.client.get("/favicon.ico")
self.getcount = 0
def on_stop(self):
print("User stopped")
#task
def track_and_trace(self):
url = URL_LIST[random.randrange(0,len(URL_LIST))]
self.client.get(url, name=url[:50])
self.getcount += 1
if self.getcount > 50 and (random.randrange(0,1000) > 990 or self.getcount > 200):
print(f'Reschedule after {self.getcount} requests.')
self.client.cookies.clear()
self.getcount = 0
raise RescheduleTaskImmediately
Each locust runs in a thread. If the thread gets blocked, it doesn't take further actions.
self.client.get(url, name=url[:50], timeout=.1)
Something like this is probably what you need, potentially with a try/except to do something different when you get an http timeout exception.
In my experience, the problem you're describing with the charts on the Locust UI has nothing to do with the errors your Locust users are hitting. I've seen this behavior if you have multiple people attempting to access the Locust UI simultaneously. Locust uses Flask to create and serve the UI. Flask by itself (at the way Locust is using it) doesn't do well with multiple connections.
If Person A starts using Locust UI and starts a test, they'll see stats and everything working fine until Person B loads the Locust UI. Person B will then see things working fine but Person A will experience issues as you describe, with the test seemingly stalling and charts not updating properly. In that state, sometimes starting a new test would resolve it temporarily, other times you need to refresh. Either way, A and B would be fighting between each other for a working UI.
The solution in this case would be to put Locust behind a reverse proxy using something such as Nginx. Nginx then maintains a single connection to Locust and all users connect through Nginx. Locust's UI should then continue to work for all connected users with correctly updating stats and charts.
I am saving a user's database connection. On the first time they enter in their credentials, I do something like the following:
self.conn = MySQLdb.connect (
host = 'aaa',
user = 'bbb',
passwd = 'ccc',
db = 'ddd',
charset='utf8'
)
cursor = self.conn.cursor()
cursor.execute("SET NAMES utf8")
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
I then have the conn ready to go for all the user's queries. However, I don't want to re-connect every time the view is loaded. How would I store this "open connection" so I can just do something like the following in the view:
def do_queries(request, sql):
user = request.user
conn = request.session['conn']
cursor = request.session['cursor']
cursor.execute(sql)
Update: it seems like the above is not possible and not good practice, so let me re-phrase what I'm trying to do:
I have a sql editor that a user can use after they enter in their credentials (think of something like Navicat or SequelPro). Note this is NOT the default django db connection -- I do not know the credentials beforehand. Now, once the user has 'connected', I would like them to be able to do as many queries as they like without me having to reconnect every time they do this. For example -- to re-iterate again -- something like Navicat or SequelPro. How would this be done using python, django, or mysql? Perhaps I don't really understand what is necessary here (caching the connection? connection pooling? etc.), so any suggestions or help would be greatly appreciated.
You could use an IoC container to store a singleton provider for you. Essentially, instead of constructing a new connection every time, it will only construct it once (the first time ConnectionContainer.connection_provider() is called) and thereafter it will always return the previously constructed connection.
You'll need the dependency-injector package for my example to work:
import dependency_injector.containers as containers
import dependency_injector.providers as providers
class ConnectionProvider():
def __init__(self, host, user, passwd, db, charset):
self.conn = MySQLdb.connect(
host=host,
user=user,
passwd=passwd,
db=db,
charset=charset
)
class ConnectionContainer(containers.DeclarativeContainer):
connection_provider = providers.Singleton(ConnectionProvider,
host='aaa',
user='bbb',
passwd='ccc',
db='ddd',
charset='utf8')
def do_queries(request, sql):
user = request.user
conn = ConnectionContainer.connection_provider().conn
cursor = conn.cursor()
cursor.execute(sql)
I've hardcoded the connection string here, but it is also possible to make it variable depending on a changeable configuration. In that case you could also create a container for the configuration file and have the connection container read its config from there. You then set the config at runtime. As follows:
import dependency_injector.containers as containers
import dependency_injector.providers as providers
class ConnectionProvider():
def __init__(self, connection_config):
self.conn = MySQLdb.connect(**connection_config)
class ConfigContainer(containers.DeclarativeContainer):
connection_config = providers.Configuration("connection_config")
class ConnectionContainer(containers.DeclarativeContainer):
connection_provider = providers.Singleton(ConnectionProvider, ConfigContainer.connection_config)
def do_queries(request, sql):
user = request.user
conn = ConnectionContainer.connection_provider().conn
cursor = conn.cursor()
cursor.execute(sql)
# run code
my_config = {
'host':'aaa',
'user':'bbb',
'passwd':'ccc',
'db':'ddd',
'charset':'utf8'
}
ConfigContainer.connection_config.override(my_config)
request = ...
sql = ...
do_queries(request, sql)
I don't see why do you need a cached connection here and why not just reconnect on every request caching user's credentials somewhere, but anyway I'll try to outline a solution that might fit your requirements.
I'd suggest to look into a more generic task first - cache something between subsequent requests your app needs to handle and can't serialize into django's sessions.
In your particular case this shared value would be a database connection (or multiple connections).
Lets start with a simple task of sharing a simple counter variable between requests, just to understand what's actually happening under the hood.
Amaizingly but neither answer has mentioned anything regarding a web server you might use!
Actually there are multiple ways to handle concurrent connections in web apps:
Having multiple processes, every request comes into one of them at random
Having multiple threads, every request is handled by a random thread
p.1 and p.2 combined
Various async techniques, when there's a single process + event loop handling requests with a caveat that request handlers shouldn't block for a long time
From my own experience p.1-2 are fine for majority of typical webapps.
Apache1.x could only work with p.1, Apache2.x can handle all of 1-3.
Lets start with the following django app and run a single-process gunicorn webserver.
I'm going to use gunicorn because it's fairly easy to configure it unlike apache (personal opinion :-)
views.py
import time
from django.http import HttpResponse
c = 0
def main(self):
global c
c += 1
return HttpResponse('val: {}\n'.format(c))
def heavy(self):
time.sleep(10)
return HttpResponse('heavy done')
urls.py
from django.contrib import admin
from django.urls import path
from . import views
urlpatterns = [
path('admin/', admin.site.urls),
path('', views.main, name='main'),
path('heavy/', views.heavy, name='heavy')
]
Running it in a single process mode:
gunicorn testpool.wsgi -w 1
Here's our process tree - there's only 1 worker that would handle ALL requests
pstree 77292
-+= 77292 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
\--- 77295 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
Trying to use our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 3
As you can see you can easily share the counter between subsequent requests.
The problem here is that you can only serve a single request in parallel. If you request for /heavy/ in one tab, / won't work until /heavy is done
Lets now use 2 worker processes:
gunicorn testpool.wsgi -w 2
This is how the process tree would look like:
pstree 77285
-+= 77285 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
|--- 77288 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
\--- 77289 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
Testing our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 1
The first two requests has been handled by the first worker process, and the 3rd one - by the second worker process that has its own memory space so you see 1 instead of 3.
Notice your output may differ because process 1 and 2 are selected at random. But sooner or later you'll hit a different process.
That's not very helpful for us because we need to handle multiple concurrent requests and we need to somehow get our request handled by a specific process that can't be done in general case.
Most pooling technics coming out of the box would only cache connections in the scope of a single process, if your request gets served by a different process - a NEW connection would need to be made.
Lets move to threads
gunicorn testpool.wsgi -w 1 --threads 2
Again - only 1 process
pstree 77310
-+= 77310 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
\--- 77313 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
Now if you run /heavy in one tab you'll still be able to query / and your counter will be preserved between requests!
Even if the number of threads is growing or shrinking depending on your workload it should still work fine.
Problems: you'll need to synchronize access to the shared variable like this using python threads synchronization technics (read more).
Another problem is that the same user may need to to issue multiple queries in parallel - i.e. open multiple tabs.
To handle it you can open multiple connections on the first request when you have db credentials available.
If a user needs more connections than your app might wait on lock until a connection becomes available.
Back to your question
You can create a class that would have the following methods:
from contextlib import contextmanager
class ConnectionPool(object):
def __init__(self, max_connections=4):
self._pool = dict()
self._max_connections = max_connections
def preconnect(self, session_id, user, password):
# create multiple connections and put them into self._pool
# ...
#contextmanager
def get_connection(sef, session_id):
# if have an available connection:
# mark it as allocated
# and return it
try:
yield connection
finally:
# put it back to the pool
# ....
# else
# wait until there's a connection returned to the pool by another thread
pool = ConnectionPool(4)
def some_view(self):
session_id = ...
with pool.get_connection(session_id) as conn:
conn.query(...)
This is not a complete solution - you'll need to somehow delete outdated connections not used for a long time.
If a user comes back after a long time and his connection have been closed, he'll need to provide his credentials again - hopefully it's ok from your app's perspective.
Also keep in mind python threads have its performance penalties, not sure if this is an issue for you.
I haven't checked it for apache2 (too much configuration burden, I haven't used it for ages and generally use uwsgi), but it should work there too - would be happy to hear back from you
if you manage to run it )
And also don't forget about p.4 (async approach) - unlikely will you be able to use it on apache, but it's worth investigation - keywords: django + gevent, django + asyncio. It has its pros/cons and may greatly affect your app implementation so it's hard to suggest any solution without knowing your app requirements in detail
This is not a good idea to do such a thing synchronously in web app context. Remember that your application may needs to work in multi process/thread fashion, and you could not share connection between processes normally. So if you create a connection for your user on a process, there is no guaranty to receive query request on the same one. May be a better idea is to have a single process background worker which handles connections in multiple threads (a thread per session) to make queries on database and retrieve result on web app. Your application should assign a unique ID to each session and the background worker track each thread using session ID. You may use celery or any other task queues supporting async result. So the design would be something like below:
|<--| |<--------------| |<--|
user (id: x) | | webapp | | queue | | worker (thread x) | | DB
|-->| |-->| |-->| |-->|
Also you could create a queue for each user until they have an active session, as a result you could run a separate background process for each session.
I actually shared my solution to this exact issue. What I did here was create a pool of connections that you can specify the max with, and then queued query requests async through this channel. This way you can leave a certain amount of connections open, but it will queue and pool async and keep the speed you are used to.
This requires gevent and postgres.
Python Postgres psycopg2 ThreadedConnectionPool exhausted
I'm no expert in this field, but I believe that PgBouncer would do the job for you, assuming you're able to use a PostgreSQL back-end (that's one detail you didn't make clear). PgBouncer is a connection pooler, which allows you re-use connections avoiding the overhead of connecting on every request.
According to their documentation:
user, password
If user= is set, all connections to the destination database will be done with the specified user, meaning that there will be only one pool for this database.
Otherwise PgBouncer tries to log into the destination database with client username, meaning that there will be one pool per user.
So, you can have a single pool of connections per user, which sounds just like what you want.
In MySQL land, the mysql.connector.pooling module allows you to do some connection pooling, though I'm not sure if you can do per-user pooling. Given that you can set up the pool name, I'm guessing you could use the user's name to identify the pool.
Regardless of what you use, you will likely have occasions where reconnecting is unavoidable (a user connects, does a few things, goes away for a meeting and lunch, comes back and wants to take more action).
I am just sharing my knowledge over here.
Install the PyMySQL to use the MySql
For Python 2.x
pip install PyMySQL
For Python 3.x
pip3 install PyMySQL
1. If you are open to use Django Framework then it's very easy to run the SQL query without any re-connection.
In setting.py file add the below lines
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'test',
'USER': 'test',
'PASSWORD': 'test',
'HOST': 'localhost',
'OPTIONS': {'charset': 'utf8mb4'},
}
}
In views.py file add these lines to get the data. You can customized your query according to your need
from django.db import connection
def connect(request):
cursor = connection.cursor()
cursor.execute("SELECT * FROM Tablename");
results = cursor.fetchall()
return results
You will get the desire results.
Click here for more information about it
2. For python Tkinter
from Tkinter import *
import MySQLdb
db = MySQLdb.connect("localhost","root","root","test")
# prepare a cursor object using cursor() method
cursor = db.cursor()
cursor.execute("SELECT * FROM Tablename")
if cursor.fetchone() is not None:
print("In If")
else:
print("In Else")
cursor.close()
Refer this for more information
PS: You can check this link for your question to reusing a DB connection for later.
How to enable MySQL client auto re-connect with MySQLdb?
I'm new-ish to Python and Pyramid so apologies if I'm trying to do the wrong thing here.
I'm currently running a Pyramid application inside of a Docker container, with the following entrypoint:
pipenv run pserve development.ini --reload
This serves my application correctly and I can then edit code directly inside the container. This all works fine. I then attempted to register this service to an instance of Netflix's Eureka Service Registry so I could then proxy to this service with a gateway (such as Netflix Zuul). I used Eureka's REST API to achieve this and again, this all worked fine.
However, when I go to shutdown the Pyramid service, I would like to send an additional HTTP request to Eureka to DELETE the registered service - This is ideal so I don't have to wait for expiry on Eureka and there will never be a window where Zuul might be proxying requests to a downed service.
The problem is I cannot reliably find a way to run a shutdown event in Pyramid. Basically, when i stop the Docker container, the service receives exit code 137 (which I believe is the result of a kill -9) and nothing ever happens. I've attempted using atexit as well as signal event such as SIGKILL, SIGTERM, SIGINT, etc and nothing ever happens. I've also tried running pserve without a --reload flag but that still doesn't work.
Is there anyway for me to reliably get this DELETE event to send right before the server and docker container shuts down?
This is the development.ini file I'm using:
[app:main]
use = egg:my-app
pyramid.reload_templates = true
pyramid.includes =
pyramid_debugtoolbar
pyramid_redis_sessions
pyramid_tm
debugtoolbar.hosts = 0.0.0.0/0
sqlalchemy.url = mysql://root:root#mysql/keyblade
my-app.secret = secretkey
redis.sessions.secret = secretkey
redis.sessions.host = redis
redis.sessions.port = 6379
[server:main]
use = egg:waitress#main
listen = 0.0.0.0:8000
# Logging Configuration
[loggers]
keys = root, debug, sqlalchemy.engine.base.Engine
[logger_debug]
level = DEBUG
handlers =
qualname = debug
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = INFO
handlers = console
[logger_sqlalchemy.engine.base.Engine]
level = INFO
handlers =
qualname = sqlalchemy.engine.base.Engine
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s][%(threadName)s] %(message)s
There is no shutdown protocol/api for a WSGI application (there technically isn't one for startup either despite people using/hoping that application-creation is close to the same time the server starts handling requests). You may be able to find a WSGI server that provides some hooks (for example gunicorn provides http://docs.gunicorn.org/en/stable/settings.html#worker-exit), but the better approach is to have your upstream handle your servers disappearing via health checks. Expecting that you'll be able to send a DELETE reliably when things go wrong is very unlikely to be a robust solution.
However, when I go to shutdown the Pyramid service, I would like to send an additional HTTP request to Eureka to DELETE the registered service - This is ideal so I don't have to wait for expiry on Eureka and there will never be a window where Zuul might be proxying requests to a downed service.
This is a web server specific and Pyramid cannot provide abstractions for it, as "your mileage may vary". Web server workers itself cannot know when they killed, as it is externally forced.
I would take an approach where you have an external process to monitor to web server and then perform clean up actions when it detects the web server is no longer running. The definition of no longer running could be "no a single process alive". Then you just have a background scheduled job (cron) to check for this condition. Or even better, have it on another monitoring instance that sits on a different server and can act in the situation in the server itself goes down.
When i use flash() in #app.before_request, I get what seems like a random number of repeated entries. Refreshing the page over and over will give me between 1 and 4 repeated messages.
There aren't any redirects.
My code is simply:
if app.config['INSTANCE'] == 'DEV':
flash("This data is from the development DB")
Alternatively, I wasn't able to figure out how to access/modify the array of messages that flash() seems to append to other than in the template via get_flashed_messages(). Anyone know how?
You can access the list of waiting messages via flashes = session.get('_flashes', []). You can view the code on Github
On the note of why you're getting a few messages flashing, it's because you're making multiple requests (but probably don't know it). Your web-browser is probably asking for favicon.ico which is a request, so causes a flash, etc. If you're running in debug mode, your console window will show all the requests being handled. For example loading a simple flask example in Chrome causes this to show:
127.0.0.1 - - [21/Jun/2013 16:35:05] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [21/Jun/2013 16:35:05] "GET /favicon.ico HTTP/1.1" 404 -
One is my request to view the homepage, the other is Chrome asking for the favicon (and it being told it doesn't exist).