Creating a Dask client results in endless loop of errors - python

When I run this code:
from dask.distributed import Client
client = Client(n_workers = 2, threads_per_worker = 2, memory_limit = '2GB', silence_logs='error')
client
I get endless loop of these errors:
tornado.application - ERROR - Exception in callback <bound method Nanny.memory_monitor of <Nanny: None, threads: 2>>
Traceback (most recent call last):
File "C:\Users\szlau\anaconda3\lib\site-packages\tornado\ioloop.py", line 907, in _run
return self.callback()
File "C:\Users\szlau\anaconda3\lib\site-packages\distributed\nanny.py", line 414, in memory_monitor
process = self.process.process
Note: I have also tried reinstalling the whole Anaconda environment and the error still persists.
I really have no idea what can be causing this or how to eliminate this.
Any suggestions are welcomed!

At a guess, you are running your code in a script. That means that when dask creates the cluster and associated worker processes, those will, in turn, also runt he script and also attempt to create new clusters and processes.
You should guard your client and cluster creation with
if __name__ == "__main__":
client = Client(...)

Related

apscheduler: returned more than one DjangoJobExecution -- it returned 2

In my proyect scheduler return this error in the execute job, help me please
this is my error in cosole, then execute the program
Error notifying listener
Traceback (most recent call last):
File "C:\Users\angel\project\venv\lib\site-packages\apscheduler\schedulers\base.py", line 836, in _dispatch_event
cb(event)
File "C:\Users\angel\project\venv\lib\site-packages\django_apscheduler\jobstores.py", line 53, in handle_submission_event
DjangoJobExecution.SENT,
File "C:\Users\angel\project\venv\lib\site-packages\django_apscheduler\models.py", line 157, in atomic_update_or_create
job_id=job_id, run_time=run_time
File "C:\Users\angel\project\venv\lib\site-packages\django\db\models\query.py", line 412, in get
(self.model._meta.object_name, num)
django_apscheduler.models.DjangoJobExecution.MultipleObjectsReturned: get() returned more than one DjangoJobExecution -- it returned 2!
This is my code
class Command(BaseCommand):
help = "Runs apscheduler."
scheduler = BackgroundScheduler(timezone=settings.TIME_ZONE, daemon=True)
scheduler.add_jobstore(DjangoJobStore(), "default")
def handle(self, *args, **options):
self.scheduler.add_job(
delete_old_job_executions,
'interval', seconds=5,
id="delete_old_job_executions",
max_instances=1,
replace_existing=True
)
try:
logger.info("Starting scheduler...")
self.scheduler.start()
except KeyboardInterrupt:
logger.info("Stopping scheduler...")
self.scheduler.shutdown()
logger.info("Scheduler shut down successfully!")
Not sure if you're still having this issue. I have same error and found your question. Turned out this happens only in dev environment.
Because python3 manage.py runserver starts two processes by default, the code
seems to register two job records and find two entries at next run time.
With --noreload option, it starts only one scheduler thread and works well. As name implies, it won't reload changes you make automatically though.
python3 manage.py runserver --noreload
not sure if you're still having this issue. i think you can use socket , socket can use this issue.
look this enter image description here

kochat in use RuntimeError: main thread is not in main loop

kochat is a Korean chatbot, and I ran into a problem while practicing it.
github imformation : https://github.com/hyunwoongko/kochat
It is environment setting
python3.8
pip install kochat
JPype is reinstalled ues JPype1-1.2.0-cp38-cp38-win_amd64.whl <-download
pytorch is same ver with cuda 11.1
start code
### from kochat.proc import DistanceClassifier ###
from kochat.data import Dataset
from kochat.proc import GensimEmbedder, DistanceClassifier
from kochat.model import intent, embed
from kochat.loss import CenterLoss
dataset = Dataset(ood=True)
emb = GensimEmbedder(model=embed.Word2Vec())
# 프로세서 생성
clf = DistanceClassifier(
model=intent.CNN(dataset.intent_dict),
loss=CenterLoss(dataset.intent_dict)
)
# 되도록이면 DistanceClassifier는 Margin 기반의 Loss 함수를 이용해주세요
# 현재는 CenterLoss, COCOLoss, Cosface, GausianMixture 등의
# 거리기반 Metric Learning 전용 Loss함수를 지원합니다.
# 모델 학습
clf.fit(dataset.load_intent(emb))
# 모델 추론 (인텐트 분류)
clf.predict(dataset.load_predict("오늘 서울 날씨 어떨까", emb))
error message
C:\projectkyc\kochat3.8\Scripts\python.exe "C:/projectkyc/kochat3.8/test3.8/!from kochat.proc import DistanceClassifier.py"`enter code here`
Exception ignored in: <function Image.__del__ at 0x000001E7E1566D30>
Traceback (most recent call last):
File "c:\users\user\appdata\local\programs\python\python38\lib\tkinter\__init__.py", line 4014, in __del__
Exception ignored in: <function Variable.__del__ at 0x000001E7E154B430>
Traceback (most recent call last):
File "c:\users\user\appdata\local\programs\python\python38\lib\tkinter\__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x000001E7E1566D30>
Traceback (most recent call last):
File "c:\users\user\appdata\local\programs\python\python38\lib\tkinter\__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (os_windows_x86.cpp:144), pid=11208, tid=0x00000000000037f4
# guarantee(result == EXCEPTION_CONTINUE_EXECUTION) failed: Unexpected result from topLevelExceptionFilter
#
# JRE version: Java(TM) SE Runtime Environment (8.0_281-b09) (build 1.8.0_281-b09)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.281-b09 mixed mode windows-amd64 compressed oops)
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
# An error report file with more information is saved as:
# C:\projectkyc\kochat3.8\test3.8\hs_err_pid11208.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Exception ignored in: <function Image.__del__ at 0x000001E7E1566D30>
Traceback (most recent call last):
File "c:\users\user\appdata\local\programs\python\python38\lib\tkinter\__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Process finished with exit code 1
If I haven't posted it or if I have the information I need, I will add it immediately upon confirmation if requested.
This is such an annoying issue and to me setting matplotlib.use('Agg') does not cut it. I dont want to comprise on this.
The only thing that seems to do it somewhat consistently is having to close the figure before creating a new plot via plt.close(). This is annoying and if i forget it does this.
this error
Matplotlib - Tcl_AsyncDelete: async handler deleted by the wrong thread?
same Question But did not know where to apply
The places to apply are kochat\utils\visualizer.py and kochat\proc\utils\visualizer.py
to be. In the
from matplotlib import pyplot as plt
part of these two places,
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot as plt
The problem is solved.

JRC JJ1000 + dronekit >> ERROR:dronekit.mavlink:Exception in MAVLink input loop

I am trying to connect JRC JJ1000 drone using dronekit + python.
when executing the connect command:
dronekit.connect('com3', baud=115200, heartbeat_timeout=30)
I am getting the following error:
ERROR:dronekit.mavlink:Exception in MAVLink input loop
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\dronekit\mavlink.py", line 211, in mavlink_thread_in
fn(self)
File "C:\Python37\lib\site-packages\dronekit\__init__.py", line 1371, in listener
self._heartbeat_error)
dronekit.APIException: No heartbeat in 5 seconds, aborting.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python37\lib\site-packages\dronekit\__init__.py", line 3166, in connect
vehicle.initialize(rate=rate, heartbeat_timeout=heartbeat_timeout)
File "C:\Python37\lib\site-packages\dronekit\__init__.py", line 2275, in initialize
raise APIException('Timeout in initializing connection.')
dronekit.APIException: Timeout in initializing connection.
I left no store unturned but no progress. I also tried both Python 2.7 and 3.7 with same result.
I have been getting the same error. I am using some custom code in a docker container to run simulations with dronekit and ArduPilot. The error is intermittent. So far it seems like the only way to get the error to stop is to:
Close all docker containers.
Open windows task manager and wait for vmmem to lower memory usage (5-10m).
Try again.
Maybe the problems are related somehow. To me it seems like the connection might be in use by a previous instance and it was not properly close. Since waiting for vmmem to free up resources appears to fix it. I would prefer a better solution if anyone finds one!
We are using python code like this to connect:
from dronekit import connect
...
# try to connect 5 times
while connected == False and fails < 5:
try:
vehicle = connect(connection_string, wait_ready=True)
except:
fails += 1
time.sleep(3)
print("Failed to connect to local mavlink sleeping for 3 seconds")
else:
connected = True
Where the connection_string is of the form:
"tcp:host:port"
Also, the documentation states "If the baud rate is not set correctly, connect may fail with a timeout error. It is best to set the baud rate explicitly." Are you sure that you have the correct baud rate?

Perforce (P4) Python API complains about too many locks

I wrote an application that opens several subprocesses, which initiate connections individually to a Perforce server. After a while I get this error message in almost all of these child-processes:
Traceback (most recent call last):
File "/Users/peter/Desktop/test_app/main.py", line 76, in p4_execute
p4.run_login()
File "/usr/local/lib/python3.7/site-packages/P4.py", line 665, in run_login
return self.run("login", *args, **kargs)
File "/usr/local/lib/python3.7/site-packages/P4.py", line 611, in run
raise e
File "/usr/local/lib/python3.7/site-packages/P4.py", line 605, in run
result = P4API.P4Adapter.run(self, *flatArgs)
P4.P4Exception: [P4#run] Errors during command execution( "p4 login" )
[Error]: "Fatal client error; disconnecting!
Operation 'client-SetPassword' failed.
Too many trys to get lock /Users/peter/.p4tickets.lck."
Does anyone have any idea what could cause this? I open my connections properly and all double checked on all source locations that I disconnect from the server properly via disconnect.
Only deleting the .p4tickets.lck manually works until the error comes back after a few seconds
The relevant code is here:
https://swarm.workshop.perforce.com/projects/perforce_software-p4/files/2018-1/support/ticket.cc#200
https://swarm.workshop.perforce.com/projects/perforce_software-p4/files/2018-1/sys/filetmp.cc#147
I can't see that there's any code path where the ticket.lck file would fail to get cleaned up without throwing some other error.
Is there anything unusual about the home directory where the tickets file lives? Like, say, it's on a network filer with some latency and some kind of backup process? Or maybe one that doesn't properly enforce file locks between all these subprocesses you're spawning?
How often are your scripts running "p4 login" to refresh and re-write the ticket? Many times a second? If you change them to not do that (e.g. only login if there's not already a ticket) does the problem persist?

ValueError: Unknown type <class 'redis.client.StrictPipeline'>

I develop locally on win10, which is a problem for the usage of the RQ task queue, which only works on linux systems because it requires the ability to fork processes. I'm trying to extend the flask-base project https://github.com/hack4impact/flask-base/tree/master/app which can use RQ. I came across https://github.com/michaelbrooks/rq-win . I love the idea of this repo (If I can get it working it will really simplify my life, since I develop on win 10 -64):
After installing this library
I can queue a job in my views by running something like:
#login_required
#main.route('/selected')
def selected():
messages = 'abcde'
j = get_queue().enqueue(render_png, messages, result_ttl=5000)
return j.get_id()
This returns a job_code correctly.
I changed the code in manage.py to:
from rq_win import WindowsWorker
#manager.command
def run_worker():
"""Initializes a slim rq task queue."""
listen = ['default']
REDIS_URL = 'redis://localhost:6379'
conn = Redis.from_url(REDIS_URL)
with Connection(conn):
# worker = Worker(map(Queue, listen))
worker = WindowsWorker(map(Queue, listen))
worker.work()
When I try to run it with:
$ python -u manage.py run_worker
09:40:44
09:40:44 *** Listening on ?[32mdefault?[39;49;00m...
09:40:58 ?[32mdefault?[39;49;00m: ?[34mapp.main.views.render_png('{"abcde"}')?[39;49;00m (8c1b6186-39a5-4daf-9c45-f60e4241cd1f)
...\lib\site-packages\rq\job.py:161: DeprecationWarning: job.status is deprecated. Use job.set_status() instead
DeprecationWarning
09:40:58 ?[31mValueError: Unknown type <class 'redis.client.StrictPipeline'>?[39;49;00m
Traceback (most recent call last):
File "...\lib\site-packages\rq_win\worker.py", line 87, in perform_job
queue.enqueue_dependents(job, pipeline=pipeline)
File "...\lib\site-packages\rq\queue.py", line 322, in enqueue_dependents
for job_id in pipe.smembers(dependents_key)]
File "...\lib\site-packages\rq\queue.py", line 322, in <listcomp>
for job_id in pipe.smembers(dependents_key)]
File "...\lib\site-packages\rq\compat\__init__.py", line 62, in as_text
raise ValueError('Unknown type %r' % type(v))
ValueError: Unknown type <class 'redis.client.StrictPipeline'>
So in summary, I think the jobs are being queued correctly within redis. However when the worker process tries to grab a job off of the queue to process, This error occurs. How can I fix this?
So after some digging, it looks like the root of the error is here, where job_id being sent to the as_text function is, somehow, a StrictPipeline object. However, I have been unable to replicate the error locally; can you post more of your code? Also, I would try re-installing the redis, rq, and rq-win modules, and possibly try importing rq.compat

Categories