I’m running redis, 32bit, 2.0.2 from the cygwin compilation here: http://code.google.com/p/servicestack/wiki/RedisWindowsDownload
I am running it from the terminal. It works great for about 24 hours and then it crashes, no errors, it just closes. My config file I have defaults except:
# save 900 1
# save 300 10
# save 60 10000
appendonly no
appendfsync no
I tried using a newer version of redis. Redis-2.2.5 win32 here: https://github.com/dmajkic/redis/downloads
However, these I can run but it throws up ‘unpacking too many values’ error when task are added onto it with Celery 2.2.6.
I haven’t ran this long enough to see if it experiences the same crashing error that 2.0.2 has after 24 hours-ish.
Also I have redis flushdb at 1am every day. But the crash could happen any part of the day, normally around 24 hours since the last time it crashed.
Any thoughts?
Thanks!
additions
Sorry, I forgot to mention that Twisted is polling data every 20 seconds and storing it into redis, which roughly translates to close to a 700 thousand records a day or 4 or 5 gb of RAM used. There is no problem with Twisted, I just thought it might be relevant to the question.
follow up question?
Thanks Dhaivat Pandya!
Are there key-value database that are more supportive of the windows environment?
Redis does is not supposed to work with Windows, and the projects that try to make it work with windows all have numerous bugs that make them unstable.
Related
I got a problem after running python script with MongoDB aggregation pipeline. The error said
errno:24 Too many open files, full error: {'ok': 0.0, 'errmsg': 'error opening file "D:\MongoDB\Server\4.4\data/_tmp/extsort-doc-group.463": errno:24 Too many open files', 'code': 16814, 'codeName': 'Location16814'}
The server that host Mongo Database is Windows Server 2016
Problem is gone when I've limited number of data by reducing span day from 7 to 3 days, the script will successfully run and give me a result.
This script have been run for couple weeks before with 7 days setting and there was no problem.
As per the MongoDB docs here, it is recommended to set a high limit for open file limit. For Ubuntu, we generally do so by changing limits in /etc/security/limits.conf which are specific to user and limit types. There are different ways for different distros. For checking the limits a simple ulimit -a can be very helpful.
IMO machines running databases should have high limits for open file and process count. Also, there is bunch of recommendation from MongoDB related to paging and disk types to use. I would recommend going through them to use MongoDB upto it's potential.
I haven't worked on a windows machine for very long and I am sure if you try to find how to increase open file limit you will find it. Also, when you reduced the query from 7 days to 3 days, the files wired-tiger had to access to fetch the indexes and disk operations also reduced, and it might have allowed you to run the query. Please note unlike some databases, the file system organisation in mongodb-wiredtiger is a bit different.
How come after 2 hours of running a model, I get a popup window saying:
Runtime disconnected
The connection to the runtime has timed out.
CLOSE RECONNECT
I had restarted my runtime and thought I have 12 hours to train a model. Any Idea how to avoid this? My other question: Is it possible to find out the time left for runtime to get disconnected using a TF or Python API?
Runtime gets disconnected when the notebook goes to "idle" mode for a time greater than 90 minutes. This is an unofficial number, as google colab has no official release about this. This is how google colab gets away with it by answering cheekily:
An extract from the Official Colab FAQ
Where is my code executed? What happens to my execution state if I close the browser window?
Code is executed in a virtual machine dedicated to your account.
Virtual machines are recycled when idle for a while, and have a
maximum lifetime enforced by the system.
So to avoid this, keep your browser open and don't let your system sleep for a time greater than 90 minutes.
This also means if you happen to close your browser within the 90 minutes, then if you reopen the notebook within 90 minutes you will still have all your running processes and session variables intact!
Also, note that currently you can run a notebook for a maximum of 12 hours. (in the "non-idle" state of course).
To answer your second question, this "idle state" stuff is a colab thing. So I don't think TF or Python will have anything to do with it.
So it is good practise to have your models saved into a folder periodically. This way, in the unfortunate event of your runtime getting disconnected, your work will not be lost. And you can simply restart your training from the latest saved model!
PS: I got the number 90 minutes from an experiment done by a fellow user
This was originally seen in Python, but has since been replicated in C++. Here is the unit test that distills down and replicates the behavior on my new laptop. These are just local socket connections.
def test_zmq_publisher_duration(self):
max_duration = 1.0
t0 = time.time()
socket = zmq.Context.instance().socket(zmq.PUB)
duration = time.time() - t0
print(socket)
self.assertLess(duration, max_duration, msg="socket() took too long.")
On other computers, and on my old laptop, this runs in a fraction of a second. However, on my new laptop (beefy Dell Precision 7730) this takes about 44 seconds. I get similar results when creating a zmq.SUB (subscriber) socket.
If I step down into the socket() call, the two statements which consume all the time are as follows:
zmq/sugar/context.py
class Context
def instance(cls, io_threads=1):
...
cls._instance = cls(io_threads=io_threads)
...
def socket(self, socket_type, **kwargs)
...
s = self._socket_class(self, socket_type, **kwargs)
...
I am perplexed and baffled. Everything else on the laptop seems to be fine. Perhaps I have pip installed my dependent modules in some slightly different way? Could a previously installed zmq module versus pyzmq be causing problems? Perhaps it is something in the laptop setup from our IT department? I have tried running as administrator, running from within PyCharm, running from the command-line, and unplugging the network cable while running.
I am relatively new to Python and ZMQ, but we have been developing on this project for months with no performance issues. In production code, we have a MessageBroker class that contains most of the pub/sub architecture. The unit test above was created by simply pulling the first significant line of code out of our MessageBroker.Publisher constructor (which creates the socket). Even though the socket creation is SLOW on this computer, our application does still come up and run properly after the sockets get created. It just takes 7 minutes to start the application.
I suspect Ed's Law of Debugging: "The more bizarre the behavior, the more stupid the mistake."
This was apparently a Windows 10 or Laptop firmware issue. Some updates got pushed by the IT department and things worked normally the next day. Here are the items that were installed per the Event Viewer:
Installed KB4456655: Servicing stack update for Windows 10, version 1803: September 11, 2018 (stability improvements)
Installed KB4462930: Update for Adobe Flash Player
Installed KB4100347: Intel microcode updates
Installed KB4485449: Servicing stack update for Windows 10 v1803 - Feb. 12
Installed KB4487017: (Same description as KB4485449)
Installed KB4487038: Security update for Adobe Flash Player
We've been running postgresql server 9.6 on FreeBSD 11.2 so far and tried to upgrade to version 10.4 last month. This week we tried again to upgrade to 10.5 hoping the issue might be solved.
After the upgrade, we experience the following behavior running this select query:
SELECT assets_daily.close
FROM assets_daily
INNER JOIN assets_daily_conditions ON (assets_daily.id = assets_daily_conditions.daily_id)
WHERE (assets_daily.asset_id = 139 AND assets_daily_conditions.condition_id = 117 AND assets_daily.close_diff_10 IS NOT NULL)
ORDER BY assets_daily.date DESC
Using either python libraries or just using the psql command client tool that comes with postgresql we will sporadically get different assets_daily.close values. So instead of the correct value, 10.8 we would get 0 or even more strange >.8
Running the same query again and again, this happens about every 10000 runs. Sometimes right after 50, sometimes after 20000 or even more. So it's very sporadic.
Here's some more information: assets_daily.close is a float8 field. The tables are not small (assets_daily has 8 million records, assets_daily_conditions about 45 million records). There are no other clients or processes running that are accessing the database at the same time. We updated FreeBSD to the latest 11.2R and tried both installing postgresql-server10 using the binary and the ports versions. The server logs show no errors and there's plenty of free memory.
We first tried to upgrade using pg_upgrade but after we got the errors also tried to pg_dump from 9.6 and re-import from scratch into 10.5. Same results.
I know this is hard/impossible to reproduce unless we provide a really detailed example with a dataset etc. But my hope is that maybe someone out there has similar issues and maybe even a solution or has some hints/ideas in what direction we could start looking for one.
Im in trouble with Saltstack since I started 2 diferent developments with Python using its API. Sometimes the services crashes and when I try to start them again or reboot the servers, the time it takes to start is about more than 24 hours. Logs are empty and if i start salt-master in debug mode nothing happens.
# dpkg -l| grep salt
ii salt-common 2014.1.5+ds-1~bpo70+1
ii salt-master 2014.1.5+ds-1~bpo70+1
Note: It's happening to me in two different machines. OS Debian sid
Whoa, 24 hours is a ridiculous amount of time to start up.
Have you added any custom grains, modules or external pillars?
Have you tried upgrading? 2014.1.10 is now out.