Frozen Python Application with cx_Freeze is very slow to start - python

I have a fairly complex python application that uses numpy, pandas, PySide, pyqtgraph, and matplotlib, among other packages. When I bundle the application with cx_Freeze on Windows, it comes in at 349MB.
My problem is the resulting executable has a very long startup time of about 15 seconds. When I say startup time, I mean the amount of time before any code gets executed. I have a simple script that prints "Hello" to the console, and even that takes about 15 seconds to run.
Does anyone know of a solution to this problem, or any ways to debug it? Is it slow because there are so many .dll files from so many packages?
EDIT:
Using a great tool called Process Montor, I have narrowed down my problem to the pytz module. On one particular load, 20 seconds were spent querying the library.zip (where cx_Freeze puts all of the compiled bytecode) for pytz zoneinfo! I recently added pandas as a dependency, and pandas uses pytz.
See this picture for a sampling of the Process Monitor output:

The solution I found is to use Process Monitor to see if cx_Freeze is loading a module for an unreasonable amount of time. Using that tool, I also found that it was taking a long time (maybe 4 seconds) to load a particular matplotlib font. I removed it and my application worked fine.

Related

How to copmile python to exe with pyinstaller leaving out the modules / dependencies?

I have a program written in python using multiple large modules such as Tensorflow, tesseract, cv2, and others. I need to compile the program fairly often, so the problem is, that all those large modules are compiled again and again, taking lot of time and resources.
Is there a way to compile my code only?
So that I could just replace the old my_program.exe with the new my_program.exe, because those modules stay the same. It's waist of time.

Long import times for some modules in VS Code

When I run a Python script in VS Code, it delays execution by more than a second if, for example, Pandas or Numpy are part of the script's import statement.
If only libraries from the Python Standard Library are used in the imports, the script will start immediately.
A second doesn't sound like a lot, but it is to me because I've used Spyder so far, where the same script starts immediately without spending any noticeable time on imports. I was wondering if this is normal in VS Code or if there are configuration parameters to speed up import time.
Edit
A minimal example would be a script with content
import collections
import pandas
print("a string")
which in my opinion should only take a few milliseconds (not noticeable) for complete processing after clicking the "run" button. Without the pandas import, it actually does.
I think this is an important aspect, because slow "import speeds" hinder the unit-testing workflow.
pandas invokes numpy, and both of those are very large packages with many C DLLs. It takes many seconds for them to load. Once the DLLs have been loaded into Windows file cache, it should load quicker until they age out. It's just a fact.

In setup.py, how can I code a function that will run at install time (on the installing computer) NOT at build time?

Q: In creating a python distribution using setup.py, how can I define Python code that will be run by pip at installation time (NOT at build time!) and runs on the installation target machine (NOT on the build machine!)
I have spent the past seek searching the web for answers, reading contradictory documentation pages, and viewing multiple videos about setup.py. But in all this research can't find even one working example of how install time tasks can be specified.
Can someone point me at a complete working example?
Background: I am writing Python code for an application that is controlling a specialized USB peripheral my company is making, the processor where this will be installed is embedded/bundled with the peripheral and control software.
What's Needed: During the installation of the controlling application, I need the installing program (pip?) to write a configuration file on the install target machine. This file needs to include machine specific information about the target machine acquired using calls to functions imported from Lib/platform.py.
What I tried: Everything I've tried so far either runs at build time on the build machine (when setup.py runs and thus picks up the WRONG information for the target machine), or it merely installs the code I want to run on the target, but does not run it. It thus requires manual intervention by the user after the pip installation but prior to attempting to run the program they think they just installed, to run the auxiliary program that creates the installation config file. Only after this 2 step process can the user actually run the installed (and now properly configured) application.
Source code: Sorry. All my failed attempts to put functions in setup.py (which only run on the build machine, at build time) would only further confuse any readers and encourage more misleading wild goose chases down pointless rat holes.
If my users were sophisticated python developers who are comfortable with command line error messages, the link that #sinoroc has provided in the previous comment would have been an interesting solution.
Given that my users are barely comfortable installing packages from the App Store or Google Play store, the referenced work around is probably not right for me.
But given that install time functions are regarded as bad practice, my workaround is to alter the installed program so that its first action is to check for the presence of the necessary configuration file every time the program runs.
While this checking is seemingly unnecessary after the first run, it consumes only minimal CPU resources and would be more robust if the configuration file is ever accidentally deleted.

Launch times of compiled Python vs IronPython

I'm currently writing some Windows GUI applications in Python 2.7 using pyQT4 and pyInstaller. These compiled executables are taking about 3-6 seconds to open. I've eliminated UPX already and seen some time shaved off the load, and some more by not packaging it as a single exe but I'd rather have a single distributable file.
I was wondering if IronPython would be a better at providing a quick loading application. Or any of the other version of Python, or better yet if there is anything else i can do to minimize the wait for the GUI to be drawn on the screen.
IronPython does add a performance hit to a number of specific areas compared to Python. However, I would not draw the conclusion that it is always slower. There has been a post already on that topic Why is IronPython faster than the Official Python Interpreter
The most important aspect to consider is have you fully optimized everything in your Python project? As in, have you performed any code profiling?
If the answer is yes then you could attempt to migrate to a different toolkit or language. If you chose IronPython (link) it would be wise to migrate your GUI to .NET (winforms, XAML, etc). Another option (albeit unpopular to most) would be to use tkinter. Tkinter is very powerful in that one can fully-customize the code to be nothing more and nothing less than the project requires. Of course that will add considerable typing time. But I have made Python apps that load and draw a window almost immediately and quite honestly almost looked like it came out of VS2012.

App created with PyInstaller has a slow startup

I have an application written in Python and 'compiled' with PyInstaller. It also uses PyQt for the GUI framework.
Running this application has a delay of about 10 seconds before the main window loads and is shown. As far as I can tell, this is not due to slowness in my code. Instead, I suspect this is due to the Python runtime initializing.
The problem is that this application is started with a custom laucncher / taskbar application. The user will click the button to launch the app, see nothing appear to happen, and click elsewhere on another application. When my application shows it's window, it cannot come to the foreground due to the rules for SetForegroundWindow.
I have access to the source for the PyInstaller win32 loader, the Python code, and even the launcher code.
My questions are:
How can I make this application start faster?
How can I measure the time spend i the first few seconds of the process's lifetime?
What is the generally accepted technique for reducing time until the first window is shown?
I'd like to avoid adding a splash screen for two reasons - one, I expect it won't help (the overhead is before Python code runs) and two, I just don't like splash screens :)
If I need to, I could probably edit the PyInstaller loader stub to create a window, but that's another route i'd rather not take.
I suspect that you're using pyinstaller's "one file" mode -- this mode means that it has to unpack all of the libraries to a temporary directory before the app can start. In the case of Qt, these libraries are quite large and take a few seconds to decompress. Try using the "one directory" mode and see if that helps?
Tell PyInstaller to create a console-mode executable. This gives you a working console you can use for debugging.
At the top of your main script, even before the first import is run, add a print "Python Code starting". Then run your packaged executable from the command line. This way you can get a clear picture whether the time is spent in PyInstaller's bootloader or in your application.
PyInstaller's bootloader is usually quite fast in one-dir mode, but it can be much slower in one-file mode, because it depacks everything into a temporary directory. On Windows, I/O is very slow, and then you have antiviruses that will want to double check all those DLL files.
PyQt itself is a non-issue. PyQt is generated by SIP which generates very fast lazy bindings; importing the whole PyQt is faster than any other GUI library because it basically does nothing: all bindings to classes/functions are dynamically created when (and if!) you access them, saving lots of memory too.
If your application is slow at coming up, that will be true without PyInstaller as well. In that case, your only solution is either a splash screen (import just PyQt, create QApplication, create a display the splashscreen, then import the rest of your program and run it), or rework your code. I can't help you much without details.
I agree with above answers. My Qt python program needed about 5 seconds to start up on a decent PC when using onefile mode. After I change to --onedir, it only took around one second to start; almost immediately after user double clicks the exe file. But the drawback is that there are many files in that directory which is not so neat.
For my application, the long startup time almost entirely was caused by the antivirus system. Switching it off reduced the startup in my case from 3 minutes to less than 10secs!
To bring these measurements into perspective: My application was bundled with extra data files (about 150 files with a payload of 250MB), besides carrying around Qt and numpy (that may depend on Intel MKL, which alone adds another 200MB to the bundle!) dependencies. It even didn't help much that the tested system was running with a solid state drive...
In conclusion: if you have a large application with lots of dependencies, the startup time may be affected strongly by the antivirus system!
In case anyone is still have this issue, I resolved mine by running the exe locally and not on any sharedrives. This took the startup time from over a minute to under 10 seconds.
I solved this by adding an exception to the anti-virus monitoring software (F-Secure). It removed the wait of several minutes before running.
I have 'compiled' a few wxPython apps using py2exe and cx_Freeze, None of them take more than 4 seconds to start.
Are you sure sure it's not your code ?
maybe some network or some I/O resource call holding your app ?
Have you tried other machine than yours? Even the fastest hardware can be slow sometimes with the wrong software config, apps or OS, try it.
Try timing it with timeit module.
I never used pyQT, but with wxPython the startup speed is OK, and after the first initialize if I close and open again, it's faster than the first time.

Categories