Bokeh Server Files: load_from_config = False - python

on Bokeh 0.7.1
I've noticed that when I run the bokeh-server, files appear in the directory that look like bokeh.data, bokeh.server, and bokeh.sets, if I use the default backend, or redis.db if I'm using redis. I'd like to run my server from a clean start each time, because I've found that if the files exist, over time, my performance can be severely impacted.
While looking through the API, I found the option to turn "load_from_config" from True to False. However, tinkering around with this didn't seem to resolve the situation (it seems to only control log-in information, on 0.7.1?). Is there a good way to resolve this and eliminate the need for me to manually remove these files each time? What is the advantage of having these files in the first place?

These files are to store data and plots persistently on a bokeh-server, for instance if you want to publish a plot so that it will always be available. If you are just using the server locally and always want a "clean slate" you can run with --backend=memory to use the in-memory data store for Bokeh objects.

Related

How can one download the outputs of historical Azure ML experiment Runs via the python API

I'm trying to write a script which can download the outputs from an Azure ML experiment Run after the fact.
Essentially, I want to know how I can get a Run by its runId property (or some other identifier).
I am aware that I have access to the Run object when I create it for the purposes of training. What I want is a way to recreate this Run object later in a separate script, possibly from a completely different environment.
What I've found so far is a way to get a list of ScriptRun objects from an experiment via the get_runs() function. But I don't see a way to use one of these ScriptRun objects to create a Run object representing the original Run and allowing me to download the outputs.
Any help appreciated.
I agree that this could probably be better documented, but fortunately, it's a simple implementation.
this is how you get a run object for an already submitted run for azureml-sdk>=1.16.0 (for the older approach see my answer here)
from azureml.core import Workspace
ws = Workspace.from_config()
run = ws.get_run('YOUR_RUN_ID')
once you have the run object, you can call methods like
.get_file_names() to see what files are available (the logs in azureml-logs/ and logs/azureml/ will also be listed)
.download_file() to download an individual file
.download_files() to download all files that match a given prefix (or all the files)
See the Run object docs for more details.

Display MPLD3 graph offline

I have a couple of graphs that need to be shown on a local website, they're made with MPLD3, and I used the save_html option. However, I've just been told that the graphs need to be able to be viewed offline, so I wanted to know if there was a way to do this without mpld3.show(), because I need the graphs embedded in the website.
Please elaborate if possible what you mean by "local website". It sounds like you have an index.html file on your hard drive that you're rendering in a browser.
If that's the case and you want this to work with no internet connection, then it's likely you'll have to embed the D3 javascript dependency and the mpld3 javascript dependency into the html file after you save it to the a file. I think the default behavior is to retrieve those libraries from a cdn rather than embedding them in full.
Another option would be to try using the fig_to_html() function kwargs d3_url= and mpld3_url= to set the paths to your locally stored D3 and mpld3 libraries using a "file://" prefix rather than the "https://" prefix (again, this just avoids loading the dependencies via cdn).

Treating a directory like a file in Python

We have a tool which is designed to allow vendors to deliver files to a company and update their database. These files (generally of predetermined types) use our web-based transport system, a new record is created in the db for each one, and the files are moved into a new structure when delivered.
We have a new request from a client to use this tool to be able to pass through entire directories without parsing every record. Imagine if the client made digital cars then this tool allows the delivery of the digital nuts and bolts and tracks each part, but they want to also deliver a directory with all of the assets which went into creating a digital bolt without adding each asset as a new record.
The issue is that the original code doesn't have a nice way to handle these passthrough folders, and would require a lot of rewriting to make it work. We'd obviously need to create a new function which happens around the time of the directory walk, which takes out each folder which matches this passthrough and then handles it separately. The problem is that all the tools which do the transport, db entry, and delivery all expect files, not folders.
My thinking: what if we could treat that entire folder as a file? That way the current file-level tools don't need to be modified, we'd just need to add the "conversion" step. After generating the manifest, what if we used a library to turn it into a "file", send that, and then turn it back into a "folder" after ingest. The most obvious way to do that is ZIP files - and the current delivery tool does handle ZIPs - but that is slow and some of these deliveries are very large, which means when transporting if something goes wrong the entire ZIP would fail.
Is there a method which we can use which doesn't necessarily compress the files but just somehow otherwise can treat a directory and all its contents like a file, so the rest of the code doesn't need to be rewritten? Or something else I'm missing entirely?
Thanks!
You could use tar files. Python has great support for it, and it is customary in *nix environments to use them as backup files. For compression you could use Gzip (also supported by the standard library and great for streaming).

How can I reduce the access time on large Excel files?

I would like to process a large data set of a mechanical testing device with Python. The software of this device only allows to export the data as an Excel file. Therefore, I use the xlrd package which works fine for small *.xlsx files.
The problem I have is, that when I want to open a common data set (3-5 MB) by
xlrd.open_workbook(path_wb)
the access time is about 30s to 60s. Is there any more effecitve and faster way to access Excel files?
You could access the file as a database via PyPyODBC instead, which may (or may not) be faster - you'd have to try it out and compare the results.
This method should work for both .xls and .xlsx files. Unfortunately, it comes with a couple of caveats:
As far as I am aware, this will only work on Windows machines, since you're relying on the Microsoft Jet database driver.
The Microsoft Jet database driver can be rather buggy, especially with dates.
It's not possible to create or modify Excel files (a note in the PyPyODBC exceltests.py file says: I have not been able to successfully create or modify Excel files.). Your question seems to indicate that you're only interested in reading files, though, so hopefully this will not be a problem.
I just figured out that it wasn't actually the problem with the access time but I created an object in the same step. Now, by creating the object separately everything works fast and nice.

Trying to automate the fpga build process in Xilinx using python scripts

I want to automate the entire process of creating ngs,bit and mcs files in xilinx and have these files be automatically be associated with certain folders in the svn repository. What I need to know is that is there a log file that gets created in the back end of the Xilinx gui which records all the commands I run e.g open project,load file,synthesize etc.
Also the other part that I have not been able to find is a log file that records the entire process of synthesis, map,place and route and generate programming file. Specially record any errors that the tool encountered during these processes.
If any of you can point me to such files if they exist it would be great. I haven't gotten much out of my search but maybe I didn't look enough.
Thanks!
Well, it is definitely a nice project idea but a good amount of work. There's always a reason why an IDE was built – a simple search yields the "Command Line Tools User Guide" for various versions of Xilinx ISE, like for 14.3, 380 pages about
Overview and list of features
Input and output files
Command line syntax and options
Report and message information
ISE is a GUI for various command line executables, most of them are located in the subfolder 14.5/ISE_DS/ISE/bin/lin/ (in this case: Linux executables for version 14.5) of your ISE installation root. You can review your current parameters for each action by right clicking the item in the process tree and selecting "Process properties".
On the Python side, consider using the subprocess module:
The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
Is this the entry point you were looking for?
As phineas said, what you are trying to do is quite an undertaking.
I've been there done that, and there are countless challenges along the way. For example, if you want to move generated files to specific folders, how do you classify these files in order to figure out which files are which? I've created a project called X-MimeTypes that attempts to classify the files, but you then need a tool to parse the EDA mime type database and use that to determine which files are which.
However there is hope, so to answer the two main questions you've pointed out:
To be able to automatically move generated files to predetermined paths. From what you are saying it seems like you want to do this to make the versioning process easier? There is already a tool that does this for you based on "design structures" that you create and that can be shared within a team. The tool is called Scineric Workspace so check it out. It also have built in Git and SVN support which ignores things according to the design structure and in most cases it filters all generated things by vendor tools without you having to worry about it.
You are looking for a log file that shows all commands that were run. As phineas said, you can check out the Command Line Tools User guides for ISE, but be aware that the commands to run have changed again in Vivado. The log file of each process also usually states the exact command with its parameters that have been called. This should be close to the top of the report. If you look for one log file that contains everything, that does not exist. Again, Scineric Workspace supports evoking flows from major vendors (ISE, Vivado, Quartus) and it produces one log file for all processes together while still allowing each process to also create its own log file. Errors, warning etc. are also marked properly in this big report. Scineric has a tcl shell mode as well, so your python tool can run it in the background and parse the complete log file it creates.
If you have more questions on the above, I will be happy to help.
Hope this helps,
Jaco

Categories