Python: Windows registry hive access NOT using registry APIs

Python: Windows registry hive access NOT using registry APIs - python

I am trying to extract some data out of the Windows registry, both the software hive and ntuser.dat from XP computers. Currently I'm using reg.exe to load the hive and _winreg to extract the data. I need to use reg.exe as the computers I'm backing up data from are usually offline and I'm putting the hard drive from them in an external drive bay and loading the hives from that in another Windows session. It's not feasible to boot up the computers being backed up as they are often failing hard drives or otherwise unbootable.
I've seen a utility called hivex which runs under Linux which combines a c-module with a python wrapper to allow for read-only (limited write) access to the Windows registry, without using the Windows Registry APIs. Sadly there doesn't appear to be a Windows version of hivex, assumingly because no one figured a need to access the Windows registry under Windows by directly accessing the hive files.
I'd love to drop the dependency of reg.exe being called by subprocess.Popen() as calling an external executable has a host of issues, plus it makes the backup utility platform limited.
Does anyone know of a python module which allows for direct access of the hive files themselves? I already know of, and am currently using _winreg, so suggesting that would be less than helpful. Thanks in advance.

I'm not sure how much better it is, but the pywin32 library supplies bindings to most of the windows API. I don't know the windows API well enough to know if you can open arbitrary hive files, however it could be worth a quick look (the release contains a CHM with the full API mapping).

Did you have a look to regobj it provides pythonic access to registry value (but it is still based on _winreg)

Is your problem with calling an external application or using the registry APIs? If it is the former you can load and unload hives yourself using RegLoadKey / RegUnLoadKey. If it is the latter then I'm sure somebody has written a C library to parse hives directly. A quick Google search gave me Microsoft's Offline Registry Library.

Related

Transform a python file, that uses packages like pandas, to be usable by someone that has no python environment

I have a client for whom I have created a program that utilizes a variety of data and machine learning packages. The client would like for the program to be easily run without installing any type of python environment. Is this possible?
I am assuming the best bet would be to transform the .py file into a .exe file but am unsure of how to do this if I have packages that need to be installed before the program can be run.
Are there websites that exist that allow you to easily host complex .py files on them to be run by anyone that accesses the URL?

I think you are looking for "freezing", which package everything including the interpreter, libs and packages into a single executable file.
There are several tools for this purpose:
https://wiki.python.org/moin/Freeze
https://docs.python-guide.org/shipping/freezing/

I think the use of colaboratory that is cloud service provided by Google might be better. Your client who has to sign up for Google account can not only run the python program, but also utilize any major python packages on the cloud (of course, it's possible to install the necessary packages into the client's cloud space), without constructing the python environment on client's local PC. What's more, it's at free!

python copying directory and reading text files Remotely

I'm about to start working on a project where a Python script is able to remote into a Windows Server and read a bunch of text files in a certain directory. I was planning on using a module called WMI as that is the only way I have been able to successfully remotely access a windows server using Python, But upon further research I'm not sure i am going to be using this module.
The only problem is that, these text files are constantly updating about every 2 seconds and I'm afraid that the script will crash if it comes into an MutEx error where it tries to open the file while it is being rewritten. The only thing I can think of is creating a new directory, copying all the files (via script) into this directory in the state that they are in and reading them from there; and just constantly overwriting these ones with the new ones once it finishes checking all of the old ones. Unfortunately I don't know how to execute this correctly, or efficiently.
How can I go about doing this? Which python module would be best for this execution?

There is Windows support in Ansible these days. It uses winrm. There are plenty of Python libraries that utilize winrm, just google it, but Ansible is very versatile.
http://docs.ansible.com/intro_windows.html
https://msdn.microsoft.com/en-us/library/aa384426%28v=vs.85%29.aspx

I've done some work with WMI before (though not from Python) and I would not try to use it for a project like this. As you said WMI tends to be obscure and my experience says such things are hard to support long-term.
I would either work at the Windows API level, or possibly design a service that performs the desired actions access this service as needed. Of course, you will need to install this service on each machine you need to control. Both approaches have merit. The WinAPI approach pretty much guarantees you don't invent any new security holes and is simpler initially. The service approach should make the application faster and required less network traffic. I am sure you can think of others easily.
You still have to have the necessary permissions, network ports, etc. regardless of the approach. E.g., WMI is usually blocked by firewalls and you still run as some NT process.
Sorry, not really an answer as such -- meant as a long comment.
ADDED
Re: API programming, though you have no Windows API experience, I expect you find it familiar for tasks such as you describe, i.e., reading and writing files, scanning directories are nothing unique to Windows. You only need to learn about the parts of the API that interest you.
Once you create the appropriate security contexts and start your client process, there is nothing service-oriented in the, i.e., your can simply open and close files, etc., ignoring that fact that the files are remote, other than server name being included in the UNC name of the file/folder location.

COMException accessing WMP from Plex Plugin (using Python for .NET)

My Goal
I'm trying to create a plugin for Plex Media Server (PMS) that will interface with WMP (Windows Media Player) to get metadata about Windows media library items.
The Setup
PMS uses Python 2.7 as its primary script host. Plex Plugins are
written in Python, though they operate in a sandboxed capacity. Unfortunately there's sad little documentation on what exactly the boundaries are on this sandboxed functionality.
I decided to use Python for .NET to access the Windows SDK to interface with WMPLib.
Python for .NET (http://pythonnet.github.io/) is a Python
library used to access functionality from .NET assemblies from within
the Python runtime.
I created a .NET assembly to access WMPLib, which
is a part of the Windows SDK designed to programmatically access the
functionality of WMP. WMPLib is basically a COM Interop wrapper for
.NET targeting wmp.dll.
What's Working?
The whole chain from Python on through the COM-based WMP access is working. If I fire up the embedded Plex Script Host that comes with Plex Media Server (a version of Python 2.7), I can readily access data from WMP. That means that the following links in the chain are all working:
Python is loading Python for .NET
Python for .NET is loading my .NET assembly
My .NET assembly is loading WMPLib (Interop.WMPLib.dll, a .NET assembly for COM Interop)
WMPLib is successfully opening and utilizing wmp.dll (accessed from C:\Windows\System32)
What's Not Working?
Activating the COM Interop part of this chain is not working from within the sandboxed Plex plugin. Again, this plugin is written in standard Python, but something is subtly different about the Python execution environment once the sandboxing code has run. I get the following exception when running the WMP access code from within the plugin:
COMException: Exception from HRESULT: 0xC00D1327
at WMPLib.IWMPPlayer4.get_mediaCollection()
In this scenario I know that Python for .NET is working, because I've already loaded and accessed other things from my .NET assembly at this point.
C:\Windows\System32 is at the front of PATH variable. I'm assuming that COM dlls should be located via the PATH environment variable (This seems to say so), but I'm not entirely certain of that. How a COM assembly should be located in this unique scenario (Python accessing .NET accessing COM) is one of the biggest unknowns for me.
The Questions
How might the Plex plugin sandbox be changing the Python execution environment such that accessing the COM assembly is no longer working?
How should the COM assembly be located and accessed by the environment in this case?
Does it require specific permissions that the Plex sandbox may have locked down?
Maybe I should at least win some kind of prize for arriving at a question that intersects so many different technologies in a uniquely confusing way...
Edit 1
I've ruled out any .NET related issues entirely, thanks to #Paulo's suggestion below. I'm now doing all interop with WMPLib through the comtypes Python library. Now I'm getting the following error:
COMError: (-1072884953, None, (None, None, None, 0, None))
Though -1072884953 is a different error code, a little digging around makes it appear that this error is associated with (maybe equivalent to?) the same error that I was getting through .NET interop (this post makes it appear so).
So now the facts that I'm stuck with are these:
wmp.dll is loading in all cases (which #Paulo helped me to figure
out below).
When the code accessing WMP is run outside the Plex sandbox environment, library items can be accessed from WMP just fine.
When the code accessing WMP is run inside the Plex sandbox environment, library items cannot be accessed from WMP.
The error code that I get (whether from .NET or Python based COM interop is NS_E_CURL_INVALIDPATH: The URL contains a path that is not valid. This error appears to be involved with attempted playback in most cases.
This is odd because I've never gotten as far as playback in my scenario... I'm only attempting to call wmp.mediaCollection
So the Plex sandbox truly seems to be key in this scenario. Any further ideas?
Edit 2
Minimally, this is the code that it takes to fail:
from comtypes.client import CreateObject
wmp = CreateObject("{6BF52A52-394A-11d3-B153-00C04F79FAA6}")
collection = wmp.mediaCollection
That collection = wmp.mediaCollection is where the error happens.
So there really aren't any parameters being passed in that could be causing the failure. To reiterate, this code runs fine in the general Python 2.7 context. It only fails within the Plex plugin sandbox. I don't know how to get details on how the Plex sandbox may be changing the execution environment. I'd imagine my answer lies in that direction.

Let me get this straight, and correct me if I'm wrong:
You have a running instance of Python 2.7, Plex Media Server
You're using a library, Python for .NET, to load a .NET into your process
You're loading WMPLib, an imported COM interop assembly, in .NET to use the Windows Media Player library through Python for .NET
Let's clear this out:
0xC00D1327 is NS_E_CURL_INVALIDPATH:
The URL contains a path that is not valid.
It seems like a legitimate object error, not a COM error.
The DLL search order has little to do with it, since wmp.dll is registered with a full path under InProcServer32 registry keys for each provided class, and that's what ultimately matters.
In fact and as you stated, if you've reached this point, it's clearly not a problem about loading .NET assemblies or COM not finding a DLL.
Now, to the questions:
Since the error seems legit, you might not have access to the media collection (Internet zone?), or WMP is not correctly registered/installed, or some codec is missing or not correctly registered/installed, etc.
What are you loading into WMP? Try with basic things, such as local .WAV, .MP3, .AVI and .MPG files, then try more advanced formats e.g. MPEG4-encoded videos, or maybe remote locations.
You should attempt a more direct approach, although I can't really vouch which is better: win32com (part of pywin32) or comtypes.
It was a long while ago since I've looked at them, so take this with a grain of salt: with comtypes, you're able to use your COM objects much like regular Python objects with properties and methods, while win32com seems more inclined to do things by runtime name dispatching.
At least you'll be punching something you really have to put up with (Python) instead of loading .NET for something that doesn't even require it (WMP).
I don't know what that sandboxing is about, but my guess is that it's a Python-only sandbox, not something that restricts usage of the operating system.
EDIT: Are you providing a filename (e.g. C:\path\to\file.mp4) where a URL is expected (file:///C:/path/to/file.mp4), or vice-versa? I guess you must show the failing code and what values are being provided.

Accessing a JET (.mdb) database in Python

Is there a way to access a JET database from Python? I'm on Linux. All I found was a .mdb viewer in the repositories, but it's very faulty. Thanks

MDB Tools is a set of open source libraries and utilities to facilitate exporting data from MS Access databases (mdb files) without using the Microsoft DLLs. Thus non Windows OSs can read the data. Or, to put it another way, they are reverse engineering the layout of the MDB file.
Jackcess is a pure Java library for reading from and writing to MS Access databases. It is part of the OpenHMS project from Health Market Science, Inc. . It is not an application. There is no GUI. It's a library, intended for other developers to use to build Java applications.
ACCESSdb is a JavaScript library used to dynamically connect to and query locally available Microsoft Access database files within Internet Explorer.
Both Jackcess and ACCESSdb are much newer than MDB tools, are more active and have write support.

Install your distribution's packaged version of mdbtools, use mdb-export to export the Jet data to text files, import the data into a SQLite database, and have a combination of code and data that works in almost any computing environment you might get your hands on.

Probably the most simple solution:
Download VirtualBox and install Windows and MS access in it.
Write a small Python server which use ODBC to access the database and which receives commands from a network socket.
On Linux, connect to the server in the virtual machine and access the database this way.
This gives you full access to all features. Every other solution will either limit the features you can use (for example, you won't be able to modify the data) or be pretty unsafe.

If you build the CVS version of mdb-tools, it works rather well. It fixed a lot of issues I had trying to use the one in the repositories related to memo field size. mdb-tools is basically a dead project, but people have still been occasionally contributing code to the CVS. The build in Ubuntu is from 2004 I think.
CVS instructions here:
http://sourceforge.net/scm/?type=cvs&group_id=2294
If using Ubuntu, before downloading the sources you'll want to enable source repositories and do:
apt-get build-dep mdbtools
That will get the required packages you'll need to manually build the sources from CVS.

portable non-relational database

I want to experiment/play around with non-relational databases, it'd be best if the solution was:
portable, meaning it doesn't require an installation. ideally just copy-pasting the directory to someplace would make it work. I don't mind if it requires editing some configuration files or running a configuration tool for first time usage.
accessible from python
works on both windows and linux
What can you recommend for me?
Essentially, I would like to be able to install this system on a shared linux server where I have little user privileges.

I recommend you consider BerkelyDB with awareness of the licensing issues.
I am getting very tired of people recommending BerkleyDB without qualification - you can only distribute BDB systems under GPL or some unknown and not publicly visible licensing fee from Oracle.
For "local" playing around where it is not in use by external parties, it's probably a good idea. Just be aware that there is a license waiting to bite you.
This is also a reminder that it is a good idea when asking for technology recommendations to say whether or not GPL is acceptable.
From my own question about a portable C API database, whilst a range of other products were suggested, none of the embedded ones have Python bindings.

Metakit is an interesting non-relational embedded database that supports Python.
Installation requires just copying a single shared library and .py file. It works on Windows, Linux and Mac and is open-source (MIT licensed).

BerkleyDB

If you're used to thinking a relational database has to be huge and heavy like PostgreSQL or MySQL, then you'll be pleasantly surprised by SQLite.
It is relational, very small, uses a single file, has Python bindings, requires no extra priviledges, and works on Linux, Windows, and many other platforms.

Have you looked at CouchDB? It's non-relational, data can be migrated with relative ease and it has a Python API in the form of couchdb-python. It does have some fairly unusual dependencies in the form of Spidermonkey and Erlang though.
As for pure python solutions, I don't know how far along PyDBLite has come but it might be worth checking out nonetheless.

BerkeleyDB : (it seems that there is an API binding to python : http://www.jcea.es/programacion/pybsddb.htm)

Have you looked at Zope Object Database?
Also, SQLAlchemy or Django's ORM layer makes schema management over SQLite almost transparent.
Edit
Start with http://www.sqlalchemy.org/docs/05/ormtutorial.html#define-and-create-a-table
to see how to create SQL tables and how they map to Python objects.
While your question is vague, your comments seem to indicate that you might want to define the Python objects first, get those to work, then map them to relational schema objects via SQLAlchemy.

If you're only coming and going from Python you might think about using Pickle to serialize the objects. Not going to work if you're looking to use other tools to access the same data of course. It's built into python, so you shouldn't have any privileged problems, but it's not a true database so it may not suit the needs of your experiment.

Adding a reference to TinyDB here since this page is showing at the top of many searches. It is a portable non-relational database in python. It stores python dicts into a local json file and makes them available for database ops similar to mongodb. It also has an extension to port to mongodb's commands, the difference being that instead of working on another system server you'll be operating on a local json file.
And unlike the presently chosen answer, it is under a permissive MIT open license.
Links:
TinyDB site
Tinydb on Github
Basic Usage
Usage
Support forum
Implementations

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.