can linux command line programs see python temporary files?

can linux command line programs see python temporary files? - python

I have a simple web-server written using Python Twisted. Users can log in and use it to generate certain reports (pdf-format), specific to that user. The report is made by having a .tex template file where I replace certain content depending on user, including embedding user-specific graphs (.png or similar), then use the command line program pdflatex to generate the pdf.
Currently the graphs are saved in a tmp folder, and that path is then put into the .tex template before calling pdflatex. But this probably opens up a whole pile of problems when the number of users increases, so I want to use temporary files (tempfile module) instead of a real tmp folder. Is there any way I can make pdflatex see these temporary files? Or am I doing this the wrong way?

without any code it's hard to tell you how, but
Is there any way I can make pdflatex see these temporary files?
yes you can print the path to the temporary file by using a named temporary file:
>>> with tempfile.NamedTemporaryFile() as temp:
... print temp.name
...
/tmp/tmp7gjBHU

As commented you can use tempfile.NamedTemporaryFile. The problem is that this will be deleted once it is closed. That means you have to run pdflatex while the file is still referenced within python.
As an alternative way you could just save the picture with a randomly generated name. The tempfile is designed to allow you to create temporary files on various platforms in a consistent way. This is not what you need, since you'll always run the script on the same webserver I guess.
You could generate random file names using the uuid module:
import uuid
for i in xrange(3):
print(str(uuid.uuid4()))
The you save the pictures explictly using the random name and pass insert it into the tex-file.
After running pdflatex you explicitly have to delete the file, which is the drawback of that approach.

Related

How to Open and Save Files in Parallel Directory Without Knowing Full Path

I am wondering if there is an easy way to access 'parallel' directories (See photo for what I am talking about... I don't know what else to call them, please correct me if they are called something else!) from a Python file without having to input the string path.
The basic structure I intend to use is shown in the picture. The structure will be used across different computers, so I need to avoid just typing in "C:\stuff_to_get_there\parent_directory\data\file.txt" because "C:\stuff_to_get_there" will not be the same on different computers.
I want to store the .py files in their own directory, then access the data files in data directory, and save figures to figures directory. I was thinking of trying os module but not sure if that's the correct way to go.
parent directory
scripts
.py files
figures
save files here
data
.txt files stored here
Thanks for any help!

Restrict the Python file to read and write

I'm trying to restrict write and read access to a Python file. Suppose I have the following code:
with open('test.py', 'w+') as file:
file.write('''
open("document.txt", "w+").write("Hello, World!")
open("document.txt", "r+").read()
''')
By executing this code, a new file is created that in the new file there are two lines of code to write and read a another file.
I want the file created by executing this code (test.py) to hit PermissionError while running and not be able to create a new file or read it; Also, this file is only executable and normal commands work in it, but it can not access other files.

If I read you correctly, this is not a python problem, but an environment problem. I understand the question as something like 'how do I prevent python code from executing arbitrary reads or writes?'. There would be a trivial solution (modifying the generated test.py so it throws an error) but presumably that's not what you want.
The easiest way to make python hit a PermissionError... is to make sure it doesn't have permissions. So run your code as a user with extremely limited permissions---specifically no write permissions anywhere---or perhaps no default permissions at all, and use something like facls to grant permission to read specific files explicitly from a more priveleged sentinel process. (This assumes you are running Linux, but there are likely other ways to do this in different OSs).
Alternatively, look into various sandboxing techniques to give you a python interpreter with the relavent modules replaced with modules which throw errors, or an environment where outside modification is impossible.
It would help if you made it clearer why this is important, and why you are writing a python script with another python script (is this just an example of malicious action?).

You could technically change the permission of the file itself on the filesystem your trying to access.
Check the previous thread about changing permissions
os.chmod(path, <permission value>)
Where 000 is to disable anyone other than root to edit on linux.

share files and functions through multiple projects

it's a kind of open question but please bear with me.
I am working on several projects (mainly with pandas) and I have created my standard approach to manage them:
1. create a main folder for all files in a project
2. create a data folder
3. have all the output in another folder
and so on.
One of my main activities is data cleaning, and in order to standardize it I have created a dictionary file where I store the various translation of the same entity, e.g. USA, US, United States, and so on, so that the files I am producing are consistent.
Every time I create a new project, I copy the dictionary file in the data directory and then:
xls = pd.ExcelFile(r"data/dictionary.xlsx")
df_area = xls.parse("area")
and after, to translate the country name into my standard, I call:
join_column, how_join = "country", "inner"
df_ct = pd.concat([
df_ct.merge(df_area, left_on=join_column, right_on="country_name", how=how_join),
df_ct.merge(df_area, left_on=join_column, right_on="alternative01", how=how_join),
and finally I check that I am not losing an record with a miss-join.
Over and over the same thing.
I would like to have a way to remove all this unnecessary cut and paste (of the file and of the code). Also, the file I used on the first projects are already deprecated and I need to update them (and sometime the code) when I need to process new data. Sometimes I also lose track of where is the latest dictionary file! Overall it's a lot of maintenance, which I believe might be saved.
Creating my own package is the way to go or is it a little too much ambitious?
Is there another shortcut? Overall it's not a lot of code, but multiplied by several projects.
Thanks for any insight, your time going through this is appreciated.

At the end I decided to create my own package.
It required some time so I am happy to share the details about the process (I run python on jupyter and windows).
The first step is to decide where to store the code.
In my case it was C:\Users\my_user\Documents
You need to add this directory to the list of the directories where python is looking for packages. this is achieved running the following statement:
import sys
sys.path.append("C:\\Users\\my_user\\Documents")
In order to run the above statement each time you start python, it must be included into a file in the directory (this directory might vary depending on your installation):
C:\Users\my_user\.ipython\profile_default\startup
the file can be named "00-first.py" ("50-middle.py" or "99-last.py" will also work)
To verify everything is working, restart python and run the command:
print(sys.path)
you should be able to see your directory at this point.
create a folder with the package name in your directory, and a subfolder (I prefer not to have code in the main package folder)
C:\Users\my_user\Documents\my_package\my_subfolder
put an empty file named "_ _init__.py" (note that there should be no space between underscores, but I do not know how to achieve it with the editor) in each of the two folders: my package and my_subfolder. At this point you should be able already to import your empty package from python
import my_package as my_pack
inside my_subfolder create a file (my_code.py) which will store the actual code
def my_function(name):
print("Hallo " + name)
modify the outer _ _init__.py file to include shortcuts. Add the following:
from my_package.my_subfolder.my_code import my_function
You should be able now to run the following in python:
my_pack.my_function("World!")
Hope you find it useful!

How do I open/convert .pkz files?

A python package that I'm using has data stored under a single file with a .pkz extension. How would I unzip (?) this file to view the format of data within?

Looks like what you are referencing is just a one-off file format used in sample data in scikit-learn. The .pkz is just a compressed version of a Python pickle file which usually has the extension .pkl.
Specifically you can see this in one of their sample files here along with the fact they are using the zlib_codec. To open it, you can go in reverse or try uncompressing from the command line.

Before attempting to open an PKZ file, you'll need to determine what kind of file you are dealing with and whether it is even possible to open or view the file format.
Files which are given the .PKZ extension are known as Winoncd Images Mask files, however other file types may also use this extension. If you are aware of any additional file formats that use the PKZ extension, please let us know.
How to open a PKZ file:
The best way to open an PKZ file is to simply double-click it and let the default assoisated application open the file. If you are unable to open the file this way, it may be because you do not have the correct application associated with the extension to view or edit the PKZ file.
If you can do it, great, you have a program installed that can do it, lets say that program is called pkzexecutor.exe, with python, you just have to do:
import subprocess
import os
path_to_notepad = 'C:\\Windows\\System32\\pkzexecutor.exe'
path_to_file = 'C:\\Users\\Desktop\\yourfile.pkz'
subprocess.call([path_to_notepad, path_to_file])

From the source code for fetch_olivetti_faces, the file appears to be downloaded from http://cs.nyu.edu/~roweis/data/ and originally has a .mat file extension, meaning it is actually a MATLAB file. If you have access to MATLAB or another program which can read those files, try opening it from there with the original file extension and see what that gives you.
(If you want to try opening this file in Python itself, then perhaps give this question a look: Read .mat files in Python )

Python watchdog event not returning entire src_path

I'm using python watchdog to keep track of what files have been changed locally. Because I'm not keeping track of an entire directory but specific files, I'm using watchdog's event.src_path to check if the changed file is the one I'm looking for.
I'm using the FileSystemEventHandler and on_modified, printing the src_path. However, when I edit a file that should have the path /home/user/project/test in gedit, I get two paths, one that looks like /home/user/project/.goutputstream-XXXXXX and one that looks something like this: home/user/project/. I never get the path I'm expecting. I thought there may have been something wrong with watchdog or my own code, but I tested the exact same process in vi, nano, my IDE (PyCharm), Sublime Text, Atom...and they all gave me the src_path I'm expecting.
I'm wondering if there is a workaround for gedit, since gedit is the default text editor for many Linux distributions...Thanks in advance.

From the Watchdog GitHub readme:
Vim does not modify files unless directed to do so. It creates backup files
and then swaps them in to replace the files you are editing on the
disk. This means that if you use Vim to edit your files, the
on-modified events for those files will not be triggered by watchdog.
You may need to configure Vim to appropriately to disable this
feature.
As the quote says your issue is due to how these text editors modify files. Basically rather than directly modifying the file, then create "buffer" files that store the edited data. In your case this file is probably .goutputstream-XXXXXX. When you hit save your original file is deleted and a the buffer file is renamed into its place. So your second path is probably the result of the original file being deleted. Sometimes these files serve as backups instead, but still cause similar issues..
By far the easiest method to solve this issue is to disable the weird way of saving in your chosen text editor. In gedit this is done by unchecking the "Create a backup copy of file before saving" option within preferences. This will stop those backup files from being created and simplify life for watchdog.
Image and preference info shamelessly stolen from this AskUbuntu question
For more information (and specific information for solving vim/vi) see this issue on the watchdog GitHub.
Basically for Vim you need to run these commands to disable the backup/swapping in feature:
:set nobackup
:set nowritebackup
You can add them to your .vimrc to automate the task

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.