I'm working on a webapp that uses SCORM so it can be included in our clients' learning management systems. This works by building a zip file that contains several files. Two of the files depend on the particular resource they want to include and the client themselves. I'd therefore like to generate these zip files automatically, on demand.
So imagine I have a "template" version of the ZIP, extracted to a directory:
/zipdir/fileA.html
/zipdir/fileB.xml
/zipdir/static-file.jpg
Let's imagine I use Django's template sytax in fileA and fileB. I know how to run a file through the template loader and render it, but how do I add that file to a ZIP file?
Could I create a base zip file (that doesn't have fileA and fileB in) and add the two renders to it? Otherwise, how would you go about cloning the zipdir to a temporary location and then rendering those two files to it before zipping it?
Using zipfile with StringIO will allow you to create a zip file in memory that you can later send to the client.
Related
I have an Apache NiFi Workflow, which takes some Files out of a FTP and puts them into a separate Folder, using ListFTP - FetchFTP and PutFile.
The problem is:
Those Files, which were copied into the new Folder, need to be decoded, which will result in several other files. From File 1, I will generate around 300 other files and the original File will be deleted. (This is how it works and I cannot modify the behavior when it comes to the deletion of the file)
Now, I want to have these Files decoded via a Python Script, called from an ExecuteScript Process in NiFi and eventually put the new Files into another Folder --> either via PutFile or PutHDFS (irrelevant at this point)
As far as I know, NiFi will execute it's entire Flow based on the UUID of each FlowFile. In my case, the original UUID of the FlowFile will no longer exist and 300 other files will be generated without an UUID, as these Files were never acknowledged by NiFi.
Is it possible to generate a new UUID for each of these new Files and send them afterwards to REL_SUCCESS?
When you call session.create(original) where original is the incoming FlowFile, the returned FlowFile reference is a child of the original. It will have its own UUID and possibly the parent UUID as a separate attribute (although that might be up to the processor to do, such as SplitText).
Is there any way to save a file to a dictionary under python?
(Indeed I am not asking how to export dictionaries to files here.)
Maybe a file could be pickled or transformed into me python object
and then saved.
Is this generally advisable?
Or should I only save the file's path to the dictionary?
How would I retrieve the file later on?
The background of my question relates to my usage of dictionaries as
databases. I use the handy little module sqlitshelf as a form of permanent dictionary: https://github.com/shish/sqliteshelf
Each dataset includes a unique config file (~500 kB) which is retrieved from an application. Upon opening of the respective data set the config file are copied into and back from the working directory of the application. I might use a folder instead where I save the config files to. Yet, it strikes me as more elegant to save them together with the other data.
I'm using django-storages to store media files in an S3 bucket. However, I am occasionally converting or otherwise fiddling with the file to create new files, and this fiddling has to use files actually on my server (most of the conversion happens using process calls). When done I'd like to save the files back to S3.
In an ideal world, I would not have to do any changes to my functions when moving from local to S3. However, I'm unsure how I would do this considering that I have to create these intermediate local files to fiddle with, and then at the end know that the resultant file (which would also be stored on the local machine) needs to then be copied over to S3.
Best that I can come up with using a pair of context guards, one for the source file and one for the destination file. The source file one would create a temporary file that would get the contents of the source file copied over, and then it would be used, manipulated, etc. The destination file context guard would just get the final desired destination path on S3 and create a temporary local file, then when exiting would create a key in the S3 bucket, copy over the contents of the temporary file, and delete it.
But this seems pretty complicated to me. It also requires me to wrap every single function that manipulates these files in two "with" clauses.
The only other solution I can think of is switching over to utilities that only deal with file-like objects rather than filenames, but this means I can't do subprocess calls.
Take a look at the built-in file storage API - this is exactly the use-case for it.
If you're using django-storages and uploading to S3, then you should have a line in your settings module that looks like
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
When you're developing locally and don't want to upload your media files to S3, in your local settings module, just leave it out so it defaults to django.core.files.storage.FileSystemStorage.
In your application code, for the media files that will get saved to S3 when you move from local development to staging, instantiate a Storage object using the class returned from the get_storage_class function, and use this object to manipulate the file. For the temporary local files you're "fiddling" with, don't use this Storage object (i.e. use Python's built-in file-handling functions) unless it's a file you're going to want to save to S3.
When you're ready to start saving stuff on S3, all you have to do is set DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage' again, and your code will work without any other tweaks. When that setting is not set, those media files will get saved to the local filesystem under MEDIA_ROOT, again without any need to change your application logic.
I have a simple web-server written using Python Twisted. Users can log in and use it to generate certain reports (pdf-format), specific to that user. The report is made by having a .tex template file where I replace certain content depending on user, including embedding user-specific graphs (.png or similar), then use the command line program pdflatex to generate the pdf.
Currently the graphs are saved in a tmp folder, and that path is then put into the .tex template before calling pdflatex. But this probably opens up a whole pile of problems when the number of users increases, so I want to use temporary files (tempfile module) instead of a real tmp folder. Is there any way I can make pdflatex see these temporary files? Or am I doing this the wrong way?
without any code it's hard to tell you how, but
Is there any way I can make pdflatex see these temporary files?
yes you can print the path to the temporary file by using a named temporary file:
>>> with tempfile.NamedTemporaryFile() as temp:
... print temp.name
...
/tmp/tmp7gjBHU
As commented you can use tempfile.NamedTemporaryFile. The problem is that this will be deleted once it is closed. That means you have to run pdflatex while the file is still referenced within python.
As an alternative way you could just save the picture with a randomly generated name. The tempfile is designed to allow you to create temporary files on various platforms in a consistent way. This is not what you need, since you'll always run the script on the same webserver I guess.
You could generate random file names using the uuid module:
import uuid
for i in xrange(3):
print(str(uuid.uuid4()))
The you save the pictures explictly using the random name and pass insert it into the tex-file.
After running pdflatex you explicitly have to delete the file, which is the drawback of that approach.
I have a Django app that needs to create a file in google drive: in FolderB/Sub1/Sub2/file.pdf. I have the id for FolderB but I don't know if Sub1 or Sub2 even exist. If not it should be created and the file.pdf should be put in it.
I figure I can look at children at each level and create the folder at each level if its not there, but this seems like a lot of checks and api calls just to create one file. Its also a harder task trying to accommodate multiple folder structures (ie, one python function that can accept any path of any depth and upload a file there)
The solution you have presented is the correct one. As you have realized, the Drive file system is not exactly like a hierarchical file system, so you will have to perform these checks.
One optimization you could perform is to try to find the grand-child folder (Sub2) first, so you will save a number of calls.