Is there any way to get the uploaded file date and name which we have stored into the database using forms ?
Right now I am just creating two more database tuples for name and date and storing them like this file_name = request.FILES['file'].name for file_name and storing date using upload_date = datetime.datetime.now()
You can kind of get the date after reading the metadata of the file using the stat module.
http://docs.python.org/release/2.5.2/lib/module-stat.html
It is OS specific but ST_CTIME should give you approximately what you looking for.
For the name, you can easily get it from the way you store. Specify a custom handler that stores the file at /your/file/path/filename.extension and just manipulate the string for the filename
Just read this in the flask docs. Not sure how much it is applicable in Django but pasting here for reference
*If you want to know how the file was named on the client before it was uploaded to your application, you can access the filename attribute. However please keep in mind that this value can be forged so never ever trust that value. If you want to use the file-name of the client to store the file on the server, pass it through the secure_filename() function that Werkzeug provides for you*
You can use the original file's name as part of the file name when storing in the disk, and you probably can use the file's creation/modification date for the upload date. IMO, you should just store it explicitly in the database.
Related
How do I access the tags attribute here in the Windows File Properties panel?
Are there any modules I can use? Most google searches yield properties related to media files, file access times, but not much related to metadata properties like Tags, Description etc.
the exif module was able to access a lot more properties than most of what I've been able to find, but still, it wasn't able to read the 'Tags' property.
The Description -> Tags property is what I want to read and write to a file.
There's an entire module dedicated to exactly what I wanted: IPTCInfo3.
import iptcinfo3, os, sys, random, string
# Random string gennerator
rnd = lambda length=3 : ''.join(random.choices(list(string.ascii_letters), k=length))
# Path to the file, open a IPTCInfo object
path = os.path.join(sys.path[0], 'DSC_7960.jpg')
info = iptcinfo3.IPTCInfo(path)
# Show the keywords
print(info['keywords'])
# Add a keyword and save
info['keywords'] = [rnd()]
info.save()
# Remove the weird ghost file created after saving
os.remove(path + '~')
I'm not particularly sure what the ghost file is or does, it looks to be an exact copy of the original file since the file size remains the same, but regardless, I remove it since it's completely useless to fulfilling the read/write purposes of metadata I need.
There have been some weird behaviours I've noticed while setting the keywords, like some get swallowed up into the file (the file size changed, I know they're there, but Windows doesn't acknowledge this), and only after manually deleting the keywords do they reappear suddenly. Very strange.
How can I get the actual path of the file uploaded from the FileUpload widget? In my case, I am uploading a binary file and just need to know where it is so I can process it using an internal application.
When I iterate on the FileUpload value I see there is a file name as a key and a dict as the value:
up_value = self.widgets['file_upload'].value
for file_name, file_dict in up_value.items():
print(file_dict.keys()) # -> dict_keys(['metadata', 'content'])
print(file_dict['metadata'].keys()) # -> dict_keys(['name', 'type', 'size', 'lastModified'])
I know the content of the file is uploaded but really don't need that. Also not sure how I would pass that content to my internal processing application. I just want to create a dict that stores the filename as a key and the file path as the value.
thx
Once 'uploaded', the file exists as data in memory, so there is no concept of the originating 'path' in the widget.
If you want to read or write the file, access the up_value.value[0].content attribute. Your processing application needs to be able to handle bytes input, or you could wrap the data in a BytesIO object.
I'm writing several related python programs that need to access the same file however, this file will be updated/replaced intermittently and I need them all to access the new file. My current idea is to have a specific folder where the latest file is placed whenever it needs to be replaced and was curious how I could have python select whatever text file is in the folder.
Or, would I be better off creating a program that has a Class entirely dedicated to holding the information of the file and have each program reference the file in that class. I could have the Class use tkinter.filedialog to select a new file whenever necessary and perhaps have a text file that has the path or name to the file that I need to access and have the other programs reference that.
Edit: I don't need to write to the file at all just read from it. However, I would like to have it so that I do not need to manually update the path to the file every time I run the program or update the file path.
Edit2: Changed title to suit the question more
If the requirement is to get the most recently modified file in a specific directory:
import os
mypath = r'C:\path\to\wherever'
myfiles = [(f,os.stat(os.path.join(mypath,f)).st_mtime) for f in os.listdir(mypath)]
mysortedfiles = sorted(myfiles,key=lambda x: x[1],reverse=True)
print('Most recently updated: %s'%mysortedfiles[0][0])
Basically, get a list of files in the directory, together with their modified time as a list of tuples, sort on modified date, then get the one you want.
It sounds like you're looking for a singleton pattern, which is a neat way of hiding a lot of logic into an 'only one instance' object.
This means the logic for identifying, retrieving, and delivering the file is all in one place, and your programs interact with it by saying 'give me the one instance of that thing'. If you need to alter how it identifies, retrieves, or delivers what that one thing is, you can keep that hidden.
It's worth noting that the singleton pattern can be considered an antipattern as it's a form of global state, it depends on the context of the program if this is a deal breaker or not.
To "have python select whatever text file is in the folder", you could use the glob library to get a list of file(s) in the directory, see: https://docs.python.org/2/library/glob.html
You can also use os.listdir() to list all of the files in a directory, without matching pattern names.
Then, open() and read() whatever file or files you find in that directory.
All,
I am working on creating an interface for dealing with some massive data and generating arff files for doing some machine learning stuff with. I can currently collect the features- but I have no way of associating them with the files they were derived from. I am currently using Dumbo
def mapper(key, value):
#do stuff to generate features
Is there any convenient method for determining the filename that was opened and had its contents passed to the mapper function?
Thanks again.
-Sam
If you're able to access the job configuration properties, then the mapreduce.job.input.file property should contain the file name of the current file.
I'm not sure how you get at these properties in Dumbo/Mrjob though - the docs specify that periods (in the conf names) are replaced with underscores, and then looking through the source for PipeMapRed.java, looks like everything single job conf property is set as a env variable - so try and access an env variable named mapreduce_job_input_file
http://hadoop.apache.org/mapreduce/docs/r0.21.0/mapred_tutorial.html#Configured+Parameters
As described here, you can use -addpath yes option.
-addpath yes (replace each input key by a tuple consisting of the path of the corresponding input file and the original key)
I have managed to create a simple app which deletes (bypassing the recycle bin) any files I want to. It can also upload files. The problem I am having is that I cannot specify which collection the new file should be uploaded to.
def UploadFile(folder, filename, local_file, client):
print "Upload Resource"
doc = gdata.docs.data.Resource(type='document', title=filename)
path = _GetDataFilePath(local_file)
media = gdata.data.MediaSource()
media.SetFileHandle(path, 'application/octet-stream')
create_uri = gdata.docs.client.RESOURCE_UPLOAD_URI + '?convert=false'
collection_resource = folder
upload_doc = client.CreateResource(doc, create_uri=create_uri, collection=collection_resource, media=media)
print 'Created, and uploaded:', upload_doc.title, doc.resource_id
From what I understand the function CreateResources requires a resource object representing the collection. How do I get this object? The variable folder is currently just a string which says 'daily' which is the name of the collection, it is this variable which I need to replace with the collection resource.
From various sources, snippets and generally stuff all over the place I managed to work this out. You need to pass a uri to the FindAllResources function (one which I found no mention of in the sample code from gdata).
I have written up in more detail how I managed to upload, delete (bypassing the bin), search for and move files into collections
here