I've built an application following the file upload process (more or less) along the lines of the Flask file upload documentation found here, https://flask.palletsprojects.com/en/1.1.x/patterns/fileuploads/.
In this portion of the code, UPLOAD_FOLDER = '/path/to/the/uploads', this points to one, single directory where file uploads will live. The problem I'm trying to solve is when I deploy my app to a server there will be multiple, simultaneous users. With a single upload directory, users will collide when they upload files with the same names--a situation that will occur in my app.
What I want to do is create a unique temp directory that is unique to each browser session. So, user 1 would have their own unique temp directory and user 2 would have their own unique temp directory and so on.
In this case, I think there would not be any user collision. Can anyone please suggest how I would create such unique temp directories associated with each browser session in the file upload process? Something along the lines of UPLOAD_FOLDER = '/path/to/the/uploads/user1_session', etc for each unique user?
Ok, so lacking further information and any sort of view on what your code/program looks like this is the what I would recommend at the moment.
I am relatively new to programming as well so this might not be the best answer. But in my experience you really,really do not want to be creating multiple directories per user/per session. That is a bad idea. This is where databases comes in handy.
Now in regards to your problem the easiest/fastest way to resolve this issue is to look into how password salt and hashing is done.
Just hash and salt your filenames.
Here is a link that provides a simple yet through explanation on how it is done.
Related
I'm working on a project that has all its data written as python files(codes). Users with authentication could upload data through a web page, and this will directly add and change codes on the production server. This is causing trouble every time I want to git pull changes from the git repo to the production server since the codes added by users directly on production are untracked.
I wonder if anyone has some other ideas. I know this is ill-designed but this is what I got from the beginning, and implementing a database will require a lot of effort since all the current codes are designed for python files.I guess it's because the people who wrote this didn't know much about databases and it works only because there is relatively little data.
The only two solutions I could think of is
use a database instead of having all datas being codes
git add/commit/push all changes on the production server every time the user upload data
Details added:
There is a 'data' folder of python files, with each file storing the information of a book. For instance, latin_stories.py has a dictionary variable text, an integer variable section, a string variable language, etc. When users upload a csv file according to a certain format, a program will automatically add and change the python files in the data folder, directly on the production server.
I'm finally at the stage of deploying my Django web-app on Heroku. The web-app retrieves and performs financial analysis on a subset of public companies. Most of the content of the web-app is stored in .csv and .xslx files. There's a master .xslx file that I need to open and manually update on a daily basis. Then I run scripts which based on info in that .xsls file retrieve financial data, news, etc and store the data in .csv files. Then my Django views and html templates are referring to those files (and not to a sql/postgress database). This is my setup in a nutshell.
Now I'm wondering what is the best way to make this run smoothly in production.
Shall I store these .xslx and .csv files on AWS S3 and have Django access them and update them from there? I assume that this way I can easily open and edit the master .xslx file anytime. Is that a good idea and do I run any performance or security issues?
Is it better to convert all these data into a postgress database on Heroku? Is there a good guideline in terms of how I can technically do that? In that scenario wouldn't it be much more challenging for me to edit the data in the master .xsls file?
Are there any better ways you would suggest to handle this?
I'd highly appreciate any advice on this matter.
You need to perform a trade-off between easy of use (access/update the source XSLX file) and maintainability (storing safely and efficiently the data).
Option #1 is more convenient if you need to quickly open and change the file using your Excel/Numbers application. On the other hand your application needs to access physical files to perform the logic and render the views.
BTW some time ago I have created a repository Heroku Files to present some options in terms of using external files.
Option #2 is typically better from a design point of view: the data is organised in the database and can be more efficiently queried and manipulated. The challenge in this case is that you need a way to view/edit the data, and this normally requires more development (creating new screens, etc..)
Involving a database is normally the preferred approach as you can scale up to large dataset without problems (which is not the case with files).
On the other hand if the XLS file stays small and you only need simple (quick) updates your current architecture can work.
I'm currently working on kind of cloud files manager. You can basically upload your files through a website. This website is connected to a python backend which will store your files and manage them using a database. It will basically put every of your files inside a folder and rename them with their hash. The database will associate the name of the file and its categories (kindof folders) with the hash so that you can retrieve the file easily.
My problem is that I would like the file upload to be really user friendly: I have a quite bad connection and when I try to download or upload a file on the internet I often get problems like at 90% of file uploading, the upload fails and I need to restart it. I'ld like to avoid that.
I'm using aiohttp to achieve this, how could I allow a file to be uploaded in multiple times? What should I use to upload large files.
In a previous code which managed really small files (less than 10MB), I was using something like that:
data = await request.post()
data['file'].file.read()
Should I continue to do it this way for large files?
I have a server which generates some data on a daily basis. I am using D3 to visualise the data (d3.csv("path")).
The problem is I can only access the files if they are under my static_dir in the project.
However, if I put them there, they do eventually get cached and I stop seeing the updates, which is fine for css and js files but not for the underlying data.
Is there a way to put these files maybe in a different folder and prevent caching on them? Under what path will I be able to access them?
Or alternatively how would it be advisable to structure my project differently in order to maybe avoid this operation in the first place. Atm, I have a seperate process that generates the data and stores it the given folder which is independent from the server.
Many thanks,
Tony
When accessing the files you can always add ?t=RANDOM to the request in order to get a "new" data all the time.
Because the request (on the server-side) is "new" - there will be no cache, and from the client side it doesn't really matter.
To get a new random you can use Date.now():
url = "myfile.csv?t="+Date.now()
I am trying to serve up some user uploaded files with Flask, and have an odd problem, or at least one that I couldn't turn up any solutions for by searching. I need the files to retain their original filenames after being uploaded, so they will have the same name when the user downloads them. Originally I did not want to deal with databases at all, and solved the problem of filename conflicts by storing each file in a randomly named folder, and just pointing to that location for the download. However, stuff came up later that required me to use a database to store some info about the files, but I still kept my old method of handling filename conflicts. I have a model for my files now and storing the name would be as simple as just adding another field, so that shouldn't be a big problem. I decided, pretty foolishly after I had written the implmentation, on using Amazon S3 to store the files. Apparently S3 does not deal with folders in the way a traditional filesystem does, and I do not want to deal with the surely convoluted task of figuring out how to create folders programatically on S3, and in retrospect, this was a stupid way of dealing with this problem in the first place, when stuff like SQLalchemy exists that makes databases easy as pie. Anyway, I need a way to store multiple files with the same name on s3, without using folders. I thought of just renaming the files with a random UUID after they are uploaded, and then when they are downloaded (the user visits a page and presses a download button so I need not have the filename in the URL), telling the browser to save the file as its original name retrieved from the database. Is there a way to implement this in Python w/Flask? When it is deployed I am planning on having the web server handle the serving of files, will it be possible to do something like this with the server? Or is there a smarter solution?
I'm stupid. Right in the Flask API docs it says you can include the parameter attachment_filename in send_from_directory if it differs from the filename in the filesystem.