How to store all files universally in memory (variable) Python - python

Essentially, I would want to be able to go through a folder with text files, jpg files, csv files, png files, any kind of file, and be able to load it into memory as some kind of object. When necessary, I would then like to be able to save it and create an instance on disk. This would need to work for any kind of file type.
I would create a class that would contain the file data itself as well as metadata, but that is not necessary for my question.
Is this possible ,and if so, how can I do this?

Related

Saving multiple pandas dataframes to one file and then reading them back

I am working on a python app and I am trying to logistically plan how saving/loading files will work
In this app multiple data sheets will be loaded and they will all be used in some capacity, the problem I am having is that I want users to be able to save the changes to the files that they have imported.
I was thinking more of a save file that holds the content of all the files they have loaded into the app. But I have no idea how to structure this! I've done some research and heard that parquet/feather are good formats for saving files but I don't know if they support saving multiple data frames to the same file.
The most important part about this is that the files need to be loadable by pandas/another library so that a user can save the changes they've made and then load it back up later if they were so inclined
Any advice is appreciated!

Storing file-like objects and writing to multiple files

I got to know that file-like objects made in python from io.BytesIO or io.StringIO are not stored in the disk. Are they stored in the memory like variables? If not, where?
Also, is there a way to store them on the disk?
My code writes mp3 data to files which takes long time (30+ sec to write ~3.5mb). I planned to spilt the input data and write to files simultaneously. But I have no idea how to do this. Do I need to run multiple python scripts? I don't mind writing to different files and later read and edit the content. Can you point me to any references to start?

Extract pickled data

Is there a way to extract a file generated by pickle, while you don't even have the same classes as the original project that pickled it?
I'm trying to read a pickled file generated in an older project. I don't have the source code of the old project. But I would like to retrieve the data of the file, even as plain dictionaries. Is there another solution or should I just use a binary editor?
Thanks,

Store pkl / binary in MetaData

I am writing a function that is supposed to store a text representation of a custom class object, cl
I have some code that writes to a file and takes the necessary information out of cl.
Now I need to go backwards, read the file and return a new instance of cl. The problem is, the file doesn't keep all of the important parts of cl because for the purpose of this text document parts of it are unnecessary.
A .jpg file allows you to store meta data like shutter speed and location. I would like to store the parts of cl that are not supposed to be in the text portion in the meta data of a .txt or .csv file. Is there a way to explicitly write something to the metadata of a text file in Python?
Additionally, would it be possible to write the byte-code .pkl representation of the entire object in the metadata?
Text files don't have meta data in the same way that a jpg file does. A jpeg file is specifically designed to have ways of including meta data as extra structured information in the image. Text files aren't: every character in the text file is generally displayed to the user.
Similarly, every thing in a CSV file is part of one cell in the table represented by the file.
That said, there are some things similar to text file metadata that have existed or exist over the years that might give you some ideas. I don't think any of these is ideal, but I'll give some examples to give you an idea how complex the area of meta data is and what people have done in similar situations.
Some filesystems have meta data associated with each file that can be extended. As an example, NTFS has streams; HFS and HFSplus have resource forks or other attributes; Linux has extended attributes on most of its filesystems. You could potentially store your pickle information in those filesystem metadata. There are disadvantages. Some filesystems don't have this meta data. Some tools for copying and manipulating files will not recognize (or intentionally strip) meta data.
You could have a .txt file and a .pcl file, where the .txt file contains your text representation and the .pkl file contained the other information.
Back in the day, some DOS programs would stop reading a text file at a DOS EOF (decimal character 26). I don't think anything behaves like that, but it's an example that there are file formats that allowed you to end the file and then still have extra data that programs could use.
With a format like HTML or an actual spreadsheet instead of CSV, there are ways you could include things in meta data easily.

Saving File to Dictionary - Python

Is there any way to save a file to a dictionary under python?
(Indeed I am not asking how to export dictionaries to files here.)
Maybe a file could be pickled or transformed into me python object
and then saved.
Is this generally advisable?
Or should I only save the file's path to the dictionary?
How would I retrieve the file later on?
The background of my question relates to my usage of dictionaries as
databases. I use the handy little module sqlitshelf as a form of permanent dictionary: https://github.com/shish/sqliteshelf
Each dataset includes a unique config file (~500 kB) which is retrieved from an application. Upon opening of the respective data set the config file are copied into and back from the working directory of the application. I might use a folder instead where I save the config files to. Yet, it strikes me as more elegant to save them together with the other data.

Categories