Can you permanently change python code by input? - python

I'm still learning python and am currently developing an API (artificial personal assistant e.g. Siri or Cortana). I was wondering if there was a way to update code by input. For example, if I had a list- would it be possible to PERMANENTLY add a new item even after the program has finished running.
I read that you would have to use SQLite, is that true? And are there any other ways?

Hello J Nowak
I think what you want to do is save the input data to a file (Eg. txt file).
You can view the link below which will show you how to read and write to a text file.
How to read and write to text file in Python

There are planty of methods how you can make your data persistent.
It depends on the task, on the environment etc.
Just a couple examples:
Files (JSON, DBM, Pickle)
NoSQL Databases (Redis, MongoDB, etc.)
SQL Databases (both serverless and client server: sqlite, MySQL, PostgreSQL etc.)
The most simple/basic approach is to use files.
There are even modules that allow to do it transparently.
You just work with your data as always.
See shelve for example.
From the documentation:
A “shelf” is a persistent, dictionary-like object. The difference with
“dbm” databases is that the values (not the keys!) in a shelf can be
essentially arbitrary Python objects — anything that the pickle module
can handle. This includes most class instances, recursive data types,
and objects containing lots of shared sub-objects. The keys are
ordinary strings.
Example of usage:
import shelve
s = shelve.open('test_shelf.db')
try:
s['key1'] = { 'int': 10, 'float':9.5, 'string':'Sample data' }
finally:
s.close()
You work with s just normally, as it were just a normal dictionary.
And it is automatically saved on disk (in file test_shelf.db in this case).
In this case your dictionary is persistent
and will not lose its values after the program restart.
More on it:
https://docs.python.org/2/library/shelve.html
https://pymotw.com/2/shelve/
Another option is to use pickle, which gives you persistence
also, but not magically: you will need read and write data on your own.
Comparison between shelve and pickle:
What is the difference between pickle and shelve?

Related

Is there a way to load variables directly from a binary file in python? [duplicate]

I understood that Python pickling is a way to 'store' a Python Object in a way that does respect Object programming - different from an output written in txt file or DB.
Do you have more details or references on the following points:
where are pickled objects 'stored'?
why is pickling preserving object representation more than, say, storing in DB?
can I retrieve pickled objects from one Python shell session to another?
do you have significant examples when serialization is useful?
does serialization with pickle imply data 'compression'?
In other words, I am looking for a doc on pickling - Python.doc explains how to implement pickle but seems not dive into details about use and necessity of serialization.
Pickling is a way to convert a python object (list, dict, etc.) into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script.
As for where the pickled information is stored, usually one would do:
with open('filename', 'wb') as f:
var = {1 : 'a' , 2 : 'b'}
pickle.dump(var, f)
That would store the pickled version of our var dict in the 'filename' file. Then, in another script, you could load from this file into a variable and the dictionary would be recreated:
with open('filename','rb') as f:
var = pickle.load(f)
Another use for pickling is if you need to transmit this dictionary over a network (perhaps with sockets or something.) You first need to convert it into a character stream, then you can send it over a socket connection.
Also, there is no "compression" to speak of here...it's just a way to convert from one representation (in RAM) to another (in "text").
About.com has a nice introduction of pickling here.
Pickling is absolutely necessary for distributed and parallel computing.
Say you wanted to do a parallel map-reduce with multiprocessing (or across cluster nodes with pyina), then you need to make sure the function you want to have mapped across the parallel resources will pickle. If it doesn't pickle, you can't send it to the other resources on another process, computer, etc. Also see here for a good example.
To do this, I use dill, which can serialize almost anything in python. Dill also has some good tools for helping you understand what is causing your pickling to fail when your code fails.
And, yes, people use picking to save the state of a calculation, or your ipython session, or whatever. You can also extend pickle's Pickler and UnPickler to do compression with bz2 or gzip if you'd like.
I find it to be particularly useful with large and complex custom classes. In a particular example I'm thinking of, "Gathering" the information (from a database) to create the class was already half the battle. Then that information stored in the class might be altered at runtime by the user.
You could have another group of tables in the database and write another function to go through everything stored and write it to the new database tables. Then you would need to write another function to be able to load something saved by reading all of that info back in.
Alternatively, you could pickle the whole class as is and then store that to a single field in the database. Then when you go to load it back, it will all load back in at once as it was before. This can end up saving a lot of time and code when saving and retrieving complicated classes.
it is kind of serialization. use cPickle it is much faster than pickle.
import pickle
##make Pickle File
with open('pickles/corups.pickle', 'wb') as handle:
pickle.dump(corpus, handle)
#read pickle file
with open('pickles/corups.pickle', 'rb') as handle:
corpus = pickle.load(handle)

How to modify variables and instances in modules and save it at runtime in Python

I have main.py,header.py and var.py
header.py
import var
class table():
def __init__(self, name):
self.name = name
var.py
month = "jen"
table = "" # tried to make empty container which can save table instance but don't know how
main.py
import header
import var
var.table = header.table(var.month)
var.month = "feb"
And after this program ended, I want that var.table and var.month is modified and saved in var.py.
When your program ends, all your values are lost—unless you save them first, and load them on the next run. There are a variety of different ways to do this; which one you want depends on what kind of data you have and what you're doing with it.
The one thing you never, ever want to do is print arbitrary objects to a file and then try to figure out how to parse them later. If the answer to any of your questions is ast.literal_eval, you're saving things wrong.
One important thing to consider is when you save. If someone quits your program with ^C, and you only save during clean shutdowns, all your changes are gone.
Numpy/Pandas
Numpy and Pandas have their own built-in functions for saving data. See the Numpy docs and Pandas docs for all of the options, but the basic choices are:
Text (e.g., np.savetxt): Portable formats, editable in a spreadsheet.
Binary (e.g., np.save): Small files, fast saving and loading.
Pickle (see below, but also builtin functions): Can save arrays with arbitrary Python objects.
HDF5. If you need HDF5 or NetCDF, you probably already know that you need it.
List of strings
If all you have is a list of single-line strings, you just write them to a file and read them back line by line. It's hard to get simpler, and it's obviously human-readable.
If you need a short name for each value, or need separate sections, but your values are still all simple strings, you may want to look at configparser for CFG/INI files. But as soon as you get more complicated than that, look for a different format.
Python source
If you don't need to save anything, only load data (that your users might want to edit), you can use Python itself as a format—either a module that you import, or a script file that you exec. This can of course be very dangerous, but for a config file that's only being edited by people who already have your entire source code on their computer, that may not be a problem.
JSON and friends
JSON can save a single dict or list to a file and load it back. JSON is built into the Python standard library, and most other languages can also load and save it. JSON files are human-editable, although not beautiful.
JSON dicts and lists can be nested structure with other dicts and lists inside, and can also contain strings, floats, bools, and None, but nothing else. You can extend the json library with converters for other types, but it's a bit of work.
YAML is (almost) a superset of JSON that's easier to extend, and allows for prettier human-editable files. It doesn't have builtin support in the standard library, but there are a number of solid libraries on PyPI, like ruamel.yaml.
Both JSON and YAML can only save one dict or list per file. (The library will let you save multiple objects, but you won't be able to load them back, so be careful.) The simplest way around this is to create one big dict or list with all of you data packed into it. But JSON Lines allows you save multiple JSON dicts in a single file, at the cost of human readability. You can load it just by for line in file: obj = json.loads(obj), and you can save it with just the standard library if you know what you're doing, but you can also find third-party libraries like json-lines to do it for you.
Key-value stores
If what you want to store fits into a dict, but you want to have it on disk all the time instead of explicitly saving and loading, you want a key-value store.
dbm is an old but still functional format, as long as your keys and values are all small-ish strings and you don't have tons of them. Python makes a dbm look like a dict, so you don't need to change most of your code at all.
shelve extends dbm to let you save arbitrary values instead of just strings. It does this by using Pickle (see below), meaning it has the same safety issues, and it can also be slow.
More powerful key-value stores (and related things) are generally called NoSQL databases. There are lots of them nowadays; Redis is one of the popular choices. There's more to learn, but it can be worth it.
CSV
CSV stands for "comma-separated values", although there are variations that use whitespace or other characters. CSV is built into the standard library.
It's a great format when you have a list of objects all with the same fields, as long as all of the members are strings or numbers. But don't try to stretch it beyond that.
CSV files are just barely human-editable as text—but they can be edited very easily in spreadsheet programs like Excel or Google Sheets.
Pickle
Pickle is designed to save and load just about anything. This can be dangerous if you're reading arbitrary pickle files supplied by users, but it can also be very convenient. Pickle actually can't quite save and load everything unless you do a lot of work to add support to some of your types, but there's a third-party library named dill that extends support a lot further.
Pickle files are not at all human-readable, and are only compatible with Python, and sometimes not even with older versions of Python.
SQL
Finally, you can always build a full relational database. This it's quite as scary as it sounds.
Python has a database called sqlite3 built into the standard library.
If that looks too complicated, you may want to consider SQLAlchemy, which lets you store and query data without having to learn the SQL language. Or, if you search around, there are a number of fancier ORMs, and libraries that let you run custom list comprehensions directly against databases, and so on.
Other formats
There are ziklions of other standards out there for data files; a few even come with support in the standard library. They can be useful for special cases—plist files match what Apple uses for preferences on macOS and iOS; netrc files are a long-established way to store a list of server logins; XML is perfect if you have a time machine that can only travel to the year 2000; etc. But usually, you're better off using one of the common formats mentioned above.

How to store a dictionary in a file?

i'm rather new to python and coding in general.
I'm writing my own chat statistics bot for russian social net (vk. com).
My question is can i store a dictionary in a file and work with it?
For example:
Userlist=open('userlist.txt', '+')
If lastmessage['uid'] not in Userlist.read():
Userlist.read()[lastmessage.'uid']=1
Userlist.close()
Or do i have to use some side modules like JSON?
Thank you
(Ammended answer in light of clarifying comment: in the while true cycle i want to check, if a user's id is in 'userlist' dictionary (as a key) and if not, add it to this dictionary with value 1. Then i want to rewrite the file with a new dictionary. the file is opened as soon as the program is launched, before the cycle):
For robustly using data on disk as though it were a dictionary you should consider either one of the dbm modules or just using the SQLite3 support.
A dbm file is simply a set of keys and values stored with transparently maintained and used indexing. Once you've opened your dbm file you simply use it exactly like you would any other Python dictionary (with strings as keys). Any changes can simply be flushed and written before closing the file. This is very simple though it offers no special features for locking (or managing consistency in the case where you might have multiple processes writing to the file concurrently) and so on.
On the other hand the incredibly powerful SQLite subsystem, which has been included in the Python standard libraries for many years, allows you to easily treat a set of local file as an SQL database management system ... with all of the features you'd expect from a client/server based system (foreign keys, data type and referential integrity constraint management, views and triggers, indexes, etc).
In your case you could simply have a single table containing a single column. Binding to that database (by its filename) would allow you to query for a user's name with SELECT and add the user's name with INSERT. As your application grows and changes you could add other columns to track when the account was created and when it was most recently used or checked (a couple of time/date stamp columns) and you could create other tables with related data (selected using JOINs, for example).
(Original answer):
In general the processing of storing any internal data structure as a file, or transmitting it over a network connection, is referred to a "serialization." The complementary process of loading or receiving such data and instantiating its contents into a new data structure is referred to (unsurprisingly) as "deserialization."
That's true of all programming languages.
There are many ways to serialize and deserialize data in Python. In particular we have the native (standard library) pickle module which produces files (or strings) which are only intended or use with other processes running Python or we can, as you said, use JSON ... the JavaScript Object Notation which has become the de facto cross-language data structure serialization standard. (There are others such as YAML and XML ... but JSON has come to predominate).
The caveat about using JSON vs. Pickle is that JavaScript (and a number of other programming and scripting languages, uses different semantics for some sorts of "dictionary" (associative array) keys than Python. In particular Python (and Ruby and Lua) treats keys such as "1" (a string containing the digit "one") and 1 or 1.0 (numeric values equal to one) as distinct keys. JavaScript, Perl and some others treats the keys as "scalar" values in which strings like "1" and the the number 1 will evaluate into the same key.
There are some other nuances which can affect the fidelity of your serialization. But that's the easiest to understand. Dictionaries with strings as keys are fine ... mixtures of numeric and string keys are the most likely cause of any troubles you'll encounter using JSON serialization/deserialization in lieu of pickling.

Python serialization - Why pickle?

I understood that Python pickling is a way to 'store' a Python Object in a way that does respect Object programming - different from an output written in txt file or DB.
Do you have more details or references on the following points:
where are pickled objects 'stored'?
why is pickling preserving object representation more than, say, storing in DB?
can I retrieve pickled objects from one Python shell session to another?
do you have significant examples when serialization is useful?
does serialization with pickle imply data 'compression'?
In other words, I am looking for a doc on pickling - Python.doc explains how to implement pickle but seems not dive into details about use and necessity of serialization.
Pickling is a way to convert a python object (list, dict, etc.) into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script.
As for where the pickled information is stored, usually one would do:
with open('filename', 'wb') as f:
var = {1 : 'a' , 2 : 'b'}
pickle.dump(var, f)
That would store the pickled version of our var dict in the 'filename' file. Then, in another script, you could load from this file into a variable and the dictionary would be recreated:
with open('filename','rb') as f:
var = pickle.load(f)
Another use for pickling is if you need to transmit this dictionary over a network (perhaps with sockets or something.) You first need to convert it into a character stream, then you can send it over a socket connection.
Also, there is no "compression" to speak of here...it's just a way to convert from one representation (in RAM) to another (in "text").
About.com has a nice introduction of pickling here.
Pickling is absolutely necessary for distributed and parallel computing.
Say you wanted to do a parallel map-reduce with multiprocessing (or across cluster nodes with pyina), then you need to make sure the function you want to have mapped across the parallel resources will pickle. If it doesn't pickle, you can't send it to the other resources on another process, computer, etc. Also see here for a good example.
To do this, I use dill, which can serialize almost anything in python. Dill also has some good tools for helping you understand what is causing your pickling to fail when your code fails.
And, yes, people use picking to save the state of a calculation, or your ipython session, or whatever. You can also extend pickle's Pickler and UnPickler to do compression with bz2 or gzip if you'd like.
I find it to be particularly useful with large and complex custom classes. In a particular example I'm thinking of, "Gathering" the information (from a database) to create the class was already half the battle. Then that information stored in the class might be altered at runtime by the user.
You could have another group of tables in the database and write another function to go through everything stored and write it to the new database tables. Then you would need to write another function to be able to load something saved by reading all of that info back in.
Alternatively, you could pickle the whole class as is and then store that to a single field in the database. Then when you go to load it back, it will all load back in at once as it was before. This can end up saving a lot of time and code when saving and retrieving complicated classes.
it is kind of serialization. use cPickle it is much faster than pickle.
import pickle
##make Pickle File
with open('pickles/corups.pickle', 'wb') as handle:
pickle.dump(corpus, handle)
#read pickle file
with open('pickles/corups.pickle', 'rb') as handle:
corpus = pickle.load(handle)

How to save big "database-like" class in python

I'm doing a project with reasonalby big DataBase. It's not a probper DB file, but a class with format as follows:
DataBase.Nodes.Data=[[] for i in range(1,1000)] f.e. this DataBase is all together something like few thousands rows. Fisrt question - is the way I'm doing efficient, or is it better to use SQL, or any other "proper" DB, which I've never used actually.
And the main question - I'd like to save my DataBase class with all record, and then re-open it with Python in another session. Is that possible, what tool should I use? cPickle - it seems to be only for strings, any other?
In matlab there's very useful functionality named save workspace - it saves all Your variables to a file that You can open at another session - this would be vary useful in python!
Pickle (cPickle) can handle any (picklable) Python object. So as long, as you're not trying to pickle thread or filehandle or something like that, you're ok.
Pickle should be able to serialise the data for you so that you can save it to file.
Alternatively if you don't need the features of a full featured RDBMS you could use a lightweight solution like SQLLite or a document store like MongoDB
The Pickle.dump function is very much like matlab's save feature. You just give it an object you wish to serialize and a file-like object to write it to. See the documentation for info and examples on how to use it.
The cPickle module is just like Pickle, but it is implemented in C so it can be much faster. You should probably use cPickle.

Categories