best way to handle the input data in python

best way to handle the input data in python - python

I have some input data that is user configurable, so i do not want to hard code it. Like the data path, result path etc.
Can you please suggest the best way to handle this data? Should i keep them in an excel or notepad and then read at run time? Or is there a better way to handle it?
Thanks

There are a lot of ways to do it.
Configuration file
You can store configuration in separate file in YAML, JSON, INI or any other format. There are a lot of tools and libraries for parsing and loading such configurations. Take a look on this article. Such approach is good for rarely changed configuration like services credentials, but it's not really good for configuration that changes very often.
Environment variables
Also, you can store configuration inside environment variables. Take a look on py-env-config. You can hard-code default configuration values but allow user to override them using environment variables.
Script arguments
If you are writing a script, you can always pass all configuration as command-line arguments/options. Manuals. Such approach is good of configs that changes very often (almost every script execution).
EDIT
I'll suggest you to use configuration file for this constants.

Related

How to save program settings to computer?

I'm looking to store some individual settings to each user's computer. Things like preferences and a license key. From what I know, saving to the registry could be one possibility. However, that won't work on Mac.
One of the easy but not so proper techniques are just saving it to a settings.txt file and reading that on load.
Is there a proper way to save this kind of data? I'm hoping to use my wx app on Windows and Mac.

There is no proper way. Use whatever works best for your particular scenario. Some common ways for storing user data include:
Text files (e.g. Windows INI, cfg files)
binary files (sometimes compressed)
Windows registry
system environment variables
online profiles
There's nothing wrong with using text files. A lot of proper applications uses them exactly for the reason that they are easy to implement, and additionally human readable. The only thing you need to worry about is to make sure you have some form of error handling in place, in case the user decides to replace you config file content with some rubbish.

Take a look at Data Persistence on python docs. One option a you said could be persist them to a simple text file. Or you can save your data using some serialization format as pickle (see previous link) or json but it will be pretty ineficient if you have several keys and values or it will be too complex.
Also, you could save user preferences in an .ini file using python's ConfigParser module as show in this SO answer.
Finally, you can use a database like sqlite3 which is simpler to handle from your code in order to save and retrieve preferences.

Python Unit Testing with PyTables and HDF5

What is a proper way to do Unit Testing with file IO, especially if it involves PyTables and HDF5?
My application evolves around storage and retrieval of python data into and from hdf5 files. So far I simply write the hdf5 files in the unit tests myself and load them for comparison. The problem is that I, of course, cannot be sure when some one else runs the test that he has privileges to actually write files to hard disk. (This probably gets even worse when I want to use automated test frameworks like Jenkins, but I haven't checked that, yet).
What is a proper way to handle these situations? Is it best practice to create a /tmp/ folder at a particular place where write access is very likely to be granted? If so, where is that? Or is there an easy and straight forward way to mock PyTables writing and reading?
Thanks a lot!

How about using the module "tempfile" to create the files?
http://docs.python.org/2/library/tempfile.html
I don't know if it's guaranteed to work on all platforms but I bet it does work on most common ones. It would certainly be better practice than hardcoding "/tmp" as the destination.
Another way would be to create an HDF5 database in memory so that no file I/O is required.
http://pytables.github.io/cookbook/inmemory_hdf5_files.html
I obtained that link by googling "hdf5 in memory" so I can't say for sure how well it works.
I think the best practice would be writing all test cases to run against both an in-memory database and a tempfile database. This way, even if one of the above techniques fails for the user, the rest of the tests will still run. Also you can separately identify whether bugs are related to file-writing or something internal to the database.

Fundmentally, HDF5 and Pytables are I/O libraries. They provide an API for file system manipulation. Therefore if you really want to test PyTables / HDF5 you have to hit the file system. There is no way around this. If a user does not have write access on a system, they cannot run the tests. Or at least they cannot run realistic tests.
You can use the in memory file driver to do testing. This is useful for speeding up most tests and testing higher level functionality. However, even if you go this route you should still have a few tests which actually write out real files. If these fail you know that something is wrong.
Normally, people create the temporary h5 files in the tests directory. But if you are truly worried about the user not having write access to this dir, you should use tempfile.gettempdir() to find their environment's correct /tmp dir. Note that this is cross-platform so should work everywhere. Put the h5 files that you create there and remember to delete them afterwards!

How to save application settings in a config file?

I am developing a program that has a settings window in which I can change various parameters for my program. What is the best way to read/save them in some kind of config file? I know that some software and games use .ini files or similar system. How can I achieve this in Python?

The Python standard library includes the ConfigParser module, which handles ini-style configuration files for you. It's more than adequate for most uses.

Another popular option for configuration files is JSON - it's a simple notation which has good support from a wide range of languages.
Python has the json module in the standard library, which makes it very easy.

Since you introduced the term config file in your question, the previous answers concentrated on means for creating plain text files, which also could be manipulated using a standard text editor. Depending on the sort of settings to store this might not be desired, since it requires strict plausibility checks after reading back the config file at the very least. So I add the proposal of the shelves module which is a straight-forward way to make information persistent in files.

Python configuration file generator

I want to use Python to make a configuration file generator. My roughly idea is feeding input with template files and some XML files with the real settings. Then use the program to generate the real configuration files.
Example:
[template file]
server_IP = %serverip%
server_name = %servername%
[XML file]
<serverinfo>
<server ip="x.x.x.x" name="host1" />
<server ip="x.x.x.x" name="host2" />
</serverinfo>
and then get output configuration file like this
[server.ini]
[server1]
server_IP = x.x.x.x
server_name = host1
[server2]
server_IP = x.x.x.x
server_name = host2
I got several questions:
Is there any open source configuration generator program? (what could be the keyword), I wonder if there's anything can be added/modified in the design.
Does Python have good XML parser module?
Is it good idea to use XML file to save the original settings? I've been thinking to use Excel since it's more intuitive to maintain, but harder for program to parse. Not sure how people deal with this.
Hope the community can give me some suggestions. Thanks a lot!
EDIT:
In scenario that there are dozens of these output ini files. I am concerning 2 things.
there are dozens of ip/hostname and related informations, that may requires to be managed by human, so XML format would be a bit inconvenient. What could be the most convenient structure to manage those information? (Excel would be a handy tool to batch modify and look up info)
In case of need to add some extra line into ini files, I need a efficient way by just modify the template file and add extra info into the source file (may be the Excel file or whatever), then whole bunches of ini files can be generated quickly.

I recommend using excellent ConfigObj library by Michael Foord. It can read/write configuration files, even with nested sections.

I don't know if there are any open source configuration generators.
Python has several xml parser modules, the newest (and perhaps most pythonic) of which is ElementTree. You can find additional documentation for it at the developer's site.
I recommend avoiding xml in your configuration files if possible. A simple flat file full of name/value pairs is much easier for humans to work with. If you need a level or two of structure, ini-style files ought to do nicely, and python comes with a built-in parser for them.

It's terrific to use xml for any configuration file. Best choice in any interpreted language is to use code file and simply exec or import it. Pickle is also good, but not human readable.

I've used the minidom module. That would probably work for you.

You need some template engine, look at string.Template

template files evaluation in python

I am trying to use python for translating a set of templates to a set of configuration files based on values taken from a main configuration file. However, I am having certain issues. Consider the following example of a template file.
file1.cfg.template
%(CLIENT1)s %(HOST1)s %(PORT1)d C %(COMPID1)s
%(CLIENT2)s %(HOST2)s %(PORT2)d C %(COMPID2)s
This file contains an entry for each client. There are hundreds of config files like this and I don't want to have logic for each type of config file. Python should do the replacements and generate config files automatically given a set of global values read from a main xml config file. However, in the above example, if CLIENT2 does not exist, how do I delete that line? I expect Python would generate the config file using something like this:
os.open("file1.cfg.template").read() % myhash
where myhash is hash of configuration parameters from the main config file which may not contain CLIENT2 at all. In the case it does not contain CLIENT2, I want that line to disappear from the file. Is it possible to insert some 'IF' block in the file and have python evaluate it?
Thanks for your help. Any suggestions most welcome.

Sounds like you may have outgrown your originally simple home-grown templating solution. Maybe you should move to something like Jinja? It might be less of a headache to simply implement a third-party solution than it would be to create/continue to maintain your own solution.
Other options:
cheetah
mako

Maybe you can use a standalone Django template.
How do I use Django templates without the rest of Django? - Stack Overflow

Given that the files already exist, I would set default values for things like CLIENT2 (assuming you know ahead of time all possible keys). You can probably set the default value to something unusual so you can do
config = os.open("file1.cfg.template").read() % myhash
config = [l for l in config.split('\n') if <l does not have unusual text>].join('\n')
I agree with others that in the long term a more robust template would be better.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

best way to handle the input data in python - python

I have some input data that is user configurable, so i do not want to hard code it. Like the data path, result path etc. Can you please suggest the best way to handle this data? Should i keep them in an excel or notepad and then read at run time? Or is there a better way to handle it? Thanks

Related

How to save program settings to computer?

Python Unit Testing with PyTables and HDF5

How to save application settings in a config file?

Python configuration file generator

template files evaluation in python

Categories

Resources