How to format numbers by (several different) locales in Python 3 - python

Is there a correct way to format numbers by locale (getting the correct decimal separator) without modifying global state? This is for text generation server-side, so setlocale is not a good idea, and Babel does not yet support Python 3

Unfortunately, the locale module gets and sets global state. This is intrinsic to the design of locale.
The various workarounds include setting locks or calling a subprocess as a service.

Related

How to parse floats in a local format without changing the current locale?

How to parse floats in Python from a source which uses a special known local format that has commas as decimal separators?
locale can do that, of course. But its documentation says:
It is generally a bad idea to call setlocale() in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
Simply replacing , by . is not such a great solution either.
Can a special locale in Python be used but only for a single parse operation?

zoneinfo: what are the new right and posix zone-name prefixes?

The most recent zoneinfo database, as maintained by the Internet Assigned Numbers Authority, holds two new prefixes, 'posix' and 'right'. For example where there used to be just Asia/Kolkata the new database has added posix/Asia/Kolkata and right/Asia/Kolkata.
This database is also known as the Olson database after its first developer, or the tz database.
What do these newly added prefixes mean, and what's their practical effect? Can any of them safely be filtered out of timezone-choice picklists presented to users?
Globalized web apps (such as WordPress) use these zoneinfo names for user-preference picklists. They're in MySQL's timezone support setup.
right (or also leap, also zoneinfo-leaps) is about using times including leap seconds. Posix (also zoneinfo-posix, often just `zoneinfo) is about using POSIX times, so without considering leap seconds.
In theory you should not choose, the system choose it for you, considering how the time is stored in the system.
But no, for user you should not use data in zoneinfo.
To quote Python documentation:
Note
These values are not designed to be exposed to end-users; for user facing elements, applications should use something like CLDR (the Unicode Common Locale Data Repository) to get more user-friendly strings. See also the cautionary note on ZoneInfo.key.
Note: some zones are obsolete, and the text is simple ASCII, so it cannot represent correctly the city names, and the Englished version is not always the best one.
The right/ prefix marks timezones taking leap seconds into account.
The posix/ prefix marks timezones using, well, POSIX time. Those timezones are, practically, the same as the unprefixed ones.
It's probably fine to filter out the prefixed names when presenting timezone choices to users (the way WordPress, for example, does). If you're an astronomer your parsecage may vary.

Locale sensitive date time parsing in Python

I need to parse a date time in a non-English language, on a machine running with an en-us locale. The easy solution to this problem would be to do setlocale and then proceed to strptime.
The problem is that if you setlocale, it gets set program-wide. However, due to the nature of my program, many threads are running and run locale sensitive processes.
This means setlocale is not an option, because it messes with it globally. How can I strptime with a different locale?

Counting number of symbols in Python script

I have a Telit module which runs [Python 1.5.2+] (http://www.roundsolutions.com/techdocs/python/Easy_Script_Python_r13.pdf)!. There are certain restrictions in the number of variable, module and method names I can use (< 500), the size of each variable (16k) and amount of RAM (~ 1MB). Refer pg 113&114 for details. I would like to know how to get the number of symbols being generated, size in RAM of each variable, memory usage (stack and heap usage).
I need something similar to a map file that gets generated with gcc after the linking process which shows me each constant / variable, symbol, its address and size allocated.
Python is an interpreted and dynamically-typed language, so generating that kind of output is very difficult, if it's even possible. I'd imagine that the only reasonable way to get this information is to profile your code on the target interpreter.
If you're looking for a true memory map, I doubt such a tool exists since Python doesn't go through the same kind of compilation process as C or C++. Since everything is initialized and allocated at runtime as the program is parsed and interpreted, there's nothing to say that one interpreter will behave the same as another, especially in a case such as this where you're running on such a different architecture. As a result, there's nothing to say that your objects will be created in the same locations or even with the same overall memory structure.
If you're just trying to determine memory footprint, you can do some manual checking with sys.getsizeof(object, [default]) provided that it is supported with Telit's libs. I don't think they're using a straight implementation of CPython. Even still, this doesn't always work and with raise a TypeError when an object's size cannot be determined if you don't specify the default parameter.
You might also get some interesting results by studying the output of the dis module's bytecode disassembly, but that assumes that dis works on your interpreter, and that your interpreter is actually implemented as a VM.
If you just want a list of symbols, take a look at this recipe. It uses reflection to dump a list of symbols.
Good manual testing is key here. Your best bet is to set up the module's CMUX (COM port MUXing), and watch the console output. You'll know very quickly if you start running out of memory.
This post makes me recall my pain once with Telit GM862-GPS modules. My code was exactly at the point that the number of variables, strings, etc added up to the limit. Of course, I didn't know this fact by then. I added one innocent line and my program did not work any more. I drove me really crazy for two days until I look at the datasheet to find this fact.
What you are looking for might not have a good answer because the Python interpreter is not a full fledged version. What I did was to use the same local variable names as many as possible. Also I deleted doc strings for functions (those count too) and replace with #comments.
In the end, I want to say that this module is good for small applications. The python interpreter does not support threads or interrupts so your program must be a super loop. When your application gets bigger, each iteration will take longer. Eventually, you might want to switch to a faster platform.

using emacs CEDET completion for python

In default installation of cedet-1.0 completion can only track global scope symbols in current file. This is not much differs from built-in completion functions (dabbrev-expand or hippie-expand).
It can complete symbols from neither imported modules, nor class properties.
Not saying it cannot handle 'self'.
Is it possible to tweak semantic to do the things?
P.S.
ECB code browser sucesfully sees all imports/base classess and stuff.
It is symbol completion workd incorrectly, or not properly set up.
CEDET support for each language is slightly different. In the case of python, the 1.0 release for CEDET hadn't been configured to convert a python import into a file-name. In addition, 'self' is similar to 'this' in c++, which needs to be added by completion logic since it isn't declared. These two features were added to the bzr repository in January of this year. I am not a python programmer, but I recall reports that this fixed a range of the most basic features of smart completion so that symbols from imported libraries works. There was also new code in bzr for python system paths.
Thus, I recommend downloading CEDET from bzr to get these features to see if it now does what you would expect for smart completion.

Categories