Update value in a JOSNObject without pointers in python - python

I'm programming a python code in which I use JSONObjects to communicate with a Java application. My problem ist, that I want to change a value in the JSONObject (in this example called py_json) and the dimension of that JSONObject is not fixed but known.
varName[x] is the input of the method and the length of varName is the dimension/size of the JSONObjects.
The code would work like that but I can't copy and paste the code 100 times to be sure that there are no bigger JSONObjects.
if length == 1:
py_json[VarName[0]] = newValue
elif length == 2:
py_json[VarName[0]][VarName[1]] = newValue
elif length == 3:
py_json[VarName[0]][VarName[1]][VarName[2]] = newValue
In C I would solve it with pointers like that:
int *pointer = NULL;
pointer = &py_json;
for (i=0; i<length; i++){
pointer = &(*pointer[VarName[i]]);
}
*pointer = varValue;
But there are no pointers in python.
Do you known a way to have a dynamic solution in python?

Python's "variables" are just names pointing to objects (instead of symbolic names for memory addresses as in C) and Python assignement doesn't "copies" a variable value to a new memory location, it only make the name points to a different object - so you don't need pointers to get the same result (you probably want to read this for more on python's names / variables).
IOW, the solution is basically the same: just use a for loop to get the desired target (actually: the parent of the desired target), then assign to it:
target = py_json
for i in range(0, length - 1):
target = target[VarName[i]]
target[VarName[length - 1]] = newValue

Related

Add an element to a list past the end of the list in python

Is there a pythonic way to add to a list at a known index that is past the end of the list? I cannot use append, as I'm looking add at an index that is more than 1 past the end. (For example, I want to put a value at x[6] when len(x) == 3).
I have a code that performs actions for sequential steps, and each step has a set of inputs. The users create an input file with these inputs. I store those inputs as a dictionary for each step, then a list of dictionaries to keep the order of the steps. I had just been reading the inputs for each step, then appending the dictionary to the list. I want to harden the code against the steps being out of order in the input files. If the user puts step 6 before step 3, I can't just append. I do not know the total number of steps until after the file has been read. I have a method worked out, but it seems clunky and involves multiple copies.
My kludgy attempt. In this case InputSpam and CurrentStep would actually be read from the user file
import copy
AllInputs = []
InputSpam = {'Key',999}
for i in xrange(0,3):
AllInputs.append(InputSpam.copy())
CurrentStep = 7
if CurrentStep - 1 == len(AllInputs):
AllInputs.append(InputSpam.copy())
elif CurrentStep - 1 < len(AllInputs):
AllInputs[CurrentStep-1] = InputSpam.copy()
elif CurrentStep - 1 > len(AllInputs):
Spam = [{}]*CurrentStep
Spam [:len(AllInputs)] = copy.deepcopy(AllInputs)
AllInputs = copy.deepcopy(Spam)
AllInputs[CurrentStep-1] = InputSpam.copy()
del Spam
Only after I wrote the answer I notice you use pyhton 2. Python 2 is unsupported for a long time now. You should switch to python 3. (The following solution is only valid for python 3.)
You can use collections.UserList to crate your own variation of a list like this:
from collections import UserList
class GappedList(UserList):
PAD_VALUE = object() # You may use None instead
def __setitem__(self, index, value):
self.data.extend(self.PAD_VALUE for _ in range(len(self.data), index+1))
self.data[index] = value
Inheriting from the UserList makes the whole structure to mostly behave like a regular list, unless specified otherwise. The data attribute gives access to "raw" underlying list. Only thing we need to redefine here is __setitem__ method which cares to assignments like my_list[idx] = val. We redefine in to firstly fill in a gap inbetween the end of the current list and the index you want to write in. (Actually it fills the list including the index you want to write to and then re-writes to value -- it makes the code a bit simpler).
You might need to redefine alse __getitem__ method if you want to handle access to index in the gaps somewhat differently.
Usage:
my_list = GappedList([0,1,2])
my_list.append(3)
my_list[6] = 6
my_list.append(7)
my_list[5] = 5
print(my_list)
# output:
[0, 1, 2, 3, <object object at 0x7f42cbd5ec80>, 5, 6, 7]

Pulling variables out of functions in Python

I have been self-teaching Python for a few weeks now and have the aim to create a script to run an equation and keep hitting walls. What I basically want to do is take an input with a unit attached i.e. 6M being 6,000,000, convert the unit into a numerical format and put that into an equation with an output.
So far I have defined a function:
def replaceunit(body):
body = body.replace(str(body[-1]),str(units.get(body[-1])))
return body
I have asked for the input and have a dictionary of units (shortened dictionary below):
T = input("T = ")
B = input("B = ")
units ={'M': 1e6, # mega
'G': 1e9 # giga
}
I then try and replace the if an M or G appears in the T or B variables:
if str(T[-1]).isalpha() == True:
replaceunit(T)
if str(B[-1]).isalpha() == True:
replaceunit(B)
After this I would like the updated T and B to be put into an equation that I define.
If I add a print action to my function I can see the values have been replaced, but have been unable to pull the corrected values through outside of the function and into another equation.
As I say, I'm very new to this, so if there's any help you can lend I'd very much appreciate it. Apologies also if this has been asked elsewhere, the few similar answers I have seen I haven't really understood the answer too.
Strings are immutable in Python, meaning they cannot be changed in place, but rather you have to create a new string for every change. That is exactly what you did in replaceunit - you wrote body = body.replace(...) and you replaced the old reference with a new one that replace gave you.
replaceunit is also returning a new reference, so calling it should be done as T = replaceunit(T) and B = replaceunit(B) to save changes. You must not use the same variable if you want to save both the replaced and non-replaced versions of the string.
If you want the value you returned from replaceunit to be the new value of T, you need to assign it:
T = replaceunit(T)
Note that you could skip the step of assigning body inside the function itself and simply return the value:
def replaceunit(body):
return body.replace(str(body[-1]),str(units.get(body[-1])))
I would also suggest that it might be more useful to have a function that turns the user-inputted number into an actual number:
def parse_number(body: str) -> float:
"""Converts a string like '2G' into a value like 2000000000."""
units ={
'M': 1e6, # mega
'G': 1e9, # giga
}
return float(body[:-1]) * units[body[-1]]
This will be necessary if you want to do any actual math with that value!

Using c-like arrays in python

Is the following ever done in python to minimize the "allocation time" of creating new objects in a for loop in python? Or, is this considered bad practice / there is a better alternative?
for row in rows:
data_saved_for_row = [] // re-initializes every time (takes a while)
for item in row:
do_something()
do_something
vs. the "c-version" --
data_saved_for_row = []
for row in rows:
for index, item in enumerate(row):
do_something()
data_saved_for_row[index + 1] = '\0' # now we have a crude way of knowing
do_something_with_row() # when it ends without having
# to always reinitialize
Normally the second approach seems like a terrible idea, but I've run into situations when iterating million+ items where the initialization time of the row:
data_saved_for_row = []
has taken a second or more to do.
Here's an example:
>>> print timeit.timeit(stmt="l = list();", number=int(1e8))
7.77035903931
If you want functionality for this sort of performance, you may as well just write it in C yourself and import it with ctypes or something. But then, if you're writing this kind of performance-driven application, why are you using Python to do it in the first place?
You can use list.clear() as a middle-ground here, not having to reallocate anything immediately:
data_saved_for_row = []
for row in rows:
data_saved_for_row.clear()
for item in row:
do_something()
do_something
but this isn't a perfect solution, as shown by the cPython source for this (comments omitted):
static int
_list_clear(PyListObject *a)
{
Py_ssize_t i;
PyObject **item = a->ob_item;
if (item != NULL) {
i = Py_SIZE(a);
Py_SIZE(a) = 0;
a->ob_item = NULL;
a->allocated = 0;
while (--i >= 0) {
Py_XDECREF(item[i]);
}
PyMem_FREE(item);
}
return 0;
}
I'm not perfectly fluent in C, but this code looks like it's freeing the memory stored by the list, so that memory will have to be reallocated every time you add something to that list anyway. This strongly implies that the python language just doesn't natively support your approach.
Or you could write your own python data structure (as a subclass of list, maybe) that implements this paradigm (never actually clearing its own list, but maintaining a continuous notion of its own length), which might be a cleaner solution to your use case than implementing it in C.

Python list being appended when not (directly) referenced

Okay so this is just a rough bit of code I made when trying to make a Guess Who(TM) for class challenge and I wanted to make a random character generator function (its only a proof of concept and I would expand it complexity later! Please don't judge!). However the character's template feature list seems to be appended every iteration (and so skewing my other loops) when it aught not to. It should be adding an item to the end of each new generated list - not the template. Yet the template variable is not appended to in the code, only a temporary copy is/should be. Here's the code:
tempfeatures = characters = []
for i in range(len(characternames)):
tempfeatures = []
charactername = characternames[random.randint(0,len(characternames)-1)]
characternames.remove(charactername)
a = features
tempfeatures = a
### "Debug bit" ###
print(features)
print("loooooop")
for y in range(len(features)):
print(len(features))
temp = random.randint(0,1)
if temp == 1:
tempfeatures[y][1] = True
else:
tempfeatures[y][1] = False
tempfeatures.append(["Dead",True])
characters.append([charactername,tempfeatures])
print(characters)
Thank you!
Apparently the tempfeature variable is "call by reference" and not "call by value". - thanks python.
So when duplicating lists, one must use this on the end of the variable name
tempfeature = feature[:]
(the [:] bit)
Thanks all for your comments!
This is called a shallow copy, it just referenciates the list to another variable, as seen here:
https://docs.python.org/2/library/copy.html
You need to make and intependent copy, or a deep copy, as: tempfeature = list(feature) so changing tempfeature won't interfere with feature

Python module savReaderWriter causing Segmentation fault

I am using Python 2.7 on Ubuntu. I have a script that writes an SPSS .sav file.
If I use ValueLabels with numbers as keys like this:
{1: 'yes', 2: 'no'}
the following line causes a Segmentation fault:
with savReaderWriter.SavWriter(sav_file_name, varNames, varTypes, valueLabels=value_labels, ioUtf8=True) as writer:
However, if my keys are strings like this:
{'1': 'yes', '2': 'no'}
I do not get the Segmentation fault, and my script runs fine. The problem, of course is that I need the keys to be numbers. How can I fix or work around this.
Thank you in advance.
-RLS
Depending on whether you specify a numerical (varType == 0) or a string (varType > 0, where varType is the length in bytes of the string value), one the following two C functions of the SPSS I/O library is called:
int spssSetVarNValueLabel(int handle, const char * varName, double value, const char * label)
int spssSetVarCValueLabel(int handle, const char * varName, const char * value, const char * label)
Note that ctypes.c_double accepts both floats and ints, so the values of numerical variables do not necessarily have to be specified as floats (doubles), they can also be ints.
It appears that you specified a varType > 1 (indicating a string variable), but a 'value label' value which is an int (suggesting a numerical variable). The fix is to make the two consistent. One way is already stated above, the other way is to set the varType for the variable in question to zero.
That said, it is ugly to get this segfault. I put it on my to-do list to specify the argtype attribute for all the setter functions (see 15.17.1.6 on https://docs.python.org/2/library/ctypes.html), so you would get a nice, understandable ArgumentError instead of this nasty segfault.
If the problem persists, could you please open an issue at https://bitbucket.org/fomcl/savreaderwriter/issues?status=new&status=open, please with a minimal example.
#ekhumoro: savReaderWriter has not been tested for Python 2.6 or earlier (I would be surprised it if works), so a dict comprehension should be fine.
UPDATE:
# RLS: You are welcome. Thank you too, it inspired me to correct this. As of commit 5c11704 this is now throwing a ctypes.ArgumentError (see https://bitbucket.org/fomcl/savreaderwriter). Here is an example that I might also use to write a unittest for this (the b" prefixes are needed for Python 3):
import savReaderWriter as rw, tempfile, os, pprint
savFileName = os.path.join(tempfile.gettempdir(), "some_file.sav")
varNames = [b"a_string", b"a_numeric"]
varTypes = {b"a_string": 1, b"a_numeric": 0}
records = [[b"x", 1], [b"y", 777], [b"z", 10 ** 6]]
# Incorrect, but now raises ctypes.ArgumentError:
valueLabels = {b"a_numeric": {b"1": b"male", b"2": b"female"},
b"a_string": {1: b"male", 2: b"female"}}
# Correct
#valueLabels = {b"a_numeric": {1: b"male", 2: b"female"},
# b"a_string": {b"1": b"male", b"2": b"female"}}
kwargs = dict(savFileName=savFileName, varNames=varNames,
varTypes=varTypes, valueLabels=valueLabels)
with rw.SavWriter(**kwargs) as writer:
writer.writerows(records)
# Check if the valueLabels look all right
with rw.SavHeaderReader(savFileName) as header:
metadata = header.dataDictionary(True)
pprint.pprint(metadata.valueLabels)
Just convert the dict before passing it to SavWriter:
labels = {str(key): value for key, value in value_labels.items()}
or for earlier versions of python:
labels = dict((str(key), value) for key, value in value_labels.items())
The best long-term solution, though, is to re-factor your code so that the keys don't have to be numbers.
UPDATE:
If the dicts are nested, then try this:
labels = {str(key): {str(key): value for key, value in value.items()}
for key, value in value_label.items()}

Categories