Appending to tuple during for in loop - python

I need to modify a tuple during a for in loop, such that the iterator iterates on the tuple.
From my understanding, tuples are immutable; so tup = tup + (to_add,) is just reassigning tup, not changing the original tuple. So this is tricky.
Here is a test script:
tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}
for i in tup:
if blah:
tup = tup + (to_add,)
blah = False
print(i)
Which prints:
{'abc': 'a'}
{'2': '2'}
What I would like is for it to print:
{'abc': 'a'}
{'2': '2'}
{'goof': 'abcde'}
From what I understand, I need to "repoint" the implicit tuple iterator mid-script so that it is pointing at the new tuple instead. (I know this is a seriously hacky thing to be doing).
This script accesses the tuple_generator in question:
import gc
tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}
for i in tup:
if blah:
tup = tup + (to_add,)
blah = False
refs = gc.get_referrers(i)
for ref in refs:
if type(ref) == tuple and ref != tup:
refs_to_tup = gc.get_referrers(ref)
for j in refs_to_tup:
if str(type(j)) == "<class 'tuple_iterator'>":
tuple_iterator = j
print(i)
How can I modify this tuple_generator so that it points at the new tup, and not the old? Is this even possible?
I am aware that this is a really strange situation, I cannot change that tup is a tuple or that I need to use an implicit for in, as I am trying to plug into code that I cannot change.

There is no way-either portably or specifically within CPython—to do what you're trying to do from within Python, even via undocumented internals of the tuple_iterator object. The tuple reference is stored in a variable that isn't exposed to Python, and (unlike the stored index) isn't modified by __setstate__ or any other method.
However, if you're willing to start monkeying with C pointers behind CPython's back, and you know how to debug the inevitable segfaults…
Under the covers, there's a C struct representing tuple_iterator. I think it's either seqiterobject, or a struct with the exact same shape, but you should read through the tupleobject source code to make sure.
Here's what that type looks like in C:
typedef struct {
PyObject_HEAD
Py_ssize_t it_index;
PyObject *it_seq; /* Set to NULL when iterator is exhausted */
} seqiterobject;
So, what happens if you create a ctypes.Structure subclass that's the same size as this, something like this:
class seqiterobject(ctypes.Structure):
_fields_ = (
('ob_refcnt', ctypes.c_ssize_t),
('ob_type', ctypes.c_void_p),
('it_index', ctypes.c_ssize_t),
('it_seq', ctypes.POINTER(ctypes.pyobject)))
… and then do this:
seqiter = seqiterobject.from_address(id(j))
… and then do this:
seqiter.it_seq = id(other_tuple)
…? Well, you probably corrupt the heap by underreferencing the new value (and also leak the old one), so you'll need to incref the new value and decref the old value first.
But, if you do that… most likely, either it'll segfault the next time you call __next__, or it'll work.
If you want more example code that does similar things, see superhackyinternals. Other than the fact that seqiterobject is not even a public type, so this is even more hacky, everything else is basically the same.

You could write your own coroutine and send the new tup to it.
def coro(iterable):
iterable = iter(iterable)
while True:
try:
v = next(iterable)
i = yield v
except StopIteration:
break
if i:
yield v
iterable = it.chain(iterable, i)
Then this works as you describe:
In []:
blah = True
tup = ({'abc': 'a'}, {'2': '2'})
to_add = {'goof': 'abcde'}
c = coro(tup)
for i in c:
if blah:
i = c.send((to_add,))
blah = False
print(i)
Out[]:
{'abc': 'a'}
{'2': '2'}
{'goof': 'abcde'}
I'm sure there are lots of edge cases I'm missing in the above but it should give you an idea of how it can be done.

Since you plan on modifying the tuple inside the loop, you are probably better off using a while loop keeping track of the current index rather then relying on an iterator. Iterators are only good for looping through collections that do not get added/removed to in the loop.
If you run this below example, the resulting tup object has the items added to it, all while looping through 3 times.
tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}
i = 0
while i < len(tup):
cur = tup[i]
if blah:
tup = tup + (to_add,)
blah = False
i += 1
print(tup)

Related

How to call arbitrary deep python object?

I'm working on a script using an API from a software I use. I lack documentation for the methods available so I have little control over what I have. I'm looking to call a variable that can be buried arbitrarily deep in an object. I know the depth before hand but it can range widely.
Example:
index = ['foo','bar']
object['foo']['bar'].method()
I've tried something like:
temp = object[index[0]]
for ind in index[1:]:
temp = temp[ind]
temp.method()
But this makes a copy (it does not according to the replies I've gotten) of the object and does not apply the method correctly. Index can be arbitrarily long.
My only working solution is to hardcode this by using:
if lengthIndex == 1:
object[index[0]].method()
if lengthIndex == 2:
object[index[0]][index[1]].method()
if lengthIndex == 3:
object[index[0]][index[1]][index[2]].method()
# and so on ...
What is the proper way to code this?
Your first code sample doesn't copy the object. In Python, variables are references (at least that's how it effectively works), and when you assign something to a variable, you're just copying a memory address, not copying the thing itself. So if your only reason not to use that was that you wanted to avoid making unnecessary copies of objects, you have nothing to worry about.
What you do in your code sample is the way I would handle repeated indexing.
If the main point of the question is asking for a function to index arbitrarily nested dictionaries then see the following:
def f():
print('nested func')
def g():
print('nested func 2')
def nested_value(nested_dict, keys):
for k in keys:
nested_dict = nested_dict[k]
return nested_dict
nested_dict = {
'foo': {
'bar': {
'baz': f
},
'taco': g
}
}
keys = ['foo', 'bar', 'baz']
val = nested_value(nested_dict, keys)
val()
keys = ['foo', 'taco']
val = nested_value(nested_dict, keys)
val()
Output:
nested func
nested func 2
In [5]: def get_deep_object(obj, index):
...: for i in index:
...: obj = obj[i]
...: return obj
...:
In [6]: l = ["foo", "bar"]
In [7]: get_deep_object(l, [1])
Out[7]: 'bar'
In [8]: get_deep_object(l, [1,1])
Out[8]: 'a'
In [9]: get_deep_object(l, [1,1]).capitalize()
Out[9]: 'A'
You can get a shorter (but not necessarily more readable) solution via functional programming:
from operator import getitem
from functools import reduce
reduce(getitem, index, object).method()
getitem from operator is equivalent to lambda data, key: data[key].

Extending python dictionary and changing key values

Assume I have a python dictionary with 2 keys.
dic = {0:'Hi!', 1:'Hello!'}
What I want to do is to extend this dictionary by duplicating itself, but change the key value.
For example, if I have a code
dic = {0:'Hi!', 1:'Hello'}
multiplier = 3
def DictionaryExtend(number_of_multiplier, dictionary):
"Function code"
then the result should look like
>>> DictionaryExtend(multiplier, dic)
>>> dic
>>> dic = {0:'Hi!', 1:'Hello', 2:'Hi!', 3:'Hello', 4:'Hi!', 5:'Hello'}
In this case, I changed the key values by adding the multipler at each duplication step. What's the efficient way of doing this?
Plus, I'm also planning to do the same job for list variable. I mean, extend a list by duplicating itself and change some values like above exmple. Any suggestion for this would be helpful, too!
You can try itertools to repeat the values and OrderedDict to maintain input order.
import itertools as it
import collections as ct
def extend_dict(multiplier, dict_):
"""Return a dictionary of repeated values."""
return dict(enumerate(it.chain(*it.repeat(dict_.values(), multiplier))))
d = ct.OrderedDict({0:'Hi!', 1:'Hello!'})
multiplier = 3
extend_dict(multiplier, d)
# {0: 'Hi!', 1: 'Hello!', 2: 'Hi!', 3: 'Hello!', 4: 'Hi!', 5: 'Hello!'}
Regarding handling other collection types, it is not clear what output is desired, but the following modification reproduces the latter and works for lists as well:
def extend_collection(multiplier, iterable):
"""Return a collection of repeated values."""
repeat_values = lambda x: it.chain(*it.repeat(x, multiplier))
try:
iterable = iterable.values()
except AttributeError:
result = list(repeat_values(iterable))
else:
result = dict(enumerate(repeat_values(iterable)))
return result
lst = ['Hi!', 'Hello!']
multiplier = 3
extend_collection(multiplier, lst)
# ['Hi!', 'Hello!', 'Hi!', 'Hello!', 'Hi!', 'Hello!']
It's not immediately clear why you might want to do this. If the keys are always consecutive integers then you probably just want a list.
Anyway, here's a snippet:
def dictExtender(multiplier, d):
return dict(zip(range(multiplier * len(d)), list(d.values()) * multiplier))
I don't think you need to use inheritance to achieve that. It's also unclear what the keys should be in the resulting dictionary.
If the keys are always consecutive integers, then why not use a list?
origin = ['Hi', 'Hello']
extended = origin * 3
extended
>> ['Hi', 'Hello', 'Hi', 'Hello', 'Hi', 'Hello']
extended[4]
>> 'Hi'
If you want to perform a different operation with the keys, then simply:
mult_key = lambda key: [key,key+2,key+4] # just an example, this can be any custom implementation but beware of duplicate keys
dic = {0:'Hi', 1:'Hello'}
extended = { mkey:dic[key] for key in dic for mkey in mult_key(key) }
extended
>> {0:'Hi', 1:'Hello', 2:'Hi', 3:'Hello', 4:'Hi', 5:'Hello'}
You don't need to extend anything, you need to pick a better input format or a more appropriate type.
As others have mentioned, you need a list, not an extended dict or OrderedDict. Here's an example with lines.txt:
1:Hello!
0: Hi.
2: pylang
And here's a way to parse the lines in the correct order:
def extract_number_and_text(line):
number, text = line.split(':')
return (int(number), text.strip())
with open('lines.txt') as f:
lines = f.readlines()
data = [extract_number_and_text(line) for line in lines]
print(data)
# [(1, 'Hello!'), (0, 'Hi.'), (2, 'pylang')]
sorted_text = [text for i,text in sorted(data)]
print(sorted_text)
# ['Hi.', 'Hello!', 'pylang']
print(sorted_text * 2)
# ['Hi.', 'Hello!', 'pylang', 'Hi.', 'Hello!', 'pylang']
print(list(enumerate(sorted_text * 2)))
# [(0, 'Hi.'), (1, 'Hello!'), (2, 'pylang'), (3, 'Hi.'), (4, 'Hello!'), (5, 'pylang')]

Python convert string to array assignment

In my application I am receiving a string 'abc[0]=123'
I want to convert this string to an array of items. I have tried eval() it didnt work for me. I know the array name abc but the number of items will be different in each time.
I can split the string, get array index and do. But I would like to know if there is any direct way to convert this string as an array insert.
I would greately appreciate any suggestion.
are you looking for something like
In [36]: s = "abc[0]=123"
In [37]: vars()[s[:3]] = []
In [38]: vars()[s[:3]].append(eval(s[s.find('=') + 1:]))
In [39]: abc
Out[39]: [123]
But this is not a good way to create a variable
Here's a function for parsing urls according to php rules (i.e. using square brackets to create arrays or nested structures):
import urlparse, re
def parse_qs_as_php(qs):
def sint(x):
try:
return int(x)
except ValueError:
return x
def nested(rest, base, val):
curr, rest = base, re.findall(r'\[(.*?)\]', rest)
while rest:
curr = curr.setdefault(
sint(rest.pop(0) or len(curr)),
{} if rest else val)
return base
def dtol(d):
if not hasattr(d, 'items'):
return d
if sorted(d) == range(len(d)):
return [d[x] for x in range(len(d))]
return {k:dtol(v) for k, v in d.items()}
r = {}
for key, val in urlparse.parse_qsl(qs):
id, rest = re.match(r'^(\w+)(.*)$', key).groups()
r[id] = nested(rest, r.get(id, {}), val) if rest else val
return dtol(r)
Example:
qs = 'one=1&abc[0]=123&abc[1]=345&foo[bar][baz]=555'
print parse_qs_as_php(qs)
# {'abc': ['123', '345'], 'foo': {'bar': {'baz': '555'}}, 'one': '1'}
Your other application is doing it wrong. It should not be specifying index values in the parameter keys. The correct way to specify multiple values for a single key in a GET is to simply repeat the key:
http://my_url?abc=123&abc=456
The Python server side should correctly resolve this into a dictionary-like object: you don't say what framework you're running, but for instance Django uses a QueryDict which you can then access using request.GET.getlist('abc') which will return ['123', '456']. Other frameworks will be similar.

searching and adding a python list

I have a TList which is a list of lists. I would like to add new items to the list if they are not present before. For instance if item I is not present, then add to Tlist otherwise skip.Is there a more pythonic way of doing it ? Note : At first TList may be empty and elements are added in this code. After adding Z for example, TList = [ [A,B,C],[D,F,G],[H,I,J],[Z,aa,bb]]. The other elements are based on calculations on Z.
item = 'C' # for example this item will given by user
TList = [ [A,B,C],[D,F,G],[H,I,J]]
if not TList:
## do something
# check if files not previously present in our TList and then add to our TList
elif item not in zip(*TList)[0]:
## do something
Since it would appear that the first entry in each sublist is a key of some sort, and the remaining entries are somehow derived from that key, a dictionary might be a more suitable data structure:
vals = {'A': ['B','C'], 'D':['F','G'], 'H':['I','J']}
if 'Z' in vals:
print 'found Z'
else:
vals['Z'] = ['aa','bb']
#aix made a good suggestion to use a dict as your data structure; It seems to fit your use case well.
Consider wrapping up the value checking (i.e. 'Does it exist?') and the calculation of the derived values ('aa' and 'bb' in your example?).
class TList(object):
def __init__(self):
self.data = {}
def __iter__(self):
return iter(self.data)
def set(self, key):
if key not in self:
self.data[key] = self.do_something(key)
def get(self, key):
return self.data[key]
def do_something(self, key):
print('Calculating values')
return ['aa', 'bb']
def as_old_list(self):
return [[k, v[0], v[1]] for k, v in self.data.iteritems()]
t = TList()
## Add some values. If new, `do_something()` will be called
t.set('aval')
t.set('bval')
t.set('aval') ## Note, do_something() is not called
## Get a value
t.get('aval')
## 'in ' tests work
'aval' in t
## Give you back your old data structure
t.as_old_list()
if you need to keep the same data structure, something like this should work:
# create a set of already seen items
seen = set(zip(*TList)[:1])
# now start adding new items
if item not in seen:
seen.add(item)
# add new sublist to TList
Here is a method using sets and set.union:
a = set(1,2,3)
b = set(4,5,6)
c = set()
master = [a,b,c]
if 2 in set.union(*master):
#Found it, do something
else:
#Not in set, do something else
If the reason for testing for membership is simply to avoid adding an entry twice, the set structure uses a.add(12) to add something to a set, but only add it once, thus eliminating the need to test. Thus the following:
>>> a=set()
>>> a.add(1)
>>> a
set([1])
>>> a.add(1)
>>> a
set([1])
If you need the set elsewhere as a list you simply say "list(a)" to get "a" as a list, or "tuple(a)" to get it as a tuple.

Map list onto dictionary

Is there a way to map a list onto a dictionary? What I want to do is give it a function that will return the name of a key, and the value will be the original value. For example;
somefunction(lambda a: a[0], ["hello", "world"])
=> {"h":"hello", "w":"world"}
(This isn't a specific example that I want to do, I want a generic function like map() that can do this)
In Python 3 you can use this dictionary comprehension syntax:
def foo(somelist):
return {x[0]:x for x in somelist}
I don't think a standard function exists that does exactly that, but it's very easy to construct one using the dict builtin and a comprehension:
def somefunction(keyFunction, values):
return dict((keyFunction(v), v) for v in values)
print somefunction(lambda a: a[0], ["hello", "world"])
Output:
{'h': 'hello', 'w': 'world'}
But coming up with a good name for this function is more difficult than implementing it. I'll leave that as an exercise for the reader.
If I understand your question correctly, I believe you can accomplish this with a combination of map, zip, and the dict constructor:
def dictMap(f, xs) :
return dict(zip(map(f, xs), xs)
And a saner implementation :
def dictMap(f, xs) :
return dict((f(i), i) for i in xs)
Taking hints from other answers I achieved this using map operation. I am not sure if this exactly answers your question.
mylist = ["hello", "world"]
def convert_to_dict( somelist ):
return dict( map( lambda x: (x[0], x), somelist ) )
final_ans = convert_to_dict( mylist )
print final_ans
If you want a general function to do this, then you're asking almost the right question. Your example doesn't specify what happens when the key function produces duplicates, though. Do you keep the last one? The first one? Do you actually want to make a list of all the words that start with the same letter? These questions are probably best answered by the user of the function, not the designer.
Parametrizing over these results in a more complicated, but very general, function. Here's one that I've used for several years:
def reduce_list(key, update_value, default_value, l):
"""Reduce a list to a dict.
key :: list_item -> dict_key
update_value :: key * existing_value -> updated_value
default_value :: initial value passed to update_value
l :: The list
default_value comes before l. This is different from functools.reduce,
because functools.reduce's order is wrong.
"""
d = {}
for k in l:
j = key(k)
d[j] = update_value(k, d.get(j, default_value))
return d
Then you can write your function by saying:
reduce_list(lambda s:s, lambda s,old:s[0], '', ['hello', 'world'])
# OR
reduce_list(lambda s:s, lambda s,old: old or s[0], '', ['hello', 'world'])
Depending on whether you want to keep the first or last word starting with, for example, 'h'.
This function is very general, though, so most of the time it's the basis for other functions, like group_dict or histogram:
def group_dict(l):
return reduce_list(lambda x:x, lambda x,old: [x] + old, [], l)
def histogram(l):
return reduce_list(lambda x:x, lambda x,total: total + 1, 0, l)
>>> dict((a[0], a) for a in "hello world".split())
{'h': 'hello', 'w': 'world'}
If you want to use a function instead of subscripting, use operator.itemgetter:
>>> from operator import itemgetter
>>> first = itemgetter(0)
>>> dict((first(x), x) for x in "hello world".split())
{'h': 'hello', 'w': 'world'}
Or as a function:
>>> dpair = lambda x : (first(x), x)
>>> dict(dpair(x) for x in "hello world".split())
{'h': 'hello', 'w': 'world'}
Finally, if you want more than one word per letter as a possibility, use collections.defaultdict
>>> from collections import defaultdict
>>> words = defaultdict(set)
>>> addword = lambda x : words[first(x)].add(x)
>>> for word in "hello house home hum world wry wraught".split():
addword(word)
>>> print words['h']
set(['house', 'hello', 'hum', 'home'])

Categories