Iterate a list with unknown sublist inside in Python? - python

I am working on a cluster algorithm, and when after it cluster the data, it return the list which contain single document (class doc) and group (class group) of documents, for example:
Group(Document(id='NSVcteD-5', name=u'1332410487000-2ed0728e9015028e7c41341011a1bd82'), Group(Document(id='NSVcteD-11', name=u'1332410485000-18ae371b18b3790874fb886085c770af'), Group(Document(id='NSVcteD-12', name=u'1332410484000-dc544efc146674289b126062b000a302'), Group(Document(id='NSVcteD-6', name=u'1332410487000-25e815a47779642df2a416495bd5174c'), Group(Document(id='NSVcteD-7', name=u'1332410485000-eb66881f5b1c633dd1609ad6fc18a45c'), Group(Document(id='NSVcteD-2', name=u'1332410487000-a39e2076ca4477e8a324081732bd36c0'), Group(Document(id='NSVcteD-9', name=u'1332410485000-db1acc63d72a63f65623610242394877'), Group(Group(Document(id='NSVcteD-13', name=u'1332410152000-13ea7da3c74917b86bb70e59ff356397'), Document(id='NSVcteD-3', name=u'1332410487000-6287c3d86e6416cb421b6f176a367e23')), Group(Document(id='NSVcteD-10', name=u'1332410485000-508937f6a4cae9ed79dbd54f016ca61c'), Group(Document(id='NSVcteD-4', name=u'1332410487000-4b16fa5633a9df1341690d9a32a4f06d'), Group(Document(id='NSVcteD-1', name=u'1332410487000-b6696b10ad4415c87e41e5367fd4bcfa'), Group(Document(id='NSVcteD-8', name=u'1332410485000-e3f77be9cddcb9efc07914654454d817'), Group(Document(id='NSVcteD-14', name=u'1332410151000-cc13783d0980106d686d64082121f6ac'), Document(id='NSVcteD-15', name=u'1332410151000-a91330e828e41ed3b8503f3133f61fc7'))))))))))))))
To make it's easy to understand, just bring a real obj, generated by my script. It is multi-tered list, and I have no idea how to iterate to manipulate with it, such as convert to the json style string.
Any help is much appreciated.

You mean something like that? Note that this is totally inefficient and will break after a certain depth..
>>> def recursive_iterate(iterable):
... iterated_object=[]
... for elem in iterable:
... if hasattr(elem,"__iter__"):
... iterated_object.append(recursive_iterate(elem))
... else:
... iterated_object.append(elem)
... return iterated_object
...
>>> recursive_iterate([1,2,3,[4,5,6]])
[1, 2, 3, [4, 5, 6]]
>>> recursive_iterate([1,2,3,xrange(10)])
[1, 2, 3, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]
>>> recursive_iterate([1,2,3,[4,5,6,[xrange(10)]]])
[1, 2, 3, [4, 5, 6, [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]]]

Related

Relative size in list python

I have a list of integers. Each number can appear several times, the list is unordered.
I want to get the list of relative sizes. Meaning, if for example the original list is [2, 5, 7, 7, 3, 10] then the desired output is [0, 2, 3, 3, 1, 4]
Because 2 is the zero'th smallest number in the original list, 3 is one'th, etc.
Any clear easy way to do this?
Try a list comprehension with dictionary and also use set for getting unique values, like below:
>>> lst = [2, 5, 7, 7, 3, 10]
>>> newl = dict(zip(range(len(set(lst))), sorted(set(lst))))
>>> [newl[i] for i in lst]
[0, 2, 3, 3, 1, 4]
>>>
Or use index:
>>> lst = [2, 5, 7, 7, 3, 10]
>>> newl = sorted(set(lst))
>>> [newl.index(i) for i in lst]
[0, 2, 3, 3, 1, 4]
>>>

Python move all elements of a list one position back with same len

def list_move_back(new_value, value_list):
for i in reversed(value_list):
if value_list.index(i) != len(value_list)-1:
value_list[value_list.index(i)+1] = i
value_list[0] = new_value
return value_list
I want to get the following result:
list_example = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list_example = list_move_back(12, list_example]
print(list_example)
>>>[12, 1, 2, 3, 4, 5, 6, 7, 8, 9]
It works if I run the function two times:
list_example = list_move_back(12, list_example]
print(list_example)
>>>[12, 12, 1, 2, 3, 4, 5, 6, 7, 8]
but if I want to run it a third time, the result looks like that:
list_example = list_move_back(12, list_example]
print(list_example)
>>>[12, 12, 1, 1, 3, 4, 5, 6, 7, 8]
The first 1 should be a 12. I have no idea why it doesn't work.
Just use list slicing:
def list_move_back(new_value, list_of_values):
return [new_value] + list_of_values[:-1]
Explanation: list_of_values[:-1] returns all the elements except for the last. By appending it to the new value, you get the wanted result. This answer has a pretty cool explanation of how list slicing works.
Also, if for some reason you'd like the "verbose" way to do this (maybe for an exercise or whatever), here's a way to go about it:
def list_move_back(new_value, list_of_values):
for i in range(len(list_of_values)-1, 0, -1):
list_of_values[i] = list_of_values[i-1]
list_of_values[0] = new_value
return list_of_values
I'd recommend list slicing over this method 9/10 times but again, I'm just leaving this here because there might be a case where someone wants to do this as some sort of mental exercise for indexing.
If you need the list to change in place, you can use the list methods .pop() to remove the last item and .insert(0,value) to add an item to the front:
>>> L = list(range(1,11))
>>> L
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> id(L)
1772071032392
>>> L.pop();L.insert(0,12)
10
>>> L
[12, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> id(L) # same list id, modified in place...
1772071032392

cant iterate nested for loop as wanted -python -maybe a simple mistake

I have tried the code below: the purpose is to generate a dictionary where each key has a list as a value. The first iteration goes well and generates the item as I want it, but the second loop, the nested for loop, doesn't generate the list as expected.
Please help me with this simple code. There must be something wrong with it, the code is as below:
schop = [1, 3, 1, 5, 6, 2, 1, 4, 3, 5, 6, 6, 2, 2, 3, 4, 4, 5]
mop = [1, 1, 2, 1, 1, 1, 3, 1, 2, 2, 2, 3, 2, 3, 3, 2, 3, 3]
mlist = ["1","2","3"]
wmlist=zip(mop,schop)
title ={}
for m in mlist:
m = int(m)
k=[]
for a,b in wmlist:
if a == m:
k.append(b)
title[m]=k
print(title)
The results are like:
title: {1: [1, 3, 5, 6, 2, 4], 2: [], 3: []}
Why do the second key and the third key have an empty list?
Thanks!
Your code would have worked as you expect in Python 2, where zip creates a list of tuples.
In Python 3, zip is an iterator. Once you iterate over it, it gets exhausted, so your second and third for loops won't have anything left to iterate over.
The simplest solution here would be to create a list from the iterator:
wmlist = list(zip(mop,schop))
i think that the best thing that you have to consider is the version of python that you have installed.
This is the output that i obtained with your code in python2:
{1: [1, 3, 5, 6, 2, 4], 2: [1, 3, 5, 6, 2, 4], 3: [1, 6, 2, 3, 4, 5]}
But with Python3 this is the answer that i obtained:
{1: [1, 3, 5, 6, 2, 4], 2: [], 3: []}
If you are sure that you have the properly vesion, only you have to consider the indentation that you have in your code. Good luck!!

Python list extend functionality using slices

I'm teaching myself Python ahead of starting a new job. Its a Django job, so I have to stick to 2.7. As such, I'm reading Beginning Python by Hetland and don't understand his example of using slices to replicate list.extend() functionality.
First, he shows the extend method by
a = [1, 2, 3]
b = [4, 5, 6]
a.extend(b)
produces [1, 2, 3, 4, 5, 6]
Next, he demonstrates extend by slicing via
a = [1, 2, 3]
b = [4, 5, 6]
a[len(a):] = b
which produces the exact same output as the first example.
How does this work? A has a length of 3, and the terminating slice index point is empty, signifying that it runs to the end of the list. How do the b values get added to a?
Python's slice-assignment syntax means "make this slice equal to this value, expanding or shrinking the list if necessary". To fully understand it you may want to try out some other slice values:
a = [1, 2, 3]
b = [4, 5, 6]
First, lets replace part of A with B:
a[1:2] = b
print(a) # prints [1, 4, 5, 6, 3]
Instead of replacing some values, you can add them by assigning to a zero-length slice:
a[1:1] = b
print(a) # prints [1, 4, 5, 6, 2, 3]
Any slice that is "out of bounds" instead simply addresses an empty area at one end of the list or the other (too large positive numbers will address the point just off the end while too large negative numbers will address the point just before the start):
a[200:300] = b
print(a) # prints [1, 2, 3, 4, 5, 6]
Your example code simply uses the most "accurate" out of bounds slice at the end of the list. I don't think that is code you'd use deliberately for extending, but it might be useful as an edge case that you don't need to handle with special logic.
It's simply an extension of normal indexing.
>>> L
[1, 2, 3, 4, 5]
>>> L[2] = 42
>>> L
[1, 2, 42, 4, 5]
The __setitem__() method detects that a slice is being used instead of a normal index and behaves appropriately.
a = [1, 2, 3]
b = [4, 5, 6]
a[len(a):] = b
means element in a from position len(a) are elements in b. Which means extending a with b.
For a demonstration, consider looking at a subclass of list:
from __future__ import print_function # so I can run on Py 3 and Py 2
class EdList(list):
def __setitem__(self,index,value):
print('setitem: index={}, value={}'.format(index,value))
list.__setitem__(self,index,value)
print(self)
def __setslice__(self,i,j,seq):
print('setslice: i:{}, j:{}, seq:{}'.format(i,j,seq))
self.__setitem__(slice(i,j),seq)
Running on Python 3:
>>> a=EdList(range(10))
>>> a[300000:]=[1,2,3]
setitem: index=slice(300000, None, None), value=[1, 2, 3]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3]
>>> a[1:1]=[4,5,6]
setitem: index=slice(1, 1, None), value=[4, 5, 6]
[0, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3]
Running on Python 2:
>>> a=EdList(range(10))
>>> a[300000:]=[1,2,3]
setslice: i:300000, j:9223372036854775807, seq:[1, 2, 3]
setitem: index=slice(300000, 9223372036854775807, None), value=[1, 2, 3]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3]
>>> a[1:1]=[4,5,6]
setslice: i:1, j:1, seq:[4, 5, 6]
setitem: index=slice(1, 1, None), value=[4, 5, 6]
[0, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3]
It is confusing when you are first learning it, but you will learn to love it I think.

Identifying References in Python

I just spent a very long time debugging an issue in python, using the web.py framework, and it has me wondering about a way to check this kind of thing in the future.
In short, one of the methods of web.py's database class was returning a storage object that, I can only surmise, was a reference to a value in memory, instead of a copy. The result was that no matter what I tried to do, I could not append multiple results of a query into a list, because the list just wound up with copies of the same row of data.
To solve it, I had to create another dictionary and convert all values in the web.py storage object explicitly to another datatype (I just used strings to get it working) so that it would be forced to create a new object.
In code, this is what was happening:
>>> result = db.query(query_params)
>>> for row in result:
>>> print row
... result_list.append(row)
{"result" : "from", "row" : "one"}
{"result" : "from", "row" : "two"}
>>> print result_list
[{"result":"from", "row" : "two"}, {"result" : "from", "row" : "two}]
This is what clued me into the fact that this was some sort of reference issue. The list was storing the location of "row", instead of a copy of it.
The first thing I tried was something along the lines of:
copy = {}
for row in result:
for value in row:
copy[value] = row[value]
result_list.append(copy)
But this lead to the same problem.
I only found a solution by tweaking the above to read:
copy[value] = str(row[value])
So, my question is two-fold, really:
Is there an easy way to tell how something is being stored and/or passed around?
Is there a way to explicitly request a copy instead of a reference?
Use copy. It allows simple shallow- and deep-copying of objects in python:
import copy
result_list.append(copy.copy(row))
Welcome to how Python works. It's easy to identify which objects are passed by reference in Python: all of them are. It's just that many (integers, strings, tuples) are immutable, so you don't really notice when more than one name is pointing to them since you can't change them.
If you need a copy, make a copy explicitly. Usually this can be done using the type's constructor:
newlist = list(oldlist)
newdict = dict(olddict)
Lists can also do it with a slice, which you'll often see:
newlist = oldlist[:]
Dictionaries have a copy() method:
newdict = olddict.copy()
These are "shallow" copies: a new container is made, but the items inside are still the original references. This can bite you for, example, with lists of lists. A module called copy exists that contains functions for copying nearly any object, and a deepcopy function that will also copy the contained objects to any depth.
I don't know anything about web.py, but most likely a result_set.append(dict(row)) is what you were looking for.
See also: copy.copy() and copy.deepcopy()
Well, if you're curious about identity you can always check using the is operator:
>>> help('is')
The reason your attempt didn't work is because you were creating the dictionary outside the loop, so of course you're going to have problems:
>>> mydict = {}
>>> for x in xrange(3):
... d = mydict
... print d is mydict
...
...
True
True
True
No matter how many times you add things to d or mydict, mydict will always be the same object.
Others have commented that you should use copy, but they've failed to address your underlying issue - you don't seem to understand reference types.
In Python, everything is an object. There are two basic types - mutable and immutable. Immutable objects are things like strings, numbers, and tuples. You're not allowed to change these objects in memory:
>>> x = 3
>>> y = x
>>> x is y
True
Now x and y refer to the same object (not just have the same value).
>>> x += 4
Because 3 is an integer and immutable, this operation does not modify the value that's stored in x, what it actually does is adds 3 and 4 and finds out that it results in the value of 7, so now x "points" to 7.
>>> y
3
>>> x
7
>>> x is y
False
>>> y = x
>>> x is y
True
With mutable objects, you can modify them in place:
>>> mylist = [1,2,3]
>>> mylist[0] = 3
>>> mylist
[3, 2, 3]
You'll have a lot easier time if you stop thinking about variables in Python as storing values in variables, and instead think of them as name tags - you know the ones that say "Hello, My Name Is"?
So essentially what's happening in your code is that the object being returned through your loop is the same one each time.
Here's an example:
>>> def spam_ref():
... thing = []
... count = 0
... while count < 10:
... count += 1
... thing.append(count)
... yield thing
...
>>> for a in spam_ref():
... print a
...
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7]
[1, 2, 3, 4, 5, 6, 7, 8]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> spam = []
>>> for a in spam_ref():
... spam.append(a)
...
So if you look at the output of the loop, you'd think that spam now contains
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7]
[1, 2, 3, 4, 5, 6, 7, 8]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
But it doesn't:
>>> for a in spam:
... print a
...
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Because a list is mutable so the generator function was able to modify it in place. In your case, you have a dictionary that's being returned, so as others have mentioned, you'll need to use some method of copying the dictionary.
For further reading, check out Python Objects and Call By Object over at effbot.
Either of two should work:
result = db.query(query_params).list()
result = list(db.query(query_params))
db.query and db.select return iterator, you need to convert it into list.

Categories