Related
If I have a nested list, e.g. x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]], how can I calculate the difference between all of them? Let's called the lists inside x - A, B, and C. I want to calculate the difference of A from B & C, then B from A & C, then C from A & B, then put them in a list diff = [].
My problem is correctly indexing the numbers and using them to do maths with corresponding elements in other lists.
This is what I have so far:
for i in range(len(x)):
diff = []
for j in range(len(x)):
if x[i]!=x[j]:
a = x[i]
b = x[j]
for h in range(len(a)):
d = a[h] - b[h]
diff.append(d)
Essentially for the difference of A to B it is ([1-2] + [2-4] + [3-6])
I would like it to return: diff = [[diff(A,B), diff(A,C)], [diff(B,A), diff(B,C)], [diff(C,A), diff(C,B)]] with the correct differences between points.
Thanks in advance!
Your solution is actually not that far off. As Aniketh mentioned, one issue is your use of x[i] != x[j]. Since x[i] and x[j] are arrays, that will actually always evaluate to false.
The reason is that python will not do a useful comparison of arrays by default. It will just check if the array reference is the same. This is obviously not what you want, you are trying to see if the array is at the same index in x. For that use i !=j.
Though there are other solutions posted here, I'll add mine below because I already wrote it. It makes use of python's list comprehensions.
def pairwise_diff(x):
diff = []
for i in range(len(x)):
A = x[i]
for j in range(len(x)):
if i != j:
B = x[j]
assert len(A) == len(B)
item_diff = [A[i] - B[i] for i in range(len(A))]
diff.append(sum(item_diff))
# Take the answers and group them into arrays of length 2
return [diff[i : i + 2] for i in range(0, len(diff), 2)]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
print(pairwise_diff(x))
This is one of those problems where it's really helpful to know a bit of Python's standard library — especially itertools.
For example to get the pairs of lists you want to operate on, you can reach for itertools.permutations
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(permutations(x, r=2))
This gives the pairs of lists your want:
[([1, 2, 3], [2, 4, 6]),
([1, 2, 3], [3, 5, 7]),
([2, 4, 6], [1, 2, 3]),
([2, 4, 6], [3, 5, 7]),
([3, 5, 7], [1, 2, 3]),
([3, 5, 7], [2, 4, 6])]
Now, if you could just group those by the first of each pair...itertools.groupby does just this.
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
list(list(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0]))
Which produces a list of lists grouped by the first:
[[([1, 2, 3], [2, 4, 6]), ([1, 2, 3], [3, 5, 7])],
[([2, 4, 6], [1, 2, 3]), ([2, 4, 6], [3, 5, 7])],
[([3, 5, 7], [1, 2, 3]), ([3, 5, 7], [2, 4, 6])]]
Putting it all together, you can make a simple function that subtracts the lists the way you want and pass each pair in:
from itertools import permutations, groupby
def sum_diff(pairs):
return [sum(p - q for p, q in zip(*pair)) for pair in pairs]
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
# call sum_diff for each group of pairs
result = [sum_diff(g) for k, g in groupby(permutations(x, r=2), key=lambda p: p[0])]
# [[-6, -9], [6, -3], [9, 3]]
This reduces the problem to just a couple lines of code and will be performant on large lists. And, since you mentioned the difficulty in keeping indices straight, notice that this uses no indices in the code other than selecting the first element for grouping.
Here is the code I believe you're looking for. I will explain it below:
def diff(a, b):
total = 0
for i in range(len(a)):
total += a[i] - b[i]
return total
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
differences = []
for i in range(len(x)):
soloDiff = []
for j in range(len(x)):
if i != j:
soloDiff.append(diff(x[i],x[j]))
differences.append(soloDiff)
print(differences)
Output:
[[-6, -9], [6, -3], [9, 3]]
First off, in your explanation of your algorithm, you are making it very clear that you should use a function to calculate the differences between two lists since you will be using it repeatedly.
Your for loops start off fine, but you should have a second list to append diff to 3 times. Also, when you are checking for repeats you need to make sure that i != j, not x[i] != x[j]
Let me know if you have any other questions!!
this is the simplest solution i can think:
import numpy as np
x = [[1, 2, 3], [2, 4, 6], [3, 5, 7]]
x = np.array(x)
vectors = ['A','B','C']
for j in range(3):
for k in range(3):
if j!=k:
print(vectors[j],'-',vectors[k],'=', x[j]-x[k])
which will return
A - B = [-1 -2 -3]
A - C = [-2 -3 -4]
B - A = [1 2 3]
B - C = [-1 -1 -1]
C - A = [2 3 4]
C - B = [1 1 1]
a = [1,2,3,4]
b = [5,6,7,8]
c = [9,10,11,12]
If I want to search for 6, the code should return b
Similarly
a = [[1, 2], [3, 4], [5, 6], [8, 7]]
If I want to search for [2,5], the code should return 0 and 2 because the elements 2 and 5 are in a[0] and a[2] respectively.
This is what I have done so far.
x = []
x.append(a)
x.append(b)
x.append(c)
print(x)
Output:[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
If I want to search for 5, the following code will return the index of 5. I want to know if there is a better way to solve this & want to know how to solve the second part.
for i in range(len(x)):
if(5 in x[i]):
print(i)
else:
continue
Output:1
for 1st part:
a = [1,2,3,8]
b = [5,6,7,8]
c = [9,10,11,12]
l = np.array(list(zip(a,b,c))).T
res = np.where(np.isin(l, [2,5]))[0]
l:
array([[ 1, 2, 3, 8],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
res:
array([0, 1])
for 2nd part:
a = [[1, 2], [3, 4], [5, 6], [8, 7]]
res = np.where(np.isin(a, [2,5,1]))[0]
res:
array([0, 0, 2])
Try the below code. I believe this should give the correct output
x = [[1, 2], [3, 4], [5, 6], [8, 7]]
y = [2,5]
for i in y:
for c, j in enumerate(x):
for k in j:
if i == k:
print(c)
It basically traverses the each list in x to find the values in y list.
How can i create a multidimensional list while iterating through a 1d list based upon some condition.
I am iterating over a 1d list and whenever i find a '\n' i should append the so created list with a new list, for example,
a = [1,2,3,4,5,'\n',6,7,8,9,0,'\n',3,45,6,7,2]
so I want it to be as,
new_list = [[1,2,3,4],[6,7,8,9,0],[3,45,6,7,2]]
how should i do it? Please help
def storeData(k):
global dataList
dlist = []
for y in k:
if y != '\n':
dlist.append(y)
else:
break
return dlist
This is what i have tried.
example code:
lst = [[]]
for x in a:
if x != '\n':
lst[-1].append(x)
else:
lst.append([])
print(lst)
output:
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [3, 45, 6, 7, 2]]
Using itertools.groupby would do the job (grouping by not being a linefeed):
import itertools
a = [1,2,3,4,5,'\n',6,7,8,9,0,'\n',3,45,6,7,2]
new_list = [list(x) for k,x in itertools.groupby(a,key=lambda x : x!='\n') if k]
print(new_list)
We compare the key truth value to filter out the occurrences of \n
result:
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [3, 45, 6, 7, 2]]
This is how I did it but there must be better solutions.
x = 0
output = [[]]
for item in a:
if item != "\n":
output[x].append(item)
else:
x += 1
output.append([])
print(output)
Here is the basic approach:
EDIT: Good heavens! Silly bug... here's a fix:
>>> sub = []
>>> final = []
>>> for e in a:
... if e == '\n':
... final.append(sub)
... sub = []
... else:
... sub.append(e)
... else:
... final.append(sub)
...
>>> final
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 0], [3, 45, 6, 7, 2]]
>>>
There are other ways, but this is the naive imperative way.
In a project I am currently working on I have implemented about 80% of what I want my program to do and I am very happy with the results.
In the remaining 20% I am faced with a problem which puzzles me a bit on how to solve.
Here it is:
I have come up with a list of lists which contain several numbers (arbitrary length)
For example:
listElement[0] = [1, 2, 3]
listElement[1] = [3, 6, 8]
listElement[2] = [4, 9]
listElement[4] = [6, 11]
listElement[n] = [x, y, z...]
where n could reach up to 40,000 or so.
Assuming each list element is a set of numbers (in the mathematical sense), what I would like to do is to derive all the combinations of mutually exclusive sets; that is, like the powerset of the above list elements, but with all non-disjoint-set elements excluded.
So, to continue the example with n=4, I would like to come up with a list that has the following combinations:
newlistElement[0] = [1, 2, 3]
newlistElement[1] = [3, 6, 8]
newlistElement[2] = [4, 9]
newlistElement[4] = [6, 11]
newlistElement[5] = [[1, 2, 3], [4, 9]]
newlistElement[6] = [[1, 2, 3], [6, 11]]
newlistElement[7] = [[1, 2, 3], [4, 9], [6, 11]]
newlistElement[8] = [[3, 6, 8], [4, 9]]
newlistElement[9] = [[4, 9], [6, 11]
An invalid case, for example would be combination [[1, 2, 3], [3, 6, 8]] because 3 is common in two elements.
Is there any elegant way to do this? I would be extremely grateful for any feedback.
I must also specify that I would not like to do the powerset function, because the initial list could have quite a large number of elements (as I said n could go up to 40000), and taking the powerset with so many elements would never finish.
I'd use a generator:
import itertools
def comb(seq):
for n in range(1, len(seq)):
for c in itertools.combinations(seq, n): # all combinations of length n
if len(set.union(*map(set, c))) == sum(len(s) for s in c): # pairwise disjoint?
yield list(c)
for c in comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print c
This produces:
[[1, 2, 3]]
[[3, 6, 8]]
[[4, 9]]
[[6, 11]]
[[1, 2, 3], [4, 9]]
[[1, 2, 3], [6, 11]]
[[3, 6, 8], [4, 9]]
[[4, 9], [6, 11]]
[[1, 2, 3], [4, 9], [6, 11]]
If you need to store the results in a single list:
print list(comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]))
The following is a recursive generator:
def comb(input, lst = [], lset = set()):
if lst:
yield lst
for i, el in enumerate(input):
if lset.isdisjoint(el):
for out in comb(input[i+1:], lst + [el], lset | set(el)):
yield out
for c in comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print c
This is likely to be a lot more efficient than the other solutions in situations where a lot of sets have common elements (of course in the worst case it still has to iterate over the 2**n elements of the powerset).
The method used in the program below is similar to a couple of previous answers in excluding not-disjoint sets and therefore usually not testing all combinations. It differs from previous answers by greedily excluding all the sets it can, as early as it can. This allows it to run several times faster than NPE's solution. Here is a time comparison of the two methods, using input data with 200, 400, ... 1000 size-6 sets having elements in the range 0 to 20:
Set size = 6, Number max = 20 NPE method
0.042s Sizes: [200, 1534, 67]
0.281s Sizes: [400, 6257, 618]
0.890s Sizes: [600, 13908, 2043]
2.097s Sizes: [800, 24589, 4620]
4.387s Sizes: [1000, 39035, 9689]
Set size = 6, Number max = 20 jwpat7 method
0.041s Sizes: [200, 1534, 67]
0.077s Sizes: [400, 6257, 618]
0.167s Sizes: [600, 13908, 2043]
0.330s Sizes: [800, 24589, 4620]
0.590s Sizes: [1000, 39035, 9689]
In the above data, the left column shows execution time in seconds. The lists of numbers show how many single, double, or triple unions occurred. Constants in the program specify data set sizes and characteristics.
#!/usr/bin/python
from random import sample, seed
import time
nsets, ndelta, ncount, setsize = 200, 200, 5, 6
topnum, ranSeed, shoSets, shoUnion = 20, 1234, 0, 0
seed(ranSeed)
print 'Set size = {:3d}, Number max = {:3d}'.format(setsize, topnum)
for casenumber in range(ncount):
t0 = time.time()
sets, sizes, ssum = [], [0]*nsets, [0]*(nsets+1);
for i in range(nsets):
sets.append(set(sample(xrange(topnum), setsize)))
if shoSets:
print 'sets = {}, setSize = {}, top# = {}, seed = {}'.format(
nsets, setsize, topnum, ranSeed)
print 'Sets:'
for s in sets: print s
# Method by jwpat7
def accrue(u, bset, csets):
for i, c in enumerate(csets):
y = u + [c]
yield y
boc = bset|c
ts = [s for s in csets[i+1:] if boc.isdisjoint(s)]
for v in accrue (y, boc, ts):
yield v
# Method by NPE
def comb(input, lst = [], lset = set()):
if lst:
yield lst
for i, el in enumerate(input):
if lset.isdisjoint(el):
for out in comb(input[i+1:], lst + [el], lset | set(el)):
yield out
# Uncomment one of the following 2 lines to select method
#for u in comb (sets):
for u in accrue ([], set(), sets):
sizes[len(u)-1] += 1
if shoUnion: print u
t1 = time.time()
for t in range(nsets-1, -1, -1):
ssum[t] = sizes[t] + ssum[t+1]
print '{:7.3f}s Sizes:'.format(t1-t0), [s for (s,t) in zip(sizes, ssum) if t>0]
nsets += ndelta
Edit: In function accrue, arguments (u, bset, csets) are used as follows:
• u = list of sets in current union of sets
• bset = "big set" = flat value of u = elements already used
• csets = candidate sets = list of sets eligible to be included
Note that if the first line of accrue is replaced by
def accrue(csets, u=[], bset=set()):
and the seventh line by
for v in accrue (ts, y, boc):
(ie, if parameters are re-ordered and defaults given for u and bset) then accrue can be invoked via [accrue(listofsets)] to produce its list of compatible unions.
Regarding the ValueError: zero length field name in format error mentioned in a comment as occurring when using Python 2.6, try the following.
# change:
print "Set size = {:3d}, Number max = {:3d}".format(setsize, topnum)
# to:
print "Set size = {0:3d}, Number max = {1:3d}".format(setsize, topnum)
Similar changes (adding appropriate field numbers) may be needed in other formats in the program. Note, the what's new in 2.6 page says “Support for the str.format() method has been backported to Python 2.6”. While it does not say whether field names or numbers are required, it does not show examples without them. By contrast, either way works in 2.7.3.
using itertools.combinations, set.intersection and for-else loop:
from itertools import *
lis=[[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]
def func(lis):
for i in range(1,len(lis)+1):
for x in combinations(lis,i):
s=set(x[0])
for y in x[1:]:
if len(s & set(y)) != 0:
break
else:
s.update(y)
else:
yield x
for item in func(lis):
print item
output:
([1, 2, 3],)
([3, 6, 8],)
([4, 9],)
([6, 11],)
([1, 2, 3], [4, 9])
([1, 2, 3], [6, 11])
([3, 6, 8], [4, 9])
([4, 9], [6, 11])
([1, 2, 3], [4, 9], [6, 11])
Similar to NPE's solution, but it's without recursion and it returns a list:
def disjoint_combinations(seqs):
disjoint = []
for seq in seqs:
disjoint.extend([(each + [seq], items.union(seq))
for each, items in disjoint
if items.isdisjoint(seq)])
disjoint.append(([seq], set(seq)))
return [each for each, _ in disjoint]
for each in disjoint_combinations([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print each
Result:
[[1, 2, 3]]
[[3, 6, 8]]
[[1, 2, 3], [4, 9]]
[[3, 6, 8], [4, 9]]
[[4, 9]]
[[1, 2, 3], [6, 11]]
[[1, 2, 3], [4, 9], [6, 11]]
[[4, 9], [6, 11]]
[[6, 11]]
One-liner without employing the itertools package.
Here's your data:
lE={}
lE[0]=[1, 2, 3]
lE[1] = [3, 6, 8]
lE[2] = [4, 9]
lE[4] = [6, 11]
Here's the one-liner:
results=[(lE[v1],lE[v2]) for v1 in lE for v2 in lE if (set(lE[v1]).isdisjoint(set(lE[v2])) and v1>v2)]
I have a list of lists, each containing a different number of strings. I'd like to (efficiently) convert these all to ints, but am feeling kind of dense, since I can't get it to work out for the life of me. I've been trying:
newVals = [int(x) for x in [row for rows in values]]
Where 'values' is the list of lists. It keeps saying that x is a list and can therefore not be the argument if int(). Obviously I'm doing something stupid here, what is it? Is there an accepted idiom for this sort of thing?
This leaves the ints nested
[map(int, x) for x in values]
If you want them flattened, that's not hard either
for Python3 map() returns an iterator. You could use
[list(map(int, x)) for x in values]
but you may prefer to use the nested LC's in that case
[[int(y) for y in x] for x in values]
How about:
>>> a = [['1','2','3'],['4','5','6'],['7','8','9']]
>>> [[int(j) for j in i] for i in a]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Another workaround
a = [[1, 2, 3], [7, 8, 6]]
list(map(lambda i: list(map(lambda j: j - 1, i)), a))
[[0, 1, 2], [6, 7, 5]] #output
You simply use incorrect order and parenthesis - should be:
inputVals = [['1','2','3'], ['3','3','2','2']]
[int(x) for row in inputVals for x in row]
Or if you need list of list at the output then:
map(lambda row: map(int, row), inputVals)
an ugly way is to use evalf:
>>> eval(str(a).replace("'",""))
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
if you don't mind all your numbers in one array you could go:
>>> a = [['1','2','3'],['4','5','6'],['7','8','9']]
>>> map(int,sum(a,[]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
In order to map list with any number of dimensions you could use numpy.apply_over_axes
import numpy as np
np.apply_over_axes(lambda x,_:x*2, np.array([[1,2,3],[5,2,1]]),[0])
--------------------
array([[ 2, 4, 6],
[10, 4, 2]])
Unfortunately that doesn't work if you also need to change variable type. Didn't find any library solution for this, so here is the code to do that:
def map_multi_dimensional_list(l, transform):
if type(l) == list and len(l) > 0:
if type(l[0]) != list:
return [transform(v) for v in l]
else:
return [map_multi_dimensional_list(v, transform) for v in l]
else:
return []
map_multi_dimensional_list([[[1,2,3],[5,2,1]],[[10,20,30],[50,20,10]]], lambda x:x*2)
------------------
[[[2, 4, 6], [10, 4, 2]], [[20, 40, 60], [100, 40, 20]]]