Related
I have two lists I want to multiply each number in the first list by all numbers in the second list
[1,2]x[1,2,3]
I want my result to be like this [(1x1)+(1x2)+(1x3),(2x1)+(2x2)+(2x3)]
numpy
a = np.array([1,2])
b = np.array([1,2,3])
c = (a[:,None]*b).sum(1)
output: array([ 6, 12])
python
a = [1,2]
b = [1,2,3]
c = [sum(x*y for y in b) for x in a]
output: [6, 12]
old answer (product per element)
numpy
a = np.array([1,2])
b = np.array([1,2,3])
c = (a[:,None]*b).ravel()
output: array([1, 2, 3, 2, 4, 6])
python
a = [1,2]
b = [1,2,3]
c = [x*y for x in a for y in b]
## OR
from itertools import product
c = [x*y for x,y in product(a,b)]
output: [1, 2, 3, 2, 4, 6]
def multiplyLists(list1: list, list2:list) -> list:
toReturn = []
for i in list1:
temp_sum = 0
for j in list2:
temp_sum += i * j
toReturn.append(temp_sum)
return toReturn
Another way using numpy (that you can extend to many other functions between two lists):
a = [1,2]
b = [1,2,3]
np.multiply.outer(a,b).ravel()
#array([1, 2, 3, 2, 4, 6])
As the comments point out a pure Python solution will be very different to a numpy solution.
Pyhton
Here it woud be straightforward to use a nested loop or list comprehension:
list1 = [1, 2]
list2 = [1, 2, 3]
lst_output = []
for i in list1:
for j in list2:
lst_output .append(i*j)
#equivalent alternative
lst_output = [i*j for i in list1 for j in list2]
Numpy
There are mny ways to go about it with numpy as well. Here's one example:
arr1 = np.array([1, 2])
arr2 = np.array([1, 2, 3])
xx, yy = np.meshgrid(arr1, arr2)
arr_output = xx * yy
# optionally (to get a 1d array)
arr_output_flat = arr_output.flatten()
Edit: Reading your question again I noticed you state you actually want the output to be 2 sums (of 3 products). I suggest you phrase more precisely what you want an what you've tried. But to provide that here's what you can do with the lists or arrays from above:
# Pure Python
lst_output = [sum(i*j for j in list2) for i in list1]
# Numpy
xx, yy = np.meshgrid(arr1, arr2)
arr_output = np.sum(xx * yy, axis=0)
I read the similar topic here. I think the question is different or at least .index() could not solve my problem.
This is a simple code in R and its answer:
x <- c(1:4, 0:5, 11)
x
#[1] 1 2 3 4 0 1 2 3 4 5 11
which(x==2)
# [1] 2 7
min(which(x==2))
# [1] 2
which.min(x)
#[1] 5
Which simply returns the index of the item which meets the condition.
If x be the input for Python, how can I get the indeces for the elements which meet criteria x==2 and the one which is the smallest in the array which.min.
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
x[x>2].index()
##'numpy.ndarray' object has no attribute 'index'
Numpy does have built-in functions for it
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
np.where(x == 2)
np.min(np.where(x==2))
np.argmin(x)
np.where(x == 2)
Out[9]: (array([1, 6], dtype=int64),)
np.min(np.where(x==2))
Out[10]: 1
np.argmin(x)
Out[11]: 4
A simple loop will do:
res = []
x = [1,2,3,4,0,1,2,3,4,11]
for i in range(len(x)):
if check_condition(x[i]):
res.append(i)
One liner with comprehension:
res = [i for i, v in enumerate(x) if check_condition(v)]
Here you have a live example
NumPy for R provides you with a bunch of R functionalities in Python.
As to your specific question:
import numpy as np
x = [1,2,3,4,0,1,2,3,4,11]
arr = np.array(x)
print(arr)
# [ 1 2 3 4 0 1 2 3 4 11]
print(arr.argmin(0)) # R's which.min()
# 4
print((arr==2).nonzero()) # R's which()
# (array([1, 6]),)
The method based on python indexing and numpy, which returns the value of the desired column based on the index of the minimum/maximum value
df.iloc[np.argmin(df['column1'].values)]['column2']
built-in index function can be used for this purpose:
x = [1,2,3,4,0,1,2,3,4,11]
print(x.index(min(x)))
#4
print(x.index(max(x)))
#9
However, for indexes based on a condition, np.where or manual loop and enumerate may work:
index_greater_than_two1 = [idx for idx, val in enumerate(x) if val>2]
print(index_greater_than_two1)
# [2, 3, 7, 8, 9]
# OR
index_greater_than_two2 = np.where(np.array(x)>2)
print(index_greater_than_two2)
# (array([2, 3, 7, 8, 9], dtype=int64),)
You could also use heapq to find the index of the smallest. Then you can chose to find multiple (for example index of the 2 smallest).
import heapq
x = np.array([1,2,3,4,0,1,2,3,4,11])
heapq.nsmallest(2, (range(len(x))), x.take)
Returns
[4, 0]
This question already has answers here:
Python: filtering lists by indices
(7 answers)
Closed 8 years ago.
I have two lists a=[10,5,6,8] and b=[1,3]. How can I use the latter as a subscript of the former? I.e. I would like to extract the second and fourth element of a.
Put otherwise, in Matlab I would use
v = [16 5 9 4 2 11 7 14];
v([1 5 6]) % Extract the first, fifth, and sixth elements
>> ans =
16 2 11
How can I do the same in Python?
You can use operator.itemgetter to do it:
from operator import itemgetter
a=[10,5,6,8]
b=[1,3]
res = itemgetter(*b)(a)
# (5, 8)
You can use a list comprehension like so:
>>> a = [10, 5, 6, 8]
>>> b = [1, 3]
>>> [a[x] for x in b]
[5, 8]
>>>
numpy supports indexing with arrays, as well as a bunch of other array and matrix operations, in Matlab style. Consider using it for computationally intensive tasks:
In [1]: import numpy as np
In [2]: a = np.array([10,5,6,8])
In [3]: b = np.array([1,3])
In [4]: a[b]
Out[4]: array([5, 8])
l=[1 5 6]
v = [16 5 9 4 2 11 7 14];
[v[i] for i in l]
you can try like this
it can be explained like this
for i in l:
print v[i]
a=[10,5,6,8]
b=[1,3]
ex = [a[i] for i in b]
print(ex) # [5, 8]
In R, you could split a vector according to the factors of another vector:
> a <- 1:10
[1] 1 2 3 4 5 6 7 8 9 10
> b <- rep(1:2,5)
[1] 1 2 1 2 1 2 1 2 1 2
> split(a,b)
$`1`
[1] 1 3 5 7 9
$`2`
[1] 2 4 6 8 10
Thus, grouping a list (in terms of python) according to the values of another list (according to the order of the factors).
Is there anything handy in python like that, except from the itertools.groupby approach?
From your example, it looks like each element in b contains the 1-indexed list in which the node will be stored. Python lacks the automatic numeric variables that R seems to have, so we'll return a tuple of lists. If you can do zero-indexed lists, and you only need two lists (i.e., for your R use case, 1 and 2 are the only values, in python they'll be 0 and 1)
>>> a = range(1, 11)
>>> b = [0,1] * 5
>>> split(a, b)
([1, 3, 5, 7, 9], [2, 4, 6, 8, 10])
Then you can use itertools.compress:
def split(x, f):
return list(itertools.compress(x, f)), list(itertools.compress(x, (not i for i in f)))
If you need more general input (multiple numbers), something like the following will return an n-tuple:
def split(x, f):
count = max(f) + 1
return tuple( list(itertools.compress(x, (el == i for el in f))) for i in xrange(count) )
>>> split([1,2,3,4,5,6,7,8,9,10], [0,1,1,0,2,3,4,0,1,2])
([1, 4, 8], [2, 3, 9], [5, 10], [6], [7])
Edit: warning, this a groupby solution, which is not what OP asked for, but it may be of use to someone looking for a less specific way to split the R way in Python.
Here's one way with itertools.
import itertools
# make your sample data
a = range(1,11)
b = zip(*zip(range(len(a)), itertools.cycle((1,2))))[1]
{k: zip(*g)[1] for k, g in itertools.groupby(sorted(zip(b,a)), lambda x: x[0])}
# {1: (1, 3, 5, 7, 9), 2: (2, 4, 6, 8, 10)}
This gives you a dictionary, which is analogous to the named list that you get from R's split.
As a long time R user I was wondering how to do the same thing. It's a very handy function for tabulating vectors. This is what I came up with:
a = [1,2,3,4,5,6,7,8,9,10]
b = [1,2,1,2,1,2,1,2,1,2]
from collections import defaultdict
def split(x, f):
res = defaultdict(list)
for v, k in zip(x, f):
res[k].append(v)
return res
>>> split(a, b)
defaultdict(list, {1: [1, 3, 5, 7, 9], 2: [2, 4, 6, 8, 10]})
You could try:
a = [1,2,3,4,5,6,7,8,9,10]
b = [1,2,1,2,1,2,1,2,1,2]
split_1 = [a[k] for k in (i for i,j in enumerate(b) if j == 1)]
split_2 = [a[k] for k in (i for i,j in enumerate(b) if j == 2)]
results in:
In [22]: split_1
Out[22]: [1, 3, 5, 7, 9]
In [24]: split_2
Out[24]: [2, 4, 6, 8, 10]
To make this generalise you can simply iterate over the unique elements in b:
splits = {}
for index in set(b):
splits[index] = [a[k] for k in (i for i,j in enumerate(b) if j == index)]
The question is, how can I remove elements that appear more often than once in an array completely. Below you see an approach that is very slow when it comes to bigger arrays.
Any idea of doing this the numpy-way? Thanks in advance.
import numpy as np
count = 0
result = []
input = np.array([[1,1], [1,1], [2,3], [4,5], [1,1]]) # array with points [x, y]
# count appearance of elements with same x and y coordinate
# append to result if element appears just once
for i in input:
for j in input:
if (j[0] == i [0]) and (j[1] == i[1]):
count += 1
if count == 1:
result.append(i)
count = 0
print np.array(result)
UPDATE: BECAUSE OF FORMER OVERSIMPLIFICATION
Again to be clear: How can I remove elements appearing more than once concerning a certain attribute from an array/list ?? Here: list with elements of length 6, if first and second entry of every elements both appears more than once in the list, remove all concerning elements from list. Hope I'm not to confusing. Eumiro helped me a lot on this, but I don't manage to flatten the output list as it should be :(
import numpy as np
import collections
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
# here, from input there should be removed input[0], input[1] and input[4] because
# first and second entry appears more than once in the list, got it? :)
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2:])
outputDict = [list(k)+list(v) for k,v in d.iteritems() if len(v) == 1 ]
result = []
def flatten(x):
if isinstance(x, collections.Iterable):
return [a for i in x for a in flatten(i)]
else:
return [x]
# I took flatten(x) from http://stackoverflow.com/a/2158522/1132378
# And I need it, because output is a nested list :(
for i in outputDict:
result.append(flatten(i))
print np.array(result)
So, this works, but it's impracticable with big lists.
First I got
RuntimeError: maximum recursion depth exceeded in cmp
and after applying
sys.setrecursionlimit(10000)
I got
Segmentation fault
how could I implement Eumiros solution for big lists > 100000 elements?
np.array(list(set(map(tuple, input))))
returns
array([[4, 5],
[2, 3],
[1, 1]])
UPDATE 1: If you want to remove the [1, 1] too (because it appears more than once), you can do:
from collections import Counter
np.array([k for k, v in Counter(map(tuple, input)).iteritems() if v == 1])
returns
array([[4, 5],
[2, 3]])
UPDATE 2: with input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]:
input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2])
d is now:
{(1, 1): [2, 3, 7],
(2, 3): [4],
(4, 5): [5]}
so we want to take all key-value pairs, that have single values and re-create the arrays:
np.array([k+tuple(v) for k,v in d.iteritems() if len(v) == 1])
returns:
array([[4, 5, 5],
[2, 3, 4]])
UPDATE 3: For larger arrays, you can adapt my previous solution to:
import numpy as np
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a)
np.array([v for v in d.itervalues() if len(v) == 1])
returns:
array([[[456, 6, 5, 343, 435, 5]],
[[ 1, 3, 4, 5, 6, 7]],
[[ 3, 4, 6, 7, 7, 6]],
[[ 3, 3, 3, 3, 3, 3]]])
This is a corrected, faster version of Hooked's answer. count_unique counts the number of the number of occurrences for each unique key in keys.
import numpy as np
input = np.array([[1,1,3,5,6,6],
[1,1,4,4,5,6],
[1,3,4,5,6,7],
[3,4,6,7,7,6],
[1,1,4,6,88,7],
[3,3,3,3,3,3],
[456,6,5,343,435,5]])
def count_unique(keys):
"""Finds an index to each unique key (row) in keys and counts the number of
occurrences for each key"""
order = np.lexsort(keys.T)
keys = keys[order]
diff = np.ones(len(keys)+1, 'bool')
diff[1:-1] = (keys[1:] != keys[:-1]).any(-1)
count = np.where(diff)[0]
count = count[1:] - count[:-1]
ind = order[diff[1:]]
return ind, count
key = input[:, :2]
ind, count = count_unique(key)
print key[ind]
#[[ 1 1]
# [ 1 3]
# [ 3 3]
# [ 3 4]
# [456 6]]
print count
[3 1 1 1 1]
ind = ind[count == 1]
output = input[ind]
print output
#[[ 1 3 4 5 6 7]
# [ 3 3 3 3 3 3]
# [ 3 4 6 7 7 6]
# [456 6 5 343 435 5]]
Updated Solution:
From the comments below, the new solution is:
idx = argsort(A[:, 0:2], axis=0)[:,1]
kidx = where(sum(A[idx,:][:-1,0:2]!=A[idx,:][1:,0:2], axis=1)==0)[0]
kidx = unique(concatenate((kidx,kidx+1)))
for n in arange(0,A.shape[0],1):
if n not in kidx:
print A[idx,:][n]
> [1 3 4 5 6 7]
[3 3 3 3 3 3]
[3 4 6 7 7 6]
[456 6 5 343 435 5]
kidx is a index list of the elements you don't want. This preserves rows where the first two inner elements do not match any other inner element. Since everything is done with indexing, it should be fast(ish), though it requires a sort on the first two elements. Note that original row order is not preserved, though I don't think this is a problem.
Old Solution:
If I understand it correctly, you simply want to filter out the results of a list of lists where the first element of each inner list is equal to the second element.
With your input from your update A=[[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]], the following line removes A[0],A[1] and A[4]. A[5] is also removed since that seems to match your criteria.
[x for x in A if x[0]!=x[1]]
If you can use numpy, there is a really slick way of doing the above. Assume that A is an array, then
A[A[0,:] == A[1,:]]
Will pull out the same values. This is probably faster than the solution listed above if you want to loop over it.
Why not create another array to hold the output?
Iterate through your main list and for each i check if i is in your other array and if not append it.
This way, your new array will not contain more than one of each element