Sum values of array using zip - python

i have array of arrays this way.
I wanna sum a specific fild (like 3rd in the list )
data = [[d, 408.56087701, 87.26907024],
[b, 277.95015117, 75.19386881],
[b, 385.41416264, 84.73488504],
[b, 380.31630662, 71.23504808],
[b, 392.10729207, 83.80720357],
[b, 399.70877373, 76.59640833],
[b, 350.93124656, 79.34979059],
[b, 330.09702335, 79.37166555]]
I am trying this, but it produces problem as I have to select only 3rd in the list (first field is string)
data = [sum(x) for x in zip(*data)]
I have to add condition to show that x is third in sub list.

the_sum = sum(x[2] for x in data)
Or if you're wondering why you thought zip(*...) seemed like a good idea in the first place:
the_sum = sum(zip(*data)[2])
Although that's more wasteful of memory

Related

next item of list inside dataframe

I have a dataframe that has a column where each row has a list.
I want to get the next element after the value I am looking for (in another column).
For example:
Let's say I am looking for 'b':
|lists |next_element|
|---------|------------|
|[a,b,c,d]| c | #(c is the next value after b)
|[c,b,a,e]| a | #(a is the next value after b)
|[a,e,f,b]| [] | #(empty, because there is no next value after b)
*All lists have the element. There are no lists without the value I am looking for
Thank you
Try writing a function and use apply.
value = 'b'
def get_next(x):
get_len = len(x)-1
for i in x:
if value.lower() == i.lower():
curr_idx = x.index(i)
if curr_idx == get_len:
return []
else:
return x[curr_idx+1]
df["next_element"] = df["lists"].apply(get_next)
df
Out[649]:
lists next_element
0 [a, b, c, d] c
1 [c, b, a, e] a
2 [a, e, f, b] []
First observation, since you want the next element of a list of string elements, the expected data type should be a string for that column, and not a list.
So, instead of the next_element columns as [c, a, []] its better to use [c, a, None]
Secondly, you should try avoiding apply methods directly over series and instead utilize the str methods that pandas provides for series which is a vectorized way of solving such problems super fast.
With the above in mind, let's try this completely vectorized one-liner -
element = 'b'
df['next_element'] = df.lists.str.join('').str.split(element).str[-1].str[0]
lists next_element
0 [a, b, c, d] c
1 [c, b, a, e] a
2 [a, e, f, b] NaN
First I combine each row as a single string [a,b,c,d]->'abcd`
Next I split this by 'b' to get substrings
I pick the last element from this list and finally the first element from that, for each row, using str functions which are vectorized over each row.
Read more about pandas.Series.str methods on official documentation/tutorial here
df = df.assign(next_element = "")
print(df)
for ind in df.index:
c= df["Lists"][ind]
for i,v in enumerate(c):
if v == "b":
df["next_element"][ind] = c[i+1]
print(df)
Try with this one you will get the exact output what you expected.

loops list slicing + elements allocation Python

I am pretty beginner in Python and trying to do the following:
main_list[80,80,30,30,30,30,20,10,5,4,3,2,1] #list of integers
- slicing the main_list in multiple lists for example list1,2,3,..,n with a sum of sub lists < 100
for i in range of n:
print(list(i))
list1[80,20], list2[80,10,5,4,1], list3[30,30,30], listn[30,3,2]
Thanks!
It's not really clear to me what you consider an acceptable output so I'm assuming that it's any list where its elements sum less than 100.
The solution I found is using recursion. For the list [a, b, c, d] we are going to check the condition for this sublists:
[a]
[a, b] (if the condition for [a] is met)
[a, b, c] (if the condition for [a, b] is met)
[a, b, c, d] (if the condition for [a, b, c] is met)
[a, c] (if the condition for [a] is met)
[a, c, d] (if the condition for [a, c] is met)
[a, d] (if the condition for [a] is met)
[b]
[b, c] (if the condition for [b] is met)
[b, c, d] (if the condition for [b, c] is met)
[b, d] (if the condition for [b] is met)
[c]
[c, d] (if the condition for [c] is met)
[d]
The concept is that for the "n" element in the list we are going to look for the sublists of size "n - 1" to 0 (so the element itself) that meet the requirements. The sublists are formed by the elements at the right of the studied element of each iteration, so for the first 30, the sublist to use would be [30, 30, 30, 20, 10, 5, 4, 3, 2, 1]
This process of finding the sublists for each element is the one that uses recursion. It calls itself for each element of the sublists checking if it meets the condition. For the example above, if the condition is met for [a, b] then it will also try for [a, b, c] and [a, b, d] (by calling itself with the sum of (a, b) and the sublist [c, d].
I've added a few prints so you can study how it works, but you should just use the results variable at the end of the script for getting your results.
main_list = [80,80,30,30,30,30,20,10,5,4,3,2,1]
def less_than_hundred(input) -> bool:
return input < 100
def sublists_meet_condition(condition, input):
"""
This function is used to call the sublists_for_element function with every element in the original list and its sublist:
- For the first element (80) it calls the second function with the sublist [80,30,30,30,30,20,10,5,4,3,2,1]
- For the fifth element (30) it calls the second function with the sublist [30,20,10,5,4,3,2,1]
Its purpose is to collect all the sublists that meet the requirements for each element
"""
results = []
for index, element in enumerate(input):
print('Iteration {} - Element {}'.format(index, element))
if condition(element):
results.append([element])
print('{} = {}'.format([element], element))
num_elements = len(input) - index
main_element = element
sublist = input[index+1:]
for result in sublists_for_element(condition, main_element, sublist):
new_result = [element] + result
sum_new_result = sum(new_result)
results.append(new_result)
print('{} = {}'.format([element] + result, sum_new_result))
return results
def sublists_for_element(condition, sum_main_elements, sublist):
"""
This function is used to check every sublist with the given condition.
The variable sum_main_elements allows the function to call itself and check if for a given list of numbers that meet the conditions [30, 30, 4] for example, any of the elements of the remaining sublists also meets the condition for example adding the number 3 still meets the condition.
Its purpose is to return all the sublists that meet the requirements for the given sum of main elements and remaining sublist
"""
num_elements = '{}{}'.format('0' if len(sublist) + 1 < 10 else '',len(sublist) + 1)
#print('Elements: {} | Main element: {} | Sublist: {}'.format(num_elements, sum_main_elements, sublist))
result = []
for index, element in enumerate(sublist):
if condition(sum_main_elements + element):
result.append([element])
sublist_results = sublists_for_element(condition, sum_main_elements + element, sublist[index+1:])
for sublist_result in sublist_results:
result.append([element] + sublist_result)
return result
results = sublists_meet_condition(less_than_hundred, main_list)

How do I enumerate all *maximal* cliques in a graph using networkx + python?

If you look at https://en.wikipedia.org/wiki/Clique_problem, you'll notice there is a distinction between cliques and maximal cliques. A maximal clique is contained in no other clique but itself. So I want those clique, but networkx seems to only provide:
networkx.algorithms.clique.enumerate_all_cliques(G)
So I tried a simple for loop filtering mechanism (see below).
def filter_cliques(self, cliques):
# TODO: why do we need this? Post in forum...
res = []
for C in cliques:
C = set(C)
for D in res:
if C.issuperset(D) and len(C) != len(D):
res.remove(D)
res.append(C)
break
elif D.issuperset(C):
break
else:
res.append(C)
res1 = []
for C in res:
for D in res1:
if C.issuperset(D) and len(C) != len(D):
res1.remove(D)
res1.append(C)
elif D.issuperset(C):
break
else:
res1.append(C)
return res1
I want to filter out all the proper subcliques. But as you can see it sucks because I had to filter it twice. It's not very elegant. So, the problem is, given a list of lists of objects (integers, strings), which were the node labels in the graph; enumerate_all_cliques(G) returns exactly this list of lists of labels. Now, given this list of lists, filter out all proper subcliques. So for instance:
[[a, b, c], [a, b], [b, c, d]] => [[a, b, c], [b, c, d]]
What's the quickest pythonic way of doing that?
There's a function for that: networkx.algorithms.clique.find_cliques, and yes, it does return only maximal cliques, despite the absence of "maximal" from the name. It should run a lot faster than any filtering approach.
If you find the name confusing (I do), you can rename it:
from networkx.algorithms.clique import find_cliques as maximal_cliques

Indexing values in another array using a list of tuples

Hi I want to use the values in another array using a list of tuples as the index's
Code :
import numpy as np
elevation_array = np.random.rand(5,5) #creates a random array 5 by 5
sort_idx = np.argsort(elevation_array, axis=None)
new_idx = zip(*np.unravel_index(sort_idx[::-1], elevation_array.shape))
for r, c in new_idx:
r, c = [r, c]
for [x, y], elevation in np.ndenumerate(elevation_array):
print elevation[r, c] # I will want to for other stuff here later on
I have also tried it this way:
for r, c in new_idx:
r, c = [r, c]
for elevation in np.ndenumerate(elevation_array):
print elevation[r, c]
I get the error in the first of:
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index
Any help would be great and explaination would be really useful as I am new to python
In the 2nd I get the error:
tuple indices must be integers, not tuple
ANSWER:
for r, c in new_idx:
print elevation_array[r, c]
I got it so simple I can't believe I did not know how to do that! :)
You can get the same result doing:
print np.sort(elevation_array.flat)[::-1]

Swap position of entities in the list

I have a following example list
x= [['True_304', 'false_2'], ['True_702', 'false_2_1'], ['True_204', 'false_222_2']]
I would like to swap the positions of entities so that the second entity is first and first one is second. Basically, something like:
x= [['false_2', 'True_304'], ['false_2_1', 'True_702'], ['false_222_2', 'True_204']]
Is there any easier way to do this? Any ideas would be helpful. Thanks.
You can use list comprehensions:
>>> x = [['True_304', 'false_2'], ['True_702', 'false_2_1'], ['True_204', 'false_222_2']]
>>> [[b, a] for [a, b] in x]
[['false_2', 'True_304'], ['false_2_1', 'True_702'], ['false_222_2', 'True_204']]
And in case your list is larger than 2, e.g. 12 elements:
for e in x:
e.reverse()

Categories