I am trying to compute the euclidean distance among the dictionary elements as shown below
#!/usr/bin/python
import itertools
from scipy.spatial import distance
def Distance(data):
for subset in itertools.combinations(data, 2):
print subset
#This shows a tuple of two element instead of the elements of the dictionary.
dst = distance.euclidean(subset)
if __name__ == "__main__":
data = {}
data['1'] = [5, 3, 4, 4, 6]
data['2'] = [1, 2, 3, 4, 5]
data['3'] = [3, 1, 2, 3, 3]
data['4'] = [4, 3, 4, 3, 3]
data['5'] = [3, 3, 1, 5, 4]
Distance(data)
The issue is that when I try to compute the combination of the dictionary elements I get something that I do not expect, as commented in the code. I think I am doing something wrong with itertools.combinations...
You're taking combinations of an iteration, and the iteration is:
for element in data:
print(element)
If you run that code, I think you'll see the issue: you're iterating over the keys of the dictionary, not the elements. Try again with .values():
for subset in itertools.combinations(data.values(), 2):
print subset
You have a couple of other issues in your code though: Firstly distance.euclidean takes 2 parameters, so you'll need to do something like this:
dst = distance.euclidean(*subset)
...or this...
dst = distance.euclidean(subset[0], subset[1])
...and secondly your function isn't doing anything with dst, so it will be overwritten every time through your loop.
Related
I can generate combinations from a list of numbers using itertools.combinations, such as the following:
from itertools import combinations
l = [1, 2, 3, 4, 5]
for i in combinations(l,2):
print(list(i))
This generates the following:
[1, 2]
[1, 3]
[1, 4]
[1, 5]
[2, 3]
[2, 4]
[2, 5]
[3, 4]
[3, 5]
[4, 5]
How can I generate just one of these list pairs at a time and save it to a variable? I want to use each pair of numbers, one pair at a time, and then go to the next pair of numbers. I don't want to generate all of them at once.
The itertools library actually returns a generator for most of it's functions (including itertools.combinations()). You can read more about Generators here. Basically, it's a function that lazily calculates values instead of generating everything all at once.
You can just get a single value out of a generator using the next command. Take a look at the following snippet:
import itertools
l = [1, 2, 3, 4, 5]
comb_generator = itertools.combinations(l, 2)
temp = next(comb_generator) #Get the first combo into a var
print(temp)
#Do some other stuff
temp = next(comb_generator) #Get another combo
print(temp)
Whenever you want a new combination, you can get it by calling next() on your generator.
So, I have an iterable in Input like this:
[4, 6, 2, 2, 6, 4, 4, 4]
And I want to sort it based on decreased frequency order. So that the result will be this:
[4, 4, 4, 4, 6, 6, 2, 2]
So what happened here is that, when an element has the same frequency of another one, they will be in the same order (6 appeared first so the 6 goes before the 2).
I tried to implement this mechanism using the sorted function but I have a big problem.
def frequency_sort(items):
return sorted(items, key=lambda elem: sum([True for i in items if i == elem]), reverse=True)
I know this short way is difficult to read but it just sort the array using the key parameter to extract the frequency of a number. But, the output is this:
[4, 4, 4, 4, 6, 2, 2, 6]
As you can see the output is a little different from what it should be. And that happened (I think) because sorted() is a function that does a "stable sort" i.e. a sort that will keep the order as it is if there are same keys.
So what is happening here is like a strong stable sort. I want more like a soft-sort that will take into account the order but will put the same elements next to each other.
You could use collections.Counter and use most_common that returns in descending order of frequency:
from collections import Counter
def frequency_sorted(lst):
counts = Counter(lst)
return [k for k, v in counts.most_common() for _ in range(v)]
result = frequency_sorted([4, 6, 2, 2, 6, 4, 4, 4])
print(result)
Output
[4, 4, 4, 4, 6, 6, 2, 2]
From the documentation on most_common:
Return a list of the n most common elements and their counts from the
most common to the least. If n is omitted or None, most_common()
returns all elements in the counter. Elements with equal counts are
ordered in the order first encountered
I'm working on a probability-related problem. I need to sum only specific items on a certain list.
I've tried using "for" functions and it hasn't worked. I'm looking for a way to select items based on their positions on the list, and summing them.
You can use operator.itemgetter to select only certian index’s in a list or keys in a dict.
from operator import itemgetter
data = [1, 2, 3, 4, 5, 6, 7, 8]
get_indexes = itemgetter(2, 5, 7)
#this will return indexes 2, 5, 7 from a sequence
sum(get_indexes(data)) #3+6+8
#returns 17
That example is for lists but you can use itemgetter for dict keys too just use itemgetter('key2', 'key5', 'key7')({some_dict})
To get only even or odd indexes use slicing not enumerate and a loop it’s much more efficient and easier to read:
even = sum(data[::2])
odd = sum(data[1::2])
You can also use filter but I wouldn’t suggest this for getting by index:
sum(filter(lambda n: data.index(n) % 2 == 0, data))
You really should have put more into your question, but:
stuff = [1, 2, 3, 4, 5, 6, 7, 8]
# sum the numbers that have even indices:
funny_total = sum([x for i, x in enumerate(stuff) if i % 2 == 0 ])
funny_total
# 16
That should get you started. An approach with a for loop would have worked, as well. You just likely have a bug in your code.
stuff = [1, 2, 3, 4, 5, 6, 7, 8]
indices_to_include = [1, 3, 4, 5, 6]
funny_total = 0
for i, x in enumerate(stuff):
if i in indices_to_include:
funny_total += x
You could also:
def keep_every_third(i):
return i % 3 == 0
# variable definitions as above...
for i, x in enumerate(stuff):
if keep_every_third(i):
# do stuff
I have a file which contains a number of lists. I want to access the index of the values retrieved from each of these lists. I use the random function as shown below. It retrieves the values perfectly well, but I need to get the index of the values obtained.
for i in range(M):
print(krr[i])
print(krr[i].index(random.sample(krr[i],2)))
nrr[i]=random.sample(krr[i],2)
outf13.write(str(nrr[i]))
outf13.write("\n")
I got ValueError saying the two values retrieved are not in the list even though they exist...
To retrieve the index of the randomly selected value in your list you could use enumerate that will return the index and the value of an iterable as a tuple:
import random
l = range(10) # example list
random.shuffle(l) # we shuffle the list
print(l) # outputs [4, 1, 5, 0, 6, 7, 9, 2, 8, 3]
index_value = random.sample(list(enumerate(l)), 2)
print(index_value) # outputs [(4, 6), (6, 9)]
Here the 4th value 6 and 6th value 9 were selected - of course each run will return something different.
Also in your code you are printing a first sample of the krr[i] and then sampling it again on the next line assigning it to nrr[i]. Those two calls will result in different samples and might cause your IndexError.
EDIT after OP's comment
The most explicit way to then separate the values from the indexes is:
indexes = []
values = []
for idx, val in index_value:
indexes.append(idx)
values.append(val)
print indexes # [4, 6]
print values # [6, 9]
Note that indexes and values are in the same order as index_value.
If you need to reproduce the results, you can seed the random generator, for instance with random.seed(123). This way, every time you run the code you get the same random result.
In this case, the accepted solution offered by bvidal it would look like this:
import random
l = list(range(10)) # example list (please notice the explicit call to 'list')
random.seed(123)
random.shuffle(l) # shuffle the list
print(l) # outputs [8, 7, 5, 9, 2, 3, 6, 1, 4, 0]
index_value = random.sample(list(enumerate(l)), 2)
print(index_value) # outputs [(8, 4), (9, 0)]
Another approach is to use the random sample function random.sample from the standard library to randomly get an array of indices and use those indices to randomly choose elements from the list. The simplest way to access the elements is converting the list to a numpy array:
import numpy as np
import random
l = [1, -5, 4, 2, 7, 4, 8, 0, 9, 3]
print(l) # prints the list
random.seed(1234) # seed the random generator for reproducing the results
random_indices = random.sample(range(len(l)), 2) # get 2 random indices
print(random_indices) # prints the indices
a = np.asarray(l) # convert to array
print(list(a[random_indices])) # prints the elements
The output of the code is:
[1, -5, 4, 2, 7, 4, 8, 0, 9, 3]
[7, 1]
[0, -5]
You could try using enumerate() on your list objects.
According to the Python official documentation
enumerate() : Return an enumerate object. sequence must be a sequence, an iterator,
or some other object which supports iteration. The next() method of
the iterator returned by enumerate() returns a tuple containing a
count (from start which defaults to 0) and the values obtained from
iterating over sequence
A simple example is this :
my_list=['a','b','c']
for index, element in enumerate(my_list):
print(index, element)
# 0 a
# 1 b
# 2 c
Don't know if I understood the question though.
You are getting the random sample twice, which results in two different random samples.
First of all, I couldn't find the answer in other questions.
I have a numpy array of integer, this is called ELEM, the array has three columns that indicate, element number, node 1 and node 2. This is one dimensional mesh. What I need to do is to renumber the nodes, I have the old and new node numbering tables, so the algorithm should replace every value in the ELEM array according to this tables.
The code should look like this
old_num = np.array([2, 1, 3, 6, 5, 9, 8, 4, 7])
new_num = np.arange(1,10)
ELEM = np.array([ [1, 1, 3], [2, 3, 6], [3, 1, 3], [4, 5, 6]])
From now, for every element in the second and third column of the ELEM array I should replace every integer from the corresponding integer specified according to the new_num table.
If you're doing a lot of these, it makes sense to encode the renumbering in a dictionary for fast lookup.
lookup_table = dict( zip( old_num, new_num ) ) # create your translation dict
vect_lookup = np.vectorize( lookup_table.get ) # create a function to do the translation
ELEM[:, 1:] = vect_lookup( ELEM[:, 1:] ) # Reassign the elements you want to change
np.vectorize is just there to make things nicer syntactically. All it does is allow us to map over the values of the array with our lookup_table.get function
I actually couldn't exactly get what your problem is but, I tried to help you as far as I could understood...
I think you need to replace, for example 2 with 1, or 7 with 10, right? In such a case, you can create a dictionary for numbers that are to be replaced. The 'dict' below is for that purpose. It could also be done by using tuples or lists but for such purposes it is better to use dictionaries. Afterwards, just replace each element by looking into the dictionary.
The code below is a very basic one is relatively easy to understand. For sure there are more pythonic ways to do that. But if you are new into Python, the code below would be the most appropriate one.
import numpy as np
# Data you provided
old_num = np.array([2, 1, 3, 6, 5, 9, 8, 4, 7])
new_num = np.arange(1,10)
ELEM = np.array([ [1, 1, 3], [2, 3, 6], [3, 1, 3], [4, 5, 6]])
# Create a dict for the elements to be replaced
dict = {}
for i_num in range(len(old_num)):
num = old_num[i_num]
dict[num] = new_num[i_num]
# Replace the elements
for element in ELEM:
element[1] = dict[element[1]]
element[2] = dict[element[2]]
print ELEM