Related
I'm coding with python 3.6 and am working on a Genetic Algorithm. When generating a new population, when I append the new values to the array all the values in the array are changed to the new value. Is there something wrong with my functions?
Code:
from fuzzywuzzy import fuzz
import numpy as np
import random
import time
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
return parent
def gen(cur_gen, pop_size, fittest):
if cur_gen == 1:
population = []
for _ in range(pop_size):
add_to = []
for _ in range(6):
add_to.append(random.randint(0,9))
population.append(add_to)
return population
else:
population = []
for _ in range(pop_size):
print('\n')
population.append(mutate(fittest))
print(population)
return population
def get_fittest(population):
fitness = []
for x in population:
fitness.append(fuzz.ratio(x, [9,9,9,9,9,9]))
fittest = fitness.index(max(fitness))
fittest_fitness = fitness[fittest]
fittest = population[fittest]
return fittest, fittest_fitness
done = False
generation = 1
population = gen(generation, 10, [0,0,0,0,0,0])
print(population)
while not done:
generation += 1
time.sleep(0.5)
print('Current Generation: ',generation)
print('Fittest: ',get_fittest(population))
if get_fittest(population)[1] == 100:
done = True
population = gen(generation, 10, get_fittest(population)[0])
print('Population: ',population)
Output:
Fittest: ([7, 4, 2, 7, 8, 9], 72)
[3, 4, 2, 7, 8, 9]
[[3, 4, 2, 7, 8, 9]]
[3, 4, 2, 7, 5, 9]
[[3, 4, 2, 7, 5, 9], [3, 4, 2, 7, 5, 9]]
[3, 4, 2, 7, 4, 9]
[[3, 4, 2, 7, 4, 9], [3, 4, 2, 7, 4, 9], [3, 4, 2, 7, 4, 9]]
[3, 1, 2, 7, 4, 9]
[[3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9], [3, 1, 2, 7, 4, 9]]
[3, 1, 2, 7, 4, 2]
[[3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2], [3, 1, 2, 7, 4, 2]]
[3, 1, 2, 5, 4, 2]
[[3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2]]
[3, 1, 2, 5, 4, 2]
[[3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2], [3, 1, 2, 5, 4, 2]]
[3, 1, 2, 5, 4, 5]
[[3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5], [3, 1, 2, 5, 4, 5]]
[3, 1, 2, 5, 4, 3]
[[3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3], [3, 1, 2, 5, 4, 3]]
You keep passing fittest, a reference to the same list, to mutate, which modifies the list in-place, and append it to population, so changes to the list in the next iteration are reflected across all the references to fittest in the population list.
You should pass a copy of the fittest list to mutate instead:
population.append(mutate(fittest[:]))
It's right there in the name:
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
return parent
mutate isn't making a new list, it's modifying the existing list in place and returning a reference to the same list it was passed. population.append(mutate(fittest)) is happily storing aliases to fittest over and over, and since fittest is never replaced, it just keeps getting mutated and new aliases to it stored.
If the goal is to store a snapshot of the list at a given stage each time, copy it, changing:
population.append(mutate(fittest))
to:
population.append(mutate(fittest)[:])
where a complete slice of the list makes a shallow copy. population.append(mutate(fittest).copy()) would also work on modern Python, as would import copy, then doing population.append(copy.copy(mutate(fittest))) or (if the contents might themselves be mutable, though they aren't in this case) population.append(copy.deepcopy(mutate(fittest))).
Note that as a rule, Python functions aren't really supposed to both mutate and return the mutated value, as it leads to confusion like this.
Pythonic code intended to mutate in place returns None (implicitly usually, by not returning at all), so the code you'd use would actually be:
def mutate(parent):
x = random.randint(0,len(parent)-1)
parent[x] = random.randint(0,9)
print(parent)
and used with:
mutate(fittest) # Mutate fittest in place
population.append(fittest[:]) # Append copy of most recently updated fittest
I need to compare the elements of a list of list. My code is for two items inside of the list of list but when I have more than two I don't know how proceed.
My inputs have the same len ever. And I need to compare d[][:1] and if it is repeated check the d[][:-1] and print the d[] with the less d[][:-1]
The print I need: d = [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
Code:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
if d[0][:-1] == d[1][:-1]:
if d[0][-1] < d[1][-1]:
d.remove(d[1])
else:
d.remove(d[0])
>>> print d
[[1, 2, 3, 4, 5], [1, 2, 3, 4, 4], [3, 2, 4, 2, 5], [3, 2, 4, 2, 1]]
Edited:
from operator import itemgetter
from itertools import groupby
d = [['4027221', 'MX', '0.4', 3],
['4027221', 'MX', '30', 1],
['4027222', 'MX', '0.4', 3],
['4027222', 'MX', '30', 1]]
d.sort()
d = [min(g, key=lambda s: s[-2]) for _, g in groupby(d, key=lambda s: s[:-2])]
[['4027221', 'MX', '0.4', 3], ['4027222', 'MX', '0.4', 3]]
You can use itertools.groupby to group the list by all but the last item first, and then sort the sub-lists by the last item with min:
from operator import itemgetter
from itertools import groupby
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
print([min(g, key=itemgetter(-1)) for _, g in groupby(d, key=lambda s: s[:-1])])
This outputs:
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
if I understood well what you want, this should do the trick:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
mins = {}
for a_list in d:
list_key = ','.join(map(str, a_list[:-1]))
list_orderer = a_list[-1]
if list_key not in mins or mins[list_key] > list_orderer:
mins[list_key] = a_list
print(sorted(mins.values())) # [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
It works in Python 2 and 3, it does not require the input to be sorted and it does not require any dependency (which is not a real argument).
You can do it this way too:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
sublists = list(set(tuple(i[:-1]) for i in d))
mins = [min([elem for elem in d if elem[:-1]==list(s)])for s in sublists]
print(mins)
Output:
[[3, 2, 4, 2, 1], [1, 2, 3, 4, 4]]
Scaling up blhsing's solution. For bigger data and skipping the need to sort.
import pandas as pd
cols = ['v1', 'v2', 'v3', 'v4', 'v5']
df = pd.DataFrame(d, columns=cols)
ndf = df.groupby(cols[:-1], as_index=False).min()
out = ndf.values.tolist()
print(out)
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
You can use a dictionary, taking advantage of the fact that, as you iterate, only the last value will be attached to any given key. The solution does not require sorting.
d2 = {tuple(key): val for *key, val in d}
res = [list(k) + [v] for k, v in d2.items()]
print(res)
[[1, 2, 3, 4, 4],
[3, 2, 4, 2, 1]]
Note tuple conversion is required since lists are not hashable, so they cannot be used as dictionary keys.
Edit: as #JonClements suggests, you can write this more simply as:
res = list({tuple(el[:-1]): el for el in d}.values())
I have a list of lists like this: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]. How can I count the lists which are sublists of more than two lists? For example, here [2, 3] and [3, 4] would be the lists that are sublists of first 3 lists. I want to get rid of them.
This comprehension should do it:
data = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
solution = [i for i in data if sum([1 for j in data if set(i).issubset(set(j))]) < 3]
set_list = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
check_list = [[2, 3], [3, 4]]
sublist_to_list = {}
for set in set_list:
for i, sublist in enumerate(check_list):
count = 0
for element in sublist:
if element in set:
count += 1
if count == len(sublist):
if i not in sublist_to_list:
sublist_to_list[i] = [set]
else:
sublist_to_list[i].append(set)
print(sublist_to_list)
Output: {0: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3]], 1: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [3, 4]]}
which means [2, 3] is subset of [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3]]
and [3, 4] is subset of [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [3, 4]]
You can first make a function that gets sub lists of a list:
def sublists(lst):
length = len(lst)
for size in range(1, length + 1):
for start in range(length - size + 1):
yield lst[start:start+size]
Which works as follows:
>>> list(sublists([1, 2, 3, 4, 5]))
[[1], [2], [3], [4], [5], [1, 2], [2, 3], [3, 4], [4, 5], [1, 2, 3], [2, 3, 4], [3, 4, 5], [1, 2, 3, 4], [2, 3, 4, 5], [1, 2, 3, 4, 5]]
Then you can use this to collect all the sublists list indices into a collections.defaultdict:
from collections import defaultdict
lsts = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
d = defaultdict(list)
for i, lst in enumerate(lsts):
subs = sublists(lst)
while True:
try:
curr = tuple(next(subs))
d[curr].append(i)
except StopIteration:
break
Which will have tuple keys for the sublists, and the list indices as the values.
Then to determine sub lists that occur more than twice in all the lists, you can check if the set of all the indices has a length of more than two:
print([list(k) for k, v in d.items() if len(set(v)) > 2])
Which will give the following sublists:
[[2], [3], [4], [5], [2, 3], [3, 4], [4, 5], [2, 3, 4], [3, 4, 5], [2, 3, 4, 5]]
for i in file:
file = [i.strip() for i in file]
return [file[i:i+1] for i in range(0, len(file),1)]
OUTPUT:
[['1,3,4,5,2'], ['4,2,5,3,1'], ['1,3,2,5,4'], ['1,2,4,3,5'], ['1,3,4,5,2'], ['2,1,3,5,4'], ['1,3,4,5,2'], ['3,5,2,4,1'], ['1,4,5,2,3'], ['5,1,4,3,2'], ['3,2,5,4,1'], ['3,1,2,5,4'], ['2,5,1,4,3'], ['3,2,1,4,5'], ['4,5,3,1,2'], ['1,5,4,3,2'], ['1,5,3,4,2'], ['2,1,4,3,5'], ['4,1,2,5,3']]
This is my List with sub-lists.
How can I take off the ' '? So ['1,3,4,5,2'] -> [1,3,4,5,2]
Thanks! (:
You can try this:
s = [['1,3,4,5,2'], ['4,2,5,3,1'], ['1,3,2,5,4'], ['1,2,4,3,5'], ['1,3,4,5,2'], ['2,1,3,5,4'], ['1,3,4,5,2'], ['3,5,2,4,1'], ['1,4,5,2,3'], ['5,1,4,3,2'], ['3,2,5,4,1'], ['3,1,2,5,4'], ['2,5,1,4,3'], ['3,2,1,4,5'], ['4,5,3,1,2'], ['1,5,4,3,2'], ['1,5,3,4,2'], ['2,1,4,3,5'], ['4,1,2,5,3']]
final_data = [map(int, i[0].split(",")) for i in s]
Output:
[[1, 3, 4, 5, 2], [4, 2, 5, 3, 1], [1, 3, 2, 5, 4], [1, 2, 4, 3, 5], [1, 3, 4, 5, 2], [2, 1, 3, 5, 4], [1, 3, 4, 5, 2], [3, 5, 2, 4, 1], [1, 4, 5, 2, 3], [5, 1, 4, 3, 2], [3, 2, 5, 4, 1], [3, 1, 2, 5, 4], [2, 5, 1, 4, 3], [3, 2, 1, 4, 5], [4, 5, 3, 1, 2], [1, 5, 4, 3, 2], [1, 5, 3, 4, 2], [2, 1, 4, 3, 5], [4, 1, 2, 5, 3]]
try this:
import ast
l = []
s = [['1,3,4,5,2'], ['4,2,5,3,1'], ['1,3,2,5,4'], ['1,2,4,3,5'], ['1,3,4,5,2'], ['2,1,3,5,4'], ['1,3,4,5,2'], ['3,5,2,4,1'], ['1,4,5,2,3'], ['5,1,4,3,2'], ['3,2,5,4,1'], ['3,1,2,5,4'], ['2,5,1,4,3'], ['3,2,1,4,5'], ['4,5,3,1,2'], ['1,5,4,3,2'], ['1,5,3,4,2'], ['2,1,4,3,5'], ['4,1,2,5,3']]
for li in s:
l.append(list(ast.literal_eval(li[0])))
note: ast is a built-in (standard) module if you don't know it, it cames with python installation
One line solution :
list_1=[['1,3,4,5,2'], ['4,2,5,3,1'], ['1,3,2,5,4'], ['1,2,4,3,5'], ['1,3,4,5,2'], ['2,1,3,5,4'], ['1,3,4,5,2'], ['3,5,2,4,1'], ['1,4,5,2,3'], ['5,1,4,3,2'], ['3,2,5,4,1'], ['3,1,2,5,4'], ['2,5,1,4,3'], ['3,2,1,4,5'], ['4,5,3,1,2'], ['1,5,4,3,2'], ['1,5,3,4,2'], ['2,1,4,3,5'], ['4,1,2,5,3']]
print([[int(item_3) for item_3 in item_1 if item_3.isdigit()] for item in list_1 for item_1 in item])
output:
[[1, 3, 4, 5, 2], [4, 2, 5, 3, 1], [1, 3, 2, 5, 4], [1, 2, 4, 3, 5], [1, 3, 4, 5, 2], [2, 1, 3, 5, 4], [1, 3, 4, 5, 2], [3, 5, 2, 4, 1], [1, 4, 5, 2, 3], [5, 1, 4, 3, 2], [3, 2, 5, 4, 1], [3, 1, 2, 5, 4], [2, 5, 1, 4, 3], [3, 2, 1, 4, 5], [4, 5, 3, 1, 2], [1, 5, 4, 3, 2], [1, 5, 3, 4, 2], [2, 1, 4, 3, 5], [4, 1, 2, 5, 3]]
in detailed :
for item in list_1:
for item_1 in item:
new=[]
for item_3 in item_1:
if item_3.isdigit():
new.append(int(item_3))
print(new)
I am writing some python script to generate JSON. I constructed the jSON successfully. but stuck in getting selecting number in cyclic order.
let us say, i have a list of 1,2,3,4,5. I need to select first 4 numbers (1,2,3,4) here for first first item and 2,3,4,5 for second and 3,4,5,1 for third and it should go on till 30 times.
import json
import random
json_dict = {}
number = []
brokers = [1,2,3,4,5]
json_dict["version"] = version
json_dict["partitions"] = [{"topic": "topic1", "name": i,"replicas":
random.choice(brokers)} for i in range(0, 30)]
with open("output.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4)
Output
"version": "1",
"partitions": [
{
"topic": "topic1",
"name": 0,
"replicas": 1,2,3,4
},
{
"topic": "topic1",
"name": 1,
"replicas": 2,3,4,5
},
{
"topic": "topic1",
"name": 3,
"replicas": 3,4,5,1
Anyway, how can i achieve this?
In order to get a cyclic elements from your brokers list you can use deque from collections module and do a deque.rotation(-1) like this example:
from collections import deque
def grouper(iterable, elements, rotations):
if elements > len(iterable):
return []
b = deque(iterable)
for _ in range(rotations):
yield list(b)[:elements]
b.rotate(-1)
brokers = [1,2,3,4,5]
# Pick 4 elements from brokers and yield 30 cycles
cycle = list(grouper(brokers, 4, 30))
print(cycle)
Output:
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5,1, 2], [5, 1, 2, 3]]
Also, this is a way how to implement this solution to your final dict:
# in this example i'm using only 5 cycles
cycles = grouper(brokers, 4, 5)
partitions = [{"topic": "topic1", "name": i, "replicas": cycle_elem} for i, cycle_elem in zip(range(5), cycles)]
final_dict = {"version": "1", "partitions": partitions}
print(final_dict)
Output:
{'partitions': [{'name': 0, 'replicas': [1, 2, 3, 4], 'topic': 'topic1'}, {'name': 1, 'replicas': [2, 3, 4, 5], 'topic': 'topic1'}, {'name': 2, 'replicas': [3, 4, 5, 1], 'topic': 'topic1'}, {'name': 3, 'replicas': [4, 5, 1, 2], 'topic': 'topic1'}, {'name': 4, 'replicas': [5, 1, 2, 3], 'topic': 'topic1'}], 'version': '1'}
Here's a purely procedural solution which also adds flexibility of selecting any number of groups, of any size (even bigger than the original 'brokers' list) with any offset:
def get_subgroups(groups, base, size, offset=1):
# cover the group size > len(base) case by expanding the base
# this step is completely optional if your group size will never be bigger
base *= -(-size // len(base))
result = [] # storage for our groups
base_size = len(base) # no need to call len() all the time
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
tail = current_offset + size # end index for our current slice
end = min(tail, base_size) # normalize to the base size
group = base[current_offset:end] + base[:tail - end] # get our slice
result.append(group) # append it to our result storage
current_offset = (current_offset + offset) % base_size # increase our current offset
return result
brokers = [1, 2, 3, 4, 5]
print(get_subgroups(5, brokers, 4)) # 5 groups of size 4, with default offset
# prints: [[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3]]
print(get_subgroups(3, brokers, 7, 2)) # 3 groups of size 7, with offset 2
# prints: [[1, 2, 3, 4, 5, 1, 2], [3, 4, 5, 1, 2, 3, 4], [5, 1, 2, 3, 4, 5, 1]]
And it does it in a O(N) time with a single loop.
If you're planning to run this on a very large generator, you can turn get_subgroups() function into a generator by forgoing the result collection and doing yield group instead of result.append(group). That way you can call it in a loop as: for group in get_subgroups(30, broker, 4): and store the group in whatever structure you want.
UPDATE
If memory is not an issue, we can optimize (processing-wise) this even more by expanding the whole base (or brokers in your case) to fit the whole set:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
result = [] # storage for our groups
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
result.append(base[current_offset:current_offset+size])
current_offset += offset
return result
Or we can make it even faster with list comprehension if we don't need the ability to turn it into a generator:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
return [base[i:i+size] for i in range(0, groups * offset, offset)]
# as previously mentioned, use xrange() on Python 2.x instead
This is a pretty cool problem, and I think I have a pretty cool solution:
items = [1, 2, 3, 4, 5]
[(items * 2)[x:x+4] for i in range(30) for x in [i % len(items)]]
which gives
[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3]]
What this is doing is taking your set of things and appending it to itself (items * 2 -> [1, 2, 3, 4, 5, 1, 2, 3 ,4, 5]), and then picking a starting place (x) by taking our loop iteration (i) and modulating it (probably not the right word) by the number of items we have (i in [x % len(items)]).