i am trying to read strings from a line and add a number at the beginning of each string and add each into an array, however my code adds a number to EACH character of the string.
infile = open("milkin.txt","r").readlines()
outfile = open("milkout.txt","w")
number = infile[0]
arrayLoc = infile[1].split( )
array = infile[2].split( )
for i in infile[2]:
counter = 1
countered = str(counter)
i = countered + i
array.append(i)
output:
['2234567', '3222222', '4333333', '5444444', '6555555', '11', '12', '13', '14', '15', '16', '17', '1 ', '12' .... etc
intended output:
['12234567', '23222222', '34333333', '45444444', '56555555']
infile:
5
1 3 4 5 2
2234567 3222222 4333333 5444444 6555555
You need to loop over the array that you read from your file, and since it looks like you want to add sequential numbers to each element, you can use enumerate(array) to get the index of each element as you loop. You can add an argument to enumerate to tell it what number to start at (default is 0):
new_arr = []
for i, a in enumerate(array, 1):
# 'i' will go from 1, 2, ... (n + 1) where 'n' is number of elements in 'array'
# 'a' will be the ith element of 'array'
new_arr.append(str(i) + a)
print(new_arr)
['12234567', '23222222', '34333333', '45444444', '56555555']
As pointed out in a comment, this can be done much more concisely using a list comprehension, which is the more pythonic way to loop:
new_arr = [str(i) + a for i, a in enumerate(array, 1)]
Related
I am writing a code to mutate a list of strings, where the list contains 6 elements, where the first 4 are the attributes and the fifth one is the class, and the last one is the fitness. Now I want to pick a random attribute position and mutate that positions attribute which will be picked from a value_range.
Below is the code :
import random, math
prob_mut = 0.1
no_of_attr = 4
def mutate(pos):
pos_random = random.randint(0, value_range[pos])
if pos_random == 0:
pos_random = '*'
return pos_random
# p stands for probability of mutation
# a stands for the number of attributes
# c is the list from which the attributes will be mutated
def mutation(p, a, c):
result = []
if isinstance(c, list):
range_ = math.ceil(a * p)
for _ in range(range_):
pos = random.randint(0, a - 1)
value = mutate(pos)
c[pos] = value
c = [str(i) for i in c]
result = c
res = mutation(prob_mut, no_of_attr, ['1', '*', '1', '2', '1', '0.57'])
My value_range looks like: {0: 3, 1: 3, 2: 3, 3: 4}
Here I am passing a list only, and it is giving me correct output for this case(a list), but I want to pass a list of list as the 3rd argument of mutation and want to get back a list of list as well. How can I do it?
My desired function call :
res = mutation(prob_mut, no_of_attr, [['1', '*', '1', '2', '1', '0.57'],['1', '*', '2', '*', '1', '0.97']])
The requirement is quite confusing, but I'll answer with the following understanding:
You want to pass a list of lists, and get mutation for each list in the list.
If this understanding is wrong, you'll have to explain your requirement better.
From purely a code perspective you can simple change your mutation function as following:
def mutation(p, a, cs):
results = []
for c in cs
if isinstance(c, list):
range_ = math.ceil(a * p)
for _ in range(range_):
pos = random.randint(0, a - 1)
value = mutate(pos)
c[pos] = value
c = [str(i) for i in c]
results.append(c)
return results
I have a string bar:
bar = 'S17H10E7S5E3H2S105H90E15'
I take this string and form groups that start with the letter S:
groups = ['S' + elem for elem in bar.split('S') if elem != '']
groups
['S17H10E7', 'S5H3E2', 'S105H90E15']
Without using the mini-language RegEx, I'd like to be able to get the integer values that follow the different letters S, H, and E in these groups. To do so, I'm using:
code = 'S'
temp_num = []
for elem in groups:
start = elem.find(code)
for char in elem[start + 1: ]:
if not char.isdigit():
break
else:
temp_num.append(char)
num_tests = ','.join(temp_num)
This gives me:
print(groups)
['S17H10E7', 'S5H3E2', 'S105H90E15']
print(temp_num)
['1', '7', '5', '1', '0', '5']
print(num_tests)
1,7,5,1,0,5
How would I take these individual integers 1, 7, 5, 1, 0, and 5 and put them back together to form a list of the digits following the code S? For example:
[17, 5, 105]
UPDATE:
In addition to the accepted answer, here is another solution:
def count_numbers_after_code(string_to_read, code):
index_values = [i for i, char in enumerate(string_to_read) if char == code]
temp_1 = []
temp_2 = []
for idx in index_values:
temp_number = []
for character in string_to_read[idx + 1: ]:
if not character.isdigit():
break
else:
temp_number.append(character)
temp_1 = ''.join(temp_number)
temp_2.append(int(temp_1))
return sum(temp_2)
Would something like this work?
def get_numbers_after_letter(letter, bar):
current = True
out = []
for x in bar:
if x==letter:
out.append('')
current = True
elif x.isnumeric() and current:
out[-1] += x
elif x.isalpha() and x!=letter:
current = False
return list(map(int, out))
Output:
>>> get_numbers_after_letter('S', bar)
[17, 5, 105]
>>> get_numbers_after_letter('H', bar)
[10, 3, 90]
>>> get_numbers_after_letter('E', bar)
[7, 2, 15]
I think it's better to get all the numbers after every letter, since we're making a pass over the string anyway but if you don't want to do that, I guess this could work.
The question states that you would favour a solution without using regex ("unless absolutely necessary" from the comments)
It is not necessary of course, but as an alternative for future readers you can match S and capture 1 or more digits using (\d+) in a group that will be returned by re.findall.
import re
bar = 'S17H10E7S5E3H2S105H90E15'
print(re.findall(r"S(\d+)", bar))
Output
['17', '5', '105']
I want to fill an array with word suffixes while making a dictionary with their indexes.
In a loop I do the following:
for i in range(len(s)):
suf = s[:j]
suff_dict.update({suf: i})
suff_arr[i][0] = suf
suff_arr[i][1] = 0
j -= 1
The dictionary is filled right, however, the array is filled only with the 1st letter.
[['H', 0], ['H', 0], ['H', 0], ['H', 0], ['H', 0], ['H', 0]]
{'HELLO': 1, 'HELL': 2, 'HEL': 3, 'HE': 4, 'H': 5}
Could you help me to find a problem?
I think maybe this is what you are looking for.
s='HELLO'
suff_arr=[]
suff_dict={}
for i in range(len(s)):
suf = s[i:]
suff_dict.update({suf: i})
suff_arr.append(suf)
print(suff_arr, suff_dict)
I do not really unterstand why you would have nested lists, with a zero, but if you want that you could do it like this:
s='HELLO'
suff_arr=[]
suff_dict={}
for i in range(len(s)):
suf = s[i:]
suff_dict.update({suf: i})
suff_arr.append([suf,0])
print(suff_arr, suff_dict)
Also you said you wanted the word suffixes not prefixes, so I changed that too. If you want the prefixes, simply replace s[i:] with s[:i+1]
Since the data in this question is unclear I can't exactly guess what you are trying to do. But from what I understand this might help u.
s = 'HELLO'
suff_dict = {}
j=len(s)
suff_arr = []
for i in range(len(s)):
suf = s[:j]
suff_dict.update({suf: i})
suff_arr.append([suf,0])
j -= 1
First off, as the previous answers have indicated, this is a case for building your list with "append()". Here is some explanation for the unexpected results you were seeing, it has to do with how Python stores and refers to objects in memory.
Copy and run the code below, it's my attempt to show why you were getting unexpected results in your list. The "id()" function in Python returns a unique identifier for an object, and I use it to show where list values are being stored in memory.
print('**** Values stored directly in list. ****')
arr = [0] * 3
print('All the items in the list refer to the same memory address.')
c = 0
for item in arr:
print(f'arr[{c}] id = ', id(item))
c += 1
print('\n')
for i in range(len(arr)):
arr[i] = i
print('As values are updated, new objects are created in memory:')
c = 0
for item in arr:
print(f'arr[{c}] id = ', id(item))
c += 1
print('And we see the results we expect:')
print(arr)
print('\n')
print('**** Values stored in sub list. ****')
arr = [[0]] * 3
print('All the items in the list refer to the same memory location.')
c = 0
for item in arr:
print(f'arr[{c}] id = ', id(item))
c += 1
print('\n')
for i in range(len(arr)):
arr[i][0] = i
print('The same memory address is repeatedly overwritten.')
c = 0
for item in arr:
print(f'arr[{c}] id = ', id(item))
c += 1
print('\n')
print('And we say "Wut??"')
print(arr)
I have to search all elements in a list and replace all occurrences of one element with another. What is the best way to do this?
For example, suppose my list has the following elements:
data = ['a34b3f8b22783cf748d8ec99b651ddf35204d40c',
'baa6cb4298d90db1c375c63ee28733eb144b7266',
'CommitTest.txt',
'=>',
'text/CommitTest.txt',
'0',
'README.md',
'=>',
'text/README.md',
'0']
and I need to replace all occurrences of character '=>' with the combined value from elements before and after the character '=>', so the output I need is:
data = ['a34b3f8b22783cf748d8ec99b651ddf35204d40c',
'baa6cb4298d90db1c375c63ee28733eb144b7266',
'CommitTest.txt=>text/CommitTest.txt',
'0',
'README.md=>text/README.md',
'0']
This is my code I wrote so far:
ind = data.index("=>")
item_to_replace = data[ind]
combine = data[ind-1]+data[ind]+data[ind+1]
replacement_value = combine
indices_to_replace = [i for i,x in enumerate(data) if x==item_to_replace]
for i in indices_to_replace:
data[i] = replacement_value
data
However, the unwanted output is like this :
data = ['a34b3f8b22783cf748d8ec99b651ddf35204d40c',
'baa6cb4298d90db1c375c63ee28733eb144b7266',
'CommitTest.txt',
'CommitTest.txt=>text/CommitTest.txt',
'text/CommitTest.txt',
'0',
'README.md',
'CommitTest.txt=>text/CommitTest.txt',
'text/README.md',
'0']
Is there a better way?
Your general algorithm is correct.
However, data.index("->") will only find the index of the first occurance of "->".
You need to find all occurrences of "=>" store it in a list, combine the elements and replace for each of the occurances.
To find the index of all occurance of "=>", you can use:
indices = [i for i, x in enumerate(data) if x == "=>"]
As #alpha_989 suggested first find the index of => element and replace for each occurances, hope this may help
>>> indices = [i for i, x in enumerate(data) if x == "=>"]
>>> for i in indices: #this will add one index upper and one index lower of elem "=>" with elem
data[i-1] = data[i-1]+ data[i] + data[i+1]
>>> for elem in data:
if elem == "=>":
del data[data.index("=>")+1]
del data[data.index("=>")]
>>> data
['a34b3f8b22783cf748d8ec99b651ddf35204d40c', 'baa6cb4298d90db1c375c63ee28733eb144b7266', 'CommitTest.txt=>text/CommitTest.txt', '0', 'README.md=>text/README.md', '0']
It was correctly pointed out to you that data.index will only return the index of the first occurence of an element. Furthermore, you code does not remove the entries after and before the "=>".
For a solution that mutates your list, you could use del, but I recommend using this neat slicing syntax that Python offers.
indices = [i for i, val in enumerate(data) if val == '=>']
for i in reversed(indices):
data[i-1: i+2] = [data[i-1] + data[i] + data[i+1]]
I also suggest you attempt an implementation that generates a new list in a single pass. Mutating a list can be a bad practice and has no real advantage over creating a new list like so.
new_data = []
i = 0
while i < len(data):
if i + 1 < len(data) and data[i + 1] == "=>":
new_data.append(data[i] + data[i+1] + data[i+2])
i += 3
else:
new_data.append(data[i])
i += 1
Below is my little experiment, I added a function to call. You can check for it:
data = ['a34b3f8b22783cf748d8ec99b651ddf35204d40c',
'baa6cb4298d90db1c375c63ee28733eb144b7266',
'CommitTest.txt',
'=>',
'text/CommitTest.txt',
'0',
'README.md',
'=>',
'text/README.md',
'0']
def convert_list():
ind = [i for i, x in enumerate(data) if x == "=>"]
if ind == 0 or ind == len(data) - 1:
print("Invalid element location")
return
new_data = []
index_start = 0
while index_start < len(data):
for ind_index in ind:
if index_start == ind_index -1:
index_start += 3
new_data.append(data[ind_index - 1] + data[ind_index] +data[ind_index + 1])
new_data.append(data[index_start])
index_start += 1
return new_data
print(convert_list())
The indexs that need to be deleted are saved first, then deleted.
delete_index=[]
for i,d in enumerate(data):
if(d=="=>"):
data[i]=data[i-1]+data[i]+data[i+1]
delete_index.append(i-1)
delete_index.append(i+1)
new_data=[]
for i,d in enumerate(data):
if i not in delete_index:
new_data.append(d)
print(new_data)
Although poorly written, this code:
marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []
for i in range(len(marker_array)):
if marker_array[i-1][1] != marker_array[i][1]:
marker_array_DS.append(marker_array[i])
print marker_array_DS
Returns:
[['hard', '2', 'soft'], ['fast', '3'], ['turtle', '4', 'wet']]
It accomplishes part of the task which is to create a new list containing all nested lists except those that have duplicate values in index [1]. But what I really need is to concatenate the matching index values from the removed lists creating a list like this:
[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]
The values in index [1] must not be concatenated. I kind of managed to do the concatenation part using a tip from another post:
newlist = [i + n for i, n in zip(list_a, list_b]
But I am struggling with figuring out the way to produce the desired result. The "marker_array" list will be already sorted in ascending order before being passed to this code. All like-values in index [1] position will be contiguous. Some nested lists may not have any values beyond [0] and [1] as illustrated above.
Quick stab at it... use itertools.groupby to do the grouping for you, but do it over a generator that converts the 2 element list into a 3 element.
from itertools import groupby
from operator import itemgetter
marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
def my_group(iterable):
temp = ((el + [''])[:3] for el in marker_array)
for k, g in groupby(temp, key=itemgetter(1)):
fst, snd = map(' '.join, zip(*map(itemgetter(0, 2), g)))
yield filter(None, [fst, k, snd])
print list(my_group(marker_array))
from collections import defaultdict
d1 = defaultdict(list)
d2 = defaultdict(list)
for pxa in marker_array:
d1[pxa[1]].extend(pxa[:1])
d2[pxa[1]].extend(pxa[2:])
res = [[' '.join(d1[x]), x, ' '.join(d2[x])] for x in sorted(d1)]
If you really need 2-tuples (which I think is unlikely):
for p in res:
if not p[-1]:
p.pop()
marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []
marker_array_hit = []
for i in range(len(marker_array)):
if marker_array[i][1] not in marker_array_hit:
marker_array_hit.append(marker_array[i][1])
for i in marker_array_hit:
lists = [item for item in marker_array if item[1] == i]
temp = []
first_part = ' '.join([str(item[0]) for item in lists])
temp.append(first_part)
temp.append(i)
second_part = ' '.join([str(item[2]) for item in lists if len(item) > 2])
if second_part != '':
temp.append(second_part);
marker_array_DS.append(temp)
print marker_array_DS
I learned python for this because I'm a shameless rep whore
marker_array = [
['hard','2','soft'],
['heavy','2','light'],
['rock','2','feather'],
['fast','3'],
['turtle','4','wet'],
]
data = {}
for arr in marker_array:
if len(arr) == 2:
arr.append('')
(first, index, last) = arr
firsts, lasts = data.setdefault(index, [[],[]])
firsts.append(first)
lasts.append(last)
results = []
for key in sorted(data.keys()):
current = [
" ".join(data[key][0]),
key,
" ".join(data[key][1])
]
if current[-1] == '':
current = current[:-1]
results.append(current)
print results
--output:--
[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]
A different solution based on itertools.groupby:
from itertools import groupby
# normalizes the list of markers so all markers have 3 elements
def normalized(markers):
for marker in markers:
yield marker + [""] * (3 - len(marker))
def concatenated(markers):
# use groupby to iterator over lists of markers sharing the same key
for key, markers_in_category in groupby(normalized(markers), lambda m: m[1]):
# get separate lists of left and right words
lefts, rights = zip(*[(m[0],m[2]) for m in markers_in_category])
# remove empty strings from both lists
lefts, rights = filter(bool, lefts), filter(bool, rights)
# yield the concatenated entry for this key (also removing the empty string at the end, if necessary)
yield filter(bool, [" ".join(lefts), key, " ".join(rights)])
The generator concatenated(markers) will yield the results. This code correctly handles the ['fast', '3'] case and doesn't return an additional third element in such cases.