Turning a string into a list with specifications - python

I want to create a list out of my string in python that would show me how many times a letter is shown in a row inside the string.
for example:
my_string= "google"
i want to create a list that looks like this:
[['g', 1], ['o', 2], ['g', 1], ['l', 1], ['e', 1]]
Thanks!

You could use groupby from itertools:
from itertools import groupby
my_string= "google"
[(c, len(list(i))) for c, i in groupby(my_string)]

You can use a regular expression and a dictionary to find and store the longest string of each letter like this
s = 'google'
nodubs = [s[0]] + [s[x] if s[x-1] != s[x] else '' for x in range(1,len(s))]
nodubs = ''.join(nodubs)
import re
dic = {}
for letter in set(s):
matches = re.findall('%s+' % letter, s)
longest = max([len(x) for x in matches])
dic[letter] = longest
print [[n,dic[n]] for n in nodubs]
Result:
[['g', 1], ['o', 2], ['g', 1], ['l', 1], ['e', 1]]

Related

How to merge two lists by alternating elements from both lists? [duplicate]

I have two lists, the first of which is guaranteed to contain exactly one more item than the second. I would like to know the most Pythonic way to create a new list whose even-index values come from the first list and whose odd-index values come from the second list.
# example inputs
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
# desired output
['f', 'hello', 'o', 'world', 'o']
This works, but isn't pretty:
list3 = []
while True:
try:
list3.append(list1.pop(0))
list3.append(list2.pop(0))
except IndexError:
break
How else can this be achieved? What's the most Pythonic approach?
If you need to handle lists of mismatched length (e.g. the second list is longer, or the first has more than one element more than the second), some solutions here will work while others will require adjustment. For more specific answers, see How to interleave two lists of different length? to leave the excess elements at the end, or How to elegantly interleave two lists of uneven length in python? to try to intersperse elements evenly.
Here's one way to do it by slicing:
>>> list1 = ['f', 'o', 'o']
>>> list2 = ['hello', 'world']
>>> result = [None]*(len(list1)+len(list2))
>>> result[::2] = list1
>>> result[1::2] = list2
>>> result
['f', 'hello', 'o', 'world', 'o']
There's a recipe for this in the itertools documentation (note: for Python 3):
from itertools import cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
import itertools
print([x for x in itertools.chain.from_iterable(itertools.zip_longest(list1,list2)) if x])
I think this is the most pythonic way of doing it.
In Python 2, this should do what you want:
>>> iters = [iter(list1), iter(list2)]
>>> print list(it.next() for it in itertools.cycle(iters))
['f', 'hello', 'o', 'world', 'o']
Without itertools and assuming l1 is 1 item longer than l2:
>>> sum(zip(l1, l2+[0]), ())[:-1]
('f', 'hello', 'o', 'world', 'o')
In python 2, using itertools and assuming that lists don't contain None:
>>> filter(None, sum(itertools.izip_longest(l1, l2), ()))
('f', 'hello', 'o', 'world', 'o')
If both lists have equal length, you can do:
[x for y in zip(list1, list2) for x in y]
As the first list has one more element, you can add it post hoc:
[x for y in zip(list1, list2) for x in y] + [list1[-1]]
To illustrate what is happening in that first list comprehension, this is how you would spell it out as a nested for loop:
result = []
for y in zip(list1, list2): # y is is a 2-tuple, containining one element from each list
for x in y: # iterate over the 2-tuple:
result.append(x)
I know the questions asks about two lists with one having one item more than the other, but I figured I would put this for others who may find this question.
Here is Duncan's solution adapted to work with two lists of different sizes.
list1 = ['f', 'o', 'o', 'b', 'a', 'r']
list2 = ['hello', 'world']
num = min(len(list1), len(list2))
result = [None]*(num*2)
result[::2] = list1[:num]
result[1::2] = list2[:num]
result.extend(list1[num:])
result.extend(list2[num:])
result
This outputs:
['f', 'hello', 'o', 'world', 'o', 'b', 'a', 'r']
Here's a one liner that does it:
list3 = [ item for pair in zip(list1, list2 + [0]) for item in pair][:-1]
Here's a one liner using list comprehensions, w/o other libraries:
list3 = [sub[i] for i in range(len(list2)) for sub in [list1, list2]] + [list1[-1]]
Here is another approach, if you allow alteration of your initial list1 by side effect:
[list1.insert((i+1)*2-1, list2[i]) for i in range(len(list2))]
This one is based on Carlos Valiente's contribution above
with an option to alternate groups of multiple items and make sure that all items are present in the output :
A=["a","b","c","d"]
B=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
def cyclemix(xs, ys, n=1):
for p in range(0,int((len(ys)+len(xs))/n)):
for g in range(0,min(len(ys),n)):
yield ys[0]
ys.append(ys.pop(0))
for g in range(0,min(len(xs),n)):
yield xs[0]
xs.append(xs.pop(0))
print [x for x in cyclemix(A, B, 3)]
This will interlace lists A and B by groups of 3 values each:
['a', 'b', 'c', 1, 2, 3, 'd', 'a', 'b', 4, 5, 6, 'c', 'd', 'a', 7, 8, 9, 'b', 'c', 'd', 10, 11, 12, 'a', 'b', 'c', 13, 14, 15]
Might be a bit late buy yet another python one-liner. This works when the two lists have equal or unequal size. One thing worth nothing is it will modify a and b. If it's an issue, you need to use other solutions.
a = ['f', 'o', 'o']
b = ['hello', 'world']
sum([[a.pop(0), b.pop(0)] for i in range(min(len(a), len(b)))],[])+a+b
['f', 'hello', 'o', 'world', 'o']
from itertools import chain
list(chain(*zip('abc', 'def'))) # Note: this only works for lists of equal length
['a', 'd', 'b', 'e', 'c', 'f']
itertools.zip_longest returns an iterator of tuple pairs with any missing elements in one list replaced with fillvalue=None (passing fillvalue=object lets you use None as a value). If you flatten these pairs, then filter fillvalue in a list comprehension, this gives:
>>> from itertools import zip_longest
>>> def merge(a, b):
... return [
... x for y in zip_longest(a, b, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "defgh")
['a', 'd', 'b', 'e', 'c', 'f', 'g', 'h']
>>> merge([0, 1, 2], [4])
[0, 4, 1, 2]
>>> merge([0, 1, 2], [4, 5, 6, 7, 8])
[0, 4, 1, 5, 2, 6, 7, 8]
Generalized to arbitrary iterables:
>>> def merge(*its):
... return [
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "lmn1234", "xyz9", [None])
['a', 'l', 'x', None, 'b', 'm', 'y', 'c', 'n', 'z', '1', '9', '2', '3', '4']
>>> merge(*["abc", "x"]) # unpack an iterable
['a', 'x', 'b', 'c']
Finally, you may want to return a generator rather than a list comprehension:
>>> def merge(*its):
... return (
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... )
...
>>> merge([1], [], [2, 3, 4])
<generator object merge.<locals>.<genexpr> at 0x000001996B466740>
>>> next(merge([1], [], [2, 3, 4]))
1
>>> list(merge([1], [], [2, 3, 4]))
[1, 2, 3, 4]
If you're OK with other packages, you can try more_itertools.roundrobin:
>>> list(roundrobin('ABC', 'D', 'EF'))
['A', 'D', 'E', 'B', 'F', 'C']
My take:
a = "hlowrd"
b = "el ol"
def func(xs, ys):
ys = iter(ys)
for x in xs:
yield x
yield ys.next()
print [x for x in func(a, b)]
def combine(list1, list2):
lst = []
len1 = len(list1)
len2 = len(list2)
for index in range( max(len1, len2) ):
if index+1 <= len1:
lst += [list1[index]]
if index+1 <= len2:
lst += [list2[index]]
return lst
How about numpy? It works with strings as well:
import numpy as np
np.array([[a,b] for a,b in zip([1,2,3],[2,3,4,5,6])]).ravel()
Result:
array([1, 2, 2, 3, 3, 4])
Stops on the shortest:
def interlace(*iters, next = next) -> collections.Iterable:
"""
interlace(i1, i2, ..., in) -> (
i1-0, i2-0, ..., in-0,
i1-1, i2-1, ..., in-1,
.
.
.
i1-n, i2-n, ..., in-n,
)
"""
return map(next, cycle([iter(x) for x in iters]))
Sure, resolving the next/__next__ method may be faster.
Multiple one-liners inspired by answers to another question:
import itertools
list(itertools.chain.from_iterable(itertools.izip_longest(list1, list2, fillvalue=object)))[:-1]
[i for l in itertools.izip_longest(list1, list2, fillvalue=object) for i in l if i is not object]
[item for sublist in map(None, list1, list2) for item in sublist][:-1]
An alternative in a functional & immutable way (Python 3):
from itertools import zip_longest
from functools import reduce
reduce(lambda lst, zipped: [*lst, *zipped] if zipped[1] != None else [*lst, zipped[0]], zip_longest(list1, list2),[])
using for loop also we can achive this easily:
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
list3 = []
for i in range(len(list1)):
#print(list3)
list3.append(list1[i])
if i < len(list2):
list3.append(list2[i])
print(list3)
output :
['f', 'hello', 'o', 'world', 'o']
Further by using list comprehension this can be reduced. But for understanding this loop can be used.
My approach looks as follows:
from itertools import chain, zip_longest
def intersperse(*iterators):
# A random object not occurring in the iterators
filler = object()
r = (x for x in chain.from_iterable(zip_longest(*iterators, fillvalue=filler)) if x is not filler)
return r
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
print(list(intersperse(list1, list2)))
It works for an arbitrary number of iterators and yields an iterator, so I applied list() in the print line.
def alternate_elements(small_list, big_list):
mew = []
count = 0
for i in range(len(small_list)):
mew.append(small_list[i])
mew.append(big_list[i])
count +=1
return mew+big_list[count:]
if len(l2)>len(l1):
res = alternate_elements(l1,l2)
else:
res = alternate_elements(l2,l1)
print(res)
Here we swap lists based on size and perform, can someone provide better solution with time complexity O(len(l1)+len(l2))
I'd do the simple:
chain.from_iterable( izip( list1, list2 ) )
It'll come up with an iterator without creating any additional storage needs.
This is nasty but works no matter the size of the lists:
list3 = [
element for element in
list(itertools.chain.from_iterable([
val for val in itertools.izip_longest(list1, list2)
]))
if element != None
]
Obviously late to the party, but here's a concise one for equal-length lists:
output = [e for sub in zip(list1,list2) for e in sub]
It generalizes for an arbitrary number of equal-length lists, too:
output = [e for sub in zip(list1,list2,list3) for e in sub]
etc.
I'm too old to be down with list comprehensions, so:
import operator
list3 = reduce(operator.add, zip(list1, list2))

Find the first repeated letter in a string and the times it is repeated

I have the following string: "WPCOPEO" and I need to find the first repeated letter and the times it is repeated. I would appreciate some help with the coding.
string = "WPCOPEO"
def is_repeated(letter):
for letter in String:
if letter == letter
print (letter)
Its pretty easy if you think about it
check if element exists in set else insert into set
>>> s=set()
>>> for i in string:
... if i in s:
... c=i
... break
... else:
... s.add(i)
...
>>> c
'P'
>>> string.count(c)
2
One of the way getting it using list comprehension is given below:
word = "WPCOPEO"
print (next([letter, word.count(letter)] for pos, letter in enumerate(word) if letter in word[pos+1:]))
Using for loop:
for pos, letter in enumerate(word):
if letter in word[pos+1:]:
print (letter, word.count(letter))
break
You can find the appearance by first storing each letters initial appearance with its frequency.
{'O': [3, 2], 'E': [4, 1], 'P': [1, 2], 'W': [0, 1], 'C': [2, 1]}
Next, you can transform that into a list of (appearance, letter frequency).
[[0, 'W', 1], [1, 'P', 2], [2, 'C', 1], [3, 'O', 2], [4, 'E', 1]]
Then you can sort by frequency and appearance and grab the first item.
[[1, 'P', 2], [3, 'O', 2], [0, 'W', 1], [2, 'C', 1], [4, 'E', 1]]
Example
#! /usr/bin/env python3
def letter_frequency_by_initial_pos(word):
occurs = {}
first_appearance = 0
for letter in word:
if letter in occurs:
occurs[letter][1] += 1
else:
occurs[letter] = [ first_appearance, 1 ]
first_appearance += 1
return occurs
def appearance_to_frequency(occurs):
freq_by_appearance = [None] * len(occurs)
for letter in occurs:
freq_by_appearance[occurs[letter][0]] = [ occurs[letter][0], letter, occurs[letter][1] ]
return sorted(freq_by_appearance, key = lambda x: (-x[2], x[0], x[1]))
if __name__ == '__main__':
freq = appearance_to_frequency(letter_frequency_by_initial_pos('WPCOPEO'))
print('The letter {} appears {} times.'.format(freq[0][1], freq[0][2]))
Output:
The letter P appears 2 times.
def is_repeated(string):
for i in range(1,len(string)):
check=string[0]
if check == string[i]:
print("This character is frequent:",string[i])
string = "WPCOPEO"
is_repeated(string)

Combinations with repetition of letters with weight from list

I have four letters with different weights as
letters = ['C', 'N', 'O', 'S']
weights_of_l = [1, 1, 2, 2]
I want to get the combinations of letters which weight = 2. The letter can be repeatedly chose and order is not important. The result can be list or array or any forms but with this combinations
comb_w2 = ['CC','NN','NC','O','S']
Here C and N has weight = 1 so combining two letters have weight = 2: The possible combinations are 'CC','NN','NC'
O and S has weight = 2 already so it cannot combine with other letters. Is there any libraries for calculating this? I saw itertools but it gives only the number of possibility, not the combinations.
Yours is a problem of partitioning (not easy stuff).
You can use this post to generate all possible combination outcomes for a given weight. Then you can delete the ones that contain keys which you don't have in weights_of_l. Finally you substitute the numbers by the letters and create permutations for they letters that have the same weight.
My answer ended up being very similar to Turksarama's. Namely, if you need the combination of results, you have to sort the letters and use a set to get rid of the duplicates. My approach is more succinct, but requires calling set() with the function call.
letters = ['C', 'N', 'O', 'S']
weights = [1, 1, 2, 2]
items = list(zip(weights, letters))
def combinations(items, max_weight, weight=0, word=''):
if weight == max_weight:
yield ''.join(sorted(word))
items_allowed = [(w, l) for w, l in items if max_weight - weight >= w]
for w, l in items_allowed:
for result in combinations(items_allowed, max_weight, weight+w, word+l):
yield result
print(set(combinations(items, 2)))
#Sneha has a nice and succinct answer, but if you're going to have a lot of combinations then it might be better to to not go too far in creating combinations. This solution is longer but will run faster for long lists of letters with large goal scores:
letters = ['C', 'N', 'O', 'S']
weights_of_l = [1, 1, 2, 2]
def get_combos(letters, weights, goal):
weighted_letters = list(zip(letters, weights))
combos = set()
def get_combos(letters, weight):
for letter, next_weight in weighted_letters:
total = weight + next_weight
if total == goal:
combos.add(''.join(sorted(letters + letter)))
elif total > goal:
pass
else:
get_combos(letters + letter, weight+next_weight)
get_combos('',0)
return combos
print(get_combos(letters, weights_of_l, 3))
EDIT: I think this one might be even faster:
letters = ['C', 'N', 'O', 'S']
weights_of_l = [1, 1, 2, 2]
def get_combos(letters, weights, goal):
weighted_letters = sorted(zip(weights, letters))
combos = []
def get_combos(letters, weight, weighted_letters):
for i, (next_weight, letter) in enumerate(weighted_letters):
total = weight + next_weight
if total == goal:
combos.append(letters + letter)
elif total > goal:
return
else:
get_combos(letters+letter, weight+next_weight, weighted_letters[i:])
get_combos('',0,weighted_letters)
return combos
print(get_combos(letters, weights_of_l, 3))
Create all combinations of the letters and use filter function to remove all combinations with combined weight not equal to 2.
from itertools import combinations_with_replacement
letters = ['C', 'N', 'O', 'S']
weights_of_l = [1, 1, 2, 2]
y=dict(zip(letters,weights_of_l)) #Creates a dict of the two list letters
#and weights_of_l
print(list(map(lambda x:''.join(x),filter(lambda
x:y[x[0]]+y[x[1]]==2,combinations_with_replacement(letters,2)))))
Or you can initially filter all letters in the letters list to include those which has weight less than 2 or the weight you require and then create all combinations.
Try the following code:
def find_combinations(letter_list, weight_list, weignt_sum):
output_list = []
letter_weight_dict = dict(zip(letter_list,weight_list))
for key, val in letter_weight_dict.items():
for key1, val1 in letter_weight_dict.items():
if val+val1 == weignt_sum:
if (key + key1)[::-1] not in output_list:
output_list.append(key+key1)
if val == weignt_sum:
output_list.append(key)
return set(output_list)
letters = ['C', 'N', 'O', 'S']
weights_of_l = [1, 1, 2, 2]
combinations = find_combinations(letters, weights_of_l, 2)
print combinations
I got the following output:
['CC', 'S', 'NN', 'CN', 'O']
This may not be the best way to do this.

Building a list inside a list in python

I have been trying to add some data in a python list. I am actually going to store the data as a list inside a list. Now, the data is not coming index-wise.
To explain that lets say I have a list of lists 'a'. Now I have data for a[2] before a[1]. And both a[1] and a[2] are lists themselves. Now, obviously I can't assign anything to a[2] before assigning a[1]. And I don't know how much lists would be there. I mean, this is supposed to be dynamic.
Any solution to this, so that I can successfully build the list?
You could append empty lists until you have enough to access the index you have data for:
while len(outerlist) <= idx:
outerlist.append([])
However, you may want to use a dictionary instead, letting you implement a sparse object instead. A collections.defaultdict() object is especially useful here:
from collections import defaultdict
data = defaultdict(list)
data[2].append(3)
data[5].append(42)
data now has keys 2 and 5, each a list with one element. No entries for 0, 1, 3, or 4 exist yet.
I had the same problem, to fill empty list with definite amount of lists.
Here is my way out
I made a "board" 6x6 filled with O, just for instant:
board = []
for i in range(6): # create a list with nested lists
board.append([])
for n in range(6):
board[i].append("O") # fills nested lists with data
Result:
[['O', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', 'O']]
I think this solution solves your problem.
Create the secondary list(inside_list) local to the for loop
outside_list=[]
for i in range(0,5):
inside_list=[]
inside_list.append(i)
inside_list.append(i+1)
outside_list.append(inside_list)
#you can access any inside_list from the outside_list and append
outside_list[1].append(100)
print(outside_list)
Output:
[[0, 1], [1, 2, 100], [2, 3], [3, 4], [4, 5]]
You can do it, there is no problem appending an element.
>>> a = [[1,2,3], [10,20,30], [100,200,300]]
>>> a[2].append(400)
[[1, 2, 3], [10, 20, 30], [100, 200, 300, 400]]
>>> a[1].append(40)
[[1, 2, 3], [10, 20, 30, 40], [100, 200, 300, 400]]
We can use the first item in the list as an index and then make use of it for adding the right value to it at runtime using the list.
def unclean_(index, val, unclean=None):
is_added = False
if unclean is None:
unclean = []
length = len(unclean)
if length == 0:
unclean.append([index, val])
else:
for x in range(length):
if unclean[x][0] == index:
unclean[x].append(val)
is_added = True
if not is_added:
unclean.append([index, val])
def terminate_even(x):
if x % 2 == 0:
raise Exception("Its even number")
def terminate_odd(x):
if x % 2 != 0:
raise Exception("Its odd number")
def fun():
unclean = []
for x in range(10):
try:
terminate_even(x)
except:
unclean_("terminate_even", x, unclean)
for x in range(10):
try:
terminate_odd(x)
except:
unclean_("terminate_odd", x, unclean)
for y in unclean:
print y
def main():
fun()
if __name__ == "__main__":
main()
Output:
-------
['terminate_even', 0, 2, 4, 6, 8]
['terminate_odd', 1, 3, 5, 7, 9]
Simply
I am gonna make it as simple as it gets, No need for fancy modules,
Just make an empty list in the start of the file, and then append that empty list whenever you need it like,
Example
Until the Function get_data give a certain output add empty lists within a list.
#/usr/bin/python3
#coding:utf-8
empty_list = []
r = ''
def get_data():
# does something here and gives the output
return output
imp_list = []
while r == %something%:
r = get_data()
imp_list.append(empty_list)

Counting consecutive characters in a string

I need to write a code that slices the string (which is an input), append it to a list, count the number of each letter - and if it is identical to the letter before it, don't put it in the list, but rather increase the appearance number of that letter in the one before..
Well this is how it should look like :
assassin [['a', 1], ['s', 2], ['a', 1], ['s', 2]], ['i', 1], ['n', 1]
the word assassin is just an example of the need..
My code so far goes like this:
userin = raw_input("Please enter a string :")
inputlist = []
inputlist.append(userin)
biglist = []
i=0
count = {}
while i<(len(userin)):
slicer = inputlist[0][i]
for s in userin:
if count.has_key(s):
count[s] += 1
else:
count[s] = 1
biglist.append([slicer,s])
i = i+1
print biglist
Thanks!
Use Collections.Counter(), dictionary is a better way to store this:
>>> from collections import Counter
>>> strs="assassin"
>>> Counter(strs)
Counter({'s': 4, 'a': 2, 'i': 1, 'n': 1})
or using itertools.groupby():
>>> [[k, len(list(g))] for k, g in groupby(strs)]
[['a', 1], ['s', 2], ['a', 1], ['s', 2], ['i', 1], ['n', 1]]
last = ''
results = []
word = 'assassin'
for letter in word:
if letter == last:
results[-1] = (letter, results[-1][1] +1)
else:
results.append((letter, 1))
last = letter
print result # [('a', 1), ('s', 2), ('a', 1), ('s', 2), ('i', 1), ('n', 1)]
Using only builtins:
def cnt(s):
current = [s[0],1]
out = [current]
for c in s[1:]:
if c == current[0]:
current[1] += 1
else:
current = [c, 1]
out.append(current)
return out
print cnt('assassin')

Categories