How to print the strings having repeating characters? - python

The question is that:
Suppose I have a string S='ABC', then I want the output to be this list['AAA','BBB','CCC','AAB','ABB','AAC','ACC','BBC','BCC']
How do I achieve this result?
Edit: Thanks to #Breno Monteiro, I came up with the solution based on the example he had shown. What I did was produced the list ['AAA','BBB','CCC'] at first, by multiplying 3 with each of the characters. After that, I replaced the first and second index of the each of the elements in ['AAA','BBB','CCC'] by the second character in the string i.e., if the character is 'A' then its replaced by 'B', if its 'C', then its replaced by 'A' and so on and so forth. So the real output came out to be ['AAA', 'BBB', 'CCC', 'BAA', 'BBA', 'CBB', 'CCB', 'ACC', 'AAC']
My code:
string='ABC'
K=3
output=['AAA','BBB','CCC','AAB','ABB','AAC','ACC','BBC','BCC']
s=""
exp_output,temp=[],[]
ind=1
#including all repeating characters in the string
for i in string:
s+=i*K
exp_output.append(s)
s=""
#including all repeating characters by the first and second index
for i in exp_output:
for j in range(K-1):
i=i.replace(i[j],string[ind%len(string)],1)
temp.append(i)
#print(temp)
ind+=1
exp_output.extend(temp)
print(exp_output)

The simplest way to repeat characters in Python is:
character = 'A'
repeat_times = 3
print(character * repeat_times)
Output: AAA
You can also use Python strings as a list of characters, like this:
characters = 'ABC'
repeat_times = 3
for character in characters:
print(character*repeat_times)
Output: AAA, BBB, CCC

Related

count words from list in another list in entry one

Hy,
I want to count given phrases from a list in another list on position zero.
list_given_atoms= ['C', 'Cl', 'Br']
list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME']
When python find a match it should be safed in a dictionary like
countdict = [ 'Cl : 2', 'C : 1', 'Br : 1']
i tried
re.findall(r'\w+', list_of_molecules[0])
already but that resulsts in words like "B2Br", which is definitly not what i want.
can someone help me?
[a-zA-Z]+ should be used instead of \w+ because \w+ will match both letters and numbers, while you are just looking for letters:
import re
list_given_atoms= ['C', 'Cl', 'Br']
list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME']
molecules = re.findall('[a-zA-Z]+', list_of_molecules[0])
final_data = {i:molecules.count(i) for i in list_given_atoms}
Output:
{'C': 1, 'Br': 1, 'Cl': 2}
You could use something like this:
>>> Counter(re.findall('|'.join(sorted(list_given_atoms, key=len, reverse=True)), list_of_molecules[0]))
Counter({'Cl': 2, 'C': 1, 'Br': 1})
You have to sort the elements by their length, so 'Cl' matches before 'C'.
Short re.findall() solution:
import re
list_given_atoms = ['C', 'Cl', 'Br']
list_of_molecules = ['C(B2Br)[Cl{H]Cl}P' ,'NAME']
d = { a: len(re.findall(r'' + a + '(?=[^a-z]|$)', list_of_molecules[0], re.I))
for a in list_given_atoms }
print(d)
The output:
{'C': 1, 'Cl': 2, 'Br': 1}
I tried your solutions and i figured out, that there are also several C after each other. So I came to this one here:
for element in re.findall(r'([A-Z])([a-z|A-Z])?'. list_of_molecules[0]):
if element[1].islower:
counter = element[0] + element[1]
if not (counter in counter_dict):
counter_dict[counter] = 1
else:
counter_dict[counter] += 1
The same way I checked for elements with just one case and added them to the dictionary. There is probably a better way.
You can't use a /w as a word character is equivalent to:
[a-zA-Z0-9_]
which clearly includes numbers so therefore "B2Br" is matched.
You also can't just use the regex:
[a-zA-Z]+
as that would produce one atom for something like "CO2"which should produce 2 separate molecules: C and 0.
However the regex I came up with (regex101) just checks for a capital letter and then between 0 and 1 (so optional) lower case letter.
Here it is:
[A-Z][a-z]{0,1}
and it will correctly produce the atoms.
So to incorporate this into your original lists of:
list_given_atoms= ['C', 'Cl', 'Br']
list_of_molecules= ['C(B2Br)[Cl{H]Cl}P' ,'NAME']
we want to first find all the atoms in list_of_molecules and then create a dictionary of the counts of the atoms in list_given_atoms.
So to find all the atoms, we can use re.findall on the first element in the molecules list:
atoms = re.findall("[A-Z][a-z]{0,1}", list_of_molecules[0])
which gives a list:
['C', 'B', 'Br', 'Cl', 'H', 'Cl', 'P']
then, to get the counts in a dictionary, we can use a dictionary-comprehension:
counts = {a: atoms.count(a) for a in list_given_atoms}
which gives the desired result of:
{'Cl': 2, 'C': 1, 'Br': 1}
And would also work when we have molecules like CO2 etc.

Sort text based on last 3rd character

I am using the sorted() function to sort the text based on last character
which works perfectly
def sort_by_last_letter(strings):
def last_letter(s):
return s[-1]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))
Output
['a', 'from', 'hello', 'letter', 'last']
My requirement is to sort based on last 3rd character .But problem is few of the words are less than 3 character in that case it should be sorted based on next lower placed character (2 if present else last).Searching to do it in pythonic way
Presently I am getting
IndexError: string index out of range
def sort_by_last_letter(strings):
def last_letter(s):
return s[-3]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))
You can use:
return sorted(strings,key=lambda x: x[max(0,len(x)-3)])
So thus we first calculate the length of the string len(x) and subtract 3 from it. In case the string is not that long, we will thus obtain a negative index, but by using max(0,..) we prevent that and thus take the last but one, or the last character in case these do not exist.
This will work given every string has at least one character. This will produce:
>>> sorted(["hello","from","last","letter","a"],key=lambda x: x[max(0,len(x)-3)])
['last', 'a', 'hello', 'from', 'letter']
In case you do not care about tie-breakers (in other words if 'a' and 'abc' can be reordered), you can use a more elegant approach:
from operator import itemgetter
return sorted(strings,key=itemgetter(slice(-3,None)))
What we here do is generating a slice with the last three characters, and then compare these substrings. This then generates:
>>> sorted(strings,key=itemgetter(slice(-3,None)))
['a', 'last', 'hello', 'from', 'letter']
Since we compare with:
['a', 'last', 'hello', 'from', 'letter']
# ['a', 'ast', 'llo', 'rom', 'ter'] (comparison key)
You can simply use the minimum of the string length and 3:
def sort_by_last_letter(strings):
def last_letter(s):
return s[-min(len(s), 3)]
return sorted(strings,key=last_letter)
print(sort_by_last_letter(["hello","from","last","letter","a"]))

Split string into array in Python

I have a string with the following structure.
string = "[abcd, abc, a, b, abc]"
I would like to convert that into an array. I keep using the split function in Python but I get spaces and the brackets on the start and the end of my new array. I tried working around it with some if statements but I keep missing letters in the end from some words.
Keep in mind that I don't know the length of the elements in the string. It could be 1, 2, 3 etc.
Assuming your elements never end or start with spaces or square brackets, you could just strip them out (the bracket can be stripped out before splitting):
arr = [ x.strip() for x in string.strip('[]').split(',') ]
It gives as expected
print (arr)
['abcd', 'abc', 'a', 'b', 'abc']
The nice part with strip is that it leaves all inner characters untouched. With:
string = "[ab cd, a[b]c, a, b, abc]"
You get: ['ab cd', 'a[b]c', 'a', 'b', 'abc']
You can also do this
>>> s = string[1:len(string)-1].split(", ")
>>> s
['abcd', 'abc', 'a', 'b', 'abc']
If the values in this list are variables themselves (looks like it because they're not quoted) the easiest way to convert this string to the equivalent list is
string = eval(string)
Caution: If the values in your list should be strings this will not work.
another way to solve this problem
string = "[abcd, abc, a, b, abc]"
result = string[1:len(string)-1].split(", ")
print(result)
Hope this helps
First remove [ and ] from your string, then split on commas, then remove spaces from resulting items (using strip).
If you do not want to use strip, it can be done by following rather clumsy way:
arr = [e[1:] for e in string.split(',')]
arr[len(arr)-1]=arr[len(arr)-1].replace(']', '')
print(arr)
['abcd', 'abc', 'a', 'b', 'abc']
I would suggest following.
[list_element.strip() for list_element in string.strip("[]").split(",")]
First remove brackets and then split it accordingly.

Python: Creating a new list that is dependent on the content of another list

I am passing a function a list populated with strings. I want this function to take each of those strings and iterate through them, executing two different actions depending on the letters found in each string, then displaying them as the separate and now changed strings in a new list.
Specifically, when the program iterates through each string and finds a consonant, it should write that consonant in the order that it was found, into the new list. If the program finds a vowel in the current string, it should append 'xy' before the vowel, then the vowel itself.
As an example:
If the user input: "how now brown cow", the output of the function should be: "hxyow nxyow brxyown cxyow". I've tried nested for loops, nested while loops, and variations between. What's the best way to accomplish this? Cheers!
For every character in old string check if it is vowel or consonant and create new string accordingly.
old = "how now brown cow"
new = ""
for character in old:
if character in ('a', 'e', 'i', 'o', 'u'):
new = new + "xy" + character
else:
new = new + character
print(new)
I gave you the idea and now I leave it as exercise to make it work for list of strings. Also make appropriate changes if you are using python2.
>>> def xy(st):
... my_list,st1 =[],''
... for x in st:
... if x in 'aeiou':
... st1 += 'xy'+x
... elif x in 'cbdgfhkjmlnqpsrtwvyxz':
... my_list.append(x)
... st1 += x
... return my_list,st1
...
>>> my_string="how now brown cow"
>>> xy(my_string)
(['h', 'w', 'n', 'w', 'b', 'r', 'w', 'n', 'c', 'w'], 'hxyow nxyow brxyown cxyow')
In above function for iterates through string when it find vowel concatenate xy+vowel else it append consonant to list, at last returns list and string
A simple way to do this using a list comprehension:
old_str = "how now brown cow"
new_str = ''.join(["xy" + c if c in "aeiou" else c for c in old_str])
print new_str
But if you're processing a lot of data it'd be more efficient to use a set of vowels, eg
vowels = set("aeiou")
old_str = "how now brown cow"
new_str = ''.join(["xy" + c if c in vowels else c for c in old_str])
print new_str
Note that these programs simply copy all characters that aren't vowels, i.e., spaces, numbers and punctuation get treated as if they were consonants.

String split formatting in python 3

I'm trying to format this string below where one row contains five words. However, I keep getting this as the output:
I love cookies yes I do Let s see a dog
First, I am not getting 5 words in one line, but instead, everything in one line.
Second, why does the "Let's" get split? I thought in splitting the string using "words", it will only split if there was a space in between?
Suggestions?
string = """I love cookies. yes I do. Let's see a dog."""
# split string
words = re.split('\W+',string)
words = [i for i in words if i != '']
counter = 0
output=''
for i in words:
if counter == 0:
output +="{0:>15s}".format(i)
# if counter == 5, new row
elif counter % 5 == 0:
output += '\n'
output += "{0:>15s}".format(i)
else:
output += "{0:>15s}".format(i)
# Increase the counter by 1
counter += 1
print(output)
As a start, don't call a variable "string" since it shadows the module with the same name
Secondly, use split() to do your word-splitting
>>> s = """I love cookies. yes I do. Let's see a dog."""
>>> s.split()
['I', 'love', 'cookies.', 'yes', 'I', 'do.', "Let's", 'see', 'a', 'dog.']
From re-module
\W
Matches any character which is not a Unicode word character. This is the opposite of \w. If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_] (but the flag affects the entire regular expression, so in such cases using an explicit [^a-zA-Z0-9_] may be a better choice).
Since the ' is not listed in the above, the regexp used splits the "Let's" string into two parts:
>>> words = re.split('\W+', s)
>>> words
['I', 'love', 'cookies', 'yes', 'I', 'do', 'Let', 's', 'see', 'a', 'dog', '']
This is the output I get using the strip()-approach above:
$ ./sp3.py
I love cookies. yes I
do. Let's see a dog.
The code could probably be simplified to this since counter==0 and the else-clause does the same thing. I through in an enumerate there as well to get rid of the counter:
#!/usr/bin/env python3
s = """I love cookies. yes I do. Let's see a dog."""
words = s.split()
output = ''
for n, i in enumerate(words):
if n % 5 == 0:
output += '\n'
output += "{0:>15s}".format(i)
print(output)
words = string.split()
while (len(words))
for word in words[:5]
print(word, end=" ")
print()
words = words[5:]
That's the basic concept, split it using the split() method
Then slice it using slice notation to get the first 5 words
Then slice off the first 5 words, and loop again

Categories