Separate each letter from a text without saving the same letter twice - python

I wanted to know how I could separate a text in the different letters it has without saving the same letter twice in python. So the output of a text like "hello" will be {'h','e',l','o'}, counting the letter l only once.

As the comments say, put your word in a set to remove duplicates:
>>> set("hello")
set(['h', 'e', 'l', 'o'])
Iterate through it (sets don't have order, so don't count on that):
>>> h = set("hello")
>>> for c in h:
... print(c)
...
h
e
l
o
Test if a character is in it:
>>> 'e' in h
True
>>> 'x' in h
False

There's a few ways to do this...
word = set('hello')
Or the following...
letters = []
for letter in "hello":
if letter not in letters:
letters.append(letter)

Related

Getting a word from numbers with combinations

i'll make an example: i have the word lol that can be 1o1. the number "1" can be used as l, i, j. i have a blacklisted word list blacklisted = ['lol']. I want to change the "1" numbers in 1o1 to verify if it is egual to lol. Example: result = ['ioi', 'joj', 'lol', 'joi']...
is there a way to do that?
blacklisted = ['lol']
check = ['1o1']
substitute = ['l', 'i', 'j']
result = [ check[0].replace('1', char) for char in substitute ]
print(result)
Obiously, you can generalize the snippet above for other values than '1' and use it in a loop that checks many words.
Use a dictionary to store these similar values. For example, here's a function to implement this functionality:
def check(string, blacklisted):
translate_dict = {'1': ['l', 'i', 'j']} # etc. for similar corresponding values
possible_strings = []
for num, letter in translate_dict.items(): # this loop replaces all numbers with letters
for i in letter:
possible_strings.append(string.replace(num, i))
for string in possible_strings:
if string in blacklisted: # checks whether the string is in blacklisted
return False
return True

Matching String in a List of Strings

I basically want to create a new list 'T' which will match if each element in the list 'Word' exists as a separate element in the list 'Z'.
ie I want the output of 'T' in the following case to be T = ['Hi x']
Word = ['x']
Z = ['Hi xo xo','Hi x','yoyo','yox']
I tried the following code but it gives me all sentences with words having 'x' in it however I only want the sentences having 'x' as a separate word.
for i in Z:
for v in i:
if v in Word:
print (i)
Just another pythonic way
[phrase for phrase in Z for w in Word if w in phrase.split()]
['Hi x']
You can do it with list comprehension.
>>> [i for i in Z if any (w.lower() ==j.lower() for j in i.split() for w in Word)]
['Hi x']
Edit:
Or you can do:
>>> [i for i in Z for w in Word if w.lower() in map(lambda x:x.lower(),i.split())]
['Hi x']
if you want to print all strings from Z that contain a word from Word:
Word = ['xo']
Z = ['Hi xo xo','Hi x','yoyo','yox']
res = []
for i in Z:
for v in i.split():
if v in Word:
res.append(i)
break
print(res)
Notice the break. Without the break you could get some strings from Z twice, if two words from it would match. Like the xo in the example.
The i.split() expression splits i to words on spaces.
words = ['x']
phrases = ['Hi xo xo','Hi x','yoyo','yox']
for phrase in phrases:
for word in words:
if word in phrase.split():
print(phrase)
If you would store Word as a set instead of list you could use set operations for check. Basically following splits every string on whitespace, constructs set out of words and checks if Word is subset or not.
>>> Z = ['Hi xo xo','Hi x','yoyo','yox']
>>> Word = {'x'}
>>> [s for s in Z if Word <= set(s.split())]
['Hi x']
>>> Word = {'Hi', 'x'}
>>> [s for s in Z if Word <= set(s.split())]
['Hi x']
In above <= is same as set.issubset.

Python: Creating a new list that is dependent on the content of another list

I am passing a function a list populated with strings. I want this function to take each of those strings and iterate through them, executing two different actions depending on the letters found in each string, then displaying them as the separate and now changed strings in a new list.
Specifically, when the program iterates through each string and finds a consonant, it should write that consonant in the order that it was found, into the new list. If the program finds a vowel in the current string, it should append 'xy' before the vowel, then the vowel itself.
As an example:
If the user input: "how now brown cow", the output of the function should be: "hxyow nxyow brxyown cxyow". I've tried nested for loops, nested while loops, and variations between. What's the best way to accomplish this? Cheers!
For every character in old string check if it is vowel or consonant and create new string accordingly.
old = "how now brown cow"
new = ""
for character in old:
if character in ('a', 'e', 'i', 'o', 'u'):
new = new + "xy" + character
else:
new = new + character
print(new)
I gave you the idea and now I leave it as exercise to make it work for list of strings. Also make appropriate changes if you are using python2.
>>> def xy(st):
... my_list,st1 =[],''
... for x in st:
... if x in 'aeiou':
... st1 += 'xy'+x
... elif x in 'cbdgfhkjmlnqpsrtwvyxz':
... my_list.append(x)
... st1 += x
... return my_list,st1
...
>>> my_string="how now brown cow"
>>> xy(my_string)
(['h', 'w', 'n', 'w', 'b', 'r', 'w', 'n', 'c', 'w'], 'hxyow nxyow brxyown cxyow')
In above function for iterates through string when it find vowel concatenate xy+vowel else it append consonant to list, at last returns list and string
A simple way to do this using a list comprehension:
old_str = "how now brown cow"
new_str = ''.join(["xy" + c if c in "aeiou" else c for c in old_str])
print new_str
But if you're processing a lot of data it'd be more efficient to use a set of vowels, eg
vowels = set("aeiou")
old_str = "how now brown cow"
new_str = ''.join(["xy" + c if c in vowels else c for c in old_str])
print new_str
Note that these programs simply copy all characters that aren't vowels, i.e., spaces, numbers and punctuation get treated as if they were consonants.

Counting Instances of Consecutive Duplicate Letters in a Python String

I'm trying to figure out how I can count the number of letters in a string that occur 3 times. The string is from raw_input().
For example, if my input is:
abceeedtyooo
The output should be: 2
This is my current code:
print 'Enter String:'
x = str(raw_input (""))
print x.count(x[0]*3)
To count number of consecutive duplicate letters that occur exactly 3 times:
>>> from itertools import groupby
>>> sum(len(list(dups)) == 3 for _, dups in groupby("abceeedtyooo"))
2
To count the chars in the string, you can use collections.Counter:
>>> from collections import Counter
>>> counter = Counter("abceeedtyooo")
>>> print(counter)
Counter({'e': 3, 'o': 3, 'a': 1, 'd': 1, 'y': 1, 'c': 1, 'b': 1, 't': 1})
Then you can filter the result as follows:
>>> result = [char for char in counter if counter[char] == 3]
>>> print(result)
['e', 'o']
If you want to match consecutive characters only, you can use regex (cf. re):
>>> import re
>>> result = re.findall(r"(.)\1\1", "abceeedtyooo")
>>> print(result)
['e', 'o']
>>> result = re.findall(r"(.)\1\1", "abcaaa")
>>> print(result)
['a']
This will also match if the same character appears three consecutive times multiple times (e.g. on "aaabcaaa", it will match 'a' twice). Matches are non-overlapping, so on "aaaa" it will only match once, but on "aaaaaa" it will match twice. Should you not want multiple matches on consecutive strings, modify the regex to r"(.)\1\1(?!\1)". To avoid matching any chars that appear more than 3 consecutive times, use (.)(?<!(?=\1)..)\1{2}(?!\1). This works around a problem with Python's regex module that cannot handle (?<!\1).
We can count the chars in the string through 'for' loop
s="abbbaaaaaaccdaaab"
st=[]
count=0
for i in set(s):
print(i+str(s.count(i)),end='')
Output: a10c2b4d1

Count the number of occurrences of a character in a string

How do I count the number of occurrences of a character in a string?
e.g. 'a' appears in 'Mary had a little lamb' 4 times.
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4
You can use .count() :
>>> 'Mary had a little lamb'.count('a')
4
To get the counts of all letters, use collections.Counter:
>>> from collections import Counter
>>> counter = Counter("Mary had a little lamb")
>>> counter['a']
4
Regular expressions maybe?
import re
my_string = "Mary had a little lamb"
len(re.findall("a", my_string))
Python-3.x:
"aabc".count("a")
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
myString.count('a');
more info here
str.count(a) is the best solution to count a single character in a string. But if you need to count more characters you would have to read the whole string as many times as characters you want to count.
A better approach for this job would be:
from collections import defaultdict
text = 'Mary had a little lamb'
chars = defaultdict(int)
for char in text:
chars[char] += 1
So you'll have a dict that returns the number of occurrences of every letter in the string and 0 if it isn't present.
>>>chars['a']
4
>>>chars['x']
0
For a case insensitive counter you could override the mutator and accessor methods by subclassing defaultdict (base class' ones are read-only):
class CICounter(defaultdict):
def __getitem__(self, k):
return super().__getitem__(k.lower())
def __setitem__(self, k, v):
super().__setitem__(k.lower(), v)
chars = CICounter(int)
for char in text:
chars[char] += 1
>>>chars['a']
4
>>>chars['M']
2
>>>chars['x']
0
This easy and straight forward function might help:
def check_freq(x):
freq = {}
for c in set(x):
freq[c] = x.count(c)
return freq
check_freq("abbabcbdbabdbdbabababcbcbab")
{'a': 7, 'b': 14, 'c': 3, 'd': 3}
If a comprehension is desired:
def check_freq(x):
return {c: x.count(c) for c in set(x)}
Regular expressions are very useful if you want case-insensitivity (and of course all the power of regex).
my_string = "Mary had a little lamb"
# simplest solution, using count, is case-sensitive
my_string.count("m") # yields 1
import re
# case-sensitive with regex
len(re.findall("m", my_string))
# three ways to get case insensitivity - all yield 2
len(re.findall("(?i)m", my_string))
len(re.findall("m|M", my_string))
len(re.findall(re.compile("m",re.IGNORECASE), my_string))
Be aware that the regex version takes on the order of ten times as long to run, which will likely be an issue only if my_string is tremendously long, or the code is inside a deep loop.
I don't know about 'simplest' but simple comprehension could do:
>>> my_string = "Mary had a little lamb"
>>> sum(char == 'a' for char in my_string)
4
Taking advantage of built-in sum, generator comprehension and fact that bool is subclass of integer: how may times character is equal to 'a'.
a = 'have a nice day'
symbol = 'abcdefghijklmnopqrstuvwxyz'
for key in symbol:
print(key, a.count(key))
An alternative way to get all the character counts without using Counter(), count and regex
counts_dict = {}
for c in list(sentence):
if c not in counts_dict:
counts_dict[c] = 0
counts_dict[c] += 1
for key, value in counts_dict.items():
print(key, value)
I am a fan of the pandas library, in particular the value_counts() method. You could use it to count the occurrence of each character in your string:
>>> import pandas as pd
>>> phrase = "I love the pandas library and its `value_counts()` method"
>>> pd.Series(list(phrase)).value_counts()
8
a 5
e 4
t 4
o 3
n 3
s 3
d 3
l 3
u 2
i 2
r 2
v 2
` 2
h 2
p 1
b 1
I 1
m 1
( 1
y 1
_ 1
) 1
c 1
dtype: int64
count is definitely the most concise and efficient way of counting the occurrence of a character in a string but I tried to come up with a solution using lambda, something like this :
sentence = 'Mary had a little lamb'
sum(map(lambda x : 1 if 'a' in x else 0, sentence))
This will result in :
4
Also, there is one more advantage to this is if the sentence is a list of sub-strings containing same characters as above, then also this gives the correct result because of the use of in. Have a look :
sentence = ['M', 'ar', 'y', 'had', 'a', 'little', 'l', 'am', 'b']
sum(map(lambda x : 1 if 'a' in x else 0, sentence))
This also results in :
4
But Of-course this will work only when checking occurrence of single character such as 'a' in this particular case.
a = "I walked today,"
c=['d','e','f']
count=0
for i in a:
if str(i) in c:
count+=1
print(count)
I know the ask is to count a particular letter. I am writing here generic code without using any method.
sentence1 =" Mary had a little lamb"
count = {}
for i in sentence1:
if i in count:
count[i.lower()] = count[i.lower()] + 1
else:
count[i.lower()] = 1
print(count)
output
{' ': 5, 'm': 2, 'a': 4, 'r': 1, 'y': 1, 'h': 1, 'd': 1, 'l': 3, 'i': 1, 't': 2, 'e': 1, 'b': 1}
Now if you want any particular letter frequency, you can print like below.
print(count['m'])
2
the easiest way is to code in one line:
'Mary had a little lamb'.count("a")
but if you want can use this too:
sentence ='Mary had a little lamb'
count=0;
for letter in sentence :
if letter=="a":
count+=1
print (count)
To find the occurrence of characters in a sentence you may use the below code
Firstly, I have taken out the unique characters from the sentence and then I counted the occurrence of each character in the sentence these includes the occurrence of blank space too.
ab = set("Mary had a little lamb")
test_str = "Mary had a little lamb"
for i in ab:
counter = test_str.count(i)
if i == ' ':
i = 'Space'
print(counter, i)
Output of the above code is below.
1 : r ,
1 : h ,
1 : e ,
1 : M ,
4 : a ,
1 : b ,
1 : d ,
2 : t ,
3 : l ,
1 : i ,
4 : Space ,
1 : y ,
1 : m ,
"Without using count to find you want character in string" method.
import re
def count(s, ch):
pass
def main():
s = raw_input ("Enter strings what you like, for example, 'welcome': ")
ch = raw_input ("Enter you want count characters, but best result to find one character: " )
print ( len (re.findall ( ch, s ) ) )
main()
Python 3
Ther are two ways to achieve this:
1) With built-in function count()
sentence = 'Mary had a little lamb'
print(sentence.count('a'))`
2) Without using a function
sentence = 'Mary had a little lamb'
count = 0
for i in sentence:
if i == "a":
count = count + 1
print(count)
Use count:
sentence = 'A man walked up to a door'
print(sentence.count('a'))
# 4
Taking up a comment of this user:
import numpy as np
sample = 'samplestring'
np.unique(list(sample), return_counts=True)
Out:
(array(['a', 'e', 'g', 'i', 'l', 'm', 'n', 'p', 'r', 's', 't'], dtype='<U1'),
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1]))
Check 's'. You can filter this tuple of two arrays as follows:
a[1][a[0]=='s']
Side-note: It works like Counter() of the collections package, just in numpy, which you often import anyway. You could as well count the unique words in a list of words instead.
This is an extension of the accepted answer, should you look for the count of all the characters in the text.
# Objective: we will only count for non-empty characters
text = "count a character occurrence"
unique_letters = set(text)
result = dict((x, text.count(x)) for x in unique_letters if x.strip())
print(result)
# {'a': 3, 'c': 6, 'e': 3, 'u': 2, 'n': 2, 't': 2, 'r': 3, 'h': 1, 'o': 2}
No more than this IMHO - you can add the upper or lower methods
def count_letter_in_str(string,letter):
return string.count(letter)
You can use loop and dictionary.
def count_letter(text):
result = {}
for letter in text:
if letter not in result:
result[letter] = 0
result[letter] += 1
return result
spam = 'have a nice day'
var = 'd'
def count(spam, var):
found = 0
for key in spam:
if key == var:
found += 1
return found
count(spam, var)
print 'count %s is: %s ' %(var, count(spam, var))

Categories