I'm trying to create a program without importing anything. The program lets the user input a passage, then prints how many A's there are in the message, how many B's, etc.
So it works...it's just VERY long. I'm new to coding, and I know that there is a way to simplify the code below with def but I'm not really sure how. Can anyone help?
You need no methods, but you can definately cut it short:
String can be used as an array of characters.
You can use the index method to determine what is the position of the letter in the alphabet.
You can iterate a zipped list of pairs from the alphabet and the counter list, to produce the output.
Use if letter in alphabet as a guard to ensure the letter is valid for the alphabet, instead of hard coding the alphabet. That way you can even expand your alphabet. (Note that the counter is set to the length of the alphabet).
Here is a suggestion:
message = input('what is your message? ').upper()
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
counter = [0] * len(alphabet)
for letter in message:
if letter in alphabet:
counter[alphabet.index(letter)] += 1
for letter, count in zip(alphabet, counter):
print(letter, ':', count)
One can do it with a one line instruction, where we make use of:
count method of string that returns the numbers of element contained in a string
chr function that gives a character from an int. chr(65) gives a A, chr(66) gives a B, ...
join function that concatenates strings of a list
The result looks like
message = input('what is your message? ').upper()
print('\n'.join([chr(65+i)+':'+str(message.count(chr(65+i))) for i in range(26)]))
For a very short and elegant solution use the Counter unit from the collections module:
from collections import Counter
message=raw_input("what is your message?")
message=message.upper()
c = Counter(message)
print c.most_common()
This counts every kind of letter in the message. And it can even sort the result for you quickly. Here is a sample dialog:
"what is your message?Hi there, new Pythonist!
[(' ', 3), ('E', 3), ('H', 3), ('T', 3), ('I', 2), ('N', 2), ('!', 1), (',', 1), ('O', 1), ('P', 1), ('S', 1), ('R', 1), ('W', 1), ('Y', 1)]"
Related
I want to write screen Capital letters and index numbers in word="WElCMMerC".For example [(0,W),(1,E),(3,C),(4,M),(5,M)...]
def cap(word):
w=list(enumerate(i) for i in word if i!=i.lower())
print (w)
print(cap("WElCMMerC"))
You can loop over the result of enumerate, and keep only those which have an uppercase letter (using isupper to check for that), and return the list w, don't print inside the function:
def cap(word):
w = [i for i in enumerate(word) if i[1].isupper()]
return w
print(cap("WElCMMerC"))
Output:
[(0, 'W'), (1, 'E'), (3, 'C'), (4, 'M'), (5, 'M'), (8, 'C')]
You made a list of enumerate objects. Read the documentation: enumerate is an iterator, much like range. Rather, you need to use the enumeration.
return [(idx, letter)
for idx, letter in enumerate(word)
if letter.isupper()]
In English:
Return the pair of index and letter
for each index, letter pair in the word
but only when the letter is upper-case.
I'm very new to this so have mercy on my poor, stupid soul.
If you wanted to return a two-character ".com" with both a string and a string-and-a-string and a string and an int (ie: "ab.com" and "a7.com") how would you do it?
I've been looking at other code and I'm literally getting Japanese characters returned.
while letter1 <= 'z': # Outer loop
letter2 = 'a'
while letter2 <= 'z': # Inner loop
print('%s%s.com' % (letter1, letter2))
letter2 = chr(ord(letter2) + 1)
letter1 = chr(ord(letter1) + 1)
letter2 should be returning either a letter a-z or a number but it only gives me back whatever letter1 is ('aa, bb, cc, etc...')
If I understand what you're looking for, you can use itertools.product to generate this:
First, generate your alphabet. For this, I'm going to use alphabet = string.ascii_lowercase + string.digits (in other words, a-z and 0-9).
If I say: list(itertools.product(alphabet, repeat=2)), we start getting what we're looking for:
[('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'd'), ('a', 'e'), ('a', 'f'), ('a', 'g'), ...]
So for example, your entire code could look like:
def domain_generator(alphabet, length, suffix):
for first, second in itertools.product(alphabet, repeat=length):
yield '{}{}.{suffix}'.format(first, second, suffix=suffix)
where you can now iterate over the domain generator with:
for domain in domain_generator(string.ascii_lowercase + string.digits, 2, 'com'):
print(domain)
Sounds like what you want is basically a list of all domains matching the regex [a-z0-9][a-z0-9].com.
Fiddling with the ASCII character value could be one way to implement this, but I think it would be more pythonic to try something like this instead:
import string
for letter1 in string.ascii_lowercase+string.digits:
for letter2 in string.ascii_lowercase+string.digits:
print('%s%s.com' % (letter1, letter2))
The first function is able to separate each letter of a string and list how many times that letter appears. For example:
print(rlencode("Hello!"))
[('H', 1), ('e', 1), ('l', 2), ('o', 1), ('!', 1)]
How do I get rldecode(rle): do the the complete opposite of rlencode(s) so that rldecode(rlencode(x)) == x returns True
def rlencode(s):
"""
signature: str -> list(tuple(str, int))
"""
string=[]
count=1
for i in range(1,len(s)):
if s[i] == s[i-1]:
count += 1
else:
string.append((s[i-1], count))
count=1
if i == len(s)-1:
string.append((s[i], count))
return string
def rldecode(rle):
"""
#signature: list(tuple(str, int)) -> str
#"""
string=" "
count=1
for i in rle:
if i == rle:
string += i
return string
You can use the fact that you can multiply a string by a number to repeat it and use `''.join() to bring the elements of the list together.
To show the effect of string multiplication, I multiplied "a" by 5
"a"*5 #'aaaaa'
Using that in a comprehension will give you
str = [char[0]*char[1] for char in rle] #['H', 'e', 'll', 'o', '!']
Then add in the ''.join() and you have your answer.
l = [('H', 1), ('e', 1), ('l', 2), ('o', 1), ('!', 1)]
str = ''.join(char[0]*char[1] for char in rle) #'Hello!'
So your function would be
def rldecode(rle):
"""
signature: list(tuple(str, int)) -> str
"""
return ''.join(char[0]*char[1] for char in rle)
Also, if you would like to make your rlencode a little cleaner, you can simplify it a little bit by using enumerate to help you keep your position in the string and check if you're about to hit either a new character or the end of the string. You just have to increment the counter on each loop.
def rlencode(s):
output = []
count = 0
for i, char in enumerate(s):
count += 1
if (i == (len(s)-1)) or (char != s[i+1]):
output.append((char, count))
count = 0
return output
Use join:
b = [('H', 1), ('e', 1), ('l', 2), ('o', 1), ('!', 1)]
''.join([c[0] * c[1] for c in b])
Hello!
You can also use list comprehensions for your initial function.
You can use collections.Counter.elements():
from collections import Counter
l = [('H', 1), ('e', 1), ('l', 2), ('o', 1), ('!', 1)]
print(''.join(Counter(dict(l)).elements()))
This outputs:
Hello!
A simple, readable solution is to iterate over all of the tuples in the list returned by rlencode and construct a new string from each letter (and it's frequency) like so:
def rldecode(rle):
string = ''
for letter, n in rle:
string += letter*n
return string
An answer that's easy to read but also accounts for ordering in the problem:
def rlencode(s):
"""
signature: str -> list(tuple(str, int, list(int)))
"""
result=[]
frequency=1
for i in range(len(s)):
letters = [item[0] for item in result]
if s[i] in letters:
idx = letters.index(s[i])
frequency=result[idx][1]
frequency+=1
positions= result[idx][2]
positions.append(i)
result[idx] = (s[i],count,lst)
else:
result.append((s[i],1,[i]))
return result
def rldecode(rle):
"""
#signature: list(tuple(str, int, list(int))) -> str
#"""
frequencies = [i[1] for i in rle]
total_length = sum(frequencies)
char_list=[None]*total_length
for c in rle:
for pos in c[2]:
char_list[pos] = c[0]
return "".join(char_list)
text = "This is a lot of text where ordering matters"
encoded = rlencode(text)
print(encoded)
decoded = rldecode(encoded)
print(decoded)
I adapted it from the answer posted by #Brian Cohan
It should be noted that the answer is computationally expensive because of .index() if letter grows really long as explained in this SO post
I want to find the most occurring substring in a CSV row either by itself, or by using a list of keywords for lookup.
I've found a way to find out the top 5 most occurring words in each row of a CSV file using Python using the below responses, but, that doesn't solve my purpose. It gives me results like -
[(' Trojan.PowerShell.LNK.Gen.2', 3),
(' Suspicious ZIP!lnk', 2),
(' HEUR:Trojan-Downloader.WinLNK.Powedon.a', 2),
(' TROJ_FR.8D496570', 2),
('Trojan.PowerShell.LNK.Gen.2', 1),
(' Trojan.PowerShell.LNK.Gen.2 (B)', 1),
(' Win32.Trojan-downloader.Powedon.Lrsa', 1),
(' PowerShell.DownLoader.466', 1),
(' malware (ai score=86)', 1),
(' Probably LNKScript', 1),
(' virus.lnk.powershell.a', 1),
(' Troj/LnkPS-A', 1),
(' Trojan.LNK', 1)]
Whereas, I would want something like 'Trojan', 'Downloader', 'Powershell' ... as the top results.
The matching words can be a substring of a value (cell) in the CSV or can be a combination of two or more words. Can someone help fix this either by using a keywords list or without.
Thanks!
Let, my_values = ['A', 'B', 'C', 'A', 'Z', 'Z' ,'X' , 'A' ,'X','H','D' ,'A','S', 'A', 'Z'] is your list of words which is to sort.
Now take a list which will store information of occurrences of every words.
count_dict={}
Populate the dictionary with appropriate values :
for i in my_values:
if count_dict.get(i)==None: #If the value is not present in the dictionary then this is the first occurrence of the value
count_dict[i]=1
else:
count_dict[i] = count_dict[i]+1 #If previously found then increment it's value
Now sort the values of dict according to their occurrences :
sorted_items= sorted(count_dict.items(),key=operator.itemgetter(1),reverse=True)
Now you have your expected results!
The most occurring 3 values are:
print(sorted_items[:3])
output :
[('A', 5), ('Z', 3), ('X', 2)]
The most occurring 2 values are :
print(sorted_items[:3])
output:
[('A', 5), ('Z', 3)]
and so on.
Okay so I need to make a python program that takes an encrypted string and from this works out the English plain text using letter frequency. Now from what I gather I should be taking the string and using string.count to get the frequency although I am stuck from here.
After getting the frequency how can I then say the most frequent letter in the cipher is 'e' so print all of the most frequent letter as 'e', the 2nd most frequent is 't' and so on?
Can anyone give me a few things to look at which could help with the creation of this?
from collections import Counter
code_string = "abcdhjshslsldjhdjh"
letters = Counter(code_string)
print(letters.most_common())
results in
[('h', 4), ('d', 3), ('j', 3), ('s', 3), ('l', 2), ('a', 1), ('c', 1), ('b', 1)]