This question already has answers here:
Printing Lists as Tabular Data
(20 answers)
Closed 5 years ago.
import re
from collections import Counter
words = re.findall(r'\w+', open('test01_cc_sharealike.txt').read().lower())
count = Counter(words).most_common(10)
print(count)
How can I change the code so it will format into like this:
Word number
word number
instead of a list
I want the format to be: the word first then 4 whitespace and the number of the word it appears on the text and so on
Just use a for loop, so instead of print(count), you could use:
for p in count:
print(p[0]+" "+str(p[1]))
However, for formatting purposes, you would probably prefer to align the numbers, so you should use:
indent=1+max([len(p[0]) for p in count])
for p in count:
print(p[0].rjust(indent)+str(p[1]))
Related
This question already has answers here:
How to split strings into text and number?
(11 answers)
Closed 2 years ago.
I have a string like 'S10', 'S11' v.v
How to split this to ['S','10'], ['S','11']
example:
import re
str = 'S10'
re.compile(...)
result = re.split(str)
result:
print(result)
// ['S','10']
resolved at How to split strings into text and number?
This should do the trick:
I'm using capture groups using the circle brackets to match the alphabetical part to the first group and the numbers to the second group.
Code:
import re
str_data = 'S10'
exp = "(\w)(\d+)"
match = re.match(exp, str_data)
result = match.groups()
Output:
('S', '10')
This question already has answers here:
Removing duplicate characters from a string
(15 answers)
Closed 3 years ago.
I have a string like 'AABA'. I want to remove multiple occurances by removing others. The result should be 'AB'.
Sample Input: AABA
Sample Output: AB
If the order doesn't matter, use a set.
word = "AABA"
new_word = "".join(set(word))
If the order DOES matter, use an Ordered Dictionary (from collections library).
from collections import OrderedDict
word = "AABA"
new_word = "".join(OrderedDict.fromkeys(word))
EDIT: Consult the link posted in the comments above - it gives the same advice, but explains it better.
This question already has answers here:
Count number of occurrences of a substring in a string
(36 answers)
Closed 4 years ago.
I want to count the number of times \n appears in string (Student Copy)\nfor\nspecial school\n...Shaping the Future\n408,) before the phrase Shaping the Future. Is there a way to do it without splitting the string?
Output in this case should be 3
You can slice the string up until your substring of interest, and then use count
s = """(Student Copy)\nfor\nspecial school\n...Shaping the Future\n408,)"""
s[:s.index("Shaping the Future")].count('\n')
This question already has answers here:
Finding all possible permutations of a given string in python
(27 answers)
Best way to randomize a list of strings in Python
(6 answers)
Closed 4 years ago.
let's say for example that I got 100 random words (not even real words just words)...
like "ABCD" and I want to make a program that takes a word like the one I mentioned and prints you all the options of this word in random order.
for example the word "ABC" will print: "ABC", "BAC", CAB", "BCA", "CBA".
I could do it manually but if I have 100 words I can't...
so how do I write a code that does it in python?
You can do this by using itertools:
import itertools
import random
words = ['word1', 'word2', 'word3']
for word in words:
permutations_list = [''.join(x) for x in itertools.permutations(word)]
random.shuffle(permutations_list)
print(permutations_list)
This question already has answers here:
String count with overlapping occurrences [closed]
(25 answers)
Closed 7 years ago.
So I have a little problem,
I want to count how many times a string : "aa" is in my longer string "aaatattgg" its looks like a dna sequence.
Here for exemple I expect 2 (overlap is allow)
There is the .count method but overlap is not allowed
PS: excuse my english , I'm french
Through re module. Put your regex inside positive lookarounds in-order to do overlapping match.
>>> import re
>>> s = "aaatattgg"
>>> re.findall(r'(?=(aa))', s)
['aa', 'aa']
>>> len(re.findall(r'(?=(aa))', s))
2