Printing a String in Reverse After Extracting [duplicate] - python

This question already has answers here:
How do I reverse a string in Python?
(19 answers)
Closed 2 years ago.
I am trying to create a program in which the user inputs a statement containing two '!' surrounding a string. (example: hello all! this is a test! bye.) I am to grab the string within the two exclamation points, and print it in reverse letter by letter. I have been able to find the start and endpoints that contain the statement, however I am having difficulties creating an index that would cycle through my variable userstring in reverse and print.
test = input('Enter a string with two "!" surrounding portion of the string:')
expoint = test.find('!')
#print (expoint)
twoexpoint = test.find('!', expoint+1)
#print (twoexpoint)
userstring = test[expoint+1 : twoexpoint]
#print(userstring)
number = 0
while number < len(userstring) :
letter = [twoexpoint - 1]
print (letter)
number += 1

twoexpoint - 1 is the last index of the string you need relative to the input string. So what you need is to start from that index and reduce. In your while loop:
letter = test[twoexpoint- number - 1]
Each iteration you increase number which will reduce the index and reverse the string.
But this way you don't actually use the userstring you already found (except for the length...). Instead of caring for indexes, just reverse the userstring:
for letter in userstring[::-1]:
print(letter)

Explanation we use regex to find the pattern
then we loop for every occurance and we replace the occurance with the reversed string. We can reverse string in python with mystring[::-1] (works for lists too)
Python re documentation Very usefull and you will need it all the time down the coder road :). happy coding!
Very usefull article Check it out!
import re # I recommend using regex
def reverse_string(a):
matches = re.findall(r'\!(.*?)\!', a)
for match in matches:
print("Match found", match)
print("Match reversed", match[::-1])
for i in match[::-1]:
print(i)
In [3]: reverse_string('test test !test! !123asd!')
Match found test
Match reversed tset
t
s
e
t
Match found 123asd
Match reversed dsa321
d
s
a
3
2
1

You're overcomplicating it. Don't bother with an index, simply use reversed() on userstring to cycle through the characters themselves:
userstring = test[expoint+1:twoexpoint]
for letter in reversed(userstring):
print(letter)
Or use a reversed slice:
userstring = test[twoexpoint-1:expoint:-1]
for letter in userstring:
print(letter)

Related

Check if any character in one string appears in another [duplicate]

This question already has answers here:
Find common characters between two strings
(5 answers)
Closed 2 months ago.
I have a string of text
hfHrpphHBppfTvmzgMmbLbgf
I have separated this string into two half's
hfHrpphHBppf,TvmzgMmbLbgf
I'd like to check if any of the characters in the first string, also appear in the second string, and would like to class lowercase and uppercase characters as separate (so if string 1 had a and string 2 had A this would not be a match).
and the above would return:
f
split_text = ['hfHrpphHBppf', 'TvmzgMmbLbgf']
for char in split_text[0]:
if char in split_text[1]:
print(char)
There is probably a better way to do it, but this a quick and simple way to do what you want.
Edit:
split_text = ['hfHrpphHBppf', 'TvmzgMmbLbgf']
found_chars = []
for char in split_text[0]:
if char in split_text[1] and char not in found_chars:
found_chars.append(char)
print(char)
There is almost certainly a better way of doing this, but this is a way of doing it with the answer I already gave
You could use the "in" word.
something like this :
for i in range(len(word1) :
if word1[i] in word2 :
print(word[i])
Not optimal, but it should print you all the letter in common
You can achieve this using set() and intersection
text = "hfHrpphHBppf,TvmzgMmbLbgf"
text = text.split(",")
print(set(text[0]).intersection(set(text[1])))
You can use list comprehension in order to check if letters from string a appears in string b.
a='hfHrpphHBppf'
b='TvmzgMmbLbgf'
c=[x for x in a if x in b]
print(' '.join(set(c)))
then output will be:
f
But you can use for,too. Like:
a='hfHrpphHBppf'
b='TvmzgMmbLbgf'
c=[]
for i in a:
if i in b:
c.append(i)
print(set(c))

Return first word in sentence? [duplicate]

This question already has answers here:
How to extract the first and final words from a string?
(7 answers)
Closed 5 years ago.
Heres the question I have to answer for school
For the purposes of this question, we will define a word as ending a sentence if that word is immediately followed by a period. For example, in the text “This is a sentence. The last sentence had four words.”, the ending words are ‘sentence’ and ‘words’. In a similar fashion, we will define the starting word of a sentence as any word that is preceded by the end of a sentence. The starting words from the previous example text would be “The”. You do not need to consider the first word of the text as a starting word. Write a program that has:
An endwords function that takes a single string argument. This functioin must return a list of all sentence ending words that appear in the given string. There should be no duplicate entries in the returned list and the periods should not be included in the ending words.
The code I have so far is:
def startwords(astring):
mylist = astring.split()
if mylist.endswith('.') == True:
return my list
but I don't know if I'm using the right approach. I need some help
Several issues with your code. The following would be a simple approach. Create a list of bigrams and pick the second token of each bigram where the first token ends with a period:
def startwords(astring):
mylist = astring.split() # a list! Has no 'endswith' method
bigrams = zip(mylist, mylist[1:])
return [b[1] for b in bigrams if b[0].endswith('.')]
zip and list comprehenion are two things worth reading up on.
mylist = astring.split()
if mylist.endswith('.')
that cannot work, one of the reasons being that mylist is a list, and doesn't have endswith as a method.
Another answer fixed your approach so let me propose a regular expression solution:
import re
print(re.findall(r"\.\s*(\w+)","This is a sentence. The last sentence had four words."))
match all words following a dot and optional spaces
result: ['The']
def endwords(astring):
mylist = astring.split('.')
temp_words = [x.rpartition(" ")[-1] for x in mylist if len(x) > 1]
return list(set(temp_words))
This creates a set so there are no duplicates. Then goes on a for loop in a list of sentences (split by ".") then for each sentence, splits it in words then using [:-1] makes a list of the last word only and gets [0] item in that list.
print (set([ x.split()[:-1][0] for x in s.split(".") if len(x.split())>0]))
The if in theory is not needed but i couldn't make it work without it.
This works as well:
print (set([ x.split() [len(x.split())-1] for x in s.split(".") if len(x.split())>0]))
This is one way to do it ->
#!/bin/env/ python
from sets import Set
sentence = 'This is a sentence. The last sentence had four words.'
uniq_end_words = Set()
for word in sentence.split():
if '.' in word:
# check if period (.) is at the end
if '.' == word[len(word) -1]:
uniq_end_words.add(word.rstrip('.'))
print list(uniq_end_words)
Output (list of all the end words in a given sentence) ->
['words', 'sentence']
If your input string has a period in one of its word (lets say the last word), something like this ->
'I like the documentation of numpy.random.rand.'
The output would be - ['numpy.random.rand']
And for input string 'I like the documentation of numpy.random.rand a lot.'
The output would be - ['lot']

Finding patterns in HEX data with regex but getting duplicates

I have a regex python script to go over Hex data and find patterns which looks like this
r"(.{6,}?)\1{2,}"
all it does is look for at least 6 character long hex strings that repeat and at least have two instances of it repeating. My issue is it is also finding substrings inside larger strings it has already found for example:
if it was "a00b00a00b00a00b00a00b00a00b00a00b00" it would find 2 instances of "a00b00a00b00a00b00" and 6 instances of "a00b00" How could I go about keeping only the longest patterns found and ignoring even looking for shorter patterns without more hardcoded parameters?
#!/usr/bin/python
import fnmatch
pattern_string = "abcdefabcdef"
def print_pattern(pattern, num):
n = num
# takes n and splits it by that value in this case 6
new_pat = [pattern[i:i+n] for i in range(0, len(pattern), n)]
# this is the hit counter for matches
match = 0
# stores the new value of the match
new_match = ""
#loops through the list to see if it matches more than once
for new in new_pat:
new_match = new
print new
#if matches previous keep adding to match
if fnmatch.fnmatch(new, new_pat[0]):
match += 1
if match:
print "Count: %d\nPattern:%s" %(match, new_match)
#returns the match
return new_match
print_pattern(pattern_string, 6)
regex is better but this was funner to write

Can't convert 'list'object to str implicitly Python

I am trying to import the alphabet but split it so that each character is in one array but not one string. splitting it works but when I try to use it to find how many characters are in an inputted word I get the error 'TypeError: Can't convert 'list' object to str implicitly'. Does anyone know how I would go around solving this? Any help appreciated. The code is below.
import string
alphabet = string.ascii_letters
print (alphabet)
splitalphabet = list(alphabet)
print (splitalphabet)
x = 1
j = year3wordlist[x].find(splitalphabet)
k = year3studentwordlist[x].find(splitalphabet)
print (j)
EDIT: Sorry, my explanation is kinda bad, I was in a rush. What I am wanting to do is count each individual letter of a word because I am coding a spelling bee program. For example, if the correct word is 'because', and the user who is taking part in the spelling bee has entered 'becuase', I want the program to count the characters and location of the characters of the correct word AND the user's inputted word and compare them to give the student a mark - possibly by using some kind of point system. The problem I have is that I can't simply say if it is right or wrong, I have to award 1 mark if the word is close to being right, which is what I am trying to do. What I have tried to do in the code above is split the alphabet and then use this to try and find which characters have been used in the inputted word (the one in year3studentwordlist) versus the correct word (year3wordlist).
There is a much simpler solution if you use the in keyword. You don't even need to split the alphabet in order to check if a given character is in it:
year3wordlist = ['asdf123', 'dsfgsdfg435']
total_sum = 0
for word in year3wordlist:
word_sum = 0
for char in word:
if char in string.ascii_letters:
word_sum += 1
total_sum += word_sum
# Length of characters in the ascii letters alphabet:
# total_sum == 12
# Length of all characters in all words:
# sum([len(w) for w in year3wordlist]) == 18
EDIT:
Since the OP comments he is trying to create a spelling bee contest, let me try to answer more specifically. The distance between a correctly spelled word and a similar string can be measured in many different ways. One of the most common ways is called 'edit distance' or 'Levenshtein distance'. This represents the number of insertions, deletions or substitutions that would be needed to rewrite the input string into the 'correct' one.
You can find that distance implemented in the Python-Levenshtein package. You can install it via pip:
$ sudo pip install python-Levenshtein
And then use it like this:
from __future__ import division
import Levenshtein
correct = 'because'
student = 'becuase'
distance = Levenshtein.distance(correct, student) # distance == 2
mark = ( 1 - distance / len(correct)) * 10 # mark == 7.14
The last line is just a suggestion on how you could derive a grade from the distance between the student's input and the correct answer.
I think what you need is join:
>>> "".join(splitalphabet)
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
join is a class method of str, you can do
''.join(splitalphabet)
or
str.join('', splitalphabet)
To convert the list splitalphabet to a string, so you can use it with the find() function you can use separator.join(iterable):
"".join(splitalphabet)
Using it in your code:
j = year3wordlist[x].find("".join(splitalphabet))
I don't know why half the answers are telling you how to put the split alphabet back together...
To count the number of characters in a word that appear in the splitalphabet, do it the functional way:
count = len([c for c in word if c in splitalphabet])
import string
# making letters a set makes "ch in letters" very fast
letters = set(string.ascii_letters)
def letters_in_word(word):
return sum(ch in letters for ch in word)
Edit: it sounds like you should look at Levenshtein edit distance:
from Levenshtein import distance
distance("because", "becuase") # => 2
While join creates the string from the split, you would not have to do that as you can issue the find on the original string (alphabet). However, I do not think is what you are trying to do. Note that the find that you are trying attempts to find the splitalphabet (actually alphabet) within year3wordlist[x] which will always fail (-1 result)
If what you are trying to do is to get the indices of all the letters of the word list within the alphabet, then you would need to handle it as
for each letter in the word of the word list, determine the index within alphabet.
j = []
for c in word:
j.append(alphabet.find(c))
print j
On the other hand if you are attempting to find the index of each character within the alphabet within the word, then you need to loop over splitalphabet to get an individual character to find within the word. That is
l = []
for c within splitalphabet:
j = word.find(c)
if j != -1:
l.append((c, j))
print l
This gives the list of tuples showing those characters found and the index.
I just saw that you talk about counting the number of letters. I am not sure what you mean by this as len(word) gives the number of characters in each word while len(set(word)) gives the number of unique characters. On the other hand, are you saying that your word might have non-ascii characters in it and you want to count the number of ascii characters in that word? I think that you need to be more specific in what you want to determine.
If what you are doing is attempting to determine if the characters are all alphabetic, then all you need to do is use the isalpha() method on the word. You can either say word.isalpha() and get True or False or check each character of word to be isalpha()

Python: how to count overlapping occurrences of a substring [duplicate]

This question already has answers here:
How can I find the number of overlapping sequences in a String with Python? [duplicate]
(4 answers)
Closed 9 years ago.
I wanted to count the number of times that a string like 'aa' appears in 'aaa' (or 'aaaa').
The most obvious code gives the wrong (or at least, not the intuitive) answer:
'aaa'.count('aa')
1 # should be 2
'aaaa'.count('aa')
2 # should be 3
Does anyone have a simple way to fix this?
From str.count() documentation:
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
So, no. You are getting the expected result.
If you want to count number of overlapping matches, use regex:
>>> import re
>>>
>>> len(re.findall(r'(a)(?=\1)', 'aaa'))
2
This finds all the occurrence of a, which is followed by a. The 2nd a wouldn't be captured, as we've used look-ahead, which is zero-width assertion.
haystack = "aaaa"
needle = "aa"
matches = sum(haystack[i:i+len(needle)] == needle
for i in xrange(len(haystack)-len(needle)+1))
# for Python 3 use range instead of xrange
The solution is not taking overlap into consideration.
Try this:
big_string = "aaaa"
substring = "aaa"
count = 0
for char in range(len(big_string)):
count += big_string[char: char + len(subtring)] == substring
print count
You have to be careful, because you seem to looking for non-overlapping substrings. To fix this I'd do:
len([s.start() for s in re.finditer('(?=aa)', 'aaa')])
And if you don't care about the position where the substring starts you can do:
len([_ for s in re.finditer('(?=aa)', 'aaa')])
Although someone smarter than myself might be able to show that there are performances differences :)

Categories