I'm writing a function that will take a word as a parameter and will look at each character and if there is a number in the word, it will return the word
This is my string that I will iterate through
'Let us look at pg11.'
and I want to look at each character in each word and if there is a digit in the word, I want to return the word just the way it is.
import string
def containsDigit(word):
for ch in word:
if ch == string.digits
return word
if any(ch.isdigit() for ch in word):
print word, 'contains a digit'
To make your code work use the in keyword (which will check if an item is in a sequence), add a colon after your if statement, and indent your return statement.
import string
def containsDigit(word):
for ch in word:
if ch in string.digits:
return word
Why not use Regex?
>>> import re
>>> word = "super1"
>>> if re.search("\d", word):
... print("y")
...
y
>>>
So, in your function, just do:
import re
def containsDigit(word):
if re.search("\d", word):
return word
print(containsDigit("super1"))
output:
'super1'
You are missing a colon:
for ch in word:
if ch.isdigit(): #<-- you are missing this colon
print "%s contains a digit" % word
return word
Often when you want to know if "something" contains "something_else" sets may be usefull.
digits = set('0123456789')
def containsDigit(word):
if set(word) & digits:
return word
print containsDigit('hello')
If you desperately want to use the string module. Here is the code:
import string
def search(raw_string):
for raw_array in string.digits:
for listed_digits in raw_array:
if listed_digits in raw_string:
return True
return False
If I run it in the shell here I get the wanted resuts. (True if contains. False if not)
>>> search("Give me 2 eggs")
True
>>> search("Sorry, I don't have any eggs.")
False
Code Break Down
This is how the code works
The string.digits is a string. If we loop through that string we get a list of the parent string broke down into pieces. Then we get a list containing every character in a string with'n a list. So, we have every single characters in the string! Now we loop over it again! Producing strings which we can see if the string given contains a digit because every single line of code inside the loop takes a step, changing the string we looped through. So, that means ever single line in the loop gets executed every time the variable changes. So, when we get to; for example 5. It agains execute the code but the variable in the loop is now changed to 5. It runs it agin and again and again until it finally got to the end of the string.
Related
I'm running into an issue where my Python code is not correctly returning a function call designed to add an underscore character before each capital letter and I'm not sure where I'm going wrong. For an output, only the "courseID" word in the string is getting touched whereas the other two words are not.
I thought cycling thru the letters in a word, looking for capitalized letters would work, but it doesn't appear to be so. Could someone let me know where my code might be going wrong?
def parse_variables(string):
new_string=''
for letter in string:
if letter.isupper():
pos=string.index(letter)
parsed_string=string[:pos] + '_' + string[pos:]
new_string=''.join(parsed_string+letter)
else:
new_string=''.join(letter)
# new_string=''.join(letter)
return new_string.lower()
parse_variables("courseID pathID apiID")
Current output is a single letter lowercase d and the expected output should be course_id path_id api_id.
The issue with your revised code is that index only finds the first occurence of the capital letter in the string. Since you have repeated instances of the same capital letters, the function never finds the subsequent instances. You could simplify your approach and avoid this issue by simply concatenating the letters with or without underscores depending on whether they are uppercase as you iterate.
For example:
def underscore_caps(s):
result = ''
for c in s:
if c.isupper():
result += f'_{c.lower()}'
else:
result += c
return result
print(underscore_caps('courseID pathID apiID'))
# course_i_d path_i_d api_i_d
Or a bit more concisely using list comprehension and join:
def underscore_caps(s):
return ''.join([f'_{c.lower()}' if c.isupper() else c for c in s])
print(underscore_caps('courseID pathID apiID'))
# course_i_d path_i_d api_i_d
I think a regex solution would be easier to understand here. This takes words that end with capital letters and adds the underscore and makes them lowercase
import re
s = "courseID pathID apiID exampleABC DEF"
def underscore_lower(match):
return "_" + match.group(1).lower()
pat = re.compile(r'(?<=[^A-Z\s])([A-Z]+)\b')
print(pat.sub(underscore_lower, s))
# course_id path_id api_id example_abc DEF
You might have to play with that regex to get it to do exactly what you want. At the moment, it takes capital letters at the end of words that are preceded by a character that is neither a capital letter or a space. It then makes those letters lowercase and adds an underscore in front of them.
You have a number of issues with your code:
string.index(letter) gives the index of the first occurrence of letter, so if you have multiple e.g. D, pos will only update to the position of the first one.
You could correct this by iterating over both position and letter using enumerate e.g. for pos, letter in enumerate(string):
You are putting underscores before each capital letter i.e. _i_d
You are overwriting previous edits by referring to string in parsed_string=string[:pos] + '_' + string[pos:]
Correcting all these issues you would have:
def parse_variables(string):
new_string=''
for pos, letter in enumerate(string):
if letter.isupper() and pos+1 < len(string) and string[pos+1].isupper():
new_string += f'_{letter}'
else:
new_string += letter
return new_string.lower()
But a much simpler method is:
"courseID pathID apiID".replace('ID', '_id')
Update:
Given the variety of strings you want to capture, it seems regex is the tool you want to use:
import re
def parse_variables(string, pattern=r'(?<=[a-z])([A-Z]+)', prefix='_'):
"""Replace patterns in string with prefixed lowercase version.
Default pattern is any substring of consecutive
capital letters that occur after a lowercase letter."""
foo = lambda pat: f'{prefix}{pat.group(1).lower()}'
return re.sub(pattern, foo, text)
text = 'courseID pathProjects apiCode'
parse_variables(text)
>>> course_id path_projects api_code
Im currently writing a program that looks at a list and iterates through a groups the words into sentences but whenever i ran it, I got [] and im not 100% sure why. Here is my code for reading in the file, and creating the sentence and an attached snippet of the list.
def import_file(text_file):
wordcounts = []
with open(text_file, encoding = "utf-8") as f:
pride_text = f.read()
sentences = pride_text.split(" ")
return sentences
def create_sentance(sentance):
sentence_list=[]
my_sentence=""
for character in sentance:
if character=='.' or character=='?' or character=='!':
sentence_list.append(my_sentence)
my_sentence=""
else:
my_sentence=my_sentence + character
return sentence_list
Preview of List
Calling of my functions
pride=import_file("pride.txt")
pride=remove_abbreviations_and_punctuation(pride)
pride=create_sentance(pride)
print(pride)
Your return sentence_list is indented one further than it should be. After the first iteration of the for, should the else condition execute and not the if, then your function returns sentence_list which was initialized to [ ]. Either way, if sentence was 20 characters long, your for will only run once given where your return call is.
Make the following change:
def create_sentance(sentance):
sentence_list=[]
my_sentence=""
for character in sentance:
if character=='.' or character=='?' or character=='!':
sentence_list.append(my_sentence)
my_sentence=""
else:
# do you not want this in 'sentence_list'?
my_sentence=my_sentence + character
return sentence_list
The reason your function is returning an empty list is because your return is inside the for loop. In addition, "character" is actually a word, since each element of your list is a word. This program works:
def create_sentance(sentance):
sentence_list=[]
my_sentence=""
for character in sentance:
print character
if '.' in character or '?' in character or '!' in character:
sentence_list.append(my_sentence + ' ' + character)
my_sentence=""
else:
my_sentence=my_sentence + ' ' + character
return sentence_list
create_sentance(['I','will','go','to','the','park.','Ok?'])
You need to use "in" instead of == because each character is a word. Try printing "character" to see this. The above program works and returns the result
[' I will go to the park.', 'Ok?']
which is what you were intending.
I have this code:
print('abcdefg')
input('Arrange word from following letters: ')
I want to return True if the input consists of letters from the printed string but it doesn't have to have all of printed letters.
That's a perfect use case for sets especially for set.issubset:
print('abcdefg')
given_input = input('Arrange word from following letters: ')
if set(given_input).issubset('abcdefg'):
print('True')
else:
print('False')
or directly print (or return) the result of the issubset operation without if and else:
print(set(given_input).issubset('abcdefg'))
This sounds a little like homework...
Basically you would need to do this: Store both strings in variables. e.g. valid_chars and s.
Then loop through s one character at a time. For each character check if it is in valid_chars (using the in operator). If any character is not found in valid_chars then you should return False. If you get to the end of the loop, return True.
If the valid_chars string is very long it would be better to first put them into a set but for short strings this is not necessary.
This might be an easy one, but I can't spot where I am making the mistake.
I wrote a simple program to read words from a wordfile (don't have to be dictionary words), sum the characters and print them out from lowest to highest. (PART1)
Then, I wrote a small script after this program to filter and search for only those words which have only alphabetic, characters in them. (PART2)
While the first part works correctly, the second part prints nothing. I think the error is at the line 'print ch' where a character of a list converted to string is not being printed. Please advise what could be the error
#!/usr/bin/python
# compares two words and checks if word1 has smaller sum of chars than word2
def cmp_words(word_with_sum1,word_with_sum2):
(word1_sum,__)=word_with_sum1
(word2_sum,__)=word_with_sum2
return word1_sum.__cmp__(word2_sum)
# PART1
word_data=[]
with open('smalllist.txt') as f:
for l in f:
word=l.strip()
word_sum=sum(map(ord,(list(word))))
word_data.append((word_sum,word))
word_data.sort(cmp_words)
for index,each_word_data in enumerate(word_data):
(word_sum,word)=each_word_data
#PART2
# we only display words that contain alphabetic characters and numebrs
valid_characters=[chr(ord('A')+x) for x in range(0,26)] + [x for x in range(0,10)]
# returns true if only alphabetic characters found
def only_alphabetic(word_with_sum):
(__,single_word)=word_with_sum
map(single_word.charAt,range(0,len(single_word)))
for ch in list(single_word):
print ch # problem might be in this loop -- can't see ch
if not ch in valid_characters:
return False
return True
valid_words=filter(only_alphabetic,word_data)
for w in valid_words:
print w
Thanks in advance,
John
The problem is that charAt does not exist in python.
You can use directly: 'for ch in my_word`.
Notes:
you can use the builtin str.isalnum() for you test
valid_characters contains only the uppercase version of the alphabet
I was building a bit of code that would trim off any non-digit entries from the start and end of a string, I had a very confusing issue with the following bit of code:
def String_Trim(Raw_String):
if Raw_String[0].isdigit() == False:
New_String = Raw_String[1:]
String_Trim(New_String)
elif Raw_String[-1].isdigit() == False:
New_String = Raw_String[:-1]
String_Trim(New_String)
else:
print Raw_String
return Raw_String
print(String_Trim('ab19fsd'))
The initial printing of Raw_String works fine and displays the value that I want (19), but for some reason, the last line trying to print the return value of String_Trim returns a None. What exactly is python doing here and how can I fix it? Any other comments about improving my code would also be greatly appreciated.
Use regex for this. Recursion for trimming a string is really not a good idea:
import re
def trim_string(string):
return re.sub(r'^([^0-9]+)(.*?)([^0-9]+)$', r'\2', string)
To break it down, the regex (r'^([^0-9]+)(.*?)([^0-9]+)$') is like so:
^ matches the start of a string.
([^0-9]+) matches a group of consecutive non-digit characters.
(.*?) matches a group of stuff (non-greedy).
([^0-9]+) matches another group of consecutive non-digit characters.
$ matches the end of the string.
The replacement string, r'\2', just says to replace the matched string with only the second group, which is the stuff between the two groups of non-digit characters.
But if you're really sure you want to use your existing solution, you need to understand how recursion actually works. When you call return foo, the function returns foo as its output. If you don't call return, you return None automatically.
That being said, you need to return in every case of the recursion process, not just at the end:
def String_Trim(Raw_String):
if Raw_String[0].isdigit() == False:
New_String = Raw_String[1:]
return String_Trim(New_String)
elif Raw_String[-1].isdigit() == False:
New_String = Raw_String[:-1]
return String_Trim(New_String)
else:
print Raw_String
return Raw_String
You return a value in only one case inside StringTrim. Add return in front of the recursive calls:
return String_Trim(New_String)
That should fix it.
If I understand your question correctly, you want to return only the digits from a string; because "trim of any non digits from the start and end" to me sounds like "return only numbers".
If that's correct, you can do this:
''.join(a for a in 'abc19def' if a.isdigit())