Python isspace function - python

I'm having difficulty with the isspace function. Any idea why my code is wrong and how to fix it?
Here is the problem:
Implement the get_num_of_non_WS_characters() function. get_num_of_non_WS_characters() has a string parameter and returns the number of characters in the string, excluding all whitespace.
Here is my code:
def get_num_of_non_WS_characters(s):
count = 0
for char in s:
if char.isspace():
count = count + 1
return count

You want non whitespace, so you should use not
def get_num_of_non_WS_characters(s):
count = 0
for char in s:
if not char.isspace():
count += 1
return count
>>> get_num_of_non_WS_characters('hello')
5
>>> get_num_of_non_WS_characters('hello ')
5
For completeness, this could be done more succinctly using a generator expression
def get_num_of_non_WS_characters(s):
return sum(1 for char in s if not char.isspace())

A shorter version of #CoryKramer answer:
def get_num_of_non_WS_characters(s):
return sum(not c.isspace() for c in s)

As an alternative you could also simple do:
def get_num_of_non_WS_characters(s):
return len(''.join(s.split()))
Then
s = 'i am a string'
get_num_of_non_WS_characters(s)
will return 10
This will also remove tabs and new line characters:
s = 'i am a string\nwith line break'
''.join(s.split())
will give
'iamastringwithlinebreak'

I would just use n=s.replace(" " , "") and then len(n).
Otherwise I think you should increase the count after the if statement and put a continue inside it.

Related

Problem with changing lowercase letters to uppercase and vice versa using str.replace

Ok, so this is my code, i don't want to use the built in swapcase() method. It does not work for the given string.
def myFunc(a):
for chars in range(0,len(a)):
if a[chars].islower():
a = a.replace(a[chars], a[chars].upper())
elif a[chars].isupper():
a = a.replace(a[chars], a[chars].lower())
return a
print(myFunc("AaAAaaaAAaAa"))
replace changes all the letters and you assign the values back to aso you end up with all upper cases.
def myFunc(a):
# use a list to collect changed letters
new_text = []
for char in a:
if char.islower():
new_text.append(char.upper())
else:
new_text.append(char.lower())
# join the letters back into a string
return ''.join(new_text)
print(myFunc("AaAAaaaAAaAa")) # aAaaAAAaaAaA
or shorter:
def my2ndFunc(text):
return ''.join( a.upper() if a.islower() else a.lower() for a in text)
using a list comprehension and a ternary expression to modify the letter (see Does Python have a ternary conditional operator?)
The problem was that you were doing a replace of all ocurrances of that character in the string. Here you have a working solution:
def myFunc(a):
result = ''
for chars in range(0,len(a)):
print(a[chars])
if a[chars].islower():
result += a[chars].upper()
elif a[chars].isupper():
result += a[chars].lower()
return result
print(myFunc("AaAAaaaAAaAa"))

python string substitute letter by its occurrence number

Suppose I have a string Hello, world, and I'd like to replace all instances of o by its occurrence number, i.e. get a sting: Hell1, w2rld?
I have found how to reference a numbered group, \g<1>, but it does require a group number.
Is there are way to do what I want in python?
Update: Sorry for not mentioning that I was indeed looking for a regexp solution, not just a string. I have marked the solution I liked the best, but thank for all contributions, they were cool!
For a regular expression solution:
import re
class Replacer:
def __init__(self):
self.counter = 0
def __call__(self, mo):
self.counter += 1
return str(self.counter)
s = 'Hello, World!'
print(re.sub('o', Replacer(), s))
Split the string on the letter "o" and reassemble it by adding the index of the part in front of each part (except the first one):
string = "Hello World"
result = "".join(f"{i or ''}"+part for i,part in enumerate(string.split("o")))
output:
print(result)
# Hell1 W2rld
Using itertools.count
Ex:
import re
from itertools import count
c = count(1)
s = "Hello, world"
print(re.sub(r"o", lambda x: "{}".format(next(c)), s))
#or
print(re.sub(r"o", lambda x: f"{next(c)}", s))
# --> Hell1, w2rld
You don't need regex for that, a simple loop will suffice:
sample = "Hello, world"
pattern = 'o'
output = ''
count = 1
for char in sample:
if char == pattern:
output += str(count)
count += 1
else:
output += char
print(output)
>>> Hell1, w2rld

Trying to learn how loops work, what is going wrong?

The question is :
Given a string, return a string where for every char in the original, there are two chars.
This is my attempt:
def double_char(str):
n = 0
for x in range(0, len(str)):
return 2*str[n]
n = n+1
When I run it, it only returns 2 versions of the first letter and doesn't loop properly. So for double_char(Hello) it just returns HH.
What is going wrong? Thanks in advance for any help, sorry for the really beginner question.
The return is causing your function to return in the first iteration so it just returns 2 of the first letter.
What you may have intended to write was something like
def double_char(s):
n = 0
r = ''
for x in range(0, len(s)):
r += 2*s[n]
n = n+1
return r
Building a string incrementally that is just 2 of each character.
A neater refactor of that function (without duplicating the other answer by using a comprehension) is
def double_char(s):
r = ''
for c in s:
r += 2*c
return r
You also should not use str as a variable name. It is a built in type and you are hiding that by defining a variable called str.
return returns control to the caller once reached, thus exiting your for loop prematurely.
Here's a simpler way to do that with str.join:
def double_char(s):
return ''.join(i*2 for i in s)
>>> s = 'Hello'
>>> double_char(s)
'HHeelllloo'
Do not use str as name to avoid shadowing the builtin str function.
Here is a different way to solving the question.
def double_char(str):
new_str = ""
for i in range(len(str)):
new_str += (str[i]*2)
return new_str
double_char('Hello')
'HHeelllloo'
def double_char(str):
string = ''
for i in range(len(str)):
string += str[i] * 2
i += 1
return string

How to create a function to calculate the frequency of a character in a piece of text in python?

I need to create a function that counts the frequency of an inputed letter in a piece of text...
e.g.
import urllib
pieceoftext =urllib.urlopen(blahblahblah).read()
def frequency(char):
count = 0
for character in pieceoftext:
count += 1
return count
...
Any help would be appreciated :)
def frequency(char, sentence):
count = 0
for k in list(sentence):
if k == char:
count+=1
return count
This runs as:
frequency('t', 'the quick brown fox jumps over the lazy dog.')
2
list(sentence) gets each individual character in the sentence, which we loop over with a for loop. We then check if the character is the one specified, and if it is, we add one to our variable count. At the end, we return count.
As #Hamatti said below, you do not need to convert to a list first, so here is a simplified version of the code:
def frequency(char, sentence):
count = 0
for k in sentence:
if k.lower() == char.lower():
count+=1
return count
EDIT:
If the sentence is imported as a piece of text from a url, use the following code:
sentence = urllib.urlopen('myamazingurl.com').read()
def frequency(char, sentence):
count = 0
for k in sentence:
if k.lower() == char.lower():
count+=1
return count
Then continue as needed. All this does is assigns the variable sentence to the text in the url. You can also assign sentence within the frequency function, if you will always be pulling from the same url. For that, use the following code:
def frequency(char):
count = 0
sentence = urllib.urlopen('myamazingurl.com').read()
for k in sentence:
if k.lower() == char.lower():
count+=1
return count
You can alter this if you don't want two parameters, but it's probably better to use two in this case.
def frequency(c, sentence):
count = 0
for character in sentence:
if (c.lower() == character.lower()):
count += 1
return count
>>> frequency('c', 'ccc')
>>> 3
Note that I used the lower() method to make the comparison case-insensitive.
You can use Counter in the collections module.
import collections
def frequency(char, string):
string = string.lower()
count = collections.Counter(string)
return count[char]
Ideally, string = string.lower() should be omitted and you should pass the string already lower-cased if you wish.
pieceoftext.lower().count('t')
or
frequency = lambda c, s=pieceoftext: s.lower().count(c)
frequency('t') # returns 2
frequency('t', pieceoftext) # returns 2
frequency('t', 'tttt') # returns 4
def frequency(c,sentance):
return sum([c==ch for ch in sentance])
is how I would do it ... if I wasnt going to use count anyway

How to replace the Nth appearance of a needle in a haystack? (Python)

I am trying to replace the Nth appearance of a needle in a haystack. I want to do this simply via re.sub(), but cannot seem to come up with an appropriate regex to solve this. I am trying to adapt: http://docstore.mik.ua/orelly/perl/cookbook/ch06_06.htm but am failing at spanning multilines, I suppose.
My current method is an iterative approach that finds the position of each occurrence from the beginning after each mutation. This is pretty inefficient and I would like to get some input. Thanks!
I think you mean re.sub. You could pass a function and keep track of how often it was called so far:
def replaceNthWith(n, replacement):
def replace(match, c=[0]):
c[0] += 1
return replacement if c[0] == n else match.group(0)
return replace
Usage:
re.sub(pattern, replaceNthWith(n, replacement), str)
But this approach feels a bit hacky, maybe there are more elegant ways.
DEMO
Something like this regex should help you. Though I'm not sure how efficient it is:
#N=3
re.sub(
r'^((?:.*?mytexttoreplace){2}.*?)mytexttoreplace',
'\1yourreplacementtext.',
'mystring',
flags=re.DOTALL
)
The DOTALL flag is important.
I've been struggling for a while with this, but I found a solution that I think is pretty pythonic:
>>> def nth_matcher(n, replacement):
... def alternate(n):
... i=0
... while True:
... i += 1
... yield i%n == 0
... gen = alternate(n)
... def match(m):
... replace = gen.next()
... if replace:
... return replacement
... else:
... return m.group(0)
... return match
...
...
>>> re.sub("([0-9])", nth_matcher(3, "X"), "1234567890")
'12X45X78X0'
EDIT: the matcher consists of two parts:
the alternate(n) function. This returns a generator that returns an infinite sequence True/False, where every nth value is True. Think of it like list(alternate(3)) == [False, False, True, False, False, True, False, ...].
The match(m) function. This is the function that gets passed to re.sub: it gets the next value in alternate(n) (gen.next()) and if it's True it replaces the matched value; otherwise, it keeps it unchanged (replaces it with itself).
I hope this is clear enough. If my explanation is hazy, please say so and I'll improve it.
Could you do it using re.findall with MatchObject.start() and MatchObject.end()?
find all occurences of pattern in string with .findall, get indices of Nth occurrence with .start/.end, make new string with replacement value using the indices?
If the pattern ("needle") or replacement is a complex regular expression, you can't assume anything. The function "nth_occurrence_sub" is what I came up with as a more general solution:
def nth_match_end(pattern, string, n, flags):
for i, match_object in enumerate(re.finditer(pattern, string, flags)):
if i + 1 == n:
return match_object.end()
def nth_occurrence_sub(pattern, repl, string, n=0, flags=0):
max_n = len(re.findall(pattern, string, flags))
if abs(n) > max_n or n == 0:
return string
if n < 0:
n = max_n + n + 1
sub_n_times = re.sub(pattern, repl, string, n, flags)
if n == 1:
return sub_n_times
nm1_end = nth_match_end(pattern, string, n - 1, flags)
sub_nm1_times = re.sub(pattern, repl, string, n - 1, flags)
sub_nm1_change = sub_nm1_times[:-1 * len(string[nm1_end:])]
components = [
string[:nm1_end],
sub_n_times[len(sub_nm1_change):]
]
return ''.join(components)
I have a similar function I wrote to do this. I was trying to replicate SQL REGEXP_REPLACE() functionality. I ended up with:
def sql_regexp_replace( txt, pattern, replacement='', position=1, occurrence=0, regexp_modifier='c'):
class ReplWrapper(object):
def __init__(self, replacement, occurrence):
self.count = 0
self.replacement = replacement
self.occurrence = occurrence
def repl(self, match):
self.count += 1
if self.occurrence == 0 or self.occurrence == self.count:
return match.expand(self.replacement)
else:
try:
return match.group(0)
except IndexError:
return match.group(0)
occurrence = 0 if occurrence < 0 else occurrence
flags = regexp_flags(regexp_modifier)
rx = re.compile(pattern, flags)
replw = ReplWrapper(replacement, occurrence)
return txt[0:position-1] + rx.sub(replw.repl, txt[position-1:])
One important note that I haven't seen mentioned is that you need to return match.expand() otherwise it won't expand the \1 templates properly and will treat them as literals.
If you want this to work you'll need to handle the flags differently (or take it from my github, it's simple to implement and you can dummy it for a test by setting it to 0 and ignoring my call to regexp_flags()).

Categories