make a global condition break - python

allow me to preface this by saying that i am learning python on my own as part of my own curiosity, and i was recommended a free online computer science course that is publicly available, so i apologize if i am using terms incorrectly.
i have seen questions regarding this particular problem on here before - but i have a separate question from them and did not want to hijack those threads. the question:
"a substring is any consecutive sequence of characters inside another string. The same substring may occur several times inside the same string: for example "assesses" has the substring "sses" 2 times, and "trans-Panamanian banana" has the substring "an" 6 times. Write a program that takes two lines of input, we call the first needle and the second haystack. Print the number of times that needle occurs as a substring of haystack."
my solution (which works) is:
first = str(input())
second = str(input())
count = 0
location = 0
while location < len(second):
if location == 0:
location = str.find(second,first,0)
if location < 0:
break
count = count + 1
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
print(count)
if you notice, i have on two separate occasions made the if statement that if location is less than 0, to break. is there some way to make this a 'global' condition so i do not have repetitive code? i imagine efficiency becomes paramount with increasing program sophistication so i am trying to develop good practice now.
how would python gurus optimize this code or am i just being too nitpicky?

I think Matthew and darshan have the best solution. I will just post a variation which is based on your solution:
first = str(input())
second = str(input())
def count_needle(first, second):
location = str.find(second,first)
if location == -1:
return 0 # none whatsoever
else:
count = 1
while location < len(second):
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
return count
print(count_needle(first, second))
Idea:
use function to structure the code when appropriate
initialise the variable location before entering the while loop save you from checking location < 0 multiple times

Check out regular expressions, python's re module (http://docs.python.org/library/re.html). For example,
import re
first = str(input())
second = str(input())
regex = first[:-1] + '(?=' + first[-1] + ')'
print(len(re.findall(regex, second)))

As mentioned by Matthew Adams the best way to do it is using python'd re module Python re module.
For your case the solution would look something like this:
import re
def find_needle_in_heystack(needle, heystack):
return len(re.findall(needle, heystack))
Since you are learning python, best way would be to use 'DRY' [Don't Repeat Yourself] mantra. There are lots of python utilities that you can use for many similar situation.
For a quick overview of few very important python modules you can go through this class:
Google Python Class
which should only take you a day.

even your aproach could be imo simplified (which uses the fact, that find returns -1, while you aks it to search from non existent offset):
>>> x = 'xoxoxo'
>>> start = x.find('o')
>>> indexes = []
>>> while start > -1:
... indexes.append(start)
... start = x.find('o',start+1)
>>> indexes
[1, 3, 5]

needle = "ss"
haystack = "ssi lass 2 vecess estan ss."
print 'needle occurs %d times in haystack.' % haystack.count(needle)

Here you go :
first = str(input())
second = str(input())
x=len(first)
counter=0
for i in range(0,len(second)):
if first==second[i:(x+i)]:
counter=counter+1
print(counter)

Answer
needle=input()
haystack=input()
counter=0
for i in range(0,len(haystack)):
if(haystack[i:len(needle)+i]!=needle):
continue
counter=counter+1
print(counter)

Related

Extract words from random strings

Below I have some strings in a list:
some_list = ['a','l','p','p','l','l','i','i','r',i','r','a','a']
Now I want to take the word april from this list. There are only two april in this list. So I want to take that two april from this list and append them to another extract list.
So the extract list should look something like this:
extract = ['aprilapril']
or
extract = ['a','p','r','i','l','a','p','r','i','l']
I tried many times trying to get the everything in extract in order, but I still can't seems to get it.
But I know I can just do this
a_count = some_list.count('a')
p_count = some_list.count('p')
r_count = some_list.count('r')
i_count = some_list.count('i')
l_count = some_list.count('l')
total_count = [a_count,p_count,r_count,i_count,l_count]
smallest_count = min(total_count)
extract = ['april' * smallest_count]
Which I wouldn't be here If I just use the code above.
Because I made some rules for solving this problem
Each of the characters (a,p,r,i and l) are some magical code elements, these code elements can't be created out of thin air; they are some unique code elements, that has some uniquw identifier, like a secrete number that is associated with them. So you don't know how to create this magical code elements, the only way to get the code elements is to extract them to a list.
Each of the characters (a,p,r,i and l) must be in order. Imagine they are some kind of chains, they will only work if they are together. Meaning that we got to put p next to and in front of a, and l must come last.
These important code elements are some kind of top secrete stuff, so if you want to get it, the only way is to extract them to a list.
Below are some examples of a incorrect way to do this: (breaking the rules)
import re
word = 'april'
some_list = ['aaaaaaappppppprrrrrriiiiiilll']
regex = "".join(f"({c}+)" for c in word)
match = re.match(regex, text)
if match:
lowest_amount = min(len(g) for g in match.groups())
print(word * lowest_amount)
else:
print("no match")
from collections import Counter
def count_recurrence(kernel, string):
# we need to count both strings
kernel_counter = Counter(kernel)
string_counter = Counter(string)
effective_counter = {
k: int(string_counter.get(k, 0)/v)
for k, v in kernel_counter.items()
}
min_recurring_count = min(effective_counter.values())
return kernel * min_recurring_count
This might sounds really stupid, but this is actually a hard problem (well for me). I originally designed this problem for myself to practice python, but it turns out to be way harder than I thought. I just want to see how other people solve this problem.
If anyone out there know how to solve this ridiculous problem, please help me out, I am just a fourteen-year-old trying to do python. Thank you very much.
I'm not sure what do you mean by "cannot copy nor delete the magical codes" - if you want to put them in your output list you will need to "copy" them somehow.
And btw your example code (a_count = some_list.count('a') etc) won't work since count will always return zero.
That said, a possible solution is
worklist = [c for c in some_list[0]]
extract = []
fail = False
while not fail:
lastpos = -1
tempextract = []
for magic in magics:
if magic in worklist:
pos = worklist.index(magic, lastpos+1)
tempextract.append(worklist.pop(pos))
lastpos = pos-1
else:
fail = True
break
else:
extract.append(tempextract)
Alternatively, if you don't want to pop the elements when you find them, you may compute the positions of all the occurences of the first element (the "a"), and set lastpos to each of those positions at the beginning of each iteration
May not be the most efficient way, although code works and is more explicit to understand the program logic:
some_list = ['aaaaaaappppppprrrrrriiiiiilll']
word = 'april'
extract = []
remove = []
string = some_list[0]
for x in range(len(some_list[0])//len(word)): #maximum number of times `word` can appear in `some_list[0]`
pointer = i = 0
while i<len(word):
j=0
while j<(len(string)-pointer):
if string[pointer:][j] == word[i]:
extract.append(word[i])
remove.append(pointer+j)
i+=1
pointer = j+1
break
j+=1
if i==len(word):
for r_i,r in enumerate(remove):
string = string[:r-r_i] + string[r-r_i+1:]
remove = []
elif j==(len(string)-pointer):
break
print(extract,string)

Extract numbers from a string in python without the use of isdigit or re. tools

Let's say I have a string of integers generated by user input, where each integer is separated by a space (Code below for example)...
How can I search through that string and store each integer separately for use later on in the program? (I.E. Assigning each integer to its own variable) I can't use isdigit and cant use re tools, and I can't store the ints into a list.
userEntry = input("Please enter a Fahrenheit temperature: ")
for i in range(4):
userEntry += " " + input("Please enter another fahrenheit:")
Things I AM allowed to use: string methods, index find/search methods, for loops, if statements, while loops.
Something like this will parse the string into space-separated strings, using slices... (I notice the first answer came in while I was working on this, but this is slightly different, so...)
def extractor(mystr):
start = 0
for a in range(len(mystr)):
if mystr[a] == ' ' or mystr[a] == len(mystr) - 1:
temp = mystr[start:a]
print(temp)
start = a + 1
This is more like a C approach, very un-Pythonic, but standard programming fare. If you will only ever have 5 user entries, this is perhaps manageable. If you can't use a list of those variables, or if you have an unknown number of user entries, or if you have to check to make sure the user actually entered a digit and not a letter, then more work is required, but that's the basic C-string parser. Useful to know if you ever want to dive into Python internals I suppose.
If you need to convert each extracted string to an int, and exceptions are allowed, place this inside the if statement to check for type correctness:
try:
myvar1 = int(temp)
except ValueError:
print("Not an int")
Note that if you absolutely cannot use lists, (*or exec as in the above answer) then the only likely option is to keep slicing off the end of the string, i.e you'd have to do something like the following at the end of each if statement, then write that for loop out 4 more times, changing the variable name each time manually.
mystr = mystr[start:len(mystr)]
break
This will of course not work if you have a variable number of user entries. And is incredibly tedious... I suspect the instructor may have intended something different. Note that the real-world process for all that is just:
result = [int(x) for x in mystr.split(' ') if x.isdigit()]
I am not sure what your use case is, and I can not think of a way where you can assign the numbers to variable in a loop, which is what you have to do if you are not allowed to use a loop. The only way I can think of is exec and I do not feel that is allowed for your task. Regardless, I am posting the answer, in case it is usable:
last_space_index = 0
characters_checked = 0
var_num = 1
userEntry = "12.8 -15.8 125.9 0 -40.0"
for character in userEntry:
characters_checked += 1
if character == ' ':
number = float(userEntry[last_space_index:characters_checked])
var_name = 'var'+str(var_num)
var_num += 1
expression = var_name + ' = number'
# expression becomes 'var1 = number'
exec(expression)
last_space_index = characters_checked
last_number = float(userEntry[last_space_index:])
var_name = 'var'+str(var_num)
expression = var_name + ' = last_number'
exec(expression)
# if you know the number of variables you are going to get
print(var1, var2, var3, var4, var5)
# else:
# for i in range(1,var_num+1):
# var_name = 'var'+str(i)
# command = 'print('+var_name+')'
# exec(command)
Output:
>>> 12.8 -15.8 125.9 0 -40.0
You can replace print with whatever you actually want to do.
And this is completely futile if you are allowed to use dictionary, sets or tuple.

Python - turn some of the words in list/str to dots. len(list)?

I've started learning Python last week on codecademy and Google etc. but got stuck and couldn't find the answer anywhere so signed up on stackoverflow.com looking for your support.
I'm trying to build a program that only takes first 5 letters of any name and the remainder of the letter(s) to be shows as blank dot(s). e.g.
Adrian: "Adria."
Michael: "Micha.."
Alexander: "Alexa...." etc.
I tried to "fix" it with the "b" variable but that just prints three dots "..." regardless of how long the name is.
This is what I've got so far:
def namecheck():
name = raw_input("Name?")
if len(name) <=5:
print name
else:
if len(name) >5:
name = name[0:5]
b = ("...")
print name + b
namecheck()
I'm a total newbie so I apologise for any wrong spacing here, thank you for your support and patience.
As an alternative to sequence multiplication (one which is somewhat more self-documenting, and hopefully less confusing to maintainers), just use str.ljust to do your padding:
def namecheck():
name = raw_input("Name?")
# Reduce to first five (or less) characters, then pad with .s to original length
# with str.ljust
print name[:5].ljust(len(name), '.')
print name[:5] + '.' * (len(name) - 5) works fine, it's just a bit arcane (and also involves more temporary values, though in practice, the lack of actual method calls makes it faster on CPython).
you can try to use the function replace().
name = 'abcdefg'
name.replace(name[5:], '.' * len(name[5:]))
output: 'abcde..'
name='randy12345'
name.replace(name[5:],'.' * len(name[5:]))
output: 'randy.....'
name[5:] means get all the element starting 6 (5+1 because it start with 0)
'.' * len(name[5:] then this code count it and multiply it by dot
name.replace(name[5:],'.' * len(name[5:])) then use replace function to replace the excess element with dots
The most concise way I can think of:
def namecheck():
name = raw_input("Name?")
print(name[0:5] + '.' * (len(name) - 5))
namecheck()
Try something like this:
def namecheck():
name = raw_input("Name?")
if len(name) <= 5:
print name
else:
print name[0:5] + '.' * (len(name)-5)
namecheck()

What's wrong with my python multiprocessing code?

I am an almost new programmer learning python for a few months. For the last 2 weeks, I had been coding to make a script to search permutations of numbers that make magic squares.
Finally I succeeded in searching the whole 880 4x4 magic square numbers sets within 30 seconds. After that I made some different Perimeter Magic Square program. It finds out more than 10,000,000 permutations so that I want to store them part by part to files. The problem is that my program doesn't use all my processes that while it is working to store some partial data to a file, it stops searching new number sets. I hope I could make one process of my CPU keep searching on and the others store the searched data to files.
The following is of the similar structure to my magic square program.
while True:
print('How many digits do you want? (more than 20): ', end='')
ansr = input()
if ansr.isdigit() and int(ansr) > 20:
ansr = int(ansr)
break
else:
continue
fileNum = 0
itemCount = 0
def fileMaker():
global fileNum, itemCount
tempStr = ''
for i in permutationList:
itemCount += 1
tempStr += str(sum(i[:3])) + ' : ' + str(i) + ' : ' + str(itemCount) + '\n'
fileNum += 1
file = open('{0} Permutations {1:03}.txt'.format(ansr, fileNum), 'w')
file.write(tempStr)
file.close()
numList = [i for i in range(1, ansr+1)]
permutationList = []
itemCount = 0
def makePermutList(numList, ansr):
global permutationList
for i in numList:
numList1 = numList[:]
numList1.remove(i)
for ii in numList1:
numList2 = numList1[:]
numList2.remove(ii)
for iii in numList2:
numList3 = numList2[:]
numList3.remove(iii)
for iiii in numList3:
numList4 = numList3[:]
numList4.remove(iiii)
for v in numList4:
permutationList.append([i, ii, iii, iiii, v])
if len(permutationList) == 200000:
print(permutationList[-1])
fileMaker()
permutationList = []
fileMaker()
makePermutList(numList, ansr)
I added from multiprocessing import Pool at the top. And I replaced two 'fileMaker()' parts at the end with the following.
if __name__ == '__main__':
workers = Pool(processes=2)
workers.map(fileMaker, ())
The result? Oh no. It just works awkwardly. For now, multiprocessing looks too difficult for me.
Anybody, please, teach me something. How should my code be modified?
Well, addressing some things that are bugging me before getting to your asked question.
numList = [i for i in range(1, ansr+1)]
I know list comprehensions are cool, but please just do list(range(1, ansr+1)) if you need the iterable to be a list (which you probably don't need, but I digress).
def makePermutList(numList, ansr):
...
This is quite the hack. Is there a reason you can't use itertools.permutations(numList,n)? It's certainly going to be faster, and friendlier on memory.
Lastly, answering your question: if you are looking to improve i/o performance, the last thing you should do is make it multithreaded. I don't mean you shouldn't do it, I mean that it should literally be the last thing you do. Refactor/improve other things first.
You need to take all of that top-level code that uses globals, apply the backspace key to it, and rewrite functions that pass data around properly. Then you can think about using threads. I would personally use from threading import Thread and manually spawn Threads to do each unit of I/O rather than using multiprocessing.

Making string series in Python

I have a problem in Python I simply can't wrap my head around, even though it's fairly simple (I think).
I'm trying to make "string series". I don't really know what it's called, but it goes like this:
I want a function that makes strings that run in series, so that every time the functions get called it "counts" up once.
I have a list with "a-z0-9._-" (a to z, 0 to 9, dot, underscore, dash). And the first string I should receive from my method is aaaa, next time I call it, it should return aaab, next time aaac etc. until I reach ----
Also the length of the string is fixed for the script, but should be fairly easy to change.
(Before you look at my code, I would like to apologize if my code doesn't adhere to conventions; I started coding Python some days ago so I'm still a noob).
What I've got:
Generating my list of available characters
chars = []
for i in range(26):
chars.append(str(chr(i + 97)))
for i in range(10):
chars.append(str(i))
chars.append('.')
chars.append('_')
chars.append('-')
Getting the next string in the sequence
iterationCount = 0
nameLen = 3
charCounter = 1
def getString():
global charCounter, iterationCount
name = ''
for i in range(nameLen):
name += chars[((charCounter + (iterationCount % (nameLen - i) )) % len(chars))]
charCounter += 1
iterationCount += 1
return name
And it's the getString() function that needs to be fixed, specifically the way name gets build.
I have this feeling that it's possible by using the right "modulu hack" in the index, but I can't make it work as intended!
What you try to do can be done very easily using generators and itertools.product:
import itertools
def getString(length=4, characters='abcdefghijklmnopqrstuvwxyz0123456789._-'):
for s in itertools.product(characters, repeat=length):
yield ''.join(s)
for s in getString():
print(s)
aaaa
aaab
aaac
aaad
aaae
aaaf
...

Categories