New Python learner here. I am trying to make a program that searches for patterns of words in a string and extracts them into variables. I’m doing this by looping through lists to find particular substrings.
I have encountered a problem that has me a bit stuck and I was wondering if somebody here could help me:
I want to loop through a list of strings inside another loop of strings but I can't seem to work out where to loop through the monthcount variable. My code below:
months = ["Easter '","December"]
monthcount = 0
datecheck = [['dated ',' and inscribed '],['dated ?','verso'],['dated ','lower right'],["dated "+months[monthcount],'in']]
datedcount = 0
while datedcount <(len(datecheck)):
if (datecheck[datedcount][0]) in inscription:
dated = (after(inscription,(datecheck[datedcount][0])))
if dated.isdigit() == False:
dated = (before(dated,(datecheck[datedcount][1])))
dated = dated.strip()
if dated.isdigit() == True:
dated_list[lister] = dated
datedcount = datedcount + 1
Perhaps this is what you are looking for?
datecheck = [['dated ',' and inscribed '],['dated ?','verso'],['dated ','lower right']]
for month in ["Easter '","December"]:
datecheck.append(['dated {0}'.format(month), 'in'])
datedcount = 0
while datedcount < len(datecheck): ...
In other words, we initialize datecheck with the static members on your list, then append a couple of dynamically-generated ones. Then you can loop over the final list just like before.
There is no nesting of loops here, just two sequential loops, where the first loops over the expressions we want to add.
Related
Below I have some strings in a list:
some_list = ['a','l','p','p','l','l','i','i','r',i','r','a','a']
Now I want to take the word april from this list. There are only two april in this list. So I want to take that two april from this list and append them to another extract list.
So the extract list should look something like this:
extract = ['aprilapril']
or
extract = ['a','p','r','i','l','a','p','r','i','l']
I tried many times trying to get the everything in extract in order, but I still can't seems to get it.
But I know I can just do this
a_count = some_list.count('a')
p_count = some_list.count('p')
r_count = some_list.count('r')
i_count = some_list.count('i')
l_count = some_list.count('l')
total_count = [a_count,p_count,r_count,i_count,l_count]
smallest_count = min(total_count)
extract = ['april' * smallest_count]
Which I wouldn't be here If I just use the code above.
Because I made some rules for solving this problem
Each of the characters (a,p,r,i and l) are some magical code elements, these code elements can't be created out of thin air; they are some unique code elements, that has some uniquw identifier, like a secrete number that is associated with them. So you don't know how to create this magical code elements, the only way to get the code elements is to extract them to a list.
Each of the characters (a,p,r,i and l) must be in order. Imagine they are some kind of chains, they will only work if they are together. Meaning that we got to put p next to and in front of a, and l must come last.
These important code elements are some kind of top secrete stuff, so if you want to get it, the only way is to extract them to a list.
Below are some examples of a incorrect way to do this: (breaking the rules)
import re
word = 'april'
some_list = ['aaaaaaappppppprrrrrriiiiiilll']
regex = "".join(f"({c}+)" for c in word)
match = re.match(regex, text)
if match:
lowest_amount = min(len(g) for g in match.groups())
print(word * lowest_amount)
else:
print("no match")
from collections import Counter
def count_recurrence(kernel, string):
# we need to count both strings
kernel_counter = Counter(kernel)
string_counter = Counter(string)
effective_counter = {
k: int(string_counter.get(k, 0)/v)
for k, v in kernel_counter.items()
}
min_recurring_count = min(effective_counter.values())
return kernel * min_recurring_count
This might sounds really stupid, but this is actually a hard problem (well for me). I originally designed this problem for myself to practice python, but it turns out to be way harder than I thought. I just want to see how other people solve this problem.
If anyone out there know how to solve this ridiculous problem, please help me out, I am just a fourteen-year-old trying to do python. Thank you very much.
I'm not sure what do you mean by "cannot copy nor delete the magical codes" - if you want to put them in your output list you will need to "copy" them somehow.
And btw your example code (a_count = some_list.count('a') etc) won't work since count will always return zero.
That said, a possible solution is
worklist = [c for c in some_list[0]]
extract = []
fail = False
while not fail:
lastpos = -1
tempextract = []
for magic in magics:
if magic in worklist:
pos = worklist.index(magic, lastpos+1)
tempextract.append(worklist.pop(pos))
lastpos = pos-1
else:
fail = True
break
else:
extract.append(tempextract)
Alternatively, if you don't want to pop the elements when you find them, you may compute the positions of all the occurences of the first element (the "a"), and set lastpos to each of those positions at the beginning of each iteration
May not be the most efficient way, although code works and is more explicit to understand the program logic:
some_list = ['aaaaaaappppppprrrrrriiiiiilll']
word = 'april'
extract = []
remove = []
string = some_list[0]
for x in range(len(some_list[0])//len(word)): #maximum number of times `word` can appear in `some_list[0]`
pointer = i = 0
while i<len(word):
j=0
while j<(len(string)-pointer):
if string[pointer:][j] == word[i]:
extract.append(word[i])
remove.append(pointer+j)
i+=1
pointer = j+1
break
j+=1
if i==len(word):
for r_i,r in enumerate(remove):
string = string[:r-r_i] + string[r-r_i+1:]
remove = []
elif j==(len(string)-pointer):
break
print(extract,string)
Can anyone help me to understand how I would create a function with def whatever() instead of using a for loop. I'm trying to do thing more Pythonically but don't really understand how to apply a function well instead of a loop. For instance, I have a loop below that works well and gives the output I would like, is there a way to do this with a function?
seasons = leaguesFinal['season'].unique()
teams = teamsDF['team_long_name'].unique()
df = []
for i in seasons:
season = leaguesFinal['season'] == i
season = leaguesFinal[season]
for j in teams:
team_season_wins = season['win'] == j
team_season_win_record = team_season_wins[team_season_wins].count()
team_season_loss = season['loss'] == j
team_season_loss_record = team_season_loss[team_season_loss].count()
df.append((j, i, team_season_win_record, team_season_loss_record))
df = pd.DataFrame(df, columns=('Team', 'Seasons', 'Wins', 'Losses'))
The output looks as follows:
Team Seasons Wins Losses
0 KRC Genk 2008/2009 15 14
1 Beerschot AC 2008/2009 11 14
2 SV Zulte-Waregem 2008/2009 16 11
3 Sporting Lokeren 2008/2009 13 9
4 KSV Cercle Brugge 2008/2009 14 15
Solution
def some_loop(something, something_else):
for i in something:
season = leaguesFinal['sesaon'] == i
season = leaguesFinal[season]
for j in something_else:
team_season_wins = season['win'] == j
team_season_win_record = team_season_wins[team_season_wins].count()
team_season_loss = season['loss'] == j
team_season_loss_record = team_season_loss[team_season_loss].count()
df.append((j, i, team_season_win_record, team_season_loss_record))
some_loop(seasons, teams)
Comments
This is what you are mentioning, creating a function out of the for loop although you still have a for loop its in a function that you can use in different areas of your code without re-using the entire code for the loop.
All there is to to is define a function that accepts two variables for this particular loop that would be def some_loop(something, something_else), I used basic naming so you could see clearer whats taking place.
Then you would replace all the instanes of seasons and teams with those variables.
Now you call your function will replace all occurences of something and something_else with whatever inputs you send to it.
Also I am not completely sure of the statements that involve x = y = i and what this accomplishes or if its even a valid statement?
actually youre mixing stuff up - functions just aggregate lines of code and thus make them reproducable without writing everything again, whereas for-loops are for iteration purposes.
In your above mentioned example, a function would just contain the for-loop and return the resulting dataframe, which you could use then. but it will not change anything or make your code smarter.
I've been write this practice program for while now, the whole purpose of the code is to get user input and generate passwords, everything almost works, but the replace statements are driving me nuts. Maybe one of you smart programmers can help me, because I'm kinda new to this whole field of programming. The issue is that replace statement only seems to work with the first char in Strng, but not the others one. The other funcs blower the last run first and then the middle one runs.
def Manip(Strng):
#Strng = 'jayjay'
print (Strng.replace('j','h',1))
#Displays: 'hayjay'
print (Strng.replace('j','h',4))
#Displays: 'hayhay'
return
def Add_nums(Strng):
Size=len(str(Strng))
Total_per = str(Strng).count('%')
# Get The % Spots Position, So they only get replaced with numbers during permutation
currnt_Pos = 0
per = [] # % position per for percent
rGen = ''
for i in str(Strng):
if i == str('%'):
per.append(currnt_Pos)
currnt_Pos+=1
for num,pos in zip(str(self.ints),per):
rGen = Strng.replace(str(Strng[pos]),str(num),4);
return rGen
for pos in AlphaB: # DataBase Of The Positions Of Alphabets
for letter in self.alphas: #letters in The User Inputs
GenPass=(self.forms.replace(self.forms[pos],letter,int(pos)))
# Not Fully Formatted yet; you got something like Cat%%%, so you can use another function to change % to nums
# And use the permutations function to generate other passwrds and then
# continue to the rest of this for loop which will generate something like cat222 or cat333
Add_nums(GenPass) # The Function That will add numbers to the Cat%%%
print (rGen);exit()
I am an almost new programmer learning python for a few months. For the last 2 weeks, I had been coding to make a script to search permutations of numbers that make magic squares.
Finally I succeeded in searching the whole 880 4x4 magic square numbers sets within 30 seconds. After that I made some different Perimeter Magic Square program. It finds out more than 10,000,000 permutations so that I want to store them part by part to files. The problem is that my program doesn't use all my processes that while it is working to store some partial data to a file, it stops searching new number sets. I hope I could make one process of my CPU keep searching on and the others store the searched data to files.
The following is of the similar structure to my magic square program.
while True:
print('How many digits do you want? (more than 20): ', end='')
ansr = input()
if ansr.isdigit() and int(ansr) > 20:
ansr = int(ansr)
break
else:
continue
fileNum = 0
itemCount = 0
def fileMaker():
global fileNum, itemCount
tempStr = ''
for i in permutationList:
itemCount += 1
tempStr += str(sum(i[:3])) + ' : ' + str(i) + ' : ' + str(itemCount) + '\n'
fileNum += 1
file = open('{0} Permutations {1:03}.txt'.format(ansr, fileNum), 'w')
file.write(tempStr)
file.close()
numList = [i for i in range(1, ansr+1)]
permutationList = []
itemCount = 0
def makePermutList(numList, ansr):
global permutationList
for i in numList:
numList1 = numList[:]
numList1.remove(i)
for ii in numList1:
numList2 = numList1[:]
numList2.remove(ii)
for iii in numList2:
numList3 = numList2[:]
numList3.remove(iii)
for iiii in numList3:
numList4 = numList3[:]
numList4.remove(iiii)
for v in numList4:
permutationList.append([i, ii, iii, iiii, v])
if len(permutationList) == 200000:
print(permutationList[-1])
fileMaker()
permutationList = []
fileMaker()
makePermutList(numList, ansr)
I added from multiprocessing import Pool at the top. And I replaced two 'fileMaker()' parts at the end with the following.
if __name__ == '__main__':
workers = Pool(processes=2)
workers.map(fileMaker, ())
The result? Oh no. It just works awkwardly. For now, multiprocessing looks too difficult for me.
Anybody, please, teach me something. How should my code be modified?
Well, addressing some things that are bugging me before getting to your asked question.
numList = [i for i in range(1, ansr+1)]
I know list comprehensions are cool, but please just do list(range(1, ansr+1)) if you need the iterable to be a list (which you probably don't need, but I digress).
def makePermutList(numList, ansr):
...
This is quite the hack. Is there a reason you can't use itertools.permutations(numList,n)? It's certainly going to be faster, and friendlier on memory.
Lastly, answering your question: if you are looking to improve i/o performance, the last thing you should do is make it multithreaded. I don't mean you shouldn't do it, I mean that it should literally be the last thing you do. Refactor/improve other things first.
You need to take all of that top-level code that uses globals, apply the backspace key to it, and rewrite functions that pass data around properly. Then you can think about using threads. I would personally use from threading import Thread and manually spawn Threads to do each unit of I/O rather than using multiprocessing.
allow me to preface this by saying that i am learning python on my own as part of my own curiosity, and i was recommended a free online computer science course that is publicly available, so i apologize if i am using terms incorrectly.
i have seen questions regarding this particular problem on here before - but i have a separate question from them and did not want to hijack those threads. the question:
"a substring is any consecutive sequence of characters inside another string. The same substring may occur several times inside the same string: for example "assesses" has the substring "sses" 2 times, and "trans-Panamanian banana" has the substring "an" 6 times. Write a program that takes two lines of input, we call the first needle and the second haystack. Print the number of times that needle occurs as a substring of haystack."
my solution (which works) is:
first = str(input())
second = str(input())
count = 0
location = 0
while location < len(second):
if location == 0:
location = str.find(second,first,0)
if location < 0:
break
count = count + 1
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
print(count)
if you notice, i have on two separate occasions made the if statement that if location is less than 0, to break. is there some way to make this a 'global' condition so i do not have repetitive code? i imagine efficiency becomes paramount with increasing program sophistication so i am trying to develop good practice now.
how would python gurus optimize this code or am i just being too nitpicky?
I think Matthew and darshan have the best solution. I will just post a variation which is based on your solution:
first = str(input())
second = str(input())
def count_needle(first, second):
location = str.find(second,first)
if location == -1:
return 0 # none whatsoever
else:
count = 1
while location < len(second):
location = str.find(second,first,location +1)
if location < 0:
break
count = count + 1
return count
print(count_needle(first, second))
Idea:
use function to structure the code when appropriate
initialise the variable location before entering the while loop save you from checking location < 0 multiple times
Check out regular expressions, python's re module (http://docs.python.org/library/re.html). For example,
import re
first = str(input())
second = str(input())
regex = first[:-1] + '(?=' + first[-1] + ')'
print(len(re.findall(regex, second)))
As mentioned by Matthew Adams the best way to do it is using python'd re module Python re module.
For your case the solution would look something like this:
import re
def find_needle_in_heystack(needle, heystack):
return len(re.findall(needle, heystack))
Since you are learning python, best way would be to use 'DRY' [Don't Repeat Yourself] mantra. There are lots of python utilities that you can use for many similar situation.
For a quick overview of few very important python modules you can go through this class:
Google Python Class
which should only take you a day.
even your aproach could be imo simplified (which uses the fact, that find returns -1, while you aks it to search from non existent offset):
>>> x = 'xoxoxo'
>>> start = x.find('o')
>>> indexes = []
>>> while start > -1:
... indexes.append(start)
... start = x.find('o',start+1)
>>> indexes
[1, 3, 5]
needle = "ss"
haystack = "ssi lass 2 vecess estan ss."
print 'needle occurs %d times in haystack.' % haystack.count(needle)
Here you go :
first = str(input())
second = str(input())
x=len(first)
counter=0
for i in range(0,len(second)):
if first==second[i:(x+i)]:
counter=counter+1
print(counter)
Answer
needle=input()
haystack=input()
counter=0
for i in range(0,len(haystack)):
if(haystack[i:len(needle)+i]!=needle):
continue
counter=counter+1
print(counter)