Task: It is necessary to output a chain of words so that the next word begins with the last letter of the previous word.
Input Example: "aac", "cas", "baa", "eeb"
Example output: "eeb", "baa", "aac", "cas"
When using a large number of words (~ 980), the program goes into an endless loop.
I think the problem is here, but I can’t solve it:
for j in range(0,N-U):
if (s[j][0] == words[i][-1]):
searchNextWord(s,j,U)
Code:
def searchNextWord(words,i,U):
global Result
s = [None]*N
if (Result==True):
return
res[U] = words[i]
U += 1
if (U == N):
Result = res[U - 1] == words[i]
if (Result==True):
return
for j in range(0,N-U):
if (j<i):
s[j]=words[j]
else:
s[j]=words[j+1]
for j in range(0,N-U):
if (s[j][0] == words[i][-1]):
searchNextWord(s,j,U)
words=[]
N=int(input())
if (N<1 or N>1000):
exit()
for i in range(N):
new_element = str(input())
words.append(new_element)
res = [None]*N
for i in words:
if (len(i)>10):
exit()
Result = False
for i in range(0,N):
if (Result==False):
searchNextWord(words, i, 0)
if (Result==True):
for i in range(0,N):
print(res[i])
else:
print("NO")
too many for loops man, your program does not go endless but it takes insane amount of time to execute!
use a linked list to do it. It will take more memory per word but it will make the search much more easier. Since you are reordering the input, you will need to consider implementing a browser class that caches the pairs into a variant of a binary tree(where each node contains the start and end of the contained elements. This will reduce the passes you need to perform the search.
Related
I built anagram generator. It works, but I don't know for loop for functions works at line 8, why does it works only in
for j in anagram(word[:i] + word[i+1:]):
why not
for j in anagram(word):
Also, I want to know what
for j in anagram(...)
means and doing...
what is j doing in this for loop?
this is my full code
def anagram(word):
n = len(word)
anagrams = []
if n <= 1:
return word
else:
for i in range(n):
for j in anagram(word[:i] + word[i+1:]):
anagrams.append(word[i:i+1] + j)
return anagrams
if __name__ == "__main__":
print(anagram("abc"))
The reason you can't write for i in anagram(word) is that it creates an infinite loop.
So for example if I write the recursive factorial function,
def fact(n):
if n <= 1:
return 1
return n * fact(n - 1)
This works and is not a circular definition because I am giving the computer two separate equations to compute the factorial:
n! = 1
n! = n (n-1)!
and I am telling it when to use each of these: the first one when n is 0 or 1, the second when n is larger than that. The key to its working is that eventually we stop using the second definition, and we instead use the first definition, which is called the “base case.” If I were to instead say another true definition like that n! = n! the computer would follow those instructions but we would never reduce down to the base case and so we would enter an infinite recursive loop. This loop would probably exhaust a resource called the “stack” rapidly, leading to errors about “excessive recursion” or too many “stack frames” or just “stack overflow” (for which this site is named!). And then if you gave it a mathematically invalid expression like n! = n n! it would infinitely loop and also it would be wrong even if it did not infinitely loop.
Factorials and anagrams are closely related, in fact we can say mathematically that
len(anagrams(f)) == fact(len(f))
so solving one means solving the other. In this case we are saying that the anagram of a word which is empty or of length 1 is just [word], the list containing just that word. (Your algorithm messes this case up a little bit, so it's a bug.)
The anagram of any other word must have something to do with anagrams of words of length len(word) - 1. So what we do is we pull each character out of the word and put it at the front of the anagram. So word[:i] + word[i+1:] is the word except it is missing the letter at index i, and word[i:i+1] is the space between these -- in other words it is the letter at index i.
This is NOT an answer but a guide for you to understand the logic by yourself.
Firstly you should understand one thing anagram(word[:i] + word[i+1:]) is not same as anagram(word)
>>> a = 'abcd'
>>> a[:2] + a[(2+1):]
'abd'
You can clearly see the difference.
And for a clearer understanding I would recommend you to print the result of every word in the recursion. put a print(word) statement before the loop starts.
#code for SieveOfEratosthenes here
SieveOfEratosthenes=SieveOfEratosthenes(999999)
t = int(input().strip())
for a0 in range(t):
N= input()
prime=set()
for w in range(1,7):
for i in range(0,len(N)):
substring=int(N[i:i+w])
if(N[i:i+w][-1]!=4 and N[i:i+w][-1]!=6 and N[i:i+w][-1]!=8 and N[i:i+w][-1]!=0):
if(len(str(substring))==w and substring in SieveOfEratosthenes):
prime.add(substring)
print(len(prime))
This code is working correctly but timesout for bigger.
Q: How to optimize it?
You do not give examples of your test cases, so we cannot know when it fails.
But here I present a optimized version of your code; at least I think I understood what you are trying to do.
First, I present a implementation of the sieve (not my own invention, the source is in the functions docstring):
def generate_primes():
"""
Generate an infinite sequence of prime numbers.
Sieve of Eratosthenes
Code by David Eppstein, UC Irvine, 28 Feb 2002
http://code.activestate.com/recipes/117119/
https://stackoverflow.com/a/568618/9225671
"""
# Maps composites to primes witnessing their compositeness.
# This is memory efficient, as the sieve is not "run forward"
# indefinitely, but only as long as required by the current
# number being tested.
D = {}
# The running integer that's checked for primeness
q = 2
while True:
if q not in D:
# q is a new prime.
# Yield it and mark its first multiple that isn't
# already marked in previous iterations
yield q
D[q * q] = [q]
else:
# q is composite. D[q] is the list of primes that
# divide it. Since we've reached q, we no longer
# need it in the map, but we'll mark the next
# multiples of its witnesses to prepare for larger
# numbers
for p in D[q]:
D.setdefault(p + q, []).append(p)
del D[q]
q += 1
Python code usually runs faster if you do not use global variables, so I put all the code inside a function. I also:
generate a set (not a list, because set provides faster membership checking) of prime numbers at the beginning.
removed the line
if(N[i:i+w][-1]!=4 and N[i:i+w][-1]!=6 and N[i:i+w][-1]!=8 and N[i:i+w][-1]!=0):
from your code, because it does nothing usefull; N[i:i+w][-1] is the last char of the substring, it has type str and will thus never be equal to an int.
My version looks like this:
def func():
max_prime_number = 10**6
primes_set = set()
for n in generate_primes():
if n < max_prime_number:
primes_set.add(n)
else:
break
print('len(primes_set):', len(primes_set))
while True:
print()
input_str = input('Enter "input_str":').strip()
if len(input_str) == 0:
break
print('Searching prime substring in', input_str)
prime_substrings = set()
for w in range(1, 7):
for i in range(len(input_str)):
n = int(input_str[i:i+w])
sub_str = str(n) # may be shorter than 'w' if it had leading zeros
if len(sub_str) == w:
if n in primes_set:
prime_substrings.add(sub_str)
print('len(prime_substrings):', len(prime_substrings))
we've started doing Lists in our class and I'm a bit confused thus coming here since previous questions/answers have helped me in the past.
The first question was to sum up all negative numbers in a list, I think I got it right but just want to double check.
import random
def sumNegative(lst):
sum = 0
for e in lst:
if e < 0:
sum = sum + e
return sum
lst = []
for i in range(100):
lst.append(random.randrange(-1000, 1000))
print(sumNegative(lst))
For the 2nd question, I'm a bit stuck on how to write it. The question was:
Count how many words occur in a list up to and including the first occurrence of the word “sap”. I'm assuming it's a random list but wasn't given much info so just going off that.
I know the ending would be similar but no idea how the initial part would be since it's string opposed to numbers.
I wrote a code for a in-class problem which was to count how many odd numbers are on a list(It was random list here, so assuming it's random for that question as well) and got:
import random
def countOdd(lst):
odd = 0
for e in lst:
if e % 2 = 0:
odd = odd + 1
return odd
lst = []
for i in range(100):
lst.append(random.randint(0, 1000))
print(countOdd(lst))
How exactly would I change this to fit the criteria for the 2nd question? I'm just confused on that part. Thanks.
The code to sum -ve numbers looks fine! I might suggest testing it on a list that you can manually check, such as:
print(sumNegative([1, -1, -2]))
The same logic would apply to your random list.
A note about your countOdd function, it appears that you are missing an = (== checks for equality, = is for assignment) and the code seems to count even numbers, not odd. The code should be:
def countOdd(lst):
odd = 0
for e in lst:
if e%2 == 1: # Odd%2 == 1
odd = odd + 1
return odd
As for your second question, you can use a very similar function:
def countWordsBeforeSap(inputList):
numWords = 0
for word in inputList:
if word.lower() != "sap":
numWords = numWords + 1
else:
return numWords
inputList = ["trees", "produce", "sap"]
print(countWordsBeforeSap(inputList))
To explain the above, the countWordsBeforeSap function:
Starts iterating through the words.
If the word is anything other than "sap" it increments the counter and continues
If the word IS "sap" then it returns early from the function
The function could be more general by passing in the word that you wanted to check for:
def countWordsBefore(inputList, wordToCheckFor):
numWords = 0
for word in inputList:
if word.lower() != wordToCheckFor:
numWords = numWords + 1
else:
return numWords
inputList = ["trees", "produce", "sap"]
print(countWordsBeforeSap(inputList, "sap"))
If the words that you are checking come from a single string then you would initially need to split the string into individual words like so:
inputString = "Trees produce sap"
inputList = inputString.split(" ")
Which splits the initial string into words that are separated by spaces.
Hope this helps!
Tom
def count_words(lst, end="sap"):
"""Note that I added an extra input parameter.
This input parameter has a default value of "sap" which is the actual question.
However you can change this input parameter to any other word if you want to by
just doing "count_words(lst, "another_word".
"""
words = []
# First we need to loop through each item in the list.
for item in lst:
# We append the item to our "words" list first thing in this loop,
# as this will make sure we will count up to and INCLUDING.
words.append(item)
# Now check if we have reached the 'end' word.
if item == end:
# Break out of the loop prematurely, as we have reached the end.
break
# Our 'words' list now has all the words up to and including the 'end' variable.
# 'len' will return how many items there are in the list.
return len(words)
lst = ["something", "another", "woo", "sap", "this_wont_be_counted"]
print(count_words(lst))
Hope this helps you understand lists better!
You can make effective use of list/generator comprehensions. Below are fast and memory efficient.
1. Sum of negatives:
print(sum( i<0 for i in lst))
2. Count of words before sap: Like you sample list, it assumes no numbers are there in list.
print(lst.index('sap'))
If it's a random list. Filter strings. Find Index for sap
l = ['a','b',1,2,'sap',3,'d']
l = filter(lambda x: type(x)==str, l)
print(l.index('sap'))
3. Count of odd numbers:
print(sum(i%2 != 0 for i in lst))
I have a list of sublists each of which consists of one or more strings. I am comparing each string in one sublist to every other string in the other sublists. This consists of writing two for loops. However, my data set is ~5000 sublists, which means my program keeps running forever unless I run the code in increments of 500 sublists. How do I change the flow of this program so I can still look at all j values corresponding to each i, and yet be able to run the program for ~5000 sublists. (wn is Wordnet library)
Here's part of my code:
for i in range(len(somelist)):
if i == len(somelist)-1: #if the last sublist, do not compare
break
title_former = somelist[i]
for word in title_former:
singular = wn.morphy(word) #convert to singular
if singular == None:
pass
elif singular != None:
newWordSyn = getNewWordSyn(word,singular)
if not newWordSyn:
uncounted_words.append(word)
else:
for j in range(i+1,len(somelist)):
title_latter = somelist[j]
for word1 in title_latter:
singular1 = wn.morphy(word1)
if singular1 == None:
uncounted_words.append(word1)
elif singular1 != None:
newWordSyn1 = getNewWordSyn(word1,singular1)
tempSimilarity = newWordSyn.wup_similarity(newWordSyn1)
Example:
Input = [['space', 'invaders'], ['draw']]
Output= {('space','draw'):0.5,('invaders','draw'):0.2}
The output is a dictionary with corresponding string pair tuple and their similarity value. The above code snippet is not complete.
How about doing a bit of preprocessing instead of doing a bunch of operations over and over? I did not test this, but you get the idea; you need to take anything you can out of the loop.
# Preprocessing:
unencountered_words = []
preprocessed_somelist = []
for sublist in somelist:
new_sublist = []
preprocessed_somelist.append(new_sublist)
for word in sublist:
temp = wn.morphy(word)
if temp:
new_sublist.append(temp)
else:
unencountered_words.append(word)
# Nested loops:
for i in range(len(preprocessed_somelist) - 1): #equivalent to your logic
for word in preprocessed_somelist[i]:
for j in range(i+1, len(preprocessed_somelist)):
for word1 in preprocessed_somelist[j]:
tempSimilarity = newWordSyn.wup_similarity(newWordSyn1)
you could try something like this but I doubt it will be faster (and you will probably need to change the distance function)
def dist(s1,s2):
return sum([i!=j for i,j in zip(s1,s2)]) + abs(len(s1)-len(s2))
dict([((k,v),dist(k,v)) for k,v in itertools.product(Input1,Input2)]
This is always going to have scaling issues, because you're doing n^2 string comparisons. Julius' optimization is certainly a good starting point.
The next thing you can do is store similarity results so you don't have to compare the same words repeatedly.
One other optimisation you can make is store comparisons of words and reuse them if the same words are encountered.
key = (newWordSyn, newWordSyn1)
if key in prevCompared:
tempSimilarity = prevCompared[(word, word1)]
else:
tempSimilarity = newWordSyn.wup_similarity(newWordSyn1)
prevCompared[key] = tempSimilarity
prevCompared[(newWordSyn1, newWordSyn)] = tempSimilarity
This only helps if you will see a lot of the same word combination, but i think wup_similarity is quite expensive.
I am using the simple program below to see how long an iterative process takes to terminate. However, in line 15, I cannot figure out why I am getting index out range error.
An example of what I am trying to count is the number of steps it takes for the following example iteration: User inputs 4 and then 1234. Then we have: [1,2,3,4] --> [1,1,1,1] --> [0,0,0,0] and then termination. 2 steps is required to get to [0,0,0,0]. I have proven that for the values of n that I am inserting, the system goes to [0,0,0,0] eventually.
import math
index = input("Enter length: ")
n = int(index)
game = input("Enter Coordinates of length n as a number: ")
s = list(game)
Game = []
for k in s:
Game.append(int(k))
l = len(game)
while sum(Game) > 0:
Iteration = []
k = 0
j = 0
while j < l-1:
Iteration.append(math.fabs(Game[j]-Game[j+1])) # line 15
j = j+1
k = k+1
Game = Iteration
print(k)
Game = Iteration is probably why. When j = 1, Game will be a list with only one item because of that. Then, Game[1]-Game[2] will be out of bounds.
Your code is written in a very un-Pythonic style that suggests you're translating directly from C code. (Also, you should basically never use input(); it's insecure because it evaluates arbitrarily user-entered Python code! Use raw_input() instead.)
If you rewrite it in a more Pythonic style, it becomes clear what the problem is:
import math
# you don't do anything with this value, but okay
s = index = int(raw_input("Enter length: "))
# game/Game naming will lead to confusion in longer code
game = raw_input("Enter Coordinates of length n as a list of comma-separated numbers: ")
Game = [int(k) for k in game.split(',')]
l = len(Game)
while sum(Game) > 0:
Game = [math.fabs(Game[j]-Game[j+1]) for j in range(l-1)] # problem here
# no idea what k is for, but it's not used in the loop anywhere
The problem is that in every iteration through your inner while loop, or the line marked # problem here in my version, your Game list gets shorter by one element! So on the second time through the outer while loop, it reads an element past the end of Game.
I have no idea what this code is trying to do, so I can't really suggest a fix, but if you truly intend to shorten the list on every pass, then you of course need to account for its shorter length by putting l=len(Game) inside the while loop.