I have this code that generate all the 2**40 possible binary numbers, and from this binary numbers, i will try to get all the vectors that match my objectif function conditions which is:
1- each vector in the matrix must have 20 of ones(1).
2- the sum of s = s + (the index of one +1)* the rank of the one must equal 4970.
i wrote this code but it will take a lot of time maybe months, to give the results. Now, i am looking for an alternative way or an optimization of this code if possible.
import time
from multiprocessing import Process
from multiprocessing import Pool
import numpy as np
import itertools
import numpy
CC = 20
#test if there is 20 numbers of 1
def test1numebers(v,x=1,x_l=CC):
c = 0
for i in range(len(v)):
if(v[i]==x):
c+=1
if c == x_l:
return True
else:
return False
#s = s+ the nth of 1 * (index+1)
def objectif_function(v,x=1):
s = 0
for i in range(len(v)):
if(v[i]==x):
s = s+((i+1)*nthi(v,i))
return s
#calculate the nth of 1 in a vecteur
def nthi(v,i):
c = 0
for j in range(0,i+1):
if(v[j] == 1):
c+=1
return c
#generate 2**40 of all possible binray numbers
def generateMatrix(N):
l = itertools.product([0, 1], repeat=N)
return l
#function that get the number of valide vector that match our objectif function
def main_algo(N=40,S=4970):
#N = 40
m = generateMatrix(N)
#S = 4970
c = 0
ii = 0
for i in m:
ii+=1
print("\n count:",ii)
xx = i
if(test1numebers(xx)):
if(objectif_function(xx)==S):
c+=1
print('found one')
print('\n',xx,'\n')
if ii>=1000000:
break
t_end = time.time()
print('time taken for 10**6 is: ',t_end-t_start)
print(c)
#main_algo()
if __name__ == '__main__':
'''p = Process(target=main_algo, args=(40,4970,))
p.start()
p.join()'''
p = Pool(150)
print(p.map(main_algo, [40,4970]))
While you could make a lot of improvements in readability and make your code more pythonic.
I recommend that you use numpy which is the fastest way of working with matrixes.
Avoid working with matrixes on a "pixel by pixel" loop. With numpy you can make those calculations faster and with all the data at once.
Also numpy has support for generating matrixes really fast. I think that you could make a random [0,1] matrix in less lines of code and quite faster.
Also i recommend that you install OPENBLAS, ATLAS and LAPACK which make linear algebra calculations quite faster.
I hope this helps you.
Beginner programmer here:)) I am working on a school project, where the assignment is to find the roots to five functions in one file.
In one of my functions there are two roots, and my code can only find one. It seems like the second while-loop is ignored. I've tried to put only this code in a separate file, and it worked - but together with the other files it wont work...
Just ask if there is something that´s weird;)
def b(x: float):
return -x**2+x+6
def bgraf():
xlim(a1, b1)
ylim(-15, 25)
x = linspace(-5, 10, 1000)
y = b(x)
plot(x, y, 'r')
return
funksjoner = [0, 1, 2, 3, 4]
while response not in funksjoner:
i = int(input("choose a function from 0 to 4"))
response = i
if response in funksjoner:
print("you chose function ", int(funksjoner[i]))
a1 = float(input())
b1 = float(input())
z = a1
y = b1
m = a1
n = b1
NP = True
if int(funksjoner[i]) == funksjoner[1]:
while abs(y-z) > 1e-10:
null1 = (z+y)/2
if b(z)*b(null1)>0 and b(y)*b(null1)>0:
NP = False
print('No roots in this interval')
bgraf()
break
elif b(null1) == 0:
break
elif b(z)*b(null1)>0:
z = null1
else :
y = null1
while abs(n-m) > 1e-10:
null1_2 = (m+n)/2
if b(m)*b(null1_2)>0 and b(n)*b(null1_2)>0:
NP = False
print('No roots in this interval')
bgraf()
break
elif b(null1_2) == 0:
break
elif b(m)*b(null1_2)>0:
m = null1_2
else :
n = null1_2
if NP :
print('we have a root when x =', round(((z+y)/2), 1))
if null1 != null1_2:
print('and when x =', round(((m+n)/2), 1))
bgraf()
scatter(null1, 0)
if null1 != null1_2:
scatter(null1_2, 0)
It looks like python is ignoring the second while-loop I placed under the if-statement. Is there another way I could this?
Thanks for your attention!
Several things to think about:
what do you want to achieve with the following line of code:
If int(funksjoner[i]) == funksjoner[1]
You could simply check
If i == 1
i don‘t see any difference between the first and the second while loop.
z=m=a1
y=n=a2
So what should be the difference between those two?
In General the code is hard to read because of the naming of the variables, try to use variables which give you an impression what they contain.
For getting better impression what is going on in your code either use debugging, or if you are not familiar with debugging add print statements in your code to better understand what is stored in your variables at which time of the execution. And what statements are executed and which are skipped/not reached
When you give us more detailed information about your code (you could e.g. add comments to explain your code), have more detailed questions we can better support you
I am trying to implement dijkstra's algorithm (on an undirected graph) to find the shortest path and my code is this.
Note: I am not using heap/priority queue or anything but an adjacency list, a dictionary to store weights and a bool list to avoid cycling in the loops/recursion forever. Also, the algorithm works for most test cases but fails for this particular one here: https://ideone.com/iBAT0q
Important : Graph can have multiple edges from v1 to v2 (or vice versa), you have to use the minimum weight.
import sys
sys.setrecursionlimit(10000)
def findMin(n):
for i in x[n]:
cost[n] = min(cost[n],cost[i]+w[(n,i)])
def dik(s):
for i in x[s]:
if done[i]:
findMin(i)
done[i] = False
dik(i)
return
q = int(input())
for _ in range(q):
n,e = map(int,input().split())
x = [[] for _ in range(n)]
done = [True]*n
w = {}
cost = [1000000000000000000]*n
for k in range(e):
i,j,c = map(int,input().split())
x[i-1].append(j-1)
x[j-1].append(i-1)
try: #Avoiding multiple edges
w[(i-1,j-1)] = min(c,w[(i-1,j-1)])
w[(j-1,i-1)] = w[(i-1,j-1)]
except:
try:
w[(i-1,j-1)] = min(c,w[(j-1,i-1)])
w[(j-1,i-1)] = w[(i-1,j-1)]
except:
w[(j-1,i-1)] = c
w[(i-1,j-1)] = c
src = int(input())-1
#for i in sorted(w.keys()):
# print(i,w[i])
done[src] = False
cost[src] = 0
dik(src) #First iteration assigns possible minimum to all nodes
done = [True]*n
dik(src) #Second iteration to ensure they are minimum
for val in cost:
if val == 1000000000000000000:
print(-1,end=' ')
continue
if val!=0:
print(val,end=' ')
print()
The optimum isn't always found in the second pass. If you add a third pass to your example, you get closer to the expected result and after the fourth iteration, you're there.
You could iterate until no more changes are made to the cost array:
done[src] = False
cost[src] = 0
dik(src)
while True:
ocost = list(cost) # copy for comparison
done = [True]*n
dik(src)
if cost == ocost:
break
This tries to factorise, I have made the code in this way as I intend to change some features to allow for more functionality but what I want to know is why my results for xneg and xpos are both 0.
import math
sqrt = math.sqrt
equation = input("Enter the equation in the form x^2 + 5x + 6 : ")
x2coe = 0
xcoe = 0
ecoe = 0
counter = -1
rint = ''
for each in range(len(equation)+1):
if equation[each] == 'x':
break
x2coe = int(equation[each])
counter = counter + 1
for each in range(len(equation)):
if equation[each] == 'x':
break
xcoe = int(equation[counter + 5:counter + 6])
ecoe = int(equation[len(equation) - 1])
if x2coe == 0:
x2coe = 1
if xcoe == 0:
xcoe = 1
xpos = (-xcoe+sqrt((xcoe**2)-4*(x2coe*ecoe)))/(2*x2coe)
xneg = (-xcoe-sqrt((xcoe**2)-4*(x2coe*ecoe)))/(2*x2coe)
print("Possible Solutions")
print("-----------------------------------------------")
print("X = {0}".format(xpos))
print("X = {0}".format(xneg))
print("-----------------------------------------------")
It's because your x2coe and xcoe variables are both 0 when you reach the computations for xpos and xneg. You would have received a division by zero, except for what looks like another problem. The xpos & xneg expressions look like the quadratic formula, but you are dividing by 2 and then multiplying by x2coe at the end. Multiplication and division have equal precedence and group from left to right, so you need to use one of:
xpos = (-xcoe+sqrt((xcoe**2)-4*(x2coe*ecoe)))/(2*x2coe) # one way to fix
xneg = (-xcoe-sqrt((xcoe**2)-4*(x2coe*ecoe)))/2/x2coe # another, slower way
I suggest that you get the "business" logic of your program debugged first, and just input the three coefficients as a tuple or list.
x2coe, xcoe, ecoe = eval(input("Enter coefficients of ax^2+bx+c as a,b,c: "))
When your factoring code gives the results you want, then go back and put in a fancy input handler.
Hint: import re. Regular expressions are a good tool for simple parsing like this. (You'll need something even fancier if you want to handle parentheses/brackets/braces some day.) Take a look at the how-to document at http://docs.python.org/3.3/howto/regex.html first, and also bookmark the re module documentation at http://docs.python.org/3.3/library/re.html
The problem is probably that you're hard-coding how long you think each coefficient should be: 1 digit. You should use another function that would make it more flexible. Any of the coefficients could be blank, in which case A or B should be assumed to be 1 and C should be assumed to be 0.
Hopefully this will help:
p = re.compile('\s*(\d*)\s*x\^2\s*\+\s*(\d*)\s*x\s*\+\s*(\d*)\s*')
A, B, C = p.match(equation).group(1, 2, 3)
print(A, B, C)
All of the instances of \s* are to allow for flexibility in input, so spaces don't kill you.
I'm programming a spellcheck program in Python. I have a list of valid words (the dictionary) and I need to output a list of words from this dictionary that have an edit distance of 2 from a given invalid word.
I know I need to start by generating a list with an edit distance of one from the invalid word(and then run that again on all the generated words). I have three methods, inserts(...), deletions(...) and changes(...) that should output a list of words with an edit distance of 1, where inserts outputs all valid words with one more letter than the given word, deletions outputs all valid words with one less letter, and changes outputs all valid words with one different letter.
I've checked a bunch of places but I can't seem to find an algorithm that describes this process. All the ideas I've come up with involve looping through the dictionary list multiple times, which would be extremely time consuming. If anyone could offer some insight, I'd be extremely grateful.
The thing you are looking at is called an edit distance and here is a nice explanation on wiki. There are a lot of ways how to define a distance between the two words and the one that you want is called Levenshtein distance and here is a DP (dynamic programming) implementation in python.
def levenshteinDistance(s1, s2):
if len(s1) > len(s2):
s1, s2 = s2, s1
distances = range(len(s1) + 1)
for i2, c2 in enumerate(s2):
distances_ = [i2+1]
for i1, c1 in enumerate(s1):
if c1 == c2:
distances_.append(distances[i1])
else:
distances_.append(1 + min((distances[i1], distances[i1 + 1], distances_[-1])))
distances = distances_
return distances[-1]
And a couple of more implementations are here.
difflib in the standard library has various utilities for sequence matching, including the get_close_matches method that you could use. It uses an algorithm adapted from Ratcliff and Obershelp.
From the docs
>>> from difflib import get_close_matches
>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
['apple', 'ape']
Here is my version for Levenshtein distance
def edit_distance(s1, s2):
m=len(s1)+1
n=len(s2)+1
tbl = {}
for i in range(m): tbl[i,0]=i
for j in range(n): tbl[0,j]=j
for i in range(1, m):
for j in range(1, n):
cost = 0 if s1[i-1] == s2[j-1] else 1
tbl[i,j] = min(tbl[i, j-1]+1, tbl[i-1, j]+1, tbl[i-1, j-1]+cost)
return tbl[i,j]
print(edit_distance("Helloworld", "HalloWorld"))
#this calculates edit distance not levenstein edit distance
word1="rice"
word2="ice"
len_1=len(word1)
len_2=len(word2)
x =[[0]*(len_2+1) for _ in range(len_1+1)]#the matrix whose last element ->edit distance
for i in range(0,len_1+1): #initialization of base case values
x[i][0]=i
for j in range(0,len_2+1):
x[0][j]=j
for i in range (1,len_1+1):
for j in range(1,len_2+1):
if word1[i-1]==word2[j-1]:
x[i][j] = x[i-1][j-1]
else :
x[i][j]= min(x[i][j-1],x[i-1][j],x[i-1][j-1])+1
print x[i][j]
Using the SequenceMatcher from Python built-in difflib is another way of doing it, but (as correctly pointed out in the comments), the result does not match the definition of an edit distance exactly. Bonus: it supports ignoring "junk" parts (e.g. spaces or punctuation).
from difflib import SequenceMatcher
a = 'kitten'
b = 'sitting'
required_edits = [
code
for code in (
SequenceMatcher(a=a, b=b, autojunk=False)
.get_opcodes()
)
if code[0] != 'equal'
]
required_edits
# [
# # (tag, i1, i2, j1, j2)
# ('replace', 0, 1, 0, 1), # replace a[0:1]="k" with b[0:1]="s"
# ('replace', 4, 5, 4, 5), # replace a[4:5]="e" with b[4:5]="i"
# ('insert', 6, 6, 6, 7), # insert b[6:7]="g" after a[6:6]="n"
# ]
# the edit distance:
len(required_edits) # == 3
I would recommend not creating this kind of code on your own. There are libraries for that.
For instance the Levenshtein library.
In [2]: Levenshtein.distance("foo", "foobar")
Out[2]: 3
In [3]: Levenshtein.distance("barfoo", "foobar")
Out[3]: 6
In [4]: Levenshtein.distance("Buroucrazy", "Bureaucracy")
Out[4]: 3
In [5]: Levenshtein.distance("Misisipi", "Mississippi")
Out[5]: 3
In [6]: Levenshtein.distance("Misisipi", "Misty Mountains")
Out[6]: 11
In [7]: Levenshtein.distance("Buroucrazy", "Born Crazy")
Out[7]: 4
Similar to Santoshi's solution above but I made three changes:
One line initialization instead of five
No need to define cost alone (just use int(boolean) 0 or 1)
Instead of double for loop use product, (this last one is only cosmetic, double loop seems unavoidable)
from itertools import product
def edit_distance(s1,s2):
d={ **{(i,0):i for i in range(len(s1)+1)},**{(0,j):j for j in range(len(s2)+1)}}
for i, j in product(range(1,len(s1)+1), range(1,len(s2)+1)):
d[i,j]=min((s1[i-1]!=s2[j-1]) + d[i-1,j-1], d[i-1,j]+1, d[i,j-1]+1)
return d[i,j]
Instead of going with Levenshtein distance algo use BK tree or TRIE, as these algorithms have less complexity then edit distance. A good browse over these topic will give a detailed description.
This link will help you more about spell checking.
You need Minimum Edit Distance for this task.
Following is my version of MED a.k.a Levenshtein Distance.
def MED_character(str1,str2):
cost=0
len1=len(str1)
len2=len(str2)
#output the length of other string in case the length of any of the string is zero
if len1==0:
return len2
if len2==0:
return len1
accumulator = [[0 for x in range(len2)] for y in range(len1)] #initializing a zero matrix
# initializing the base cases
for i in range(0,len1):
accumulator[i][0] = i;
for i in range(0,len2):
accumulator[0][i] = i;
# we take the accumulator and iterate through it row by row.
for i in range(1,len1):
char1=str1[i]
for j in range(1,len2):
char2=str2[j]
cost1=0
if char1!=char2:
cost1=2 #cost for substitution
accumulator[i][j]=min(accumulator[i-1][j]+1, accumulator[i][j-1]+1, accumulator[i-1][j-1] + cost1 )
cost=accumulator[len1-1][len2-1]
return cost
Fine tuned codes based on the version from #Santosh and should address the issue brought up by #Artur Krajewski; The biggest difference is replacing an effective 2d matrix
def edit_distance(s1, s2):
# add a blank character for both strings
m=len(s1)+1
n=len(s2)+1
# launch a matrix
tbl = [[0] * n for i in range(m)]
for i in range(m): tbl[i][0]=i
for j in range(n): tbl[0][j]=j
for i in range(1, m):
for j in range(1, n):
#if strings have same letters, set operation cost as 0 otherwise 1
cost = 0 if s1[i-1] == s2[j-1] else 1
#find min practice
tbl[i][j] = min(tbl[i][j-1]+1, tbl[i-1][j]+1, tbl[i-1][j-1]+cost)
return tbl
edit_distance("birthday", "Birthdayyy")
following up on #krassowski's answer
from difflib import SequenceMatcher
def sequence_matcher_edits(word_a, word_b):
required_edits = [code for code in (
SequenceMatcher(a=word_a, b=word_b, autojunk=False).get_opcodes()
)
if code[0] != 'equal'
]
return len(required_edits)
print(f"sequence_matcher_edits {sequence_matcher_edits('kitten', 'sitting')}")
# -> sequence_matcher_edits 3