Regular expression to match the linear system ax+by=c - python

I'm looking for the best regular expression to match the linear system with 2 unknowns (ax+by=c) for Python module ’re’. Where a, b and c are positive or negative integers and I need to separate the match in
3 groups each one contains the value of a, b and c (with signs): group 1 containing ‘a’ value’s, group 2 containing ‘b’ value’s and group 3 containing ‘c’ value’s.
e.g.:
for -3x+y=-2, group1 will contain -3, group 2 will contain 1 and group 3 will contain -2
e.g.:
x+3y=-4
-2x+y=2
3x-y=2
...
What I used so far is :
r"(^[+-]?\d*)x([+-]?\d*)y=([+-]?\d*)"
It almost woks fine except when it has to deal with a negative sign and a or b are missing.
e.g.:
-x+2y=4
5x-y=3
I have to put 1 before x or y if they're negative to make it work:
-x+2y=4 => -1x+2=4
5x-y=3 => 5x-1y=3
Python code:
import numpy as np
import re
def solve(eq1,eq2):
match1 = re.match(r"(^[+-]?\d*)x([+-]?\d*)y=([+-]?\d*)", eq1)
a1, b1, c1 = match1.groups()
if a1 is None or a1== '':
a1=1
elif a1 == '-':
a1=-1
if b1 is None:
b1=1
elif b1 == '-':
b1=-1
elif b1 == '+':
b1 = 1
a1, b1, c1 = float(a1), float(b1), float(c1)
match2 = re.match(r"([+-]?\d*)x([+-]?\d*)y=([+-]?\d*)", eq2)
a2, b2, c2 = match2.groups()
if a2 is None or a2== '':
a2=1
elif a2 == '-':
a2=-1
if b2 is None:
b2=1
elif b2 == '-':
b2=-1
elif b2 == '+':
b2 = 1
a2, b2, c2 = float(a2), float(b2), float(c2)
A = np.array([[a1, b1], [a2, b2]])
B = np.array([[c1], [c2]])
print(np.linalg.inv(A) # B)
solve("x-y=7","2x+3y=4")
Output:
[[ 5.][-2.]]

Split based on regular expression x|y=, considering empty strings and + or - without numbers.
import re
ee = ['x+3y=-4', '-2x+y=2', '3x-y=2', '-x+2y=4', '5x-y=3']
for e in ee:
print([int(m+'1' if m in ['', '+', '-'] else m)
for m in re.split('x|y=', e)])
Output:
[1, 3, -4]
[-2, 1, 2]
[3, -1, 2]
[-1, 2, 4]
[5, -1, 3]
Update #1:
import numpy as np
import re
def solve(eq1, eq2):
coeffs = []
for e in [eq1, eq2]:
for m in re.split('x|y=', e):
coeffs.append(float(m + '1' if m in '+-' else m))
a1, b1, c1, a2, b2, c2 = coeffs
A = np.array([[a1, b1], [a2, b2]])
B = np.array([[c1], [c2]])
return np.linalg.inv(A) # B
print(solve("x-y=7", "2x+3y=4"))
Output:
[[ 5.]
[-2.]]
Check it online with rextester.

Related

How to implement generator for this function?

For function a(0) = 3, a(n) = 2*a(n-1) -1, generator should be like:
def p():
b = 3
while True:
yield b
b = 2 * b -1
So for function c(1) = 9, c(n) = 9*c(n-1) + 10**(n-1)- c(n-1),
how to write the generator for this function?
For the sequence, you have value for c(1) as 9, calculate the value for c(0) using c(1) which turns out to be 1, then write a generator which first yeilds c(0), and c(1), then for each next values, apply the formula, and get the next value and yield it, finally replace the previous value b0 by this next value b1 in order to continue the sequence.
def generate_seq():
b0 = 1
b1 = 9
n=2
yield b0
yield b1
while True:
b1 = 9*b1 + 10**(n-1) - b1
yield b1
b0 = b1
n += 1
seq = generate_seq()
for i in range(10):
print(next(seq))
OUTPUT:
1
9
82
756
7048
66384
631072
6048576
58388608
567108864
Similar to your original, just the extra power of 10:
def p():
c = 1
pow10 = 1
while True:
yield c
c = 8*c + pow10
pow10 *= 10
Try it online!

Python triplet of numbers

I would like to write a program for Pythagorean Triplet. Program for numbers a, b, c return Pythagorean three natural numbers a1, b1, c1 such that a1 >= a, b1 >= b, c1 >= c.
def Triplet(a, b, c):
a1 = a
b1 = b
n = 5
m = 0
while True:
m += 1
while b1 <= (b + n * m):
a1 = a
while a1 <= b1:
#while c1 > c:
c1 = (a1 * a1 + b1 * b1) ** .5
if c1 % 1 == 0:
return a1, b1, int(c1)
a1 += 1
b1 += 1
print(Triplet(3,4,6))
For input: (3, 4, 6), output should be: (6, 8, 10). Where is the error?
The issue is that you've commented out your incorrect check for c1 > c, but not replaced it with anything.
If you just add that condition back before the return, it works:
def Triplet(a,b,c):
a1=a
b1=b
n=5
m=0
while True:
m+=1
while b1<=(b+n*m):
a1=a
while a1<=b1:
c1=(a1*a1+b1*b1)**.5
if c1>=c and c1%1==0:
return a1,b1,int(c1)
a1+=1
b1+=1
print(Triplet(3,4,6))
If you change the condition to if c1%1==0 and c1>=c: then the issue will get fixed.
I ran it locally and i got (6, 8, 10)

Parsing specific columns of CSV in python

so I have this CSV and I would like to do the following:
Original data:
Parsed Data:
So, to put in words, if a column has commas then I want to create a new column with only one value and delete the column which has multiple values.
For example: N2 has I1, I3 and I4. Hence the new data gets 3 columns, containing one value only.
I want to make it dynamic in such a way that all the permutations are reflected. Like in the case of N3 that has 2 places and 2 items.
I am trying to use python's pandas to do this. Some help would be appreciated.
Here is another option:
df['Place'] = df['Place'].str.split(',')
df['Item'] = df['Item'].str.split(',')
exploded = pd.DataFrame([
a + [p, t] for *a, P, T in df.values
for p in P for t in T
], columns=df.columns)
And the output:
Name Place Item
0 N1 P1 I1
1 N2 P2 I1
2 N2 P2 I3
3 N2 P2 I4
4 N3 P2 I2
5 N3 P2 I5
6 N3 P3 I2
7 N3 P3 I5
You are effectively attempting to take the Cartesian product of each row, then binding the result back into a DataFrame. As such, you could use itertools and do something like
from itertools import chain, product
df_lists = df.applymap(lambda s: s.split(','))
pd.DataFrame(chain.from_iterable(df_lists.apply(lambda row: product(*row), axis=1)), columns=df.columns)
With your example input:
In [334]: df
Out[334]:
Name Place Item
0 N1 P1 I1
1 N2 P2 I1,I3,I4
2 N3 P2,P3 I2,I5
In [336]: df_lists = df.applymap(lambda s: s.split(','))
In [337]: pd.DataFrame(chain.from_iterable(df_lists.apply(lambda row: product(*row), axis=1)), columns=df.columns)
Out[337]:
Name Place Item
0 N1 P1 I1
1 N2 P2 I1
2 N2 P2 I3
3 N2 P2 I4
4 N3 P2 I2
5 N3 P2 I5
6 N3 P3 I2
7 N3 P3 I5
You can use iterrows() :
df = pd.DataFrame({'Name': ['N1', 'N2', 'N3'], 'Place':['P1', 'P2','P2,P3'], 'Item':['I1,', 'I1,I3,I4', 'I2,I5']})
result = pd.DataFrame()
new_result = pd.DataFrame()
df['Place'] = df['Place'].apply(lambda x: x.strip(','))
df['Item'] = df['Item'].apply(lambda x: x.strip(','))
for a,b in df.iterrows():
curr_row = df.iloc[a]
temp = ((curr_row['Place'].split(',')))
for x in temp:
curr_row['Place'] = x
result = result.append(curr_row, ignore_index=True)
for a,b in result.iterrows():
curr_row = result.iloc[a]
temp = ((curr_row['Item'].split(',')))
for x in temp:
curr_row['Item'] = x
new_result = new_result.append(curr_row, ignore_index=True)
Output:
Name Place Item
0 N1 P1 I1
1 N2 P2 I1
2 N2 P2 I3
3 N2 P2 I4
4 N3 P2 I2
5 N3 P2 I5
6 N3 P3 I2
7 N3 P3 I5
This is the simplest way you can achieve your desired output.
You can avoid the use of pandas. If you want to stick with the standard csv module, you simply have to split each field on comma (',') and then iterate on the splitted elements.
Code could be assuming the input delimiter is a semicolon (;) (I cannot know what it is except it cannot be a comma):
with open('input.csv', newline='') as fd, open('output.csv', 'w', newline='') as fdout:
rd = csv.DictReader(fd, delimiter=';')
wr = csv.writer(fdout)
_ = wr.writerow(rd.fieldnames)
for row in rd:
for i in row['Item'].split(','):
i = i.strip()
if len(i) != 0:
for p in row['Place'].split(','):
p = p.strip()
if len(p) != 0:
for n in row['Name'].split(','):
n = n.strip()
if len(n) != 0:
wr.writerow((n,p,i))
Output is:
Name,Place,Item
N1,P1,I1
N2,P2,I1
N2,P2,I3
N2,P2,I4
N3,P2,I2
N3,P3,I2
N3,P2,I5
N3,P3,I5

why is an error code for a2 not being defined show

find numbers and compares with eachother
to see if they are bigger or smaller.
x = input("Your First Capacity? ")
y = input("Your Second Capacity? ")
z = input("Your Required Capacity? ")
x = int(x)
y = int(y)
z = int(z)
if x <= z:
if y != z:
if x != z:
a1 = x
b1 = y
if y <= z:
if x != z:
if y != z:
a1 = y
b1 = x
if one number is the same the code dosnt carry on
if (x == z) or (y == z):
print("Required Capasity Already Reached")
a1 = 0
b1 = 0
statements for making a2 = the remaining of a1 for 0-10.
if a1 == 0:
a2 = a1
statements for making b2 = the remaining of b1 from 0-10.
if b1 == 0:
b2 = b1
a1 = a1 - 1
b1 = b1 - 1
Why is print(a2) returning a 0 intead of 1 if a1 = 3
Thanks in advance.
It is causing you an error because you are using if statements to compare your input to a string, when you have already converted them (your input) into integers. You should instead use
if a1 == 1: # Notice no quotation marks around the number. Important!!
a2 = a1 - 1
because you've used:
x = int(x)
y = int(y)
z = int(z)
You need to change this for all if statements.
Good luck!

Error upon converting a pandas dataframe to spark DataFrame

I created a pandas dataframe out of some StackOverFlow posts. Used lxml.eTree to separate the code_blocks and the text_blocks. Below code shows the basic outline :
import lxml.etree
a1 = tokensentRDD.map(lambda (a,b): (a,''.join(map(str,b))))
a2 = a1.map(lambda (a,b): (a, b.replace("<", "<")))
a3 = a2.map(lambda (a,b): (a, b.replace(">", ">")))
def parsefunc (x):
html = lxml.etree.HTML(x)
code_block = html.xpath('//code/text()')
text_block = html.xpath('// /text()')
a4 = code_block
a5 = len(code_block)
a6 = text_block
a7 = len(text_block)
a8 = ''.join(map(str,text_block)).split(' ')
a9 = len(a8)
a10 = nltk.word_tokenize(''.join(map(str,text_block)))
numOfI = 0
numOfQue = 0
numOfExclam = 0
for x in a10:
if x == 'I':
numOfI +=1
elif x == '?':
numOfQue +=1
elif x == '!':
numOfExclam
return (a4,a5,a6,a7,a9,numOfI,numOfQue, numOfExclam)
a11 = a3.take(6)
a12 = map(lambda (a,b): (a, parsefunc(b)), a11)
columns = ['code_block', 'len_code', 'text_block', 'len_text', 'words#text_block', 'numOfI', 'numOfQ', 'numOfExclam']
index = map(lambda x:x[0], a12)
data = map(lambda x:x[1], a12)
df = pd.DataFrame(data = data, columns = columns, index = index)
df.index.name = 'Id'
df
code_block len_code text_block len_text words#text_block numOfI numOfQ numOfExclam
Id
4 [decimal 3 [I want to use a track-bar to change a form's ... 18 72 5 1 0
6 [div, ] 5 [I have an absolutely positioned , div, conta... 22 96 4 4 0
9 [DateTime] 1 [Given a , DateTime, representing a person's ... 4 21 2 2 0
11 [DateTime] 1 [Given a specific , DateTime, value, how do I... 12 24 2 1 0
I need to create a Spark DataFrame on order to apply machine learning algorithms on the output. I tried:
sqlContext.createDataFrame(df).show()
The error I receive is:
TypeError: not supported type: <class 'lxml.etree._ElementStringResult'>
Can someone tell me a proper way to convert a Pandas DataFrame into A Spark DataFrame?
Your problem is not related to Pandas. Both code_block (a4) and text_block (a6) contain lxml specific objects which cannot be encoded using SparkSQL types. Converting these to strings should be just enough.
a4 = [str(x) for x in code_block]
a6 = [str(x) for x in text_block]

Categories