I have a g-code written in Fanuc g-code format including Macro-B (more info here), for example
#101 = 2.0 (first variable)
#102 = 0.1 (second variable)
#103 = [#101 + #102 * 3] (third variable using simple arithmetic)
G01 X#101 Y#103 F0.1
which should be converted to:
G01 X1.0 Y2.3 F0.1
more elaborate examples here and here.
things to be changed:
all instances of a variable slot should be replace with its value:
(#\d+)\s*=\s*(-?\d*\.\d+|\d+\.\d*)
arithmetic +, -, * and / inside the [...] need to be calculated:
(#\d+)\s*=\s*\[(#\d+|(-?\d*\.\d+|\d+\.\d*))(\s*[+\-*/]\s*(#\d+|(-?\d*\.\d+|\d+\.\d*|\d+)))*\]
comments (...) could be ignored or removed.
I would appreciate if you could help me know how i can do this in Python and if the regex I have above is correct. Thanks for your support in advance.
P.S.1. Unfortunately I can't find the syntax highlighting for fenced code blocks for g-code
P.S.2. when changing floats to strings one should consider the issue with Python floating point handeling. I made this function to solve that:
def f32str(inputFloat):
"""
This function converts a Python float to a string with 3 decimals
"""
return str(f"{inputFloat:.3f}")
OK I found a piece of code which does the job. Assuming gcode is a multiline string read from a Fanuc G-code:
import re
import os
def f32str(inputFloat):
return str(f"{inputFloat:.3f}")
gcode = re.sub(r"\(.*?\)", "", gcode)
flag = len(re.findall(r"#\d+", gcode))
while 0 < flag:
cases = re.findall(r"((#\d+)\s*=\s*([-+]?\s*(\d*\.\d+|\d+\.?\d*)))", gcode)
for case in cases:
gcode = gcode.replace(case[0], "")
gcode = gcode.replace(case[0], case[1])
cases = re.findall(r"(\[(\s*[+-]?\s*(\d+(\.\d*)?|\d*\.\d+)(\s*[-+*\/]\s*[+-]?\s*(\d+(\.\d*)?|\d*\.\d+)\s*)*)\])", gcode)
for case in cases:
gcode = gcode.replace(case[0], f32str(eval(case[1])))
flag = len(re.findall(r"#\d+", gcode))
gcode = os.linesep.join([s for s in gcode.splitlines() if s.strip()])
this is probably the worst way to do this and there should be more efficient implementations. I will leave the rest to the Python experts.
Related
Can someone let me know how to pull out certain values from a Python output.
I would like the retrieve the value 'ocweeklyreports' from the the following output using either indexing or slicing:
'config': '{"hiveView":"ocweeklycur.ocweeklyreports"}
This should be relatively easy, however, I'm having problem defining the Slicing / Indexing configuation
The following will successfully give me 'ocweeklyreports'
myslice = config['hiveView'][12:30]
However, I need the indexing or slicing modified so that I will get any value after'ocweeklycur'
I'm not sure what output you're dealing with and how robust you're wanting it but if it's just a string you can do something similar to this (for a quick and dirty solution).
input = "Your input"
indexStart = input.index('.') + 1 # Get the index of the input at the . which is where you would like to start collecting it
finalResponse = input[indexStart:-2])
print(finalResponse) # Prints ocweeklyreports
Again, not the most elegant solution but hopefully it helps or at least offers a starting point. Another more robust solution would be to use regex but I'm not that skilled in regex at the moment.
You could almost all of it using regex.
See if this helps:
import re
def search_word(di):
st = di["config"]["hiveView"]
p = re.compile(r'^ocweeklycur.(?P<word>\w+)')
m = p.search(st)
return m.group('word')
if __name__=="__main__":
d = {'config': {"hiveView":"ocweeklycur.ocweeklyreports"}}
print(search_word(d))
The following worked best for me:
# Extract the value of the "hiveView" key
hive_view = config['hiveView']
# Split the string on the '.' character
parts = hive_view.split('.')
# The value you want is the second part of the split string
desired_value = parts[1]
print(desired_value) # Output: "ocweeklyreports"
I'm relatively new to Sympy and had a lot of trouble with the information that I was able to scavenge on this site. My main goal is basically to take a string, representing some mathematical expression, and then save an image of that expression but in a cleaner form.
So for example, if this is the expression string:
"2**x+(3-(4*9))"
I want it to display like this
cleaner image.
This is currently the code that I have written in order to achieve this, based off of what I was able to read on StackExchange:
from matplotlib import pylab
from sympy.parsing.sympy_parser import parse_expr
from sympy.plotting import plot
from sympy.printing.preview import preview
class MathString:
def __init__(self, expression_str: str):
self.expression_str = expression_str
#property
def expression(self):
return parse_expr(self.expression_str)
def plot_expression(self):
return plot(self.expression)
def save_plot(self):
self.plot_expression().saveimage("imagePath",
format='png')
And then using a main function:
def main():
test_expression = '2**x+(3-(4*9))'
test = MathString(test_expression)
test.save_plot()
main()
However, when I run main(), it just sends me an actual graphical plot of the equation I provided. I've tried multiple other solutions but the errors ranged from my environment not supporting LaTeX to the fact that I am passing trying to pass the expression as a string.
Please help! I'm super stuck and do not understand what I am doing wrong! Given a certain address path where I can store my images, how can I save an image of the displayed expression using Sympy/MatPlotLib/whatever other libraries I may need?
The program in your question does not convert the expression from
string format to the sympy internal format. See below for examples.
Also, sympy has capabilities to detect what works best in your
environment. Running the following program in Spyder 5.0 with an
iPython 7.22 terminal, I get the output in Unicode format.
from sympy import *
my_names = 'x'
x = symbols(','.join(my_names))
ns1 = {'x': x}
my_symbols = tuple(ns1.values())
es1 = '2**x+(3-(4*9))'
e1 = sympify(es1, locals=ns1)
e2 = sympify(es1, locals=ns1, evaluate=False)
print(f"String {es1} with symbols {my_symbols}",
f"\n\tmakes expression {e1}",
f"\n\tor expression {e2}")
# print(simplify(e1))
init_printing()
pprint(e2)
Output (in Unicode):
# String 2**x+(3-(4*9)) with symbols (x,)
# makes expression 2**x - 33
# or expression 2**x - 4*9 + 3
# x
# 2 - 4⋅9 + 3
Introduction to the problem
I have inputs in a .txt file and I want to 'extract' the values when a velocity is given.
Inputs have the form: velocity\t\val1\t\val2...\tvaln
[...]
16\t1\t0\n
1.0000\t9.3465\t8.9406\t35.9604\n
2.0000\t10.4654\t9.9456\t36.9107\n
3.0000\t11.1235\t10.9378\t37.1578\n
[...]
What have I done
I have written a piece of code to return values when a velocity is requested:
def values(input,velocity):
return re.findall("\n"+str(velocity)+".*",input)[-1][1:]
It works "backwards" because I want to ignore the first row from the inputs (16\t1\t0\n), this way if I call:
>>>values('inputs.txt',16)
>>>16.0000\t0.5646\t14.3658\t1.4782\n
But it has a big problem: if I call the function for 1, it returns the value for 19.0000
Since I thought all inputs would be in the same format I made a litte fix:
def values(input,velocity):
if velocity <= 5: #Because velocity goes to 50
velocity = str(velocity)+'.0'
return re.findall("\n"+velocity+".*",input)[-1][1:]
And it works pretty well, maybe is not the most beautiful (or efficient) way of do it but I'm a beginner.
The problem
But with this code I have a problem and it is that sometimes inputs have this form:
[...]
16\t1\t0\n
1\t9.3465\t8.9406\t35.9604\n
2\t10.4654\t9.9456\t36.9107\n
3\t11.1235\t10.9378\t37.1578\n
[...]
And, of course my solution doesn't work
So, is there any pattern that fit both kinds of inputs?
Thank you for your help.
P.S. I have a solution using the function split('\n') and indexes but I would like to solve it with re library:
def values(input,velocity):
return input.split('\n)[velocity+1] #+1 to avoid first row
You could use a positive look ahead to check that after your velocity there is either a period or a tab. That will stop you picking up further numbers without hardcoding there must be .0. This means that velocity 1 will be able to match 1 or 1.xxxxx
import re
from typing import List
def find_by_velocity(velocity: int, data: str) -> List[str]:
return re.findall(r"\n" + str(velocity) + r"(?=\.|\t).*", data)
data = """16\t1\t0\n1\t9.3465\t8.9406\t35.9604\n2\t10.4654\t9.9456\t36.9107\n3\t11.1235\t10.9378\t37.1578\n16\t1\t0\n1.0000\t9.3465\t8.9406\t35.9604\n2.0000\t10.4654\t9.9456\t36.9107\n3.0000\t11.1235\t10.9378\t37.1578\n"""
print(find_by_velocity(1, data))
OUTPUT
['\n1\t9.3465\t8.9406\t35.9604', '\n1.0000\t9.3465\t8.9406\t35.9604']
I have to write a function that takes a string, and will return the string with added "asteriks" or "*" symbols to signal multiplication.
As we know 4(3) is another way to show multiplication, as well as 4*3 or (4)(3) or 4*(3) etc. Anyway, my code needs to fix that problem by adding an asterik between the 4 and the 3 for when multiplication is shown WITH PARENTHESIS but without the multiplication operator " * ".
Some examples:
"4(3)" -> "4*(3)"
"(4)(3)" -> "(4)*(3)"
"4*2 + 9 -4(-3)" - > "4*2 + 9 -4*(-3)"
"(-9)(-2) (4)" -> "(-9)*(2) *(4)"
"4^(3)" -> "4^(3)"
"(4-3)(4+2)" -> "(4-3)*(4+2)"
"(Aflkdsjalkb)(g)" -> "(Aflkdsjalkb)*(g)"
"g(d)(f)" -> "g*(d)*(f)"
"(4) (3)" -> "(4)*(3)"
I'm not exactly sure how to do this, I am thinking about finding the left parenthesis and then simply adding a " * " at that location but that wouldn't work hence the start of my third example would output "* (-9)" which is what I don't want or my fourth example that would output "4^*(3)". Any ideas on how to solve this problem? Thank you.
Here's something I've tried, and obviously it doesn't work:
while index < len(stringtobeconverted)
parenthesis = stringtobeconverted[index]
if parenthesis == "(":
stringtobeconverted[index-1] = "*"
In [15]: def add_multiplies(input_string):
...: return re.sub(r'([^-+*/])\(', r'\1*(', input_string)
...:
...:
...:
In [16]: for example in examples:
...: print(f"{example} -> {add_multiplies(example)}")
...:
4(3) -> 4*(3)
(4)(3) -> (4)*(3)
4*2 + 9 -4(-3) -> 4*2 + 9 -4*(-3)
(-9)(-2) (4) -> (-9)*(-2) *(4)
4^(3) -> 4^*(3)
(4-3)(4+2) -> (4-3)*(4+2)
(Aflkdsjalkb)(g) -> (Aflkdsjalkb)*(g)
g(d)(f) -> g*(d)*(f)
(g)-(d) -> (g)-(d)
tl;dr– Rather than thinking of this as string transformation, you might:
Parse an input string into an abstract representation.
Generate a new output string from the abstract representation.
Parse input to create an abstract syntax tree, then emit the new string.
Generally you should:
Create a logical representation for the mathematical expressions.You'll want to build an abstract syntax tree (AST) to represent each expression. For example,
2(3(4)+5)
could be form a tree like:
*
/ \
2 +
/ \
* 5
/ \
3 4
, where each node in that tree (2, 3, 4, 5, both *'s, and the +) are each an object that has references to its child objects.
Write the logic for parsing the input.Write a logic that can parse "2(3(4)+5)" into an abstract syntax tree that represents what it means.
Write a logic to serialize the data.Now that you've got the data in conceptual form, you can write methods that convert it into a new, desired format.
Note: String transformations might be easier for quick scripting.
As other answers have shown, direct string transformations can be easier if all you need is a quick script, e.g. you have some text you just want to reformat real quick. For example, as #PaulWhipp's answer demonstrates, regular expressions can make such scripting really quick-and-easy.
That said, for professional projects, you'll generally want to parse data into an abstract representation before emitting a new representation. String-transform tricks don't generally scale well with complexity, and they can be both functionally limited and pretty error-prone outside of simple cases.
I'll share mine.
def insertAsteriks(string):
lstring = list(string)
c = False
for i in range(1, len(lstring)):
if c:
c = False
pass
elif lstring[i] == '(' and (lstring[i - 1] == ')' or lstring[i - 1].isdigit() or lstring[i - 1].isalpha() or (lstring[i - 1] == ' ' and not lstring[i - 2] in "*^-+/")):
lstring.insert(i, '*')
c = True
return ''.join(lstring)
Let's check against your inputs.
print(insertAsteriks("4(3)"))
print(insertAsteriks("(4)(3)"))
print(insertAsteriks("4*2 + 9 -4(-3)"))
print(insertAsteriks("(-9)(-2) (4)"))
print(insertAsteriks("(4)^(-3)"))
print(insertAsteriks("ABC(DEF)"))
print(insertAsteriks("g(d)(f)"))
print(insertAsteriks("(g)-(d)"))
The output is:
4*(3)
(4)*(3)
4*2 + 9 -4*(-3)
(-9)*(-2) (4)
(4)^(-3)
ABC*(DEF)
g*(d)*(f)
(g)-(d)
[Finished in 0.0s]
One way would be to use a simple replacement. The cases to be replaced are:
)( -> )*(
N( -> N*(
)N -> )*N
Assuming you want to preserve whitespace as well, you need to find all patterns on the left side with an arbitrary number of spaces in between and replace that with the same number of spaces less one plus the asterisk at the end. You can use a regex for that.
A more fun way would be using kind of a recursion with fake linked lists:) You have entities and operators. An entity can be a number by itself or anything enclosed in parentheses. Anything else is an operator. How bout something like this:
For each string, find all entities and operators (keep them in a list for example)
Then for each entity see if there are more entities inside.
Keep doing that until there are no more entities left in any entities.
Then starting from the very bottom (the smallest of entities that is) see if there is an operator between two adjacent entities, if there is not, insert an asterisk there. Do that all the way up to the top level. The start from the bottom again and reassemble all the pieces.
Here is a code tested on your examples :
i = 0
input_string = "(4-3)(4+2)"
output_string = ""
while i < len(input_string):
if input_string[i] == "(" and i != 0:
if input_string[i-1] in list(")1234567890"):
output_string += "*("
else:
output_string += input_string[i]
else:
output_string += input_string[i]
i += 1
print(output_string)
The key here is to understand the logic you want to achieve, which is in fact quite simple : you just want to add some "*" before opening parenthesis based on a few conditions.
Hope that helps !
I was given the following problem where I had to match the logdata and the expected_result. The code is as follows, edited with my solution and comments containing feedback I received:
import collections
log_data = """1.1.2014 12:01,111-222-333,454-333-222,COMPLETED
1.1.2014 13:01,111-222-333,111-333,FAILED
1.1.2014 13:04,111-222-333,454-333-222,FAILED
1.1.2014 13:05,111-222-333,454-333-222,COMPLETED
2.1.2014 13:01,111-333,111-222-333,FAILED
"""
expected_result = {
"111-222-333": "40.00%",
"454-333-222": "66.67%",
"111-333" : "0.00%"
}
def compute_success_ratio(logdata):
#! better option to use .splitlines()
#! or even better recognize the CSV structure and use csv.reader
entries = logdata.split('\n')
#! interesting choice to collect the data first
#! which could result in explosive growth of memory hunger, are there
#! alternatives to this structure?
complst = []
faillst = []
#! probably no need for attaching `lst` to the variable name, no?
for entry in entries:
#! variable naming could be clearer here
#! a good way might involve destructuring the entry like:
#! _, caller, callee, result
#! which also avoids using magic indices further down (-1, 1, 2)
ent = entry.split(',')
if ent[-1] == 'COMPLETED':
#! complst.extend(ent[1:3]) for even more brevity
complst.append(ent[1])
complst.append(ent[2])
elif ent[-1] == 'FAILED':
faillst.append(ent[1])
faillst.append(ent[2])
#! variable postfix `lst` could let us falsely assume that the result of set()
#! is a list.
numlst = set(complst + faillst)
#! good use of collections.Counter,
#! but: Counter() already is a dictionary, there is no need to convert it to one
comps = dict(collections.Counter(complst))
fails = dict(collections.Counter(faillst))
#! variable naming overlaps with global, and doesn't make sense in this context
expected_result = {}
for e in numlst:
#! good: dealt with possibility of a number not showing up in `comps` or `fails`
#! bad: using a try/except block to deal with this when a simpler .get("e", 0)
#! would've allowed dealing with this more elegantly
try:
#! variable naming not very expressive
rat = float(comps[e]) / float(comps[e] + fails[e]) * 100
perc = round(rat, 2)
#! here we are rounding twice, and then don't use the formatting string
#! to attach the % -- '{:.2f}%'.format(perc) would've been the right
#! way if one doesn't know percentage formatting (see below)
expected_result[e] = "{:.2f}".format(perc) + '%'
#! a generally better way would be to either
#! from __future__ import division
#! or to compute the ratio as
#! ratio = float(comps[e]) / (comps[e] + fails[e])
#! and then use percentage formatting for the ratio
#! "{:.2%}".format(ratio)
except KeyError:
expected_result[e] = '0.00%'
return expected_result
if __name__ == "__main__":
assert(compute_success_ratio(log_data) == expected_result)
#! overall
#! + correct
#! ~ implementation not optimal, relatively wasteful in terms of memory
#! - variable naming inconsistent, overly shortened, not expressive
#! - some redundant operations
#! + good use of standard library collections.Counter
#! ~ code could be a tad bit more idiomatic
I have understood some of the problems such as variable naming conventions and avoiding the try block section as much as possible.
However, I fail to understand how using csv.reader improves the code. Also, how am I supposed to understand the comment about collecting the data first? What could the alternatives be? Could anybody throw some light on these two issues?
When you do entries = logdata.split('\n') you will create a list with the split strings. Since log files can be quite large, this can consume a large amount of memory.
The way that csv.reader works is that it will open the file and only read one line at a time (approximately). This means that the data remains in the file and only one row is ever in memory.
Forgetting about the csv parsing for a minute, the issue is illustrated by the difference between these approaches:
In approach 1 we read the whole file into memory:
data = open('logfile').read().split('\n')
for line in data:
# do something with the line
In approach 2 we read one line at a time:
data = open('logfile')
for line in data:
# do something with the line
Approach 1 will consume more memory as the whole file needs to be read into memory. It also traverses the data twice - once when we read it and once to split into lines. The downside of approach 2 is that we can only do one loop through data.
For the particular case here, where we're not reading from a file but rather from a variable which is already in memory, the big difference will be that we will consume about twice as much memory by using the split approach.
split('\n') and splitlines will create a copy of your data where each line is is separate item in the list. Since you only need to pass over the data once instead of randomly accessing lines this is wasteful compared to CSV reader which could return you one line at the time. The other benefit of using the reader is that you wouldn't have to split the data to lines and lines to columns manually.
The comment about data collection refers the fact that you add all the completed and failed items to two lists. Let's say that item 111-333 completes five times and fails twice. Your data would look something like this:
complst = ['111-333', '111-333', '111-333', '111-333', '111-333']
faillst = ['111-333', '111-333']
You don't need those repeating items so you could have used Counter directly without collecting items to the lists and save a lot of memory.
Here's an alternative implementation that uses csv.reader and collects success & failure counts to a dict where item name is key and value is list [success count, failure count]:
from collections import defaultdict
import csv
from io import StringIO
log_data = """1.1.2014 12:01,111-222-333,454-333-222,COMPLETED
1.1.2014 13:01,111-222-333,111-333,FAILED
1.1.2014 13:04,111-222-333,454-333-222,FAILED
1.1.2014 13:05,111-222-333,454-333-222,COMPLETED
2.1.2014 13:01,111-333,111-222-333,FAILED
"""
RESULT_STRINGS = ['COMPLETED', 'FAILED']
counts = defaultdict(lambda: [0, 0])
for _, *params, result in csv.reader(StringIO(log_data)):
try:
index = RESULT_STRINGS.index(result)
for param in params:
counts[param][index] += 1
except ValueError:
pass # Skip line in case last column is not in RESULT_STRINGS
result = {k: '{0:.2f}%'.format(v[0] / sum(v) * 100) for k, v in counts.items()}
Note that above will work only on Python 3.
Alternatively, Pandas looks a good solution for this purpose if you are OK with using it.
import pandas as pd
log_data = pd.read_csv('data.csv',header=None)
log_data.columns = ['date', 'key1','key2','outcome']
meltedData = pd.melt(log_data, id_vars=['date','outcome'], value_vars=['key1','key2'],
value_name = 'key') # we transpose the keys here
meltedData['result'] = [int(x.lower() == 'completed') for x in meltedData['outcome']] # add summary variable
groupedData = meltedData.groupby(['key'])['result'].mean()
groupedDict = groupedData.to_dict()
print groupedDict
Result:
{'111-333': 0.0, '111-222-333': 0.40000000000000002, '454-333-222': 0.66666666666666663}