Using Logical And In CLIPS - python

I altered some CLIPS/CLIPSpy code to look for when the Variable column in a CSV is the word Oil Temp and when the duration of that column is above 600 or above. The rule should fire twice according to the CSV I'm using:
I'm receiving the following error.
Here is my code currently. I think it's failing on the variable check or the logical and check.
import sys
from tempfile import mkstemp
import os
import clips
CLIPS_CONSTRUCTS = """
(defglobal ?*oil-too-hot-times* = 0)
(deftemplate oil-is-too-hot-too-long
(slot Variable (type STRING))
(slot Duration (type INTEGER)))
(defrule check-for-hot-oil-too-long-warning
(oil-is-too-hot-too-long (Variable ?variable) (Duration ?duration))
(test (?variable Oil Temp))
(and (>= ?duration 600))
=>
(printout t "Warning! Check engine light on!" tab ?*oil-too-hot-times* crlf))
"""
def main():
environment = clips.Environment()
# use environment.load() to load constructs from a file
constructs_file, constructs_file_name = mkstemp()
file = open(constructs_file, 'wb')
file.write(CLIPS_CONSTRUCTS.encode())
file.close()
environment.load(constructs_file_name)
os.remove(constructs_file_name)
# enable fact duplication as data has duplicates
environment.eval("(set-fact-duplication TRUE)")
# Template facts can be built from their deftemplate
oil_too_hot_too_long_template = environment.find_template("oil-is-too-hot-too-long")
for variable, duration in get_data_frames(sys.argv[1]):
new_fact = oil_too_hot_too_long_template.new_fact()
# Template facts are represented as dictionaries
new_fact["Variable"] = variable
new_fact["Duration"] = int(duration)
# Add the fact into the environment Knowledge Base
new_fact.assertit()
# Fire all the rules which got activated
environment.run()
def get_data_frames(file_path):
"""Parse a CSV file returning the dataframes."""
with open(file_path) as data_file:
return [l.strip().split(",") for i, l in enumerate(data_file) if i > 1]
if __name__ == "__main__":
main()

CLIPS adopts Polish/Prefix notation. Therefore, your rule should be written as follows.
(defrule check-for-hot-oil-too-long-warning
(oil-is-too-hot-too-long (Variable ?variable) (Duration ?duration))
(test (and (eq ?variable "Oil Temp")
(>= ?duration 600)))
=>
(printout t "Warning! Check engine light on!" tab ?*oil-too-hot-times* crlf))
Also notice how the type STRING requires double quotes ".
Yet I'd suggest you to leverage the alpha network matching of the engine which is more concise and efficient.
(defrule check-for-hot-oil-too-long-warning
(oil-is-too-hot-too-long (Variable "Oil Temp") (Duration ?duration))
(test (>= ?duration 600))
=>
(printout t "Warning! Check engine light on!" tab ?*oil-too-hot-times* crlf))
The engine can immediately see that your Variable slot is a constant and can optimize the matching logic accordingly. I am not sure it can make the same assumption within the joint test.

Related

Python method for rapidly producing custom synthesized tonal sequences from a terminal window

I'm trying to design a method for generating audio signals rapidly. I need this for electrophysiological experiments in which I will play tone sequences for the purpose of examining neuronal responses in the brain's auditory system.
I need to be able to quickly construct a novel sequence in which I can specify features of each tone (e.g. frequency, duration, amplitude, etc.), silent pauses (i.e. rests), and the sequence of tones and pauses.
I want to do this from the terminal using a simple sequence of codes. For instance, entering tone(440,2) rest(2) tone(880,1) rest(1) tone(880,1) would generate a "song" that plays a 2-second sine wave tone at 440 Hz, then a 2-second rest, then a 1-second tone at 880 Hz, etc.
I have Python functions for producing tones and rests, but I don't know how to access and control them from the terminal for this purpose. After some reading, it seems like using textX or PyParsing might be good options, but I have no background in creating domain-specific languages or parsers, so I'm not sure. I've completed this textX tutorial and read this PyParsing description, but it's not yet clear how or whether I can use these methods for the rapid, terminal-based audio construction and playback that I need. Do you have any suggestions?
This would be an initial solution for textX:
from textx import metamodel_from_str
grammar = r'''
Commands: commands*=Command;
Command: Tone | Rest;
Tone: 'tone' '(' freq=INT ',' duration=INT ')';
Rest: 'rest' '(' duration=INT ')';
'''
mm = metamodel_from_str(grammar)
input = 'tone(440,2) rest(2) tone(880,1) rest(1) tone(880,1)'
model = mm.model_from_str(input)
for command in model.commands:
if command.__class__.__name__ == 'Tone':
# This command is a tone. Call your function for tone. For example:
render_tone(command.freq, command.duration)
else:
# Call rest. For example:
render_rest(command.duration)
You can also easily take your input recipe from external file by changing above mm.model_from_str to mm.model_from_file.
This annotated pyparsing example should get you started:
import pyparsing as pp
ppc = pp.pyparsing_common
# expressions for punctuation - useful during parsing, but
# should be suppressed from the parsed results
LPAR, RPAR, COMMA = map(pp.Suppress, "(),")
# expressions for your commands and numeric values
# the value expression could have used ppc.integer, but
# using number allows for floating point values (such as
# durations that are less than a second)
TONE = pp.Keyword("tone")
REST = pp.Keyword("rest")
value = ppc.number
# expressions for tone and rest commands
tone_expr = (TONE("cmd")
+ LPAR + value("freq") + COMMA + value("duration") + RPAR)
rest_expr = (REST("cmd")
+ LPAR + value("duration") + RPAR)
# a command is a tone or a rest expression
cmd_expr = tone_expr | rest_expr
# functions to call for each command - replace with your actual
# music functions
def play_tone(freq, dur):
print("BEEP({}, {})".format(freq, dur))
def play_rest(dur):
print("REST({})".format(dur))
How it works:
cmd_str = "tone(440,0.2) rest(2) tone(880, 1) rest(1) tone( 880, 1 )"
for music_code in cmd_expr.searchString(cmd_str):
if music_code.cmd == "tone":
play_tone(music_code.freq, music_code.duration)
elif music_code.cmd == "rest":
play_rest(music_code.duration)
else:
print("unexpected code", music_code.cmd)
Prints:
BEEP(440, 0.2)
REST(2)
BEEP(880, 1)
REST(1)
BEEP(880, 1)
More info at https://pyparsing-docs.readthedocs.io/en/pyparsing_2.4.7/HowToUsePyparsing.html and module reference at https://pyparsing-docs.readthedocs.io/en/pyparsing_2.4.7/pyparsing.html

How to print each loop result to a single file?

I am running a model evaluation protocol for Modeller. It evaluates every model and writes its result to a separate file. However I have to run it for every model and write to a single file.
This is the original code:
from modeller import *
from modeller.scripts import complete_pdb
log.verbose() # request verbose output
env = environ()
env.libs.topology.read(file='$(LIB)/top_heav.lib') # read topology
env.libs.parameters.read(file='$(LIB)/par.lib') # read parameters
# read model file
mdl = complete_pdb(env, 'TvLDH.B99990001.pdb')
# Assess all atoms with DOPE:
s = selection(mdl)
s.assess_dope(output='ENERGY_PROFILE NO_REPORT', file='TvLDH.profile',
normalize_profile=True, smoothing_window=15)
I added a loop to evaluate every model in a single run, however I am creating several files (one for each model) and I want is to print all evaluations in a single file
from modeller import *
from modeller.scripts import complete_pdb
log.verbose() # request verbose output
env = environ()
env.libs.topology.read(file='$(LIB)/top_heav.lib') # read topology
env.libs.parameters.read(file='$(LIB)/par.lib') # read parameters
#My loop starts here
for i in range (1,1001):
number=str(i)
if i<10:
name='000'+number
else:
if i<100:
name='00'+number
else:
if i<1000:
name='0'+number
else:
name='1000'
# read model file
mdl = complete_pdb(env, 'TcP5CDH.B9999'+name+'.pdb')
# Assess all atoms with DOPE: this is the assesment that i want to print in the same file
s = selection(mdl)
savename='TcP5CDH.B9999'+name+'.profile'
s.assess_dope(output='ENERGY_PROFILE NO_REPORT',
file=savename,
normalize_profile=True, smoothing_window=15)
As I am new to programming, any help will be very helpful!
Welcome :-) Looks like you're very close. Let's introduce you to using a python function and the .format() statement.
Your original has a comment line # read model file, which looks like it could be a function, so let's try that. It could look something like this.
from modeller import *
from modeller.scripts import complete_pdb
log.verbose() # request verbose output
# I'm assuming this can be done just once
# and re-used for all your model files...
# (if not, the env stuff should go inside the
# read_model_file() function.
env = environ()
env.libs.topology.read(file='$(LIB)/top_heav.lib') # read topology
env.libs.parameters.read(file='$(LIB)/par.lib') # read parameters
def read_model_file(file_name):
print('--- read_model_file(file_name='+file_name+') ---')
mdl = complete_pdb(env, file_name)
# Assess all atoms with DOPE:
s = selection(mdl)
output_file = file_name+'.profile'
s.assess_dope(
output='ENERGY_PROFILE NO_REPORT',
file=output_file,
normalize_profile=True,
smoothing_window=15)
for i in range(1,1001):
file_name = 'TcP5CDH.B9999{:04d}pdb'.format(i)
read_model_file(file_name)
Using .format() we can get ride of the multiple if-statement checks for 10? 100? 1000?
Basically .format() replaces {} curly braces with the argument(s).
It can be pretty complex but you don't need to digetst all of it.
Example:
'Hello {}!'.format('world') yields Hello world!. The {:04d} stuff uses formatting, basically that says "Please make a 4-character wide digit-substring and zero-fill it, so you should get '0001', ..., '0999', '1000'.
Just {:4d} (no leading zero) would give you space padded results (e.g. ' 1', ..., ' 999', '1000'.
Here's a little more on the zero-fill: Display number with leading zeros

Collecting data first in python to conduct operations

I was given the following problem where I had to match the logdata and the expected_result. The code is as follows, edited with my solution and comments containing feedback I received:
import collections
log_data = """1.1.2014 12:01,111-222-333,454-333-222,COMPLETED
1.1.2014 13:01,111-222-333,111-333,FAILED
1.1.2014 13:04,111-222-333,454-333-222,FAILED
1.1.2014 13:05,111-222-333,454-333-222,COMPLETED
2.1.2014 13:01,111-333,111-222-333,FAILED
"""
expected_result = {
"111-222-333": "40.00%",
"454-333-222": "66.67%",
"111-333" : "0.00%"
}
def compute_success_ratio(logdata):
#! better option to use .splitlines()
#! or even better recognize the CSV structure and use csv.reader
entries = logdata.split('\n')
#! interesting choice to collect the data first
#! which could result in explosive growth of memory hunger, are there
#! alternatives to this structure?
complst = []
faillst = []
#! probably no need for attaching `lst` to the variable name, no?
for entry in entries:
#! variable naming could be clearer here
#! a good way might involve destructuring the entry like:
#! _, caller, callee, result
#! which also avoids using magic indices further down (-1, 1, 2)
ent = entry.split(',')
if ent[-1] == 'COMPLETED':
#! complst.extend(ent[1:3]) for even more brevity
complst.append(ent[1])
complst.append(ent[2])
elif ent[-1] == 'FAILED':
faillst.append(ent[1])
faillst.append(ent[2])
#! variable postfix `lst` could let us falsely assume that the result of set()
#! is a list.
numlst = set(complst + faillst)
#! good use of collections.Counter,
#! but: Counter() already is a dictionary, there is no need to convert it to one
comps = dict(collections.Counter(complst))
fails = dict(collections.Counter(faillst))
#! variable naming overlaps with global, and doesn't make sense in this context
expected_result = {}
for e in numlst:
#! good: dealt with possibility of a number not showing up in `comps` or `fails`
#! bad: using a try/except block to deal with this when a simpler .get("e", 0)
#! would've allowed dealing with this more elegantly
try:
#! variable naming not very expressive
rat = float(comps[e]) / float(comps[e] + fails[e]) * 100
perc = round(rat, 2)
#! here we are rounding twice, and then don't use the formatting string
#! to attach the % -- '{:.2f}%'.format(perc) would've been the right
#! way if one doesn't know percentage formatting (see below)
expected_result[e] = "{:.2f}".format(perc) + '%'
#! a generally better way would be to either
#! from __future__ import division
#! or to compute the ratio as
#! ratio = float(comps[e]) / (comps[e] + fails[e])
#! and then use percentage formatting for the ratio
#! "{:.2%}".format(ratio)
except KeyError:
expected_result[e] = '0.00%'
return expected_result
if __name__ == "__main__":
assert(compute_success_ratio(log_data) == expected_result)
#! overall
#! + correct
#! ~ implementation not optimal, relatively wasteful in terms of memory
#! - variable naming inconsistent, overly shortened, not expressive
#! - some redundant operations
#! + good use of standard library collections.Counter
#! ~ code could be a tad bit more idiomatic
I have understood some of the problems such as variable naming conventions and avoiding the try block section as much as possible.
However, I fail to understand how using csv.reader improves the code. Also, how am I supposed to understand the comment about collecting the data first? What could the alternatives be? Could anybody throw some light on these two issues?
When you do entries = logdata.split('\n') you will create a list with the split strings. Since log files can be quite large, this can consume a large amount of memory.
The way that csv.reader works is that it will open the file and only read one line at a time (approximately). This means that the data remains in the file and only one row is ever in memory.
Forgetting about the csv parsing for a minute, the issue is illustrated by the difference between these approaches:
In approach 1 we read the whole file into memory:
data = open('logfile').read().split('\n')
for line in data:
# do something with the line
In approach 2 we read one line at a time:
data = open('logfile')
for line in data:
# do something with the line
Approach 1 will consume more memory as the whole file needs to be read into memory. It also traverses the data twice - once when we read it and once to split into lines. The downside of approach 2 is that we can only do one loop through data.
For the particular case here, where we're not reading from a file but rather from a variable which is already in memory, the big difference will be that we will consume about twice as much memory by using the split approach.
split('\n') and splitlines will create a copy of your data where each line is is separate item in the list. Since you only need to pass over the data once instead of randomly accessing lines this is wasteful compared to CSV reader which could return you one line at the time. The other benefit of using the reader is that you wouldn't have to split the data to lines and lines to columns manually.
The comment about data collection refers the fact that you add all the completed and failed items to two lists. Let's say that item 111-333 completes five times and fails twice. Your data would look something like this:
complst = ['111-333', '111-333', '111-333', '111-333', '111-333']
faillst = ['111-333', '111-333']
You don't need those repeating items so you could have used Counter directly without collecting items to the lists and save a lot of memory.
Here's an alternative implementation that uses csv.reader and collects success & failure counts to a dict where item name is key and value is list [success count, failure count]:
from collections import defaultdict
import csv
from io import StringIO
log_data = """1.1.2014 12:01,111-222-333,454-333-222,COMPLETED
1.1.2014 13:01,111-222-333,111-333,FAILED
1.1.2014 13:04,111-222-333,454-333-222,FAILED
1.1.2014 13:05,111-222-333,454-333-222,COMPLETED
2.1.2014 13:01,111-333,111-222-333,FAILED
"""
RESULT_STRINGS = ['COMPLETED', 'FAILED']
counts = defaultdict(lambda: [0, 0])
for _, *params, result in csv.reader(StringIO(log_data)):
try:
index = RESULT_STRINGS.index(result)
for param in params:
counts[param][index] += 1
except ValueError:
pass # Skip line in case last column is not in RESULT_STRINGS
result = {k: '{0:.2f}%'.format(v[0] / sum(v) * 100) for k, v in counts.items()}
Note that above will work only on Python 3.
Alternatively, Pandas looks a good solution for this purpose if you are OK with using it.
import pandas as pd
log_data = pd.read_csv('data.csv',header=None)
log_data.columns = ['date', 'key1','key2','outcome']
meltedData = pd.melt(log_data, id_vars=['date','outcome'], value_vars=['key1','key2'],
value_name = 'key') # we transpose the keys here
meltedData['result'] = [int(x.lower() == 'completed') for x in meltedData['outcome']] # add summary variable
groupedData = meltedData.groupby(['key'])['result'].mean()
groupedDict = groupedData.to_dict()
print groupedDict
Result:
{'111-333': 0.0, '111-222-333': 0.40000000000000002, '454-333-222': 0.66666666666666663}

From Python Code to a Working Executable File (Downsizing Grid Files Program)

I posted a question earlier about a syntax error here Invalid Syntax error in Python Code I copied from the Internet. Fortunately, my problem was fixed really fast thanks to you. However now that there is no syntax error I found myself helpless as I don't know what to do now with this code. As I've said I've done some basic Python Training 3 years ago but the human brain seems to forget things so fast.
So in a few words, I need to reduce the grid resolution of some files to half and I've been searching for a way to do it for weeks. Luckily I found some python code that seems to do exactly what I am looking for. The code is this :
#!/bin/env python
# -----------------------------------------------------------------------------
# Reduce grid data to a smaller size by averaging over cells of specified
# size and write the output as a netcdf file. xyz_origin and xyz_step
# attributes are adjusted.
#
# Syntax: downsize.py <x-cell-size> <y-cell-size> <z-cell-size>
# <in-file> <netcdf-out-file>
#
import sys
import Numeric
from VolumeData import Grid_Data, Grid_Component
# -----------------------------------------------------------------------------
#
def downsize(mode, cell_size, inpath, outpath):
from VolumeData import fileformats
try:
grid_data = fileformats.open_file(inpath)
except fileformats.Uknown_File_Type as e:
sys.stderr.write(str(e))
sys.exit(1)
reduced = Reduced_Grid(grid_data, mode, cell_size)
from VolumeData.netcdf.netcdf_grid import write_grid_as_netcdf
write_grid_as_netcdf(reduced, outpath)
# -----------------------------------------------------------------------------
# Average over cells to produce reduced size grid object.
#
# If the grid data sizes are not multiples of the cell size then the
# final data values along the dimension are not included in the reduced
# data (ie ragged blocks are not averaged).
#
class Reduced_Grid(Grid_Data):
def __init__(self, grid_data, mode, cell_size):
size = map(lambda s, cs: s / cs, grid_data.size, cell_size)
xyz_origin = grid_data.xyz_origin
xyz_step = map(lambda step, cs: step*cs, grid_data.xyz_step, cell_size)
component_name = grid_data.component_name
components = []
for component in grid_data.components:
components.append(Reduced_Component(component, mode, cell_size))
Grid_Data.__init__(self, '', '', size, xyz_origin, xyz_step,
component_name, components)
# -----------------------------------------------------------------------------
# Average over cells to produce reduced size grid object.
#
class Reduced_Component(Grid_Component):
def __init__(self, component, mode, cell_size):
self.component = component
self.mode = mode
self.cell_size = cell_size
Grid_Component.__init__(self, component.name, component.rgba)
# ---------------------------------------------------------------------------
#
def submatrix(self, ijk_origin, ijk_size):
ijk_full_origin = map(lambda i, cs: i * cs, ijk_origin, self.cell_size)
ijk_full_size = map(lambda s, cs: s*cs, ijk_size, self.cell_size)
values = self.component.submatrix(ijk_full_origin, ijk_full_size)
if mode == 'ave':
m = average_down(values, self.cell_size)
I have this saved as a .py file and when I double click it, the command prompt appears for a milisecond and then disappears. I managed to take a screenshot of that command prompt which it says "Unable to create process using 'bin/env python "C:\Users...........py".
What I want to do is to be able to do this downsizing using the Syntax that the code tells me to use :
# Syntax: downsize.py <x-cell-size> <y-cell-size> <z-cell-size>
# <in-file> <netcdf-out-file>
Can you help me ?
Don't run the file by double-clicking it. Run the file by opening a new shell, and typing in the path to the .py file (or just cd to the parent directory) followed by the arguments you want to pass. For example:
python downsize.py 1 2 3 foo bar

calling a function from another file

I'm writing a code on python where I must import a function from other file. I write import filename and filename.functionname and while I'm writing the first letter of the function name a window pops up on PyCharm showing me the full name of the function, so I guess Python knows that the file has the function I need. When I try it on console it works. But when I run the same thing on my code it gives an error: 'module' object has no attribute 'get_ecc'. what could be the problem? The only import part is the last function, make_qr_code.
""" Create QR error correction codes from binary data, according to the
standards laid out at http://www.swetake.com/qr/qr1_en.html. Assumes the
following when making the codes:
- alphanumeric text
- level Q error-checking
Size is determined by version, where Version 1 is 21x21, and each version up
to 40 is 4 more on each dimension than the previous version.
"""
import qrcode
class Polynomial(object):
""" Generator polynomials for error correction.
"""
# The following tables are constants, associated with the *class* itself
# instead of with any particular object-- so they are shared across all
# objects from this class.
# We break style guides slightly (no space following ':') to make the
# tables easier to read by organizing the items in lines of 8.
def get_ecc(binary_string, version, ec_mode):
""" Create the error-correction code for the binary string provided, for
the QR version specified (in the range 1-9). Assumes that the length of
binary_string is a multiple of 8, and that the ec_mode is one of 'L', 'M',
'Q' or 'H'.
"""
# Create the generator polynomial.
generator_coeffs = get_coefficients(SIZE_TABLE[version, ec_mode][1])
generator_exps = range(len(generator_coeffs) - 1, -1, -1)
generator_poly = Polynomial(generator_coeffs, generator_exps)
# Create the message polynomial.
message_coeffs = []
while binary_string:
message_coeffs.append(qrcode.convert_to_decimal(binary_string[:8]))
binary_string = binary_string[8:]
message_max = len(message_coeffs) - 1 + len(generator_coeffs) - 1
message_exps = range(message_max, message_max - len(message_coeffs), -1)
message_poly = Polynomial(message_coeffs, message_exps)
# Keep dividing the message polynomial as much as possible, leaving the
# remainder in the resulting polynomial.
while message_poly.exps[-1] > 0:
message_poly.divide_by(generator_poly)
# Turn the error-correcting code back into binary.
ecc_string = ""
for item in message_poly.coeffs:
ecc_string += qrcode.convert_to_binary(item, 8)
return ecc_string

Categories