Loading previously saved JModelica result-file - python

I got the following question:
I am loading a JModelica model and simulate it easily by doing:
from pymodelica import compile_fmu
from pyfmi import load_fmu
model = load_fmu(SOME_FMU);
res=model.simulate();
Everything works fine and it even saves a resulting .txt - file. Now, with this .txt the problem is that I did not find any funtionality so far within the jmodelica-python packages to actually load such an .txt result file again later on into a result-object ( like the one being returned by simulate() ) to easily extract the previous saved data.
Implementing that by hand is of course possible but I find it quiet nasty and just wanted to ask if anyone knows of method that does the job to load that JModlica-format result-file into an result object for me.
Thanks!!!!

The functionality that you need is located in the io module:
from pyfmi.common.io import ResultDymolaTextual
res = ResultDymolaTextual("MyResult.txt")
var = res.get_variable_data("MyVariable")
var.x #Trajectory
var.t #Corresponding time vector

Related

while working with gspread-pandas module, I want to change default_dir of the module

import json
from os import path, makedirs
_default_dir = path.expanduser('~/.config/gspread_pandas')
_default_file = 'google_secret.json'
def ensure_path(pth):
if not path.exists(pth):
makedirs(pth)
hi, I'm currently working on data collection via selenium and pandas to parse the data and edit it with pandas to send the data to google spread
however, while I'm working on gspread-pandas module, the module needs to put google_secret json file to '~/.config/gspread_pandas'. which is fixed location as described in the link below
https://pypi.python.org/pypi/gspread-pandas/0.15.1
I want to make some function to set the custom location to achieve independent working app environment.
for example, I want to locate the file to here
default_folder = os.getcwd()
the default_folder will be where my project is located(the same folder)
what can I do with it?
If you see the source https://github.com/aiguofer/gspread-pandas/blob/master/gspread_pandas/conf.py you can notice, that you can create your own config and pass it to Spread object constructor.
But yes, this part is really badly documented.
So, this code works well for me:
from gspread_pandas import Spread, conf
c = conf.get_config('[Your Path]', '[Your filename]')
spread = Spread('username', 'spreadname', config=c)
Thank you for this. It really should be documented better. I was getting so frustrated trying to get this to work with heroku, but it worked perfectly. I had to change to the following:
c = gspread_pandas.conf.get_config('/app/', 'google_secret.json')
spread = gspread_pandas.Spread('google_sheet_key_here_that_is_a_long_string', config=c)

Import MTX files into R

I have a large hdf file from which I get one of the data-frame and convert it into Sparse Matrix in python (sparse.csr_matrix). Now, I save this as .MTX file and trying to load this in R. I went through some documentation and links for loading MTX files into R using externalFormats {Matrix}. Unfortunately, I get the following error.
TestDataMatrix = readMM(system.file("./Downloads/TestDataMatrix.mtx",
package = "Matrix"))
When I run the above code, I get the following error and I have no clue what it means.
TestDataMatrix = readMM(system.file("./Downloads/TestDataMatrix.mtx",
+ package = "Matrix"))
1:
Could someone let me know if there is an easy way to convert python objects to R objects (like RDS).
I found a way to read hdf files in R and got the data-frame converted to Sparse-Matrix. But there is a problem writing to HDF files. I posted a new question here.

Python, Alembic.io, Cask: Properties of object do not save when using write_to_file()

I am often writing scripts for various 3d packages (3ds max, Maya, etc) and that is why I am interested with Alembic, a file format that is getting a lot of attention lately.
Quick explanation for anyone who does not know this project: alembic - www.alembic.io - is a file format created for containing 3d meshes and data connected with them. It is using a tree-like structure, as You may see below, with one root node and its childs, childs of childs etc. Objects of this node can have properties.
I am trying to learn how to use this Alembic with Python.
There are some tutorials on docks page of this project and I'm having some problems with this one:
http://docs.alembic.io/python/cask.html
It's about using cask module - a wrapper that should manipulating a content of files easier.
This part:
a = cask.Archive("animatedcube.abc")
r = cask.Xform()
x = a.top.children["cube1"]
a.top.children["root"] = r
r.children["cube1"] = x
a.write_to_file("/var/tmp/cask_insert_node.abc")
works well. Afther that there's new file "cask_insert_node.abc" and it has objects as expected.
But when I'm adding some properties to objects, like this:
a = cask.Archive("animatedcube.abc")
r = cask.Xform()
x = a.top.children["cube1"]
x.properties['new_property'] = cask.Property()
a.top.children["root"] = r
r.children["cube1"] = x
a.write_to_file("/var/tmp/cask_insert_node.abc")
the "cube1" object in a resulting file do not contain property "new_property".
The saving process is a problem, i know that the property has been added to "cube1" before saving, I've checked it another way, with a function that I wrote which creates graph of objects in archive.
The code for this module is there:
source
Does anyone know what I am doing wrong? How to save parameters? Some other way?
Sadly, cask doesn't support this. One cannot modify an archive and have the result saved (somehow related to how Alembic streams the data off of disk). What you'll want to do is create an output archive
oArchive = alembic.Abc.CreateArchiveWithInfo(...)
then copy all desired data from your input archive over to your output archive including time sampling (
.addTimeSampling(iArchive.getTimeSampling(i) for i in iArchive.getNumTimeSamplings()
, and objects, recursing through iArchive.getTop() and oArchive.getTop() defining output properties (alembic.Abc.OArrayProperty, or OScalarProperty) as you encounter them in the iArchive. When these are defined, you can interject your new values as samples to the property at that time.
It's a real beast, and something that cask really ought to support. In fact, someone in the Alembic community should just do everyone a favor and write a cask2 (casket?) which wraps all of this into simple calls like you instinctively tried to do.

How to avoid loading a large file into a python script repeatedly?

I've written a python script to take a large file (a matrix ~50k rows X ~500 cols) and use it as a dataset to train a random forest model.
My script has two functions, one to load the dataset and the other to train the random forest model using said data. These both work fine, but the file upload takes ~45 seconds and it's a pain to do this every time I want to train a subtly different model (testing many models on the same dataset). Here is the file upload code:
def load_train_data(train_file):
# Read in training file
train_f = io.open(train_file)
train_id_list = []
train_val_list = []
for line in train_f:
list_line = line.strip().split("\t")
if list_line[0] != "Domain":
train_identifier = list_line[9]
train_values = list_line[12:]
train_id_list.append(train_identifier)
train_val_float = [float(x) for x in train_values]
train_val_list.append(train_val_float)
train_f.close()
train_val_array = np.asarray(train_val_list)
return(train_id_list,train_val_array)
This returns a numpy array with col. 9 as the label and cols. 12-end as the metadata to train the random forest.
I am going to train many different forms of my model with the same data, so I just want to upload the file one time and have it available to feed into my random forest function. I want the file to be an object I think (I am fairly new to python).
If I understand you correctly, the data set does not change but the model parameters do change and you are changing the parameters after each run.
I would put the file load script in one file, and run this in the python interpreter. Then the file will load and be saved in memory with whatever variable you use.
Then you can import another file with your model code, and run that with the training data as argument.
If all your model changes can be determined as parameters in a function call, all you need is to import your model and then call the training function with different parameter settings.
If you need to change the model code between runs, save with a new filename and import that one, run again and send the source data to that one.
If you don't want to save each model modification with a new filename, you might be able to use the reload functionality depending on python version, but it is not recommended (see Proper way to reload a python module from the console)
Simplest way would be to cache the results, like so:
_train_data_cache = {}
def load_cached_train_data(train_file):
if train_file not in _train_data_cache:
_train_data_cache[train_file] = load_train_data(train_file)
return _train_data_cache[train_file]
Try to learn about Python data serialization. You would basically be storing the large file as a python specific, serialized binary object using python's marshal function. This would drastically speed up IO of the file. See these benchmarks for performance variations. However, if these random forest models are all trained at the same time then you could just train them against the data-set you already have in memory then release train data after completion.
Load your data in ipython.
my_data = open("data.txt")
Write your codes in a python script, say example.py, which uses this data. At the top of the script example.py add these lines:
import sys
args = sys.argv
data = args[1]
...
Now run the python script in ipython:
%run example.py $mydata
Now, when running your python script, you don't need to load data multiple times.

Unable to load a previously dumped pickle file in Python

The implemented algorithm which I use is quite heavy and has three parts. Thus, I used pickle to dump everything in between various stages in order to do testing on each stage separately.
Although the first dump always works fine, the second one behaves as if it is size dependent. It will work for a smaller dataset but not for a somewhat larger one. (The same actually also happens with a heatmap I try to create but that's a different question) The dumped file is about 10MB so it's nothing really large.
The dump which creates the problem contains a whole class which in turn contains methods, dictionaries, lists and variables.
I actually tried dumping both from inside and outside the class but both failed.
The code I'm using looks like this:
data = pickle.load(open("./data/tmp/data.pck", 'rb')) #Reads from the previous stage dump and works fine.
dataEvol = data.evol_detect(prevTimeslots, xLablNum) #Export the class to dataEvol
dataEvolPck = open("./data/tmp/dataEvol.pck", "wb") #open works fine
pickle.dump(dataEvol, dataEvolPck, protocol = 2) #dump works fine
dataEvolPck.close()
and even tried this:
dataPck = open("./data/tmp/dataFull.pck", "wb")
pickle.dump(self, dataPck, protocol=2) #self here is the dataEvol in the previous part of code
dataPck.close()
The problem appears when i try to load the class using this part:
dataEvol = pickle.load(open("./data/tmp/dataEvol.pck", 'rb'))
The error in hand is:
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
dataEvol = pickle.load(open("./data/tmp/dataEvol.pck", 'rb'))
ValueError: itemsize cannot be zero
Any ideas?
I'm using Python 3.3 on a 64-bit Win-7 computer. Please forgive me if I'm missing anything essential as this is my first question.
Answer:
The problem was an empty numpy string in one of the dictionaries. Thanks Janne!!!
It is a NumPy bug that has been fixed recently in this pull request. To reproduce it, try:
import cPickle
import numpy as np
cPickle.loads(cPickle.dumps(np.string_('')))

Categories