I am trying to run a rolling horizon optimisation where I have multiple optimisation scripts, each generating their own results. Instead of printing results to screen at every interval, I want to write each of the results using model.write("results.sol") - and then read them back into a results processing script (separate python script).
I have tried using read("results.sol") using Python, but the file format is not recognised. Is there any way that you can read/process the .sol file format that Gurobi outputs? It would seem bizarre if you cannot read the .sol file at some later point and generate plots etc.
Maybe I have missed something blindingly obvious.
Hard to answer without seeing your code as we have to guess what you are doing.
But well...
When you use
model.write("out.sol")
Gurobi will use it's own format to write it (and what is written is inferred from the file-suffix).
This can easily be read by:
model.read("out.sol")
If you used
x = read("out.sol")
you are using python's basic IO-tools and of course python won't interpret that file in respect to the format. Furthermore reading like that is text-mode (and maybe binary is required; not sure).
General rule: if you wrote the solution using a class-method of class model, then read using a class-method of class model too.
The usage above is normally used to reinstate some state of your model (e.g. MIP-start). If you want to plot it, you will have to do further work. In this case, using python's IO tools might be a good idea and you should respect the format described here. This could be read as csv or manually (and opposed to my remark earlier: it is text-mode; not binary).
So assuming the example from the link is in file gur.sol:
import csv
with open('gur.sol', newline='\n') as csvfile:
reader = csv.reader((line.replace(' ', ' ') for line in csvfile), delimiter=' ')
next(reader) # skip header
sol = {}
for var, value in reader:
sol[var] = float(value)
print(sol)
Output:
{'z': 0.2, 'x': 1.0, 'y': 0.5}
Remarks:
Code is ugly because python's csv module has some limitations
Delimiter is two-spaces in this format and we need to hack the code to read it (as only one character is allowed in this function)
Code might be tailored to python 3 (what i'm using; probably the next() method will be different in py2)
pandas would be much much better for this purpose (huge tool with a very good csv_reader)
Related
As the title indicate I have this issue of retrieving those information from dump_stats properly. Without further ado here is my simple code.
Code
import cProfile
import pstats
def fun_to_profile():
... code to be profilled ...
profiler = cProfile.Profile()
profiler.runcall(fun_to)profile)
stats.sort_stats('cumulative')
stats.print_stats()
stats.dump_stats("output.txt")
This is the simple code that I could found, and I really read multiple times the documentation.
Problem
My problem when I open the file "output.txt", even if it's empty or with non comprehended characters. So do I need to specify any extension of the file, or maybe the issue is with my compiler.
Thanks in advance.
Apparently working with cProfile is so easy and straight forwards. I figure the solution for the problem.
First of all we need to know that the more adequate file extension is "file.dat". Then we need to read it and writing down in the desired files format like text.txt.
For that we need the following piece of code :
import cProfile
import pstats
cProfile.run("fun_to_profile", "Out_put_profile.dat") # here we just run and save the output
with open("Profile_time.txt", "w") as f:
p = pstats.Stats("Out_put_profile.dat", stream=f)
p.sort_stats("time").print_stats() # here we sort our analysis by the time-spent
And just like this we will have a more materials for analyzing the code and in human readable format. Thanks for IDG TECHtalk for sharing the solution.
Link to the youtube video: https://youtu.be/dmnA3axZ3FY.
The documentation for the python library Murmur is a bit sparse.
I have been trying to adapt the code from this answer:
import hashlib
from functools import partial
def md5sum(filename):
with open(filename, mode='rb') as f:
d = hashlib.md5()
for buf in iter(partial(f.read, 128), b''):
d.update(buf)
return d.hexdigest()
print(md5sum('utils.py'))
From what I read in the answer, the md5 can't operate on the whole file at once so it needs this looping. Not sure exactly what would happen on the line d.update(buf) however.
The public methods in hashlib.md5() are:
'block_size',
'copy',
'digest',
'digest_size',
'hexdigest',
'name',
'update'
whereas mmh3 has
'hash',
'hash64',
'hash_bytes'
No update or hexdigest methods..
Does anyone know how to achieve a similar result?
The motivation is testing for uniqueness as fast as possible, the results here suggests murmur is a good candidate.
Update -
Following the comment from #Bakuriu I had a look at mmh3 which seems to be better documented.
The public methods inside it are:
import mmh3
print([x for x in dir(mmh3) if x[0]!='_'])
>>> ['hash', 'hash128', 'hash64', 'hash_bytes', 'hash_from_buffer']
..so no "update" method. I had a look at the source code for mmh3.hash_from_buffer but it does not look like it contains a loop and it is also not in Python, can't really follow it. Here is a link to the line
So for now will just use CRC-32 which is supposed to be almost as good for the purpose, and it is well documented how to do it. If anyone posts a solution will test it out.
To hash a file using murmur, one has to load it completely into memory and hash it in one go.
import mmh3
with open('main.py') as file:
data = file.read()
hash = mmh3.hash_bytes(data, 0xBEFFE)
print(hash.hex())
If your file is too large to fit into memory, you could use incremental/progressive hashing: add your data in multiple chunks and hash them on the fly (like your example above).
Is there a Python library for progressive hashing with murmur?
I tried to find one, but it seems there is none.
Is progressive hashing even possible with murmur?
There is a working implementation in C:
https://github.com/rurban/smhasher/blob/master/PMurHash.h
https://github.com/rurban/smhasher/blob/master/PMurHash.c
I have been working on this file i/o and have made some progress reading through the site and i am wondering what other ways this can be optimized. I am parsing a test infile of 10GB/30MM lines and writing to an outfile the fields which results in aprog 1.4GB clean file. Initially, it took 40m to run this process and i have reduced it to around 30m. Anyone have any other ideas to reduce this in python. Long term i will be looking to write this in C++ - i just have to learn the language first. thanks in advance.
with open(fdir+"input.txt",'rb',(50*(1024*1024))) as r:
w=open(fdir+"output0.txt",'wb',50*(1024*1024)))
for i,l in enumerate(r):
if l[42:44]=='25':
# takes fixed width line into csv line of only a few cols
wbun.append(','.join([
l[7:15],
l[26:35],
l[44:52],
l[53:57],
format(int(l[76:89])/100.0,'.02f'),
l[89:90],
format(int(l[90:103])/100.0,'.02f'),
l[193:201],
l[271:278]+'\n'
]))
# write about every 5MM lines
if len(wbun)==wsize:
w.writelines(wbun)
wbun=[]
print "i_count:",i
# splits about every 4GB
if (i+1)%fsplit==0:
w.close()
w=open(fdir+"output%d.txt"%(i/fsplit+1),'wb',50*(1024*1024)))
w.writelines(wbun)
w.close()
Try running it in Pypy (https://pypy.org), it will run without changes to your code, and probably faster.
Also, C++ might be an overkill, especially if you don't know it yet. Consider learning Go or D instead.
So I'm thinking this is one of those problems where I can't see the forest for the tree. Here is the assignment:
Using the file object input, write code that read an integer from a file called
rawdata into a variable datum (make sure you assign an integer value to datum).
Open the file at the beginning of your code, and close it at the end.
okay so first thing: I thought the input function was for assigning data to an object such as a variable, not for reading data from an object. Wouldn't that be read.file_name ?
But I gave it shot:
infile = open('rawdata','r')
datum = int(input.infile())
infile.close()
Now first problem... MyProgrammingLab doesn't want to grade it. By that I mean I type in the code, click 'submit' and I get the "Checking" screen. And that's it. At the time of writing this, my latest attempt to submit as been 'checking' for 11 minutes. It's not giving me an error, it's just not... 'checking' I guess.
Now at the moment I can't use Python to try the program because it's looking for a while and I'm on a school computer that is write locked, so even if I have the code right (I doubt it), the program will fail to run because it can neither find the file rawdata nor create it.
So... what's the deal? Am I reading the instructions wrong or is it telling me to use input in some other way then I'm trying to use it? Or am I supposed to be using a different method?
You are so close. You're just using the file object slightly incorrectly. Once it's open, you can just .read() it, and get the value.
It would probably look something like this
infile = open('rawdata','r')
datum = int(infile.read())
infile.close()
I feel like your confusion is based purely on the wording of the question - the term "file object input" can certainly be confusing if you haven't worked with Python I/O before. In this case, the "file object" is infile and the "input" would be the rawdata file, I suppose.
Currently taking this class and figured this out. This is my contribution to all of us college peeps just trying to make it through, lol. MPL accepts this answer.
input = open('rawdata','r')
datum = int(input.readline())
input.close()
I want to pass an input fasta file stored in a variable say inp_a from python to bowtie and write the output into another out_a. I want to use
os.system ('bowtie [options] inp_a out_a')
Can you help me out
Your question asks for two things, as far as I can tell: writing data to disk, and calling an external program from within Python. Without more detailed requirements, here's what I would write:
import subprocess
data_for_bowtie = "some genome data, lol"
with open("input.fasta", "wb") as input_file:
input_file.write(data_for_bowtie)
subprocess.call(["bowtie", "input.fasta", "output.something"])
There are some fine details here which I have assumed. I'm assuming that you mean bowtie, the read aligner. I'm assuming that your file is a binary, non-human-readable one (which is why there's that b in the second argument to open) and I'm making baseless assumptions about how to call bowtie on the command line because I'm not motivated enough to spend the time learning it.
Hopefully, that provides a starting point. Good luck!