Pulp error +self.path - python

I am using python to read data from a .xlsm excel file. I have two files that are nearly identical and are saved in the same directory. When I give the python program one excel sheet, it correctly reads the data and solves the problem. However, with the other excel sheet I get the following error.
(I blocked out my name with ####)
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
solve("updated_excel.xlsm")
File "C:\Documents and Settings\#####\My Documents\GlockNew.py", line 111, in solve
prob.solve()
File "C:\Python27\lib\site-packages\pulp-1.5.4-py2.7.egg\pulp\pulp.py", line 1614, in solve
status = solver.actualSolve(self, **kwargs)
File "C:\Python27\lib\site-packages\pulp-1.5.4-py2.7.egg\pulp\solvers.py", line 1276, in actualSolve
return self.solve_CBC(lp, **kwargs)
File "C:\Python27\lib\site-packages\pulp-1.5.4-py2.7.egg\pulp\solvers.py", line 1343, in solve_CBC
raise PulpSolverError, "Pulp: Error while executing "+self.path
PulpSolverError: Pulp: Error while executing C:\Python27\lib\site-packages\pulp-1.5.4-py2.7.egg\pulp\solverdir\cbc.exe
I don't know what ""Pulp: Error while executing "+self.path" means, but both files are stored in the same directory, and the problem only appears once I try to solve the problem. Does anyone have an idea as to what can possible trigger such an error?
EDIT
After further debugging, I have found that the error lies in the solve_CBC method in the COIN_CMD class. The error occurs here:
if not os.path.exists(tmpSol):
raise PulpSolverError, "Pulp: Error while executing "+self.path
When I run the solver for both excel sheets, they have the same value for tmpSol: 4528-pulp.sol
However, when I run it for one excel sheet os.path.exists(tmpSol) returns true, and for the other it returns false. How can that be- tmpSol has the same value both times?

The name is created using the process id, if you have some sort of batch job that launches both solver applications from one process then they will have the same name.

I experienced the same issue when launching multiple instances of the LPSolver class. The issue is caused by the following lines of code within the solvers.py file of pulp:
pid = os.getpid()
tmpLp = os.path.join(self.tmpDir, "%d-pulp.lp" % pid)
tmpMps = os.path.join(self.tmpDir, "%d-pulp.mps" % pid)
tmpSol = os.path.join(self.tmpDir, "%d-pulp.sol" % pid)
which appears in every solver. The problem is that these paths are deleted later on, but may coincide for different instances of the LPSolver class (as the variable pid is not unique).
The solution is to get a unique path for each instance of LPSolver, using, for example, the current time. Replacing the above lines by the following four will do the trick.
currentTime = time()
tmpLp = os.path.join(self.tmpDir, "%f3-pulp.lp" % currentTime)
tmpMps = os.path.join(self.tmpDir, "%f3-pulp.mps" % currentTime)
tmpSol = os.path.join(self.tmpDir, "%f3-pulp.sol" % currentTime)
Don't forget to
from time import time
Cheers,
Tim

Related

VS Code python terminal is keep printing "Found" when it should ask the user for an input

I am taking cs50 class. Currently on Week 7.
Prior to this coding, python was working perfectly fine.
Now, I am using SQL command within python file on VS Code.
cs50 module is working fine through venv.
When I execute python file, I should be asked "Title: " so that I can type any titles to see the outcome.
I should be getting an output of the counter, which tracks the number of occurrence of the title from user input.
import csv
from cs50 import SQL
db = SQL("C:\\Users\\wf user\\Desktop\\CODING\\CS50\\shows.db")
title = input("Title: ").strip()
#uses SQL command to return the number of occurrence of the title the user typed.
rows = db.execute("SELECT COUNT(*) AS counter FROM shows WHERE title LIKE ?", title) #? is for title.
#db.execute always returns a list of rows even if it's just one row.
#setting row to the keyword which is is rows[0]. the actual value is in rows[1]
row = rows[0]
#passing the key called counter will print out the value that is in rows[1]
print(row["counter"])
I have shows.db in the path.
But the output is printing "Found". It's not even asking for a Title to input.
PS C:\Users\wf user\Desktop\CODING\CS50> python favoritesS.py
Found
I am expecting the program to ask me "Title: " for me, but instead it's print "Found"
In cs50, the professor encountered the same problem when he was coding phonebook.py, but the way he solved the problem was he put the python file into a separate folder called "tmp"
I tried the same way but then I was given a long error message
PS C:\Users\wf user\Desktop\CODING\CS50> cd tmp
PS C:\Users\wf user\Desktop\CODING\CS50\tmp> python favoritesS.py
Traceback (most recent call last):
File "C:\Users\wf user\Desktop\CODING\CS50\tmp\favoritesS.py", line 5, in <module>
db = SQL("C:\\Users\\wf user\\Desktop\\CODING\\CS50\\shows.db")
File "C:\Users\wf user\AppData\Local\Programs\Python\Python311\Lib\site-packages\cs50\sql.py", line 74, in __init__
self._engine = sqlalchemy.create_engine(url, **kwargs).execution_options(autocommit=False, isolation_level="AUTOCOMMIT")
File "<string>", line 2, in create_engine
File "C:\Users\wf user\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\util\deprecations.py", line 309, in warned
return fn(*args, **kwargs)
File "C:\Users\wf user\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\create.py", line 518, in create_engine
u = _url.make_url(url)
File "C:\Users\wf user\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\url.py", line 732, in make_url
return _parse_url(name_or_url)
File "C:\Users\wf user\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\url.py", line 793, in _parse_url
raise exc.ArgumentError(
sqlalchemy.exc.ArgumentError: Could not parse SQLAlchemy URL from string 'C:\Users\wf user\Desktop\CODING\CS50\shows.db'
Here is the proof that the code I posted is the same code I am working on.
I use Start Debugging under Run menu on VSCode and it's working! But not when I don't use debugging.
Is this the library you are using? https://cs50.readthedocs.io/
It may be that one of your intermediate results is not doing what you think it is. I would recommend you put print() statements at every step of the way to see the values of the intermediate variables.
If you have learned how to use a debugger, that is even better.

Python multiprocessing apply_async "assert left > 0" AssertionError

I am trying to load numpy files asynchronously in a Pool:
self.pool = Pool(2, maxtasksperchild = 1)
...
nextPackage = self.pool.apply_async(loadPackages, (...))
for fi in np.arange(len(files)):
packages = nextPackage.get(timeout=30)
# preload the next package asynchronously. It will be available
# by the time it is required.
nextPackage = self.pool.apply_async(loadPackages, (...))
The method "loadPackages":
def loadPackages(... (2 strings & 2 ints) ...):
print("This isn't printed!')
packages = {
"TRUE": np.load(gzip.GzipFile(path1, "r")),
"FALSE": np.load(gzip.GzipFile(path2, "r"))
}
return packages
Before even the first "package" is loaded, the following error occurs:
Exception in thread Thread-8: Traceback (most recent call last):
File "C:\Users\roman\Anaconda3\envs\tsc1\lib\threading.py", line 914,
in _bootstrap_inner
self.run() File "C:\Users\roman\Anaconda3\envs\tsc1\lib\threading.py", line 862, in
run
self._target(*self._args, **self._kwargs) File "C:\Users\roman\Anaconda3\envs\tsc1\lib\multiprocessing\pool.py", line
463, in _handle_results
task = get() File "C:\Users\roman\Anaconda3\envs\tsc1\lib\multiprocessing\connection.py",
line 250, in recv
buf = self._recv_bytes() File "C:\Users\roman\Anaconda3\envs\tsc1\lib\multiprocessing\connection.py",
line 318, in _recv_bytes
return self._get_more_data(ov, maxsize) File "C:\Users\roman\Anaconda3\envs\tsc1\lib\multiprocessing\connection.py",
line 337, in _get_more_data
assert left > 0 AssertionError
I monitor the resources closely: Memory is not an issue, I still have plenty left when the error occurs.
The unzipped files are just plain multidimensional numpy arrays.
Individually, using a Pool with a simpler method works, and loading the file like that works. Only in combination it fails.
(All this happens in a custom keras generator. I doubt this helps but who knows.) Python 3.5.
What could the cause of this issue be? How can this error be interpreted?
Thank you for your help!
There is a bug in Python C core code that prevents data responses bigger than 2GB return correctly to the main thread.
you need to either split the data into smaller chunks as suggested in the previous answer or not use multiprocessing for this function
I reported this bug to python bugs list (https://bugs.python.org/issue34563) and created a PR (https://github.com/python/cpython/pull/9027) to fix it, but it probably will take a while to get it released (UPDATE: the fix is present in python 3.8.0+)
if you are interested you can find more details on what causes the bug in the bug description in the link I posted
It think I've found a workaround by retrieving data in small chunks. In my case it was a list of lists.
I had:
for i in range(0, NUMBER_OF_THREADS):
print('MAIN: Getting data from process ' + str(i) + ' proxy...')
X_train.extend(ListasX[i]._getvalue())
Y_train.extend(ListasY[i]._getvalue())
ListasX[i] = None
ListasY[i] = None
gc.collect()
Changed to:
CHUNK_SIZE = 1024
for i in range(0, NUMBER_OF_THREADS):
print('MAIN: Getting data from process ' + str(i) + ' proxy...')
for k in range(0, len(ListasX[i]), CHUNK_SIZE):
X_train.extend(ListasX[i][k:k+CHUNK_SIZE])
Y_train.extend(ListasY[i][k:k+CHUNK_SIZE])
ListasX[i] = None
ListasY[i] = None
gc.collect()
And now it seems to work, possibly by serializing less data at a time.
So maybe if you can segment your data into smaller portions you can overcome the issue. Good luck!

Don't understand this ConfigParser.InterpolationSyntaxError

So I have tried to write a small config file for my script, which should specify an IP address, a port and a URL which should be created via interpolation using the former two variables. My config.ini looks like this:
[Client]
recv_url : http://%(recv_host):%(recv_port)/rpm_list/api/
recv_host = 172.28.128.5
recv_port = 5000
column_list = Name,Version,Build_Date,Host,Release,Architecture,Install_Date,Group,Size,License,Signature,Source_RPM,Build_Host,Relocations,Packager,Vendor,URL,Summary
In my script I parse this config file as follows:
config = SafeConfigParser()
config.read('config.ini')
column_list = config.get('Client', 'column_list').split(',')
URL = config.get('Client', 'recv_url')
If I run my script, this results in:
Traceback (most recent call last):
File "server_side_agent.py", line 56, in <module>
URL = config.get('Client', 'recv_url')
File "/usr/lib64/python2.7/ConfigParser.py", line 623, in get
return self._interpolate(section, option, value, d)
File "/usr/lib64/python2.7/ConfigParser.py", line 691, in _interpolate
self._interpolate_some(option, L, rawval, section, vars, 1)
File "/usr/lib64/python2.7/ConfigParser.py", line 716, in _interpolate_some
"bad interpolation variable reference %r" % rest)
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
I have tried debugging, which resulted in giving me one more line of error code:
...
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
Exception AttributeError: "'NoneType' object has no attribute 'path'" in <function _remove at 0x7fc4d32c46e0> ignored
Here I am stuck. I don't know where this _remove function is supposed to be... I tried searching for what the message is supposed to tell me, but quite frankly I have no idea. So...
Is there something wrong with my code?
What does '< function _remove at ... >' mean?
There was indeed a mistake in my config.ini file. I did not regard the s at the end of %(...)s as a necessary syntax element. I suppose it refers to "string" but I couldn't really confirm this.
My .ini file for starting the Python Pyramid server had a similar problem.
And to use the variable from the .env file, I needed to add the following: %%(VARIEBLE_FOR_EXAMPLE)s
But I got other problems, and I solved them with this: How can I use a system environment variable inside a pyramid ini file?

luigi target for non-existent table

I'm trying to set up a simple table existence test for a luigi task using luigi.hive.HiveTableTarget
I create a simple table in hive just to make sure it is there:
create table test_table (a int);
Next I set up the target with luigi:
from luigi.hive import HiveTableTarget
target = HiveTableTarget(table='test_table')
>>> target.exists()
True
Great, next I try it with a table I know doesn't exist to make sure it returns false.
target = HiveTableTarget(table='test_table_not_here')
>>> target.exists()
And it raises an exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/site-packages/luigi/hive.py", line 344, in exists
return self.client.table_exists(self.table, self.database)
File "/usr/lib/python2.6/site-packages/luigi/hive.py", line 117, in table_exists
stdout = run_hive_cmd('use {0}; describe {1}'.format(database, table))
File "/usr/lib/python2.6/site-packages/luigi/hive.py", line 62, in run_hive_cmd
return run_hive(['-e', hivecmd], check_return_code)
File "/usr/lib/python2.6/site-packages/luigi/hive.py", line 56, in run_hive
stdout, stderr)
luigi.hive.HiveCommandError: ('Hive command: hive -e use default; describe test_table_not_here
failed with error code: 17', '', '\nLogging initialized using configuration in
jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hive-common-0.13.1-
cdh5.2.0.jar!/hive-log4j.properties\nOK\nTime taken: 0.822 seconds\nFAILED:
SemanticException [Error 10001]: Table not found test_table_not_here\n')
edited formatting for clarity
I don't understand that last line of the exception. Of course the table is not found, that is the whole point of an existence check. Is this the expected behavior or do I have some configuration issue I need to work out?
Okay so it looks like this may have been a bug in the latest tagged release (1.0.19) but it is fixed on the master branch. The code responsible is the line:
stdout = run_hive_cmd('use {0}; describe {1}'.format(database, table))
return not "does not exist" in stdout
which is changed in the master to be:
stdout = run_hive_cmd('use {0}; show tables like "{1}";'.format(database, table))
return stdout and table in stdout
The latter works fine whereas the former throws a HiveCommandError.
If you want a solution without having to update to the master branch, you could create your own target class with minimal effort:
from luigi.hive import HiveTableTarget, run_hive_cmd
class MyHiveTarget(HiveTableTarget):
def exists(self):
stdout = run_hive_cmd('use {0}; show tables like "{1}";'.format(self.database, self.table))
return self.table in stdout
This will produce the desired output.

Pymongo failing but won't give exception

Here is the query in Pymongo
import mong #just my library for initializing
collection_1 = mong.init(collect="col_1")
collection_2 = mong.init(collect="col_2")
for name in collection_2.find({"field1":{"$exists":0}}):
try:
to_query = name['something']
actual_id = collection_1.find_one({"something":to_query})['_id']
crap_id = name['_id']
collection_2.update({"_id":id},{"$set":{"new_name":actual_id}},upset=True)
except:
open('couldn_find_id.txt','a').write(name)
All this is doing is taking a field from one collection, finding the id of that field and updating the id of another collection. It works for about 1000-5000 iterations, but periodically fails with this and then I have to restart the script.
> Traceback (most recent call last):
File "my_query.py", line 6, in <module>
for name in collection_2.find({"field1":{"$exists":0}}):
File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 814, in next
if len(self.__data) or self._refresh():
File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 776, in _refresh
limit, self.__id))
File "/home/user/python_mods/pymongo/pymongo/cursor.py", line 720, in __send_message
self.__uuid_subtype)
File "/home/user/python_mods/pymongo/pymongo/helpers.py", line 98, in _unpack_response
cursor_id)
pymongo.errors.OperationFailure: cursor id '7578200897189065658' not valid at server
^C
bye
Does anyone have any idea what this failure is, and how I can turn it into an exception to continue my script even at this failure?
Thanks
The reason of the problem is described in pymongo's FAQ:
Cursors in MongoDB can timeout on the server if they’ve been open for
a long time without any operations being performed on them. This can
lead to an OperationFailure exception being raised when attempting to
iterate the cursor.
This is because of the timeout argument of collection.find():
timeout (optional): if True (the default), any returned cursor is
closed by the server after 10 minutes of inactivity. If set to False,
the returned cursor will never time out on the server. Care should be
taken to ensure that cursors with timeout turned off are properly
closed.
Passing timeout=False to the find should fix the problem:
for name in collection_2.find({"field1":{"$exists":0}}, timeout=False):
But, be sure you are closing the cursor properly.
Also see:
mongodb cursor id not valid error

Categories