EOFError: Ran out of input, using torch.load()

EOFError: Ran out of input, using torch.load() - python

I saw this error being posted a lot and often it was due to the file not being closed properly after opening. But since I'm using the integrated torch.load() function, I'm not sure what I could do different.
First the saving part:
torch.save({
'model_state_dict': agent.dqn.state_dict(),
...
'loss_history': agent.losshistory
}, modelpath)
and here the loading part, where I also get the error message:
if os.path.exists(modelpath):
checkpoint = torch.load(modelpath)
agent.dqn.load_state_dict(checkpoint['model_state_dict'])
...
agent.losshistory = checkpoint['loss_history']
and here the error:
Traceback (most recent call last):
File "c:/Users/levin/Desktop/programming/main.py", line 33, in <module>
checkpoint = torch.load(modelpath)
File "C:\Users\levin\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\Users\levin\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 702, in _legacy_load
result = unpickler.load()
EOFError: Ran out of input
One more thing I want to mention is that I used this exact code several times without a problem. I can't remember changing anything that could have caused the error.

According to this thread it seems to raise an exception when reading an empty file, so please check the size of the document before reading it and post a response if it is not solved.

Related

Why don't I get a full traceback from a saved exception - and how do I get and save the full trace?

I have a user submitted data validation interface for a scientific site in django, and I want the user to be able to submit files of scientific data that will aid them in resolving simple problems with their data before they're allowed to make a formal submission (to reduce workload on the curators who actually load the data into our database).
The validation interface re-uses the loading code, which is good for code re-use. It has a "validate mode" that doesn't change the database. Everything is in an atomic transaction block and it gets rolled back in any case when it runs in validate mode.
I'm in the middle of a refactor to alleviate a problem. The problem is that the user has to submit the files multiple times, each time, getting the next error. So I've been refining the code to be able to "buffer" the exceptions in an array and only really stop if any error makes further processing impossible. So far, it's working great.
Since unexpected errors are expected in this interface (because the data is complex and lab users are continually finding new ways to screw up the data), I am catching and buffering any exception and intend to write custom exception classes for each case as I encounter them.
The problem is that when I'm adding new features and encounter a new error, the tracebacks in the buffered exceptions aren't being fully preserved, which makes it annoying to debug - even when I change the code to raise and immediately catch the exception so I can add it to the buffer with the traceback. For example, in my debugging, I may get an exception from a large block of code, and I can't tell what line it is coming from.
I have worked around this problem by saving the traceback as a string inside the buffered exception object, which just feels wrong. I had to play around in the shell to get it to work. Here is my simple test case to demonstrate what's happening. It's reproducible for me, but apparently not for others who try this toy example - and I don't know why:
import traceback
class teste(Exception):
"""This is an exception class I'm going to raise to represent some unanticipated exception - for which I will want a traceback."""
pass
def buf(exc, args):
"""This represents my method I call to buffer an exception, but for this example, I just return the exception and keep it in main in a variable. The actual method in my code appends to a data member array in the loader object."""
try:
raise exc(*args)
except Exception as e:
# This is a sanity check that prints the trace that I will want to get from the buffered exception object later
print("STACK:")
traceback.print_stack()
# This is my workaround where I save the trace as a string in the exception object
e.past_tb = "".join(traceback.format_stack())
return e
The above example raises the exception inside buf. (My original code supports both raising the exception for the first time and buffering an already raised and caught exception. In both cases, I wasn't getting a saved full traceback, so I'm only providing the one example case (where I raise it inside the buf method).
And here's what I see when I use the above code in the shell. This first call shows my sanity check - the whole stack, which is what I want to be able to access later:
In [5]: es = buf(teste, ["This is a test"])
STACK:
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/manage.py", line 22, in <module>
main()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/commands/shell.py", line 100, in handle
return getattr(self, shell)(options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/commands/shell.py", line 36, in ipython
start_ipython(argv=[])
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/__init__.py", line 126, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/traitlets/config/application.py", line 846, in launch_instance
app.start()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/interactiveshell.py", line 566, in mainloop
self.interact()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/interactiveshell.py", line 557, in interact
self.run_cell(code, store_history=True)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2914, in run_cell
result = self._run_cell(
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2960, in _run_cell
return runner(coro)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 78, in _pseudo_sync_runner
coro.send(None)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3185, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3377, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-5-92f4a0db918d>", line 1, in <module>
es = buf(teste, ["This is a test"])
File "<ipython-input-2-86e515dc1ec1>", line 6, in buf
traceback.print_stack()
But this is what I see when I want to see the original traceback from the es object (i.e. the buffered exception) later. It only has the last item from the traceback. This is exactly what I see in the original source code - a single item for the line of code inside the buffer method:
In [8]: traceback.print_exception(type(es), es, es.__traceback__)
Traceback (most recent call last):
File "<ipython-input-2-86e515dc1ec1>", line 3, in buf
raise exc(*args)
teste: This is a test
My workaround suffices for now, but I'd like to have a proper traceback object.
I debugged the issue by re-cloning our repo in a second directory to make sure I hadn't messed up my sandbox. I guess I should try this on another computer too - my office mac. But can anyone point me in the right direction to debug this issue? What could be the cause for losing the full traceback?

Python has a really weird way of building exception tracebacks. You might expect it to build the traceback when the exception is created, or when it's raised, but that's not how it works.
Python builds a traceback as an exception propagates. Every time the exception propagates up to a new stack frame, a traceback entry for that stack frame is added to the exception's traceback.
This means that an exception's traceback only goes as far as the exception itself propagates. If you catch it (and don't reraise it), the traceback only goes up to the point where it got caught.
Unfortunately, your workaround is about as good as it gets. You're not really losing the full traceback, because a full traceback was never created. If you want full stack info, you need to record it yourself, with something like the traceback.format_stack() function you're currently using.

My discord.py bot is not starting because of a KeyError

With the following code that I have for my bot, he's not booting up. I don't understand.
Here is my code (I have put a mystb.in link, because my question had "too much code")
The full traceback:
Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000000004005D30>
Traceback (most recent call last):
File "C:\Users\Bi\AppData\Local\Programs\Python\Python38\lib\asyncio\proactor_events.py", line 116, in __del__
self.close()
File "C:\Users\Bi\AppData\Local\Programs\Python\Python38\lib\asyncio\proactor_events.py", line 108, in close
self._loop.call_soon(self._call_connection_lost, None)
File "C:\Users\Bi\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 711, in call_soon
self._check_closed()
File "C:\Users\Bi\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 504, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
And here is the image of what the error shows up like in VSCode.
There is something that I don't understand very well, when I put in a different token it works just perfectly fine.

It seems to me that the problem lies in reading from the config file in main at lines 6-13. Make sure that in config.json you have all the variables you're trying to get and it gets them correctly.

In fact, I went on replit and the error shew very well : Invalid token.
My bot just got deleted.

Error loading neuraxle pipeline with execution context

When I save a pipeline, which has an ExecutionContext associated with it, and try to load it again, I get the error shown below.
from neuraxle.base import ExecutionContext, Identity
from neuraxle.pipeline import Pipeline
PIPELINE_NAME = 'saved_pipeline_name'
cache_folder = 'cache_folder'
pipeline = Pipeline([
Identity()
]).with_context(ExecutionContext(cache_folder))
pipeline.set_name(PIPELINE_NAME).save(ExecutionContext(cache_folder), full_dump=True)
loaded_pipeline = ExecutionContext(cache_folder).load(PIPELINE_NAME)
Error message:
Traceback (most recent call last):
File "save_example.py", line 12, in <module>
loaded_pipeline = ExecutionContext(cache_folder).load(PIPELINE_NAME)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 555, in load
).load(context_for_loading, True)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 3621, in load
return loaded_self.load(context, full_dump)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 1708, in load
return self._load_step(context, savers)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 1717, in _load_step
loaded_self = saver.load_step(loaded_self, context)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 3644, in load_step
step.apply('_assert_has_services', context=context)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 2316, in apply
results: RecursiveDict = self._apply_childrens(results=results, method=method, ra=ra)
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 2327, in _apply_childrens
for children in self.get_children():
File ".env/lib/python3.7/site-packages/neuraxle/base.py", line 2530, in get_children
return [self.wrapped]
AttributeError: 'StepWithContext' object has no attribute 'wrapped'
Without the with_context(ExecutionContext(cache_folder)) the loading works fine. Is this expected behaviour, or is it a bug? What would be the best practice for saving pipelines, when working with execution contexts?

There was an erroneous call to a function in StepWithContext's saver. There will be a hotfix pushed on Neuraxle's main repository in the next day or so. If you can wait until then your code should execute with no problems.
If not, I'd suggest you to bypass StepWithContext by calling the save directly on StepWithContext's wrapped step (i.e. your pipeline instance):
pipeline.wrapped.set_name(PIPELINE_NAME).save(ExecutionContext(cache_folder), full_dump=True)
loaded_pipeline = ExecutionContext(cache_folder).load(PIPELINE_NAME)
You'll then have to re-wrap the loaded_pipeline instance with a StepWithContext using the .with_context() call.
When the hotfix will be available, keep in my mind that ExecutionContext instances are not getting saved at all and that, on loading, StepWithContext's context attribute is getting replace with whatever context is used for the loading.
Feel free to ask me any other questions! I'll be glad to answer them.
Cheers

Getting Dill Memory Error when loading serialized object, how to fix?

I am getting a dill/pickle memory error when loading a serialized object file. I am not quite sure what is happening and I am unsure on how to fix it.
When I call:
stat_bundle = train_batch_iterator(clf, TOTAL_TRAINED_EVENTS)
The code traces to the train_batch_iterator function in which it loads a serialized object and trains the classifier with the data within the object. This is the code:
def train_batch_iterator(clf, tte):
plot_data = [] # initialize plot data array
for file in glob.glob('./SerializedData/Batch8172015_19999/*'):
with open(file, 'rb') as stream:
minibatch_train = dill.load(stream)
clf.partial_fit(minibatch_train.data[1], minibatch_train.target,
classes=np.array([11, 111]))
tte += len(minibatch_train.target)
plot_data.append((test_batch_iterator(clf), tte))
return plot_data
Here is the error:
Traceback (most recent call last):
File "LArSoftSGD-version2.0.py", line 154, in <module>
stat_bundle = train_batch_iterator(clf, TOTAL_TRAINED_EVENTS)
File "LArSoftSGD-version2.0.py", line 118, in train_batch_iterator
minibatch_train = dill.load(stream)
File "/home/jdoe/.local/lib/python3.4/site-packages/dill/dill.py", line 199, in load
obj = pik.load()
File "/home/jdoe/.local/lib/python3.4/pickle.py", line 1038, in load
dispatch[key[0]](self)
File "/home/jdoe/.local/lib/python3.4/pickle.py", line 1184, in load_binbytes
self.append(self.read(len))
File "/home/jdoe/.local/lib/python3.4/pickle.py", line 237, in read
return self.file_read(n)
MemoryError
I have no idea what could be going wrong. The error seems to be in the line minibatch_train = dill.load(stream) and the only thing I can think of is that the serialized data file is too large, however the file is exactly 1161 MB which doesn't seem to big/big enough to cause a memory error.
Does anybody know what might be going wrong?

EOFError Opening/Reading Pickled File

so I have a pickled file that I would like to read and display the data from. I've never worked with pickled files before, but from a little research I found simple commands that should open it properly. Unfortunately I receive some errors that I will display below:
import pickle
f = open("1965.pkl")
here = pickle.load(f)
Traceback (most recent call last):
File "<ipython-input-7-43273f8d751b>", line 1, in <module>
here = pickle.load(f)
File "D:\Anaconda\lib\pickle.py", line 1378, in load
return Unpickler(file).load()
File "D:\Anaconda\lib\pickle.py", line 858, in load
dispatch[key](self)
File "D:\Anaconda\lib\pickle.py", line 880, in load_eof
raise EOFError
EOFError
Not really sure what this issue is since the EOFError doesn't give its usual description.
Any help is a big thanks!

Try this :
here = pickle.load(open("1965.pkl", 'rb'))
[ Edit ]:
Or you wrote to pickle with wrong flag.
For writing you should use 'wb'; for reading 'rb'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

EOFError: Ran out of input, using torch.load() - python

According to this thread it seems to raise an exception when reading an empty file, so please check the size of the document before reading it and post a response if it is not solved.

Related

Why don't I get a full traceback from a saved exception - and how do I get and save the full trace?

My discord.py bot is not starting because of a KeyError

Error loading neuraxle pipeline with execution context

Getting Dill Memory Error when loading serialized object, how to fix?

EOFError Opening/Reading Pickled File

Categories

Resources