I'm trying to run boto3 to loop through snapshots older than 14 days.
It can find all the snapshots older than 14 days fine, and I've verified that all that works okay. The problem is when it runs through the dictionary trying to delete, it looks like the function isn't correctly evaluating the variable (See below).
It seems to just include it as a string.
The loop runs through the dict using a "for snapshot in ..." if'ing the tags to find the snapshots ready for deletion. Here's the 'if' part:
if snap_start_time < expiry: # check if it's more than a <expiry> old
print "Deleting Snapshot: " + snapshot['SnapshotId']
response = ec2client.delete_snapshot(
SnapshotId=snapshot['SnapshotId']
)
errors here:
Deleting Snapshot: snap-f4f0079d
Traceback (most recent call last):
File "./aws-snap.py", line 27, in <module>
SnapshotId=snapshot['SnapshotId']
File "/usr/lib/python2.6/site-packages/botocore/client.py", line 159, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/lib/python2.6/site-packages/botocore/client.py", line 494, in _make_api_call
raise ClientError(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidSnapshot.NotFound) when calling the DeleteSnapshot operation: None
Any clues? \o/
I would doubt that the SnapshotId might not be passing as a string.
Change the SnapshotId to a string format and pass it for deletion.
str(snapshot['SnapshotId'])
As it turns out, referencing straight from the dictionary is a bad idea. It needs to be wrapped in str() and provided with the DryRun=False option too.
Related
I have a user submitted data validation interface for a scientific site in django, and I want the user to be able to submit files of scientific data that will aid them in resolving simple problems with their data before they're allowed to make a formal submission (to reduce workload on the curators who actually load the data into our database).
The validation interface re-uses the loading code, which is good for code re-use. It has a "validate mode" that doesn't change the database. Everything is in an atomic transaction block and it gets rolled back in any case when it runs in validate mode.
I'm in the middle of a refactor to alleviate a problem. The problem is that the user has to submit the files multiple times, each time, getting the next error. So I've been refining the code to be able to "buffer" the exceptions in an array and only really stop if any error makes further processing impossible. So far, it's working great.
Since unexpected errors are expected in this interface (because the data is complex and lab users are continually finding new ways to screw up the data), I am catching and buffering any exception and intend to write custom exception classes for each case as I encounter them.
The problem is that when I'm adding new features and encounter a new error, the tracebacks in the buffered exceptions aren't being fully preserved, which makes it annoying to debug - even when I change the code to raise and immediately catch the exception so I can add it to the buffer with the traceback. For example, in my debugging, I may get an exception from a large block of code, and I can't tell what line it is coming from.
I have worked around this problem by saving the traceback as a string inside the buffered exception object, which just feels wrong. I had to play around in the shell to get it to work. Here is my simple test case to demonstrate what's happening. It's reproducible for me, but apparently not for others who try this toy example - and I don't know why:
import traceback
class teste(Exception):
"""This is an exception class I'm going to raise to represent some unanticipated exception - for which I will want a traceback."""
pass
def buf(exc, args):
"""This represents my method I call to buffer an exception, but for this example, I just return the exception and keep it in main in a variable. The actual method in my code appends to a data member array in the loader object."""
try:
raise exc(*args)
except Exception as e:
# This is a sanity check that prints the trace that I will want to get from the buffered exception object later
print("STACK:")
traceback.print_stack()
# This is my workaround where I save the trace as a string in the exception object
e.past_tb = "".join(traceback.format_stack())
return e
The above example raises the exception inside buf. (My original code supports both raising the exception for the first time and buffering an already raised and caught exception. In both cases, I wasn't getting a saved full traceback, so I'm only providing the one example case (where I raise it inside the buf method).
And here's what I see when I use the above code in the shell. This first call shows my sanity check - the whole stack, which is what I want to be able to access later:
In [5]: es = buf(teste, ["This is a test"])
STACK:
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/manage.py", line 22, in <module>
main()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/commands/shell.py", line 100, in handle
return getattr(self, shell)(options)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/core/management/commands/shell.py", line 36, in ipython
start_ipython(argv=[])
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/__init__.py", line 126, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/traitlets/config/application.py", line 846, in launch_instance
app.start()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/interactiveshell.py", line 566, in mainloop
self.interact()
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/terminal/interactiveshell.py", line 557, in interact
self.run_cell(code, store_history=True)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2914, in run_cell
result = self._run_cell(
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2960, in _run_cell
return runner(coro)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 78, in _pseudo_sync_runner
coro.send(None)
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3185, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3377, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-5-92f4a0db918d>", line 1, in <module>
es = buf(teste, ["This is a test"])
File "<ipython-input-2-86e515dc1ec1>", line 6, in buf
traceback.print_stack()
But this is what I see when I want to see the original traceback from the es object (i.e. the buffered exception) later. It only has the last item from the traceback. This is exactly what I see in the original source code - a single item for the line of code inside the buffer method:
In [8]: traceback.print_exception(type(es), es, es.__traceback__)
Traceback (most recent call last):
File "<ipython-input-2-86e515dc1ec1>", line 3, in buf
raise exc(*args)
teste: This is a test
My workaround suffices for now, but I'd like to have a proper traceback object.
I debugged the issue by re-cloning our repo in a second directory to make sure I hadn't messed up my sandbox. I guess I should try this on another computer too - my office mac. But can anyone point me in the right direction to debug this issue? What could be the cause for losing the full traceback?
Python has a really weird way of building exception tracebacks. You might expect it to build the traceback when the exception is created, or when it's raised, but that's not how it works.
Python builds a traceback as an exception propagates. Every time the exception propagates up to a new stack frame, a traceback entry for that stack frame is added to the exception's traceback.
This means that an exception's traceback only goes as far as the exception itself propagates. If you catch it (and don't reraise it), the traceback only goes up to the point where it got caught.
Unfortunately, your workaround is about as good as it gets. You're not really losing the full traceback, because a full traceback was never created. If you want full stack info, you need to record it yourself, with something like the traceback.format_stack() function you're currently using.
I get the following error (edited out some parts, but same structure):
File "./project/calcs.py", line 43, in getVar
var = await olf.getOrg(*args, **kwargs)
File "./project/subView.py", line 177, in getOrg
undo = force.getLength()
File "./project/stack/refine.py", line 89, in getLength
for eTL in getLength():
blablabla.blablabla.issue: Can't redirect user
Am I supposed to look at the very FIRST error line to address the issue? Is it a chronology?
So in this case, I would be obliged to address calcs.py # line 43 to fix that issue? I wouldn't be able to fix it by, let's say, addressing the 2nd error or the last one, correct?
Is that also the way to look at any python error trackback?
You are probably looking at a stack trace which is ordered chronologically (oldest call first). In your case this means var = await olf.getOrg(*args, **kwargs) ran first which then ended up calling undo = force.getLength() and then for eTL in getLength(): which failed.
I'm trying making a simple query with pymongo and looping over the results.
This is the code I'm using:
data = []
tam = db.my_collection.find({'timestamp': {'$gte': start, '$lte':end}}).count()
for i,d in enumerate(table.find({'timestamp': {'$gte': start, '$lte':end}}):
print('%s of %s' % (i,tam))
data.append(d)
start and end variables are datetime python objects. Everything runs fine until I get the following output:
2987 of 12848
2988 of 12848
2989 of 12848
2990 of 12848
2991 of 12848
2992 of 12848
Traceback (most recent call last):
File "db_extraction\extract_data.py", line 68, in <module>
data = extract_data(yesterday,days = 1)
File "db_extraction\extract_data.py", line 24, in extract_data
for i,d in enumerate(table.find({'timestamp': {'$gte': start, '$lte':end}}).limit(100000)):
File "\venv\lib\site-packages\pymongo\cursor.py", line 1169, in next
if len(self.__data) or self._refresh():
File "\venv\lib\site-packages\pymongo\cursor.py", line 1106, in _refresh
self.__send_message(g)
File "\venv\lib\site-packages\pymongo\cursor.py", line 971, in __send_message
codec_options=self.__codec_options)
File "\venv\lib\site-packages\pymongo\cursor.py", line 1055, in _unpack_response
return response.unpack_response(cursor_id, codec_options)
File "\venv\lib\site-packages\pymongo\message.py", line 945, in unpack_response
return bson.decode_all(self.documents, codec_options)
bson.errors.InvalidBSON
First thing I've tried is changing the range of the query to check if it is data related, and it's not. Another range stops at 1615 of 6360 and same error.
I've also tried list(table.find({'timestamp': {'$gte': start, '$lte':end}}) and same error.
Another maybe relevant info is that first queries are really fast. It freezes on the last number for a while before returning the error.
So I need some help. Am I hitting limits here? Or any clue on whats going on?
This is might be related with this 2013 question, but the author says that he gets no error output.
Thanks!
EDIT:
First thank you all for your time and suggestions. Unfortunately, I've tested all sugestions and I get the same error at the same spot. I've printed the problematic file using mongo shell and it is pretty much the same as all others.
I changed the range of the query and tried picking up other days. Same problem in all days, until I found one random run that gave me a MEMORY ERROR.
1737 of 8011
1738 of 8011
1739 of 8011
1740 of 8011
1741 of 8011
Traceback (most recent call last):
File "db_extraction\pymongo_test.py", line 14, in <module>
for post in all_posts:
File "\python_modules\venv\lib\site-packages\pymongo\cursor.py", line 1189, in next
if len(self.__data) or self._refresh():
File "\python_modules\venv\lib\site-packages\pymongo\cursor.py", line 1126, in _refresh
self.__send_message(g)
File "\python_modules\venv\lib\site-packages\pymongo\cursor.py", line 931, in __send_message
operation, exhaust=self.__exhaust, address=self.__address)
File "\python_modules\venv\lib\site-packages\pymongo\mongo_client.py", line 1145, in _send_message_with_response
exhaust)
File "\python_modules\venv\lib\site-packages\pymongo\mongo_client.py", line 1156, in _reset_on_error
return func(*args, **kwargs)
File "\python_modules\venv\lib\site-packages\pymongo\server.py", line 106, in send_message_with_response
reply = sock_info.receive_message(request_id)
File "\python_modules\venv\lib\site-packages\pymongo\pool.py", line 612, in receive_message
self._raise_connection_failure(error)
File "\python_modules\venv\lib\site-packages\pymongo\pool.py", line 745, in _raise_connection_failure
raise error
File "\python_modules\venv\lib\site-packages\pymongo\pool.py", line 610, in receive_message
self.max_message_size)
File "\python_modules\venv\lib\site-packages\pymongo\network.py", line 191, in receive_message
data = _receive_data_on_socket(sock, length - 16)
File "\python_modules\venv\lib\site-packages\pymongo\network.py", line 227, in _receive_data_on_socket
buf = bytearray(length)
MemoryError
This is intermitent. I ran again without changing anything and got the old invalidBSON error, and ran again and got Memory Error.
I started the task manager and ran again, and the memory indeed grows fast up to 95% usage and hangs there. The query should retrieve something like 1GB of data in 8GB RAM machine so... I dont know if this is suposed to happen. Anyway a code suggestion that retrieves the data from mongoDB with pymongo and writes to a file without putting everything into memory probably will do the job. The bonus would be if someone could explain why I'm getting an invalid BSON instead of MemoryError (for vast majority of runs) in my case.
Thanks
Your code runs fine on my computer. Since it works for your first 2992 records, I think the documents may have some inconsistency. Does every document in your collection follow the same schema and format? and is your pymongo updated?
Here is my suggestion if you want to loop through every record:
data = []
all_posts = db.my_collection.find({'timestamp': {'$gte': start, '$lte':end}})
tam = all_posts.count()
i = 0
for post in all_posts:
i += 1
print('%s of %s' % (i,tam))
data.append(post)
Regards,
I ran into this same exact problem myself, it ended up having nothing to do with the documents themselves but the amount of memory that the program was taking up during large queries.
In our specific case, when running the broken query that was giving us this exact bug by itself in a separate script, the bug didn't occur. Eventually we found that we were using a uwsgi config setting:
limit-as = 512
This would immediately kill our process when address space reached 512M, resulting in either an InvalidBSON error OR a MemoryError interchangeably, seemingly at random.
We fixed this by changing the limit-as setting it to reload-on-as instead:
reload-on-as = 512
Ultimately we ended up with deciding to break up large queries like this into smaller pieces and performing them sequentially instead of all at once anyway, but we at least determined it was an external cause instead of an issue with the pymongo driver itself.
Could it be related to specific documents in the DB? Have you checked the document that might cause the error (e.g., the 2992th result of your above query, starting with 0)?
You could also execute some queries against the DB directly (e.g., via the mongo shell) without using pymongo to see whether expected results are returned. For example, you could try db.my_collection.find({...}).skip(2992) to see the result. You could also use cursor.forEach() to print all the retrieved documents.
Im working with a lambda function where i use boto3 to put_item() into my DynamoBD Table and on the code im adding the ttl parameter (Time to live).
ttl = str(int(time.time() + 2629746))
This line gives me a 1 month ttl but for some reason im getting alot of this errors:
An error occurred (ValidationException) when calling the PutItem operation: The parameter cannot be converted to a numeric value: undefined: ClientError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 52, in handler
response = c_put_item(d)
File "/var/task/lambda_function.py", line 40, in c_put_item
'ttl':{'N':ttl}
File "/var/runtime/botocore/client.py", line 317, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 615, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the PutItem operation: The parameter cannot be converted to a numeric value: undefined
Any idea why?
PS: im using python3
-- EDIT:
Im adding a little bit more of the code.
For some reason this isnt working.
ttl = str(int(time.time() + 2629746))
response = client.put_item(TableName='MYTABLENAME',Item={
'item':{'S':item},
'title':{'S':title},
'link':{'S':link},
'price':{'N':price},
'category':{'S':category},
'avaliable':{'S':avaliable},
'image':{'S':image},
'ttl':{'N':ttl}
})
-- EDIT2:
The AWS docs specify you should use the put_item() as i did except im forced to use str() cause i was getting an error.
#kichik was actually right on his comment. This error doesn't necessarily say ttl is None. It says one of the fields of the entire call is None.
So ones i detected the conflicting field i added an exception and the problem stopped.
Consider making ttl an integer: ttl = int(time.time() + 2629746)
Boto3 docs, Creating a new item:
For all of the valid types that can be used for an item, refer to Valid DynamoDB Types:
These are the valid item types to use with Boto3 Table Resource (dynamodb.Table) and DynamoDB:
• integer – Number (N)
• decimal.Decimal – Number (N)
Hello I am running Python 2.5 on Windows and whenever my application gets an exception rather than seeing the debug information I get an error inside of the traceback.py file itself. Anyone know a fix for this mb a patch or replacement file.
Traceback (most recent call last):
File "C:\Python25\lib\logging\__init__.py", line 744, in emit
msg = self.format(record)
File "C:\Python25\lib\logging\__init__.py", line 630, in format
return fmt.format(record)
File "C:\Python25\lib\logging\__init__.py", line 426, in format
record.exc_text = self.formatException(record.exc_info)
File "C:\Python25\lib\logging\__init__.py", line 398, in formatException
traceback.print_exception(ei[0], ei[1], ei[2], None, sio)
File "C:\Python25\lib\traceback.py", line 126, in print_exception
lines = format_exception_only(etype, value)
File "C:\Python25\lib\traceback.py", line 176, in format_exception_only
stype = etype.__name__
AttributeError: 'NoneType' object has no attribute '__name__'
===EDIT===
Found same error in mailing list here outdated answer it seems
http://mail.python.org/pipermail/python-dev/2006-September/068975.html
Possible causes:
Calling logging.exception() when there is no active exception
Calling a logging function with exc_info=1, when there is no active exception.
Calling a logging function with exc_info=(None, None, None) to a logging function (e.g. if doing the exception logging manually).
You should not use logging.exception outside of an except block.
The exception is caused by a None exception type passed to traceback.print_exception, meaning that there is no active exception to process.
Meanwhile, the newsgroup posting you linked to indicates that it was a regression in the standard library that resulted in that particular traceback. You may want to try upgrading your Python to 2.5.1, which fixed this particular problem.