A user asked (Keyerror while using pandas in PYTHON 2.7) why he was having a KeyError while looking in a dictionary and how he could avoid this exception.
As an answer, I suggested him to check for the keys in the dictionary before. So, if he needed all the keys ['key_a', 'key_b', 'key_c'] in the dictionary, he could test it with:
if not all([x in dictionary for x in ['key_a', 'key_b', 'key_c']]):
continue
This way he could ignore dictionaries that didn't have the expected keys (the list of dictionaries is created out of JSON formatted lines loaded from a file). *Refer to the original question for more details, if relevant to this question.
A user more experienced in Python and SO, which I would consider an authority on the matter for its career and gold badges told me I was using all incorrectly. I was wondering if this is really the case (for what I can tell, that works as expected) and why, or if there is a better way to check if a couple of keys are all in a dictionary.
Yes that will work fine, but you don't even need the list comprehension
if not all(x in dictionary for x in ['key_a', 'key_b', 'key_c']):
continue
If you have the surrounding [], it will evaluate all the elements before calling all. If you remove them, the inner expression is a generator, and will short-circuit upon the first False.
Related
The context of what I'm doing: I'm translating if/then/else statements between 2 languages via a Python script (2x for now, but may eventually upgrade to 3x). I have a function that takes the if/then/else statement from the original language and breaks it into a list of [if_clause,then_clause,else_clause]. The thing is, there may be (and often are) nested if statements in the then and/or else clauses. For example, I would pass a string like...
if (sim_time<=1242) then (new_tmaxF0740) else if (sim_time<=2338) then (new_tmaxF4170) else (new_tmaxF7100)
...to my function, and it would return the list...
['(sim_time<=1242)','(new_tmaxF0740)','if (sim_time<=2338) then (new_tmaxF4170) else (new_tmaxF7100)']
So, as you can see, in this case the else clause needs to be further broken up by running it again through the same function I used to generate the list, this time only passing the last list element to that function. I am going about this by testing the original string to see if there are more than 1 if statements contained (I already have the regex for this) and my thought is to use a loop to create nested lists within the original list, that might then look like...
[if_clause,then_clause,[if_clause, then_clause, else_clause]]
These can be nested any number of times/to any dimension. My plan so far is to write a loop that looks for the next nested if statement (using a regex), and reassigns the list index where the if statement is found to the resultant list from applying my if_extract() function to break up the statement.
I feel like list comprehension may not do this, because to find the indices, it seems like the list comprehension statement might have to dynamically change. Maybe better suited for map, but I'm not sure how to apply? I ultimately want to iterate through the loop to return the index of the next (however deeply nested) if statement so I can continue breaking them apart with my function.
If I understand correctly, you could call your function recursively.
def split_if_then_else(str):
if check_if_if_in_string_function(str)
if_clause, then_clause, else_clause = split_str_core_function(str)
then_clause = split_if_then_else(str)
return [if_clause, then_clause, else_clause]
else:
return str
I didn't test it since I don't know what functions you are using exactly, but I think something like this should work
I'm parsing a xml file ... so there's a field called case: Sometimes it's a single OrderedDict, other times it's a list of OrderedDict. That's it:
OrderedDict([(u'duration', u'2.111'), (u'className', u'foo'), (u'testName', u'runTest'), (u'skipped', u'false'), (u'failedSince', u'0')])
[OrderedDict([(u'duration', u'0.062'), (u'className', u'foo'), (u'testName', u'runTest'), (u'skipped', u'false'), (u'failedSince', u'0')]), OrderedDict([(u'duration', u'0.461'), (u'className', u'bar'), (u'testName', u'runTest'), (u'skipped', u'false'), (u'failedSince', u'0')])]
I want to always have that expression as a single list. The reason is to have a for loop to take care of that. I thought about doing something like:
[case]
But as the later I would have [[case]]. I don't think list joins or concatenations would help me. A trivial solution would be to check if case is of the type list or OrderedDict, however I was looking for a simpler, one line, pythonic solution like the one I described above. How can I accomplish that?
Since list and OrderedDict are both kinds of containers, checking the type sounds like it might be the simplest solution, if you're sure that the xml parse will always use the list type.
There's no reason you can't do this in a one-liner:
case = [case] if not isinstance(case, list) else case
self.PARSE_TABLE={"$_ERROR":self.WEEK_ERRORS,"$_INFORM":self.WEEK_INFORM,"$_REDIR":self.WEEK_REDIRECTS,"$_SERVER_ERROR":self.WEEK_SERVER_ERROR,"$_BYTES":self.WEEK_BYTES,"$_HITS":self.WEEK_HITS}
for j in self.PARSE_TABLE:
print j
break
When I run this on my python the first element I get is S_REDIR can someone explain why?
Dictionaries don't maintain order. The order you get from iterating over them may not be the order in which you inserted the elements. This is the price you pay for near-instant lookup of values by key. In short, the behavior you are seeing is correct and expected, and may even vary from run to run of the Python interpreter.
It normal behaviour. Inside dictionary and set using hash codes. If you want orderd keys use self.PARSE_TABLE.keys.sort(). Also you can use OrderedDict from collection library.
Dictionary by default stores all the keys in its own convenient order rather to the order we gave.
If the order of the keys should be maintained, you can use OrderedDict which came to implementation from the python version 3.0
P.S. I don't think sorting keys would do any help in preserving the order given.
I have admittedly not done a huge amount of research on this topic, but I am trying to get something done quickly. I have a dictionary with integers as keys and lists as values. Previously, I was checking for a list being in the dictionary with a simple if statement:
if(someList is in someDictionary.values()):
someCode() #failure
However, I realized it is incorrect for what I was doing, and I only want to check for the inclusion of the first value of the list in the dictionary's values, e.g
if(someList[0] == someValueInDictionary[0]):
someCode() #failure
I first tried
if(someList[0] is in someDictionary.values()[0]):
someCode() #failure
But that clearly doesn't work. As someDictionary.values() is a list in itself. I realize I could iterate through all of the values to check, e.g
for list in someDictionary.values():
if(someList[0] == list[0]):
someCode() #failure
actualCode() #success
But this really messes up the flow of my program. I am a new Python programmer, most experienced in Java, and I am trying to get the conciseness and convenience of Python in my bones, as such I thought there might be a better solution for what I am testing for. If there is not, I can make the iteration thing work, but if there is, I would greatly appreciate it!
Thanks in advance!
Use the any() function with a generator expression to find if there is any dictionary value that contains your item:
if any(someList[0] in v for v in someDictionary.itervalues()):
# item found
Use someDictionary.values() on Python 3. The generator expression loops over the dictionary values (without producing a list of all values first) and tests against each value, one by one as the generator expression is iterated over.
any() only tests elements from the generator expression until one is True, and then stops looping, making this relatively efficient.
If you need to have the key of the matching value, use next():
next((k for k, v in someDictionary.iteritems() if someList[0] in v), None)
which returns either the matching key, or None in no match is found.
Try this, assuming that you want to check the first element of someList against the first element in all of someDictionary's list values (the code in the question seems to indicate that's what you want):
if someList[0] in (x[0] for x in someDictionary.itervalues()):
someCode()
But if what you want is to check if the first element of someList is in any of the lists of values, then try this:
import itertools as it
if someList[0] in it.chain(*someDictionary.itervalues()):
someCode()
So I'm a longtime perl scripter who's been getting used to python since I changed jobs a few months back. Often in perl, if I had a list of values that I needed to check a variable against (simply to see if there is a match in the list), I found it easier to generate hashes to check against, instead of putting the values into an array, like so:
$checklist{'val1'} = undef;
$checklist{'val2'} = undef;
...
if (exists $checklist{$value_to_check}) { ... }
Obviously this wastes some memory because of the need for a useless right-hand value, but IMO is more efficients and easier to code than to loop through an array.
Now in python, the code for this is exactly the same no matter if you're searching an list or a dictionary:
if value_to_check in checklist_which_can_be_list_or_dict:
<code>
So my real question here is: in perl, the hash method was preferred for speed of processing vs. iterating through an array, but is this true in python? Given the code is the same, I'm wondering if python does list iteration better? Should I still use the dictionary method for larger lists?
Dictionaries are hashes. An in test on a list has to walk through every element to check it against, while an in test on a dictionary uses hashing to see if the key exists. Python just doesn't make you explicitly loop through the list.
Python also has a set datatype. It's basically a hash/dictionary without the right-hand values. If what you want is to be able to build up a collection of things, then test whether something is already in that collection, and you don't care about the order of the things or whether a thing is in the collection multiple times, then a set is exactly what you want!