Python equiv. of PHP foreach []? - python

I am fetching rows from the database and wish to populate a multi-dimensional dictionary.
The php version would be roughly this:
foreach($query as $rows):
$values[$rows->id][] = $rows->name;
endforeach;
return $values;
I can't seem to find out the following issues:
What is the python way to add keys to a dictionary using an automatically numbering e.g. $values[]
How do I populate a Python dictionary using variables; using, for example, values[id] = name, will not add keys, but override existing.
I totally have no idea how to achieve this, as I am a Python beginner (programming in general, actually).

values = collections.defaultdict(list)
for rows in query:
values[rows.id].append(rows.name)
return values

Just a general note:
Python's dictionaries are mappings without order, while adding numerical keys would allow "sequential" access, in case of iteration there's no guarantee that order will coincide with the natural order of keys.
It's better not to translate from PHP to Python (or any other language), but rather right code idiomatic to that particular language. Have a look at the many open-source code that does the same/similar things, you might even find a useful module (library).

all_rows=[]
for row in query:
all_rows.append(row['col'])
print(all_rows)

You can do:
from collections import defaultdict
values = defaultdict(list)
for row in query:
values[row.id].append(row.name)
return values
Edit: forgot to return the values.

Related

Python Dictionary with Variable Array keys

self.PARSE_TABLE={"$_ERROR":self.WEEK_ERRORS,"$_INFORM":self.WEEK_INFORM,"$_REDIR":self.WEEK_REDIRECTS,"$_SERVER_ERROR":self.WEEK_SERVER_ERROR,"$_BYTES":self.WEEK_BYTES,"$_HITS":self.WEEK_HITS}
for j in self.PARSE_TABLE:
print j
break
When I run this on my python the first element I get is S_REDIR can someone explain why?
Dictionaries don't maintain order. The order you get from iterating over them may not be the order in which you inserted the elements. This is the price you pay for near-instant lookup of values by key. In short, the behavior you are seeing is correct and expected, and may even vary from run to run of the Python interpreter.
It normal behaviour. Inside dictionary and set using hash codes. If you want orderd keys use self.PARSE_TABLE.keys.sort(). Also you can use OrderedDict from collection library.
Dictionary by default stores all the keys in its own convenient order rather to the order we gave.
If the order of the keys should be maintained, you can use OrderedDict which came to implementation from the python version 3.0
P.S. I don't think sorting keys would do any help in preserving the order given.

Look up python dict value by expression

I have a dict that has unix epoch timestamps for keys, like so:
lookup_dict = {
1357899: {} #some dict of data
1357910: {} #some other dict of data
}
Except, you know, millions and millions and millions of entries. I'd like to subset this dict, over and over again. Ideally, I'd love to be able to write something like I can in R, like:
lookup_value = 1357900
dict_subset = lookup_dict[key >= lookup_value]
# dict_subset now contains {1357910: {}}
But I confess, I can't find any actual proof that this is something Python can do without having, one way or the other, to iterate over every row. If I understand Python correctly (and I might not), key lookup of the form key in dict uses binary search, and is thus very fast; any way to do a binary search, on dict keys?
To do this without iterating, you're going to need the keys in sorted order. Then you just need to do a binary search for the first one >= lookup_value, instead of checking each one for >= lookup_value.
If you're willing to use a third-party library, there are plenty out there. The first two that spring to mind are bintrees (which uses a red-black tree, like C++, Java, etc.) and blist (which uses a B+Tree). For example, with bintrees, it's as simple as this:
dict_subset = lookup_dict[lookup_value:]
And this will be as efficient as you'd hope—basically, it adds a single O(log N) search on top of whatever the cost of using that subset. (Of course usually what you want to do with that subset is iterate the whole thing, which ends up being O(N) anyway… but maybe you're doing something different, or maybe the subset is only 10 keys out of 1000000.)
Of course there is a tradeoff. Random access to a tree-based mapping is O(log N) instead of "usually O(1)". Also, your keys obviously need to be fully ordered, instead of hashable (and that's a lot harder to detect automatically and raise nice error messages on).
If you want to build this yourself, you can. You don't even necessarily need a tree; just a sorted list of keys alongside a dict. You can maintain the list with the bisect module in the stdlib, as JonClements suggested. You may want to wrap up bisect to make a sorted list object—or, better, get one of the recipes on ActiveState or PyPI to do it for you. You can then wrap the sorted list and the dict together into a single object, so you don't accidentally update one without updating the other. And then you can extend the interface to be as nice as bintrees, if you want.
Using the following code will work out
some_time_to_filter_for = # blah unix time
# Create a new sub-dictionary
sub_dict = {key: val for key, val in lookup_dict.items()
if key >= some_time_to_filter_for}
Basically we just iterate through all the keys in your dictionary and given a time to filter out for we take all the keys that are greater than or equal to that value and place them into our new dictionary

Can't update a dictionary in Python

I'm trying to add some records into a dictionary.
Initially I was doing it this way
licenses = [dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18]) for row in db]
But I've since realized I need to do some processing to filter records from db, so I tried changing the code to:
for rec in db:
if rec['deleted'] == False:
licenses.update(dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18])
That code runs without exceptions, but I only end up with the last db record in licenses, which is confusing me.
I think licenses is a list:
licenses = []
...
and you should append to it new dictionaries:
licenses.append(dict(...))
If I understand correctly, you want to add multiple records in a single dictionary, right ? Instead of making a list of dictionaries, why wouldn't you make a dictionary of lists instead?
Start by building a list of the keys you'll need (so that you always access them in the same order).
keys = ["licenses", "client", "macaddress", "void"]
Construct an empty dictionary:
licences = dict((k,[]) for k in keys]
Recursively add entries to your dictionary:
for (k,item) in row:
dict[k].append(item)
Of course, it might be easier to build a list of all your records first, and then construct a dictionary at the very end.
Quoth the dict.update() documentation:
update([other]) Update the dictionary with the key/value pairs from
other, overwriting existing keys. Return None.
Which explains why the last update "wins". licences cannot be a list as there is no update method for lists.
If the code in your post is your genuine code, then you might consider replacing row with rec in the last line (the one with the update), because there are chances you're updating your dictionary with always the same values !
Edit: There's obviously something very wrong in this code, from the other answer I see that I overlooked the fact that licenses was declared as a list: so the only explanation for not having an exception is either the snippets you show are not the genuine one or all your record are so that rec['deleted'] is True (so that the update method is never called).
After responses, I've amended my code:
licenses = []
for row in db:
if row.deleted == False:
licenses.append(dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18]))
Which now works perfectly. Thanks for spotting my stupidity! ;)

A more idiomatic way to fill a dictionary in Python

Currently, I'm trying to fill a dictionary in Python but I think what I'm doing is a little bit redundant. Is there a more pythonic way to do the following:
if not pattern_file_map.get(which_match):
pattern_file_map[which_match] = [line]
else:
pattern_file_map[which_match].append(line)
where pattern_file_map is a dictionary.
I know that there is a certain idiom to use when checking if there is a key in a dictionary, like
this question, but I I just want to fill this dictionary with a lists.
You could use
pattern_file_map.setdefault(which_match, []).append(line)
instead.
Other people might suggest using a collections.defaultdict(list) instead, which is another option, but be warned that this might hide errors, since it will silently create all keys you access.
You could try using a collections.defaultdict.
Since you're adding to a maybe existing value:
pattern_file_map.setdefault(which_match, []).append(line)
If using dict.get():
li = pattern_file_map.get(which_match, [])
li.append(line)
pattern_file_map[which_match] = li
Of course, both cases are restricted to the case where your dict's values are lists.

How do I know what data type to use in Python?

I'm working through some tutorials on Python and am at a position where I am trying to decide what data type/structure to use in a certain situation.
I'm not clear on the differences between arrays, lists, dictionaries and tuples.
How do you decide which one is appropriate - my current understanding doesn't let me distinguish between them at all - they seem to be the same thing.
What are the benefits/typical use cases for each one?
How do you decide which data type to use? Easy:
You look at which are available and choose the one that does what you want. And if there isn't one, you make one.
In this case a dict is a pretty obvious solution.
Tuples first. These are list-like things that cannot be modified. Because the contents of a tuple cannot change, you can use a tuple as a key in a dictionary. That's the most useful place for them in my opinion. For instance if you have a list like item = ["Ford pickup", 1993, 9995] and you want to make a little in-memory database with the prices you might try something like:
ikey = tuple(item[0], item[1])
idata = item[2]
db[ikey] = idata
Lists, seem to be like arrays or vectors in other programming languages and are usually used for the same types of things in Python. However, they are more flexible in that you can put different types of things into the same list. Generally, they are the most flexible data structure since you can put a whole list into a single list element of another list, but for real data crunching they may not be efficient enough.
a = [1,"fred",7.3]
b = []
b.append(1)
b[0] = "fred"
b.append(a) # now the second element of b is the whole list a
Dictionaries are often used a lot like lists, but now you can use any immutable thing as the index to the dictionary. However, unlike lists, dictionaries don't have a natural order and can't be sorted in place. Of course you can create your own class that incorporates a sorted list and a dictionary in order to make a dict behave like an Ordered Dictionary. There are examples on the Python Cookbook site.
c = {}
d = ("ford pickup",1993)
c[d] = 9995
Arrays are getting closer to the bit level for when you are doing heavy duty data crunching and you don't want the frills of lists or dictionaries. They are not often used outside of scientific applications. Leave these until you know for sure that you need them.
Lists and Dicts are the real workhorses of Python data storage.
Best type for counting elements like this is usually defaultdict
from collections import defaultdict
s = 'asdhbaklfbdkabhvsdybvailybvdaklybdfklabhdvhba'
d = defaultdict(int)
for c in s:
d[c] += 1
print d['a'] # prints 7
Do you really require speed/efficiency? Then go with a pure and simple dict.
Personal:
I mostly work with lists and dictionaries.
It seems that this satisfies most cases.
Sometimes:
Tuples can be helpful--if you want to pair/match elements. Besides that, I don't really use it.
However:
I write high-level scripts that don't need to drill down into the core "efficiency" where every byte and every memory/nanosecond matters. I don't believe most people need to drill this deep.

Categories