Can't update a dictionary in Python - python

I'm trying to add some records into a dictionary.
Initially I was doing it this way
licenses = [dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18]) for row in db]
But I've since realized I need to do some processing to filter records from db, so I tried changing the code to:
for rec in db:
if rec['deleted'] == False:
licenses.update(dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18])
That code runs without exceptions, but I only end up with the last db record in licenses, which is confusing me.

I think licenses is a list:
licenses = []
...
and you should append to it new dictionaries:
licenses.append(dict(...))

If I understand correctly, you want to add multiple records in a single dictionary, right ? Instead of making a list of dictionaries, why wouldn't you make a dictionary of lists instead?
Start by building a list of the keys you'll need (so that you always access them in the same order).
keys = ["licenses", "client", "macaddress", "void"]
Construct an empty dictionary:
licences = dict((k,[]) for k in keys]
Recursively add entries to your dictionary:
for (k,item) in row:
dict[k].append(item)
Of course, it might be easier to build a list of all your records first, and then construct a dictionary at the very end.

Quoth the dict.update() documentation:
update([other]) Update the dictionary with the key/value pairs from
other, overwriting existing keys. Return None.
Which explains why the last update "wins". licences cannot be a list as there is no update method for lists.

If the code in your post is your genuine code, then you might consider replacing row with rec in the last line (the one with the update), because there are chances you're updating your dictionary with always the same values !
Edit: There's obviously something very wrong in this code, from the other answer I see that I overlooked the fact that licenses was declared as a list: so the only explanation for not having an exception is either the snippets you show are not the genuine one or all your record are so that rec['deleted'] is True (so that the update method is never called).

After responses, I've amended my code:
licenses = []
for row in db:
if row.deleted == False:
licenses.append(dict(licenseid=row[0], client=row[1], macaddress=row[2], void=row[18]))
Which now works perfectly. Thanks for spotting my stupidity! ;)

Related

Handling data pulled from MySQL database in Python

I am trying to figure out a good way to handle blacklists for words via a MySQL database. I have hit a roadblock when it comes to handling the data returned from the database.
cursor.execute('SELECT word FROM blacklist')
blacklist1 = []
for word in cursor.fetchall():
if word in blacklist1:
return
else:
blacklist1.append(word)
The above code is what I am using to pull the info which I know works. However, I need some help with converting this:
[('word1',), ('word2',), ('word3',), ('word4',), ('word5',)]
into this:
['word1', 'word2', 'word3', 'word4', 'word5']
my biggest issue is that I need it to scale so that it will check each word within the blacklist from no words to several thousand if necessary. I know a for loop would work when it comes to checking them versus the message it checks. but I know I will not be able to check the words till it is a normal list. any help would be appreciated.
In each iteration of for word in cursor.fetchall(), the variable word is a tuple, or a collection of values. This is documented here.
These correspond to each column returned, i.e. if you had a second column in your select statement ('SELECT word, replacement FROM blacklist') you would get tuples of two elements.
Use a set, and add the one and only element of the tuple, instead of the tuple itself:
for word_tuple in cursor.fetchall():
blacklist1.add(word[0])
Looking at the code more closely, if word in blacklist1: return may be a logical error - as soon as you see a duplicate, you'll stop reading rows from the database. You were likely looking to just skip that duplicate - you don't actually need that logic anymore because sets automatically remove duplicates.
Your list currently contains one element tuples. If you want to extract the strings you could try this:
blacklist1 = []
for word_tuple in cursor.fetchall():
if word_tuple[0] in blacklist1:
return
else:
blacklist1.append(word_tuple[0])
For your use case you might also benefit from having blacklist1 be a set, that way you can check for membership in O(1) time:
blacklist1 = set()
for word_tuple in cursor.fetchall():
if word_tuple[0] in blacklist1:
return
else:
blacklist1.add(word_tuple[0])
First, your actual problem is that the cursor is a wrapper of an iterator over rows returned from MySQL, so it can be operated on similarly to a list of tuples. That being said, my advice would be to split your "business" logic from your data access logic. This might seem trivial but it will make debugging much easier. The overall approach will look like this:
def get_from_database():
cursor.execute('SELECT word FROM blacklist')
return [row[0] for row in cursor.fetchall()]
def get_blacklist():
words = get_from_database()
return list(set(words))
In this approach, get_from_database retrieves all the words from MySQL and returns them in the format your program needs. get_blacklist encapsulates this logic and also makes the returned list unique. So now, if there's a bug, you can verify each independently.

Unexpected output when looping through subdictionaries in Python

I have a nested dictionary, where I have tickers to identifiy certain assets in my dictionary and then for each of these assets I would like to store characteristics in a subdictionary for the asset, creating them in a simple loop like the below:
ticker = ["a","bb","ccc"]
ticker_dict = dict.fromkeys(ticker, {"Var":[]})
for key in ticker_dict:
ticker_dict[key]["Var"] = len(key)
From the above output I would expect, that for each ticker/asset it saves the "Var" variable as the length of its name, meaning the following:
{"a":{"Var":1},
"bb":{"Var":2},
"ccc":{"Var":3}}
But, in my view rather weirdly, the result is this
{"a":{"Var":3},
"bb":{"Var":3},
"ccc":{"Var":3}}
To provide further context, the real process is that I have four assets, for which I would like to store dataframes in their subdictionaries as this makes it easy for me to access them later in loops etc. Somehow though, the data from the last asset is simply copied over all assets, eventhogh I explicitly loop through different keys.
What's going on?
PS: I'm not sure how to explain the problem without the sample code, so I might have missed a similar entry on this site. If so, any hints to it would be appreciated as well of course.
In your code, {"Var":[]} is only evaluated once, causing there to be only 1 inner dictionary shared by all keys. Instead, you can use a dictionary comprehension:
ticker_dict = {key:{"Var":[]} for key in ticker}
and it will work as expected.

How to remove extraneous square brackets from a nested list inside a dictionary?

I have been working on a problem which involves sorting a large data set of shop orders, extracting shop and user information based on some parameters. Mostly this has involved creating dictionaries by iterating through a data set with a for loop and appending a new list, like this:
sshop = defaultdict(list)
for i in range(df_subset.shape[0]):
orderid, sid, userid, time = df.iloc[i]
sshop[sid].append(userid)
sData = dict(sshop)
#CREATES DICTIONARY OF UNIQUE SHOPS WITH USER DATA AS THE VALUE
shops = df_subset['shopid'].unique()
shops_dict = defaultdict(list)
for shop in shops:
shops_dict[shop].append(sData[shop])
shops_dict = dict(shops_dict)
shops_dict looks like this at this point:
{10009: [[196962305]], 10051: [[2854032, 48600461]], 10061: [[168750452, 194819216, 130633421,
62464559]]}
To get to the final stages I have had to repeat lines of code similar to these a couple of times. What seems to happen everytime I do this is that the VALUES in the dictionaries gain a set of square brackets.
This is one of my final dictionaries:
{10159: [[[1577562540.0, 1577736960.0, 1577737080.0]], [[1577651880.0, 1577652000.0, 1577652960.0]]],
10208: [[[1577651040.0, 1577651580.0, 1577797080.0]]]}
I don't entirely understand why this is happening, asides from I believe it is something to do with using defaultdict(list) and then converting that into a dictionary with dict().
These extra brackets, asides from being a little confusing, appear to be causing some problems for accessing the data using certain functions. I understand that there needs to be two sets of square brackets in total, one set that encases all the values in the dictionary key and another inside of that for each of the specific sets of values within that key.
My first question would be, is it possible to remove a specific set of square brackets from a dictionary like that?
My second question would be, if not - is there a better way of creating new dictionaries out the data from an older one without using defaultdict(list) and having all those extra square brackets?
Any help much appreciated!
Thanks :)!
In second loop use extend instead of append.
for shop in shops:
shops_dict[shop].extend(sData[shop])
shops_dict = dict(shops_dict)

Python list.remove items present in second list

I've searched around and most of the errors I see are when people are trying to iterate over a list and modify it at the same time. In my case, I am trying to take one list, and remove items from that list that are present in a second list.
import pymysql
schemaOnly = ["table1", "table2", "table6", "table9"]
db = pymysql.connect(my connection stuff)
tables = db.cursor()
tables.execute("SHOW TABLES")
tablesTuple = tables.fetchall()
tablesList = []
# I do this because there is no way to remove items from a tuple
# which is what I get back from tables.fetchall
for item in tablesTuple:
tablesList.append(item)
for schemaTable in schemaOnly:
tablesList.remove(schemaTable)
When I put various print statements in the code, everything looks like proper and like it is going to work. But when it gets to the actual tablesList.remove(schemaTable) I get the dreaded ValueError: list.remove(x): x not in list.
If there is a better way to do this I am open to ideas. It just seemed logical to me to iterate through the list and remove items.
Thanks in advance!
** Edit **
Everyone in the comments and the first answer is correct. The reason this is failing is because the conversion from a Tuple to a list is creating a very badly formatted list. Hence there is nothing that matches when trying to remove items in the next loop. The solution to this issue was to take the first item from each Tuple and put those into a list like so: tablesList = [x[0] for x in tablesTuple] . Once I did this the second loop worked and the table names were correctly removed.
Thanks for pointing me in the right direction!
I assume that fetchall returns tuples, one for each database row matched.
Now the problem is that the elements in tablesList are tuples, whereas schemaTable contains strings. Python does not consider these to be equal.
Thus when you attempt to call remove on tablesList with a string from schemaTable, Python cannot find any such value.
You need to inspect the values in tablesList and find a way convert them to a strings. I suspect it would be by simply taking the first element out of the tuple, but I do not have a mySQL database at hand so I cannot test that.
Regarding your question, if there is a better way to do this: Yes.
Instead of adding items to the list, and then removing them, you can append only the items that you want. For example:
for item in tablesTuple:
if item not in schemaOnly:
tablesList.append(item)
Also, schemaOnly can be written as a set, to improve search complexity from O(n) to O(1):
schemaOnly = {"table1", "table2", "table6", "table9"}
This will only be meaningful with big lists, but in my experience it's useful semantically.
And finally, you can write the whole thing in one list comprehension:
tablesList = [item for item in tablesTuple if item not in schemaOnly]
And if you don't need to keep repetitions (or if there aren't any in the first place), you can also do this:
tablesSet = set(tablesTuple) - schemaOnly
Which is also has the best big-O complexity of all these variations.

Properly looping over a dictionary / using dictionaries as databases

This looks like a CS 101 style homework but it actually isn't. I am trying to learn more python so I took up this personal project to write a small app that keeps my grade-book for me.
I have a class semester which holds a dictionary of section objects.
A section is a class that I am teaching in which ever semester object I am manipulating (I didn't want to call them classes for obvious reasons). I originally had sections as a list not a dictionary, and when I wanted to add a roster of students to that semester I could do this.
for sec in working_semester.sections:
sec.addRosterFromFile(filename)
Now I have changed the sections member of semester to a dictionary so I can look up a specific one to work with, but I am having trouble when I want to loop over all of them to do something like when I first set up a new semester I want to add all the sections, then loop over them and add students to each. If I try the same code to loop over the dictionary it gives me the key, but I was hoping to get the value.
I have also tried to iterate over a dictionary like this, which I got out of an older stack over flow question
for sec in iter(sorted(working_semester.sections.iteritems())):
sec.addRosterFromFile(filename)
But iter(sorted ... returns a tuple (key, value) not the item so the line in side the loop gives me an error that tuple does not have a function called addStudent.
Currently I have this fix in place where I loop through the keys and then use the key to access the value like this:
for key in working_semester.sections:
working_semester.sections[key].addRosterFromFile(filename)
There has to be a way to loop over dictionary values, or is this not desirable? My understanding of dictionaries is that they are like lists but rather than grabbing an element by its position it has a specific key, which makes it easier to grab the one you want no matter what order they are in. Am I missing how dictionaries should be used?
Using iteritems is a good approach, you just need to unpack the key and value:
for key, value in iter(sorted(working_semester.sections.iteritems())):
value.addRosterFromFile(filename)
If you really only need the value, you could use the aptly named itervalues:
for sec in sorted(working_semester.sections.itervalues()):
sec.addRosterFromFile(filename)
(It's not clear from your example whether you really need sorted there. If you don't need to iterate over the sections in sorted order just leave sorted out.)

Categories