Iterate through Python Dictionary without knowing the specific keys - python

I can't seem to figure out how to write this piece of Python Code.
I need to read something in from a file, which is this:
Jake FJ49FJH
Bob FJ49GKH
I've imported the file into a dictionary. I then check if the number plate (following the names) contains the sequence of two letters, two numbers, then three letters. Here is that part of the code:
d = {} # Blank Dictionary
with open("plates.txt") as f:
d = dict(x.rstrip().split(None, 1) for x in f) # Put file into dictionary
names = d.keys()
print(names)
#carReg = ??
a,b,c = carReg[:2],carReg[2:4],carReg[4:]
if all((a.isalpha(),b.isdigit(),c.isalpha(),len(c)== 3)):
print("Valid Reg")
else:
print("Invalid Reg")
# Now get the name of the person with the invalid carReg
As you can see, I don't know what to put for carReg. I need to loop through the dictionary, like in a list when you can use an integer to get part of the list. If the program returns Invalid Reg, then I need to get the key that the invalid reg belongs to - which will be a name.
Any help appreciated.

Iterate over dict.items(); it'll give you the (key, value) pairs from the dictionary:
for name, car_registration in d.items():

Related

How to extract all occurrences of a JSON object that share a duplicate key:value pair?

I am writing a python script that reads a large JSON file containing data from an API, and iterates through all the objects. I want to extract all objects that have a specific matching/duplicate "key:value", and save it to a separate JSON file.
Currently, I have it almost doing this, however the one flaw in my code that I cannot fix is that it skips the first occurrence of the duplicate object, and does not add it to my dupObjects list. I have an OrderedDict keeping track of unique objects, and a regular list for duplicate objects. I know this means that when I add the second occurrence, I must add the first (unique) object, but how would I create a conditional statement that only does this once per unique object?
This is my code at the moment:
import collections import OrderedDict
import json
with open('input.json') as data:
data = json.load(data)
uniqueObjects = OrderedDict()
dupObjects = list()
for d in data:
value = d["key"]
if value in uniqueObjects:
# dupObjects.append(uniqueObjects[hostname])
dupHostnames.append(d)
if value not in uniqueObjects:
uniqueObjects[value] = d
with open('duplicates.json', 'w') as g:
json.dump(dupObjects, g, indent=4)
Where you see that one commented line is where I tried to just add the object from the OrderedList to my list, but that causes it to add it as many times as there are duplicates. I only want it to add it one time.
Edit:
There are several unique objects that have duplicates. I'm looking for some conditional statement that can add the first occurrence of an object that has duplicates, once per unique object.
You could group by key.
Using itertools:
def by_key(element):
return ["key"]
grouped_by_key = itertools.groupby(data, key_func=by_key)
Then is just a matter of finding groups that have more than one element.
For details check: https://docs.python.org/3/howto/functional.html#grouping-elements
In this line you forgot .keys(), so you skip need values
if value in uniqueObjects.keys():
And this line
if value not in uniqueObjects.keys():
Edit #1
My mistake :)
You need to add first duplicate object from uniqueObjects in first if
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1
dupHostnames.append(d)
Edit #2
Try this option, it will write only the first occurrence in duplicates
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1

How to fix the errors in my code for making a dictionary from a file

This is what I am supposed to do in my assignment:
This function is used to create a bank dictionary. The given argument
is the filename to load. Every line in the file will look like key:
value Key is a user's name and value is an amount to update the user's
bank account with. The value should be a number, however, it is
possible that there is no value or that the value is an invalid
number.
What you will do:
Try to make a dictionary from the contents of the file.
If the key doesn't exist, create a new key:value pair.
If the key does exist, increment its value with the amount.
You should also handle cases when the value is invalid. If so, ignore that line and don't update the dictionary.
Finally, return the dictionary.
Note: All of the users in the bank file are in the user account file.
Example of the contents of 'filename' file:
Brandon: 115.5
James: 128.87
Sarah: 827.43
Patrick:'18.9
This is my code:
bank = {}
with open(filename) as f:
for line in f:
line1 = line
list1 = line1.split(": ")
if (len(list1) == 2):
key = list1[0]
value = list1[1]
is_valid = value.isnumeric()
if is_valid == True
value1 = float(value)
bank[(key)] = value1
return bank
My code returns a NoneType object which causes an error but I don't know where the code is wrong. Also, there are many other errors. How can I improve/fix the code?
Try this code and let me explain everything on it because it depends on how much you're understanding Python Data structure:
Code Syntax
adict = {}
with open("text_data.txt") as data:
"""
adict (dict): is a dictionary variable which stores the data from the iteration
process that's happening when we're separating the file syntax into 'keys' and 'values'.
We're doing that by iterate the file lines from the file and looping into them.
The `line` is each line from the func `readlines()`. Now the magic happens here,
you're playing with the line using slicing process which helps you to choose
the location of the character and play start from it. BUT,
you'll face a problem with how will you avoid the '\n' that appears at the end of each line.
you can use func `strip` to remove this character from the end of the file.
"""
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip('\n') for line in data.readlines()}
print(adict)
Output
{' Brandon': '115.5', ' James': '128.87', ' Sarah': '827.43', ' Patrick': "'18.9"}
In term of Value Validation by little of search you will find that you can check the value if its a number or not
According to Detect whether a Python string is a number or a letter
a = 5
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
By Calling the function
print(is_number(a))
print(is_number(1.4))
print(is_number('hello'))
OUTPUT
True
True
False
Now, let's back to our code to edit;
All you need to do is to add condition to this dict..
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
OUTPUT
{'Brandon': '115.5', 'James': '128.87', 'Sarah': '827.43'}
You can check the value of the dict by passing it to the function that we created
Code Syntax
print(is_number(adict['Brandon']))
OUTPUT
True
You can add more extensions to the is_number() function if you want.
You're likely hitting the return in the else statement, which doesn't return anything (hence None). So as soon as there is one line in your file that does not contain 2 white-space separated values, you're returning nothing.
Also note that your code is only trying to assign a value to a key in a dictionary. It is not adding a value to an existing key if it already exists, as per the documentation.
This should effectively do the job:
bank = {}
with open(filename) as file:
for line in file:
key, val = line.rsplit(": ", 1) # This will split on the last ': ' avoiding ambiguity of semi-colons in the middle
# Using a trial and error method to convert number to float
try:
bank[key] = float(val)
except ValueError as e:
print(e)
return bank

Python Max function - Finding highest value in a dictionary

My question is about finding highest value in a dictionary using max function.
I have a created dictionary that looks like this:
cc_GDP = {'af': 1243738953, 'as': 343435646, etc}
I would like to be able to simply find and print the highest GDP value for each country.
My best attempt having read through similar questions is as follows (I'm currently working through the Python crash course book at which the base of this code has been taken, note the get_country_code function is simply providing 2 letter abbreviations for the countries in the GDP_data json file):
#Load the data into a list
filename = 'gdp_data.json'
with open(filename) as f:
gdp_data = json.load(f)
cc_GDP` = {}
for gdp_dict in gdp_data:
if gdp_dict['Year'] == 2016:
country_name = gdp_dict['Country Name']
GDP_total = int(gdp_dict['Value'])
code = get_country_code(country_name)
if code:
cc_GDP[code] = int(GDP_total)
print(max(cc_GDP, key=lambda key: cc_GDP[key][1]))
This provides the following error 'TypeError: 'int' object is not subscriptable'
Note if leaving out the [1] in the print function, this does provide the highest key which relates to the highest value, but does not return the highest value itself which is what I wish to achieve.
Any help would be appreciated.
So you currently extract the key of the country that has the highest value with this line:
country_w_highest_val = max(cc_GDP, key=lambda key: cc_GDP[key]))
You can of course just look that up in the dictionary again:
highest_val = cc_GDP[contry_w_highest_val]
But simpler, disregard the keys completely, and just find the highest value of all values in the dictionary:
highest_val = max(cc_GDP.values())
How about something like this:
print max(cc_GDP.values())
That will give you the highest value but not the key.
The error is being cause because you need to look at the entire dictionary, not just one item. remove the [1] and then use the following line:
print(cc_GDP[max(cc_GDP, key=lambda key: cc_GDP[key])])
Your code currently just returns the dictionary key. You need to plug this key back into the dictionary to get the GDP.
You could deploy .items() method of dict to get key-value pairs (tuples) and process it following way:
cc_GDP = {'af': 1243738953, 'as': 343435646}
m = max(list(cc_GDP.items()), key=lambda x:x[1])
print(m) #prints ('af', 1243738953)
Output m in this case is 2-tuple, you might access key 'af' via m[0] and value 1243738953 via m[1].

How to create a dictionary whose values are sets?

I'm working on an exercise that requires me to build two dictionaries, one whose keys are country names, and the values are the GDP. This part works fine.
The second dictionary is where I'm lost, as the keys are supposed to be the letters A‐Z and the values are sets of country names. I tried using a for loop, which I've commented on below, where the issue lies.
If the user enters a string with only one letter (like A), the program should print all the countries that begin with that letter. When you run the program, however, it only prints out one country for each letter.
The text file contains 228 lines. ie:
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
etc.
And here's my code.
initials = []
countries=[]
incomes=[]
dictionary={}
dictionary_2={}
keywordFile = open("raw.txt", "r")
for line in keywordFile:
line = line.upper()
line = line.strip("\n")
line = line.split(":")
initials.append(line[1][0]) # first letter of second element
countries.append(line[1])
incomes.append(line[2])
for i in range(0,len(countries)):
dictionary[countries[i]] = incomes[i]
this for loop should spit out 248 values (one for each country), where the key is the initial and the value is the country name. However, it only spits out 26 values (one country for each letter in the alphabet)
for i in range(0,len(countries)):
dictionary_2[initials[i]] = countries[i]
print(dictionary_2)
while True:
inputS = str(input('Enter an initial or a country name.'))
if inputS in dictionary:
value = dictionary.get(inputS, "")
print("The per capita income of {} is {}.".format((inputS.title()), value ))
elif inputS in dictionary_2:
value = dictionary_2.get(inputS)
print("The countries that begin with the letter {} are: {}.".format(inputS, (value.title())))
elif inputS.lower() in "quit":
break
else:
print("Does not exit.")
print("End of session.")
I'd appreciate any input leading me in the right direction.
Use defaultdict to make sure each value of your initials dict is a set, and then use the add method. If you just use = you'll be overwriting the initial keys value each time, defaultdict is an easier way of using an expression like:
if initial in dict:
dict[initial].add(country)
else:
dict[initial] = {country}
See the full working example below, and also note that i'm using enumerate instead of range(0,len(countries)), which i'd also recommend:
#!/usr/bin/env python3
from collections import defaultdict
initials, countries, incomes = [],[],[]
dict1 = {}
dict2 = defaultdict(set)
keywordFile = """
1:Qatar:98900
2:Liechtenstein:89400
3:Luxembourg:80600
4:Bermuda:69900
5:Singapore:59700
6:Jersey:57000
""".split("\n\n")
for line in keywordFile:
line = line.upper().strip("\n").split(":")
initials.append(line[1][0])
countries.append(line[1])
incomes.append(line[2])
for i,country in enumerate(countries):
dict1[country] = incomes[i]
dict2[initials[i]].add(country)
print(dict2["L"])
Result:
{'LUXEMBOURG', 'LIECHTENSTEIN'}
see: https://docs.python.org/3/library/collections.html#collections.defaultdict
The values for dictionary2 should be such that they can contain a list of countries. One option is to use a list as the values in your dictionary. In your code, you are overwriting the values for each key whenever a new country with the same initial is to be added as the value.
Moreover, you can use the setdefault method of the dictionary type. This code:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], []).append(country)
should be enough to create the second dictionary elegantly.
setdefault, either returns the value for the key (in this case the key is set to the first letter of the country name) if it already exists, or inserts a new key (again, the first letter of the country) into the dictionary with a value that is an empty set [].
edit
if you want your values to be set (for faster lookup/membership test), you can use the following lines:
dictionary2 = {}
for country in countries:
dictionary2.setdefault(country[0], set()).add(country)
Here's a link to a live functioning version of the OP's code online.
The keys in Python dict objects are unique. There can only ever be one 'L' key a single dict. What happens in your code is that first the key/value pair 'L':'Liechtenstein' is inserted into dictionary_2. However, in a subsequent iteration of the for loop, 'L':'Liechtenstein' is overwritten by 'L':Luxembourg. This kind of overwriting is sometimes referred to as "clobbering".
Fix
One way to get the result that you seem to be after would be to rewrite that for loop:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]}
print(dictionary_2)
Also, you have to rewrite the related elif statement beneath that:
elif inputS in dictionary_2:
titles = ', '.join([v.title() for v in dictionary_2[inputS]])
print("The countries that begin with the letter {} are: {}.".format(inputS, titles))
Explanation
Here's a complete explanation of the dictionary_2[initials[i]] = dictionary_2.get(initials[i], set()) | {countries[i]} line above:
dictionary_2.get(initials[i], set())
If initials[i] is a key in dictionary_2, this will return the associated value. If initials[i] is not in the dictionary, it will return the empty set set() instead.
{countries[i]}
This creates a new set with a single member in it, countries[i].
dictionary_2.get(initials[i], set()) | {countries[i]}
The | operator adds all of the members of two sets together and returns the result.
dictionary_2[initials[i]] = ...
The right hand side of the line either creates a new set, or adds to an existing one. This bit of code assigns that newly created/expanded set back to dictionary_2.
Notes
The above code sets the values of dictionary_2 as sets. If you want to use list values, use this version of the for loop instead:
for i in range(0,len(countries)):
dictionary_2[initials[i]] = dictionary_2.get(initials[i], []) + [countries[i]]
print(dictionary_2)
You're very close to what you're looking for, You could populate your dictionaries respectively while looping over the contents of the file raw.txt that you're reading. You can also read the contents of the file first and then perform the necessary operations to populate the dictionaries. You could achieve your requirement with nice oneliners in python using dict comprehensions and groupby. Here's an example:
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
You now have a list of all lines in the keywordFile as follows:
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000']
As you loop over the items, you can split(':') and use the [1] and [2] index values as required.
You could use dictionary comprehension as follows:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
Which results in:
{'Qatar': '98900', 'Libya': '1000', 'Singapore': '59700', 'Luxembourg': '80600', 'Liechtenstein': '89400', 'Bermuda': '69900', 'Jersey': '57000'}
Similarly using groupby from itertools you can obtain:
from itertools import groupby
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
Which results in the required dictionary of initial : [list of countries]
{'Q': ['Qatar'], 'S': ['Singapore'], 'B': ['Bermuda'], 'L': ['Luxembourg', 'Liechtenstein'], 'J': ['Jersey']}
A complete example is as follows:
from itertools import groupby
country_per_capita_dict = {}
letter_countries_dict = {}
keywordFile = [line.strip() for line in open('raw.txt' ,'r').readlines()]
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
country_list = country_per_capita_dict.keys()
country_list.sort()
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
print (country_per_capita_dict)
print (letter_countries_dict)
Explanation:
The line:
country_per_capita_dict = {entry.split(':')[1] : entry.split(':')[2] for entry in keywordFile}
loops over the following list
['1:Qatar:98900', '2:Liechtenstein:89400', '3:Luxembourg:80600', '4:Bermuda:69900', '5:Singapore:59700', '6:Jersey:57000', '7:Libya:1000', '8:Sri Lanka:5000'] and splits each entry in the list by :
It then takes the value at index [1] and [2] which are the country names and the per capita value and makes them into a dictionary.
country_list = country_per_capita_dict.keys()
country_list.sort()
This line, extracts the name of all the countries from the dictionary created before into a list and sorts them alphabetically for groupby to work correctly.
letter_countries_dict = {k: list(g) for k,g in groupby(country_list, key=lambda x:x[0]) }
This lambda expression takes the input as the list of countries and groups together the names of countries where each x starts with x[0] into list(g).

Nested dictionary behavior

I am trying to learn how to manipulate data in python.
I have the following data in a txt file
{"summonerId":000000,"games":[{"gameId":111111,"invalid":false,"gameMode":"CLASSIC","gameType":"MATCHED_GAME","subType":"NORMAL","mapId":11,"teamId":200,"championId":89,"spell1":3,"spell2":4,"level":30,"ipEarned":237,"createDate":1443314494341,"fellowPlayers":[{"summonerId":46350758,"teamId":100,"championId":157}],"stats":{"level":15,"goldEarned":10173,"numDeaths":5,"minionsKilled":48,"championsKilled":1,"goldSpent":9205,"totalDamageDealt":48752,"totalDamageTaken":23464,"team":200,"win":true,"largestMultiKill":1,"physicalDamageDealtPlayer":9064,"magicDamageDealtPlayer":35714,"physicalDamageTaken":18944,"magicDamageTaken":4005,"timePlayed":1831,"totalHeal":4129,"totalUnitsHealed":5,"assists":24,"item0":3401,"item1":2049,"item2":3117,"item3":3068,"item4":3075,"item5":1028,"item6":3340,"magicDamageDealtToChampions":9062,"physicalDamageDealtToChampions":3348,"totalDamageDealtToChampions":12411,"trueDamageDealtPlayer":3974,"trueDamageTaken":514,"wardKilled":1,"wardPlaced":16,"totalTimeCrowdControlDealt":104,"playerRole":2,"playerPosition":4}]}
My end goal is to be able to display a specific piece of information from the "stats" dictionary.
When I run the following code
import json
matches = open('testdata.txt', 'r')
output = matches.read()
data=json.loads(output)
display = data["games"]
print("Info: " + str(display))
The output is everything that corresponds to the "games" key as I would expect.
When I try
import json
matches = open('testdata.txt', 'r')
output = matches.read()
data=json.loads(output)
display = data["games"]["stats"]
print("Info: " + str(display))
I receive: TypeError: list indices must be integers, not str
I'm not really sure how to proceed given that the key is clearly a string and not an integer...
Your data["games"] value is a list; each element in that list is a dictionary, and it is those dictionaries in the list that (may) have the 'stats' key. A list can contain 0 or more elements; in this specific case there is just 1 but there could be more or none.
Loop over the list of dictionaries, or pick a specific dictionary from the list with indexing. Since there is only one in your specific example, you could just index that 1 element with the 0 index:
display = data["games"][0]["stats"]

Categories