Searching and manipulating lists in lists in Python

Searching and manipulating lists in lists in Python - python

I have lists of names and cities and am trying to compile a list of the number of users in each city.
I want to have a list that looks like:
citylist = (['New York', 53], ['San Francisco', 23], ['Los Angeles', 54])
etc.
First problem I have is that when I read a new line from the file I need to check whether that city already exists. If it doesn't then I need to add it and give it the number 1. So I have tried:
if city not in citylist:
citylist.append([city, 1])
Problem with that is that even if the city is already in the list the search doesn't work as I guess it is typing to match the city to the entire element not just the first item of the element. Can someone tell me how to get round that please?
The seocnd part is lets assume that city is found somewhere in citylist, how can I then increment the number next to the city name by 1?
Thanks for any guidance.

Use a dictionary or collections.Counter here. List is not an appropriate data-structure for this task.
Normal dictionary example:
citydict = {'New York': 53,
'San Francisco': 23,
'Los Angeles': 54}
Now simply update the dictionary like this:
for line in file_obj:
city = #do something with line
citydict[city] = citydict.get(city, 0) + 1

python dict is a proper data structure for what you want to achive. Using defaultdict(int) you can also increment directly for a given city (key of the dict) even if it is not yet present in the dict.

Use a dictionary to maintain the counters
Here is a sample code:
citydict = {}
all_cities = open("cities.txt", "r").readlines()
for city in all_cities:
if citydict.has_key(city):
citydict[city] +=1
else:
citydict[city] = 1
print citydict.items()

As everyone else said, a dictionay is exactly the datastructure for this type of problems. But if you really want it as list (e.g. to understand how lists work), you can do it as follows:
def add_to_citylist(citylist, city):
"""modifies citylist according to spec"""
city_already_in_citylist = False
#iterate throuch citylists and get city-sub-list as c:
for c in citylist:
if c[0] == city:
#city found, so update count
c[1] += 1
#take a note that city was in list:
city_already_in_citylist = True
if not city_already_in_citylist:
#we did not find city in citylist --> add it
citylist.append([city, 1])
#your citylist should be a list (not a tuple (...) ) since a tuple is unmutable
citylist = [['New York', 53], ['San Francisco', 23], ['Los Angeles', 54]]
add_to_citylist(citylist, "Boston")
add_to_citylist(citylist, "New York")
print citylist
After understanding the idea, you can improve the code by using "return" in the loop which has a similar effect but is more effective since it terminats the loop after the element is found:
def add_to_citylist(citylist, city):
"""modifies citylist according to spec"""
#iterate throuch citylists and get city-sub-list as c:
for c in citylist:
if c[0] == city:
#city found, so update count
c[1] += 1
break
citylist.append([city, 1])

Related

Access nested list within a dictionary

So I'm working on a lab studying multilayer dictionaries, the goal is to receive an input of a string including a country and three cities located in this country, i.e.
string = "Spain Madrid Barcelona Valencia", then ask for an input (city = "Madrid"). If the city has been previously input, the output should be Madrid is located in Spain, otherwise the output should be No data on input city.
I've come up with the following:
country = "Spain Madrid Barcelona Valencia".split()
#Initialize a dictionary:
d = {}
#Create another list that only includes cities:
cities_list = country[1:]
#Create a nested list within a dictionary:
d[country[0]] = cities_list
Which would provide a nested dictionary such as {'Spain': ['Madrid', 'Barcelona', 'Valencia']}
And that's where I get really confused. It's clear that I need to access the nested list, but using d.values() only gives the following output
dict_values([['Madrid', 'Barcelona', 'Valencia']])
I'm clearly missing some fundamental info on the topic, but I've looked up here and in Eric Matthes's "Crash Course" but still couldn't find a solid solution.
Probably my initial approach is completely wrong? There's a couple of similar topics here too, but it seems none of them actually involves not just accessing a list (which I kinda understand: d."Spain"[0]) but also comparing an input to one of the list's values.
Any advice would be great anyways.

You are correct, to check if Madrid exists in your dictionary you can use for loop to check
city_to_be_searched = 'Madrid'
result = None
for k,v in d.items():
if city_to_be_searched in v:
result = k
if(result):
print(f'{city_to_be_searched} located in {result}')
else:
print('No data found')
Madrid located in Spain

After taking an input:
city_name = input()
You could do something like:
result = None
for key in d:
if city_name in d[key]:
result = f'{city_name} is located in {key}'
if result:
print(result)
else:
print('No data on input city')

You can iterate over the key, value pairs of a dictionary using items() method as:
city = "Madrid"
found = False
for key, value in d.items():
if city in value: # check if Madrid is in the values' list
print(f"{city} is located in {key}")
found = True
break
if not found:
print("No data on input city")
Output:
Madrid is located in Spain

Finding highest value in a dictionary

I'm new to programming and currently taking a CSC 110 class. Our assignment is to create a bunch functions that do all sorts of things with some data that is given. I have taken all that data and put it into a dictionary but I'm having some trouble getting the data I want out of it.
Here is my problem:
I have a dictionary that stores a bunch of countries followed by a list that includes their population and GDP. Formatted something like this
{'country': [population, GDP], ...}
My task is to loop through this and find the country with the highest population or GDP then print:
'The country with the highest population is ' + highCountry+\
' with a population of ' + format(highPop, ',.0f')+'.')
In order to do this I wrote this function (this one is specifically for highest population but they all look about the same).
def highestPop(worldInfo):
highPop = worldInfo[next(iter(worldInfo))][0] #Grabs first countries Population
highCountry = next(iter(worldInfo))#Grabs first country in worldInfo
for k,v in worldInfo.items():
if v[0] > highPop:
highPop = v[0]
highCountry = k
return highPop,highCountry
While this is working for me I gotta think there is an easier way to do this. Also I'm not 100% sure how [next(iter(worldInfo))] works. Does this just grab the first value it sees?
Thanks for your help in advance!
Edit: Sorry I guess I wasn't clear. I need to pass the countries population but also the countries name. So I can print both of them in my main function.

I think you're looking for this:
max(worldInfo.items(), key=lambda x: x[1][0])
This will return both the country name and its info. For instance:
('france', [100, 22])
The max() function can work on python "iterables" which is a fancy word for anything that can be cycled or looped through. Thus it cycles or loops through the thing you put into it and spits out the item that's the highest.
But how does it judge which tuple is highest? Which is higher: France or Germany? You have to specify a key (some specification for how to judge each item). The key=lambda etc specifies a function that given an item (x), judge that item based on x[1][0]. In this instance if the item is ('france', [100, 22]) then x[1][0] is 100. So the x[1][0] of each item is compared and the item with the highest one is returned.
The next() and iter() functions are for python iterators. For example:
mytuple = ("apple", "banana", "cherry")
myit = iter(mytuple)
print(next(myit)) #=> apple
print(next(myit)) #=> banana
print(next(myit)) #=> cherry

Use the max() function, like so:
max(item[0] for item in county_dict.values()) #use item[1] for GDP!
Also try storing the values not in a list ([a, b]) but in a tuple ((a, b)).
Edit: Like iamanigeeit said in the comments, this works to give you the country name as well:
max(data[0], country for country, data in country_dict.items())

An efficient solution to get the key with the highest value: you can use the max function this way:
highCountry = max(worldInfo, key=lambda k: worldInfo[k][0])
The key argument is a function that specifies what values you want to use to determine the max.max(data[0], country for country, data in country_dict.items())
And obviously :
highPop = worldInfo[highCountry][0]

Build a List of Tuples from a Dict

I have a list y of keys from a dictionary that is derived from a call to the Google Places API.
I would like to build a list of tuples for each point of interest:
lst = []
for i in range(len(y)):
lst.append((y[i]['name'], y[i]['formatted_address'], y[i]['opening_hours']['open_now'], y[i]['rating']))
This works if the field is in the list and I receive a list of results that look like the one below, which is exactly what I want:
("Friedman's", '1187 Amsterdam Ave, New York, NY 10027, USA', True, 4.2)
However, the script throws an error if a desired field is not in the list y. How can I build a list of tuples that checks whether the desired field is in y before building the tuple?
Here's what I've tried:
for i in range(len(y)):
t = ()
if y[i]['name']:
t = t + lst.append(y[i]['name'])
if y[i]['formatted_address']:
t = t + lst.append(y[i]['formatted_address'])
if y[i]['opening_hours']['open_now']:
t = t + lst.append(y[i]['opening_hours']['open_now'])
if y[i]['rating']:
t = t + lst.append(y[i]['rating'])
lst.append(t)
However, this doesn't work and seems very inelegant. Any suggestions?

This list comprehension uses default values when one of the keys is not present (using dict.get()). I added variables so you can set the desired default values.
default_name = ''
default_address = ''
default_open_now = False
default_rating = 0.0
new_list = [
(
e.get('name', default_name),
e.get('formatted_address', default_address),
e.get('opening_hours', {}).get('open_now', default_open_now),
e.get('rating', default_rating),
)
for e in y]

For a start, you should almost never loop over range(len(something)). Always iterate over the thing directly. That goes a long way to making your code less inelegant.
For the actual issue, you could loop over the keys and only add the item if it is in the dict. That gets a bit more complicated with your one element that is a nested lookup, but if you take it out then your code just becomes:
for item in y:
lst.append(tuple(item[key] for key in ('name', 'formatted_address', 'opening_hours', 'rating') if key in item))

You can use the get feature from dict.
y[i].get('name')
if y[i] has key 'name' returns the value or None. For nested dicts, use default value from get.
y[i].get('opening_hours', {}).get('open_now')
For data structure, I recommend to keep it as an dict, and add dicts to an list.
lst = []
lst.append({'name': "Friedman's", "address": '1187 Amsterdam Ave, New York, NY 10027, USA'})

Try this:
for i in y:
lst.append((v for k,v in i.items()))

you can use the keys method to find the keys in a dict. In your case:
lst=[]
fields = ('name', 'formatted_address', 'opening_hours' 'open_now', 'rating')
for i in range(len(y)):
data = []
for f in fields:
if f in y[].keys():
data.append(y[i][f])
else:
data.append(None)
lst.append(set(data))
note that you can also get all the key, value pairs in a dict using the items() method. That would actually simply the code a bit. To make it even better, itterate over the set, rather than calling len(set) to:
lst=[]
fields = ('name', 'formatted_address', 'opening_hours' 'open_now', 'rating')
for i in y:
data = []
for key, value in i.items():
if key in fields:
data.append(value)
else:
data.append(None)
lst.append(set(data))

Using a dict as a key when not all values exist [duplicate]

This question already has answers here:
Return a default value if a dictionary key is not available
(15 answers)
Does Python have a defined or operator like Perl? [duplicate]
(2 answers)
Make value_counts() return 0 if the value does not occur
(1 answer)
Default dict keys to avoid KeyError
(4 answers)
Closed 5 years ago.
I have a situation where some city names need to be renamed, so I am using a dict where the keys are the old city names and the values are the new ones. However, only some cities need to be renamed so not all possible cities are in the dict.
The only way I know how to do it is to except a KeyError when the city doesn't need to be renamed, which works, but I'm not sure if this is bad practice, or if there are any downfalls to this. Is there something I am missing?
# Set Venue
venue_name = unidecode(cell[2].get_text())
try:
# Correct venue names i.e. Cairns, QLD = Cairns
venue_name = VENUE_NAMES_DICT[venue_name]
except KeyError:
pass

As #jarmod suggests, you can use the .get() method of the standard Dictionary to provide a default value in case the key is missing. What isn't described is that this approach enables you to turn your problem into a one-liner by passing the venue_name value to .get() as the default value.
# Set Venue
venue_name = unidecode(cell[2].get_text())
# Correct venue names i.e. Cairns, QLD = Cairns
venue_name = VENUE_NAMES_DICT.get(venue_name, venue_name)
If venue_name is present as a key in the dictionary, .get() will return the desired new value. If it isn't present, .get() will return the original value of venue_name unchanged. This eliminates the need for any conditional logic.

What you can do is use defaultdict
from collections import defaultdict
d=defaultdict(list) #this will return a empty list everytime a new key is used or if key exists it will appendt the value to the list
d[venue_name]=a
Example:
>>> from collections import defaultdict
>>> city_list = [('TX','Austin'), ('TX','Houston'), ('NY','Albany'), ('NY', 'Syracuse'), ('NY', 'Buffalo'), ('NY', 'Rochester'), ('TX', 'Dallas'), ('CA','Sacramento'), ('CA', 'Palo Alto'), ('GA', 'Atlanta')]
>>>
>>> cities_by_state = defaultdict(list)
>>> for state, city in city_list:
... cities_by_state[state].append(city)
...
for state, cities in cities_by_state.iteritems():
... print state, ', '.join(cities)
...
NY Albany, Syracuse, Buffalo, Rochester
CA Sacramento, Palo Alto
GA Atlanta
TX Austin, Houston, Dallas

You can use dict.get(key, default_value) and supply a default value.

You can use "in", like this:
data = ['Chicago', 'NYC', 'Boston', 'SD']
dictionary = {'NYC': 'New York', 'SD': 'San Diego'}
new_list = []
for x in data:
if x in dictionary:
new_list.append(dictionary[x])
else:
new_list.append(x)
print(new_list)
#output
['Chicago', 'New York', 'Boston', 'San Diego']
Using List comprehension
data = ['Chicago', 'NYC', 'Boston', 'SD']
dictionary = {'NYC': 'New York', 'SD': 'San Diego'}
new_list=[dictionary[x] if x in dictionary else x for x in data]
print(new_list)
['Chicago', 'New York', 'Boston', 'San Diego']

In general, throwing exceptions for non-exceptional situations is poor design. You want a defaultdict.
from collections import defaultdict
renames = defaultdict(lambda: None)
# Add the elements to renames here ...
Now, renames is a dictionary, except that if the key doesn't exist, it returns None rather than throwing, so you can just check if the value is None to see if it needs to be renamed.

How to access an entry in a list of lists and print the other entries in that list

For simplicity sake let's say I'm using a list as follows:
[['Bob', 'Pizza', 'Male'], ['Sally', 'Tacos', 'Female']]
I want to ask the user which person's stats they would like to view such that it would print out Bob, Pizza, and Male when it was called. I tried to use the index method but the list of lists I'm working with has well over 150 entries.
I tried to use something like:
personName = input("Enter the person whose stats you would like to see: )
personIndex = personList.index(personName)
personStats = personList[personName][1:3] # first index is the name, index 1 and 2 is favorite food and gender
print(personStats)
But it doesn't work.

If you really want to use index, you can do it like below:
lst=[['Bob', 'Pizza', 'Male'], ['Sally', 'Tacos', 'Female']]
personName = input("Enter the person whose stats you would like to see:" )
ind = [i[0] for i in lst].index(personName)
food, gender = lst[ind][1:]
Print "{0} is a {1} , a {2} lover".format(personName, gender, food)

Ahsanul's way is not very efficient because it gets the first item of each list even if the first one matches. Mine is short-circuiting:
index = next(i for i, v in enumerate(personList) if v[0] == personName)
If there is a possibility that it doesn't exist, you can have a default like this:
index = next((i for i, v in enumerate(personList) if v[0] == personName)), my_default)
If you want the index only to get the value, change the first i to a v to get the value in the first place so that you don't need to worry about the extra processing time of finding the value at that index.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Searching and manipulating lists in lists in Python - python

python dict is a proper data structure for what you want to achive. Using defaultdict(int) you can also increment directly for a given city (key of the dict) even if it is not yet present in the dict.

Use a dictionary to maintain the counters Here is a sample code: citydict = {} all_cities = open("cities.txt", "r").readlines() for city in all_cities: if citydict.has_key(city): citydict[city] +=1 else: citydict[city] = 1 print citydict.items()

Related

Access nested list within a dictionary

Finding highest value in a dictionary

Build a List of Tuples from a Dict

Using a dict as a key when not all values exist [duplicate]

How to access an entry in a list of lists and print the other entries in that list

Categories

Resources