I am experiencing a strange faulty behaviour, where a dictionary is only appended once and I can not add more key value pairs to it.
My code reads in a multi-line string and extracts substrings via split(), to be added to a dictionary. I make use of conditional statements. Strangely only the key:value pairs under the first conditional statement are added.
Therefore I can not complete the dictionary.
How can I solve this issue?
Minimal code:
#I hope the '\n' is sufficient or use '\r\n'
example = "Name: Bugs Bunny\nDOB: 01/04/1900\nAddress: 111 Jokes Drive, Hollywood Hills, CA 11111, United States"
def format(data):
dic = {}
for line in data.splitlines():
#print('Line:', line)
if ':' in line:
info = line.split(': ', 1)[1].rstrip() #does not work with files
#print('Info: ', info)
if ' Name:' in info: #middle name problems! /maiden name
dic['F_NAME'] = info.split(' ', 1)[0].rstrip()
dic['L_NAME'] = info.split(' ', 1)[1].rstrip()
elif 'DOB' in info: #overhang
dic['DD'] = info.split('/', 2)[0].rstrip()
dic['MM'] = info.split('/', 2)[1].rstrip()
dic['YY'] = info.split('/', 2)[2].rstrip()
elif 'Address' in info:
dic['STREET'] = info.split(', ', 2)[0].rstrip()
dic['CITY'] = info.split(', ', 2)[1].rstrip()
dic['ZIP'] = info.split(', ', 2)[2].rstrip()
return dic
if __name__ == '__main__':
x = format(example)
for v, k in x.iteritems():
print v, k
Your code doesn't work, at all. You split off the name before the colon and discard it, looking only at the value after the colon, stored in info. That value never contains the names you are looking for; Name, DOB and Address all are part of the line before the :.
Python lets you assign to multiple names at once; make use of this when splitting:
def format(data):
dic = {}
for line in data.splitlines():
if ':' not in line:
continue
name, _, value = line.partition(':')
name = name.strip()
if name == 'Name':
dic['F_NAME'], dic['L_NAME'] = value.split(None, 1) # strips whitespace for us
elif name == 'DOB':
dic['DD'], dic['MM'], dic['YY'] = (v.strip() for v in value.split('/', 2))
elif name == 'Address':
dic['STREET'], dic['CITY'], dic['ZIP'] = (v.strip() for v in value.split(', ', 2))
return dic
I used str.partition() here rather than limit str.split() to just one split; it is slightly faster that way.
For your sample input this produces:
>>> format(example)
{'CITY': 'Hollywood Hills', 'ZIP': 'CA 11111, United States', 'L_NAME': 'Bunny', 'F_NAME': 'Bugs', 'YY': '1900', 'MM': '04', 'STREET': '111 Jokes Drive', 'DD': '01'}
>>> from pprint import pprint
>>> pprint(format(example))
{'CITY': 'Hollywood Hills',
'DD': '01',
'F_NAME': 'Bugs',
'L_NAME': 'Bunny',
'MM': '04',
'STREET': '111 Jokes Drive',
'YY': '1900',
'ZIP': 'CA 11111, United States'}
Related
I need to be able to print all instances of a name within the list of dictionaries. I can't seem to be able to print them in the desired format. It also doesn't work when it's in lowercase and the name is in uppercase.
def findContactsByName(name):
return [element for element in contacts if element['name'] == name]
def displayContactsByName(name):
print(findContactsByName(name))
if inp == 3:
print("Item 3 was selected: Find contact")
name = input("Enter name of contact to find: ")
displayContactsByName(name)
When the name 'Joe' was put in the output is:
[{'name': 'Joe', 'surname': ' Miceli', 'DOB': ' 25/06/2002', 'mobileNo': ' 79444425', 'locality': ' Zabbar'}, {'name': 'Joe', 'surname': 'Bruh', 'DOB': '12/12/2131', 'mobileNo': '77777777', 'locality': 'gozo'}]
When the name 'joe':
[]
Expected output:
name : Joe
surname : Miceli
DOB : 25/06/2002
mobileNo : 79444425
locality : Zabbar
name : Joe
surname : Bruh
DOB : 12/12/2131
mobileNo : 77777777
locality : gozo
Change the first function to:
def findContactsByName(name):
return [element for element in contacts if element['name'].lower() == name.lower()]
To account for the differences in uppercase and lowercase, I've just converted the name in the dictionary and the entered name to lowercase during the comparison part alone.
To be able to print it in the format that you've specified you could make a function for the same as follows:
def printResult(result):
for d in result:
print(f"name: {d['name']}")
print(f"surname: {d['surname']}")
print(f"DOB: {d['DOB']}")
print(f"mobileNo: {d['mobileNo']}")
print(f"locality: {d['locality']}")
print()
result=findContactsByName("joe")
printResult(result)
I modified your program. Now you don't have to worry about the case and the output formatting.
contacts = [{'name': 'Joe',
'surname': ' Miceli', 'DOB': ' 25/06/2002', 'mobileNo': ' 79444425', 'locality': ' Zabbar'},
{'name': 'Joe', 'surname': 'Bruh', 'DOB': '12/12/2131', 'mobileNo': '77777777', 'locality': 'gozo'}]
def findContactsByName(name):
return [element for element in contacts if element['name'].lower() == name.lower()]
def displayContactsByName(name):
for i in range(len(findContactsByName(name))):
for j in contacts[i]:
print('{}: {}'.format(j, contacts[i][j]))
print('\n')
displayContactsByName('Joe')
Case issue can be solved by setting each side of the comparison to UPPERCASE or LOWERCASE.
return [element for element in contacts if element['name'].upper() == name.upper()]
For the format of the print statement you could use the json module:
import json
print(json.dumps( findContactsByName(name), sort_keys=True, indent=4))
I have an old string and a modified one. Then also values from old string in dictionary format. I am trying to check if the values in dictionary is still present as such in new string. If yes nothing happens. If there is a change in the value, the value in the dictionary is replace by the modified value. If the value in dictionary is not present in new string, then update the value in dictionary by None.
Code
import re
db_tag_old = {"art":"art", "organizer":"james", "month":"December", "season":"summer"}
old = 'The art is performed by james. _______ Season is summer _____ time. It is December.'
new = 'The art is performed by ______ Mathew. Season is ______ autmn time. __ __ _________'
db_tag_new = {}
final_db_tag = {}
symbol = '_'
needle = f'{re.escape(symbol)}+'
position = [(match.start(),match.end()) for match in re.finditer(needle, old)]
for key,value in db_tag_old.items():
position_old = [(match.start(),match.end()) for match in re.finditer(value.lower(), old)]
position_new = [(match.start(),match.end()) for match in re.finditer(value.lower(), new)]
if position_old == position_new and [] not in (position_old, position_new)::
db_tag_new.update({key:value})
continue
else:
new_value = new[position[0][0]:position[0][1]]
db_tag_new.update({key:new_value})
final_db_tag.update({"old":db_tag_old,"new":db_tag_new})
print(final_db_tag)
Output Obtained
{'old': {'art': 'art', 'organizer': 'james', 'month': 'December', 'season': 'summer'}, 'new': {'art': 'art', 'organizer': 'Mathew.', 'month': 'Mathew.', 'season': 'Mathew.'}}
Here in the dictionary key "new", month and season are wring values.
Expected Output
{'old': {'art': 'art', 'organizer': 'james', 'month': 'December', 'season': 'summer'}, 'new': {'art': 'art', 'organizer': 'Mathew.', 'month': 'None', 'season': 'autmn'}}
How this can be corrected
It's not really clear to me, what the rule is to replace old with new text. The following code produces the wanted result, but I'm not sure whether this approach is as universal as needed:
import re
db_tag_old = {"art":"art", "organizer":"james.", "month":"December", "season":"summer"}
old = 'The art is performed by james. _______ Season is summer _____ time. It is December.'
new = 'The art is performed by ______ Mathew. Season is ______ autmn time. __ __ _________'
db_tag_new = {}
# pre-definition for dict-entries we won't find:
for key, val in db_tag_old.items():
db_tag_new[key] = "None"
owords = old.split();
nwords = new.split();
for (i, nw) in enumerate(nwords):
# the "art"-case:
for key, ow in db_tag_old.items():
if nw == ow:
db_tag_new[key] = ow
# "organizer" / "season" cases:
if re.match(r'^_+$', nw):
for key, ow in db_tag_old.items():
if ow == owords[i] and re.match(r'^_+$', owords[i+1]):
db_tag_new[key] = nwords[i+1]
print("old: ", db_tag_old)
print("new: ", db_tag_new)
Is there a smart way to shorten very long if-elif-elif-elif... statements?
Let's say I have a function like this:
def very_long_func():
something = 'Audi'
car = ['VW', 'Audi', 'BMW']
drinks = ['Cola', 'Fanta', 'Pepsi']
countries = ['France', 'Germany', 'Italy']
if something in car:
return {'type':'car brand'}
elif something in drinks:
return {'type':'lemonade brand'}
elif something in countries:
return {'type':'country'}
else:
return {'type':'nothing found'}
very_long_func()
>>>> {'type': 'car brand'}
The actual function is much longer than the example. What would be the best way to write this function (not in terms of speed but in readability)
I was reading this, but I have trouble to apply it to my problem.
You can't hash lists as dictionary values. So go other way round. Create a mapping of type -> list. And initialize your output with the default type. This allows you to keep on adding new types to your mapping without changing any code.
def very_long_func():
something = 'Audi'
car = ['VW', 'Audi', 'BMW']
drinks = ['Cola', 'Fanta', 'Pepsi']
countries = ['France', 'Germany', 'Italy']
out = {'type': 'nothing found'} # If nothing matches
mapping = {
'car brand': car,
'lemonade brand': drinks,
'country': countries
}
for k,v in mapping.items() :
if something in v:
out['type'] = k # update if match found
break
return out # returns matched or default value
you can create dictionary like this and then use map_dict.
from functools import reduce
car = ['VW', 'Audi', 'BMW']
drinks = ['Cola', 'Fanta', 'Pepsi']
countries = ['France', 'Germany', 'Italy']
li = [car, drinks, countries]
types = ['car brand', 'lemonade brand', 'country', 'nothing found']
dl = [dict(zip(l, [types[idx]]*len(l))) for idx, l in enumerate(li)]
map_dict = reduce(lambda a, b: dict(a, **b), dl)
Try this:
def create_dct(lst, flag):
return {k:flag for k in lst}
car = ['VW', 'Audi', 'BMW']
drinks = ['Cola', 'Fanta', 'Pepsi']
countries = ['France', 'Germany', 'Italy']
merge_dcts = {}
merge_dcts.update(create_dct(car, 'car brand'))
merge_dcts.update(create_dct(drinks, 'lemonade brand'))
merge_dcts.update(create_dct(countries, 'country'))
something = 'Audi'
try:
print("type: ", merge_dcts[something])
except:
print("type: nothing found")
You can simulate a switch statement with a helper function like this:
def switch(v): yield lambda *c: v in c
The your code could be written like this:
something = 'Audi'
for case in switch(something):
if case('VW', 'Audi', 'BMW'): name = 'car brand' ; break
if case('Cola', 'Fanta', 'Pepsi'): name = 'lemonade brand' ; break
if case('France', 'Germany', 'Italy'): name = 'country' ; break
else: name = 'nothing found'
return {'type':name}
If you don't have specific code to do for each value, then a simple mapping dictionary would probably suffice. For ease of maintenance, you can start with a category-list:type-name mapping and expand it before use:
mapping = { ('VW', 'Audi', 'BMW'):'car brand',
('Cola', 'Fanta', 'Pepsi'):'lemonade brand',
('France', 'Germany', 'Italy'):'country' }
mapping = { categ:name for categs,name in mapping.items() for categ in categs }
Then your code will look like this:
something = 'Audi'
return {'type':mapping.get(something,'nothing found')}
using a defaultdict would make this even simpler to use by providing the 'nothing found' value automatically so you could write: return {'type':mapping[something]}
I am looking to create a function that will turn a dictionary with address values into a string value with a specific order. I also need to account for missing values (Some address wont have a second or third address line. I want my output to look like the below so that I can copy the text block, separated by a new line, into a database field.
name
contact
addr1
addr2 (if not empty)
addr3 (if not empty)
city, state zip
phone
I have the following to create the dictionary, but I am stuck on creating the string object that ignores the empty values and puts everything in the correct order.
def setShippingAddr(name, contact, addr1, addr2, addr3, city, state, zipCode, phone):
addDict = {'name': name, 'contact': contact, 'addr1': addr1,
'city': city, 'state': state, 'zip': zipCode, 'phone': phone}
if addr2 is True: # append dict if addr2/addr 3 are True
addDict['addr2'] = addr2
if addr3 is True:
addDict['addr3'] = addr3
shAddr = # This is where i need to create the string object
return shAddr
I would rewrite the function to only return the string, the dictionary is not necessary:
def setShippingAddr(name, contact, addr1, city, state, zipCode, phone, addr2=None, addr3=None):
shAddr = f'{name}\n{contact}\n{addr1}'
shAddr = f'{shAddr}\n{addr2}' if addr2 else shAddr
shAddr = f'{shAddr}\n{addr3}' if addr3 else shAddr
shAddr = f'{shAddr}\n{city}, {state} {zipCode}\n{phone}'
return shAddr
Considering that you may want to add new entries to the dictionary
def setShippingAddr(name, contact, addr1, addr2, addr3, city, state, zipCode, phone):
addDict = {'name': name, 'contact': contact, 'addr1': addr1,
'city': city, 'state': state, 'zip': zipCode, 'phone': phone}
if addr2 is True: # append dict if addr2/addr 3 are True
addDict['addr2'] = addr2
if addr3 is True:
addDict['addr3'] = addr3
shAddr = ''
for key in addDict:
shAddr += addDict[key] + '\n'
return shAddr
It looks like (assuming you're using python3), an f string would work here.
shAddr = f"{addDict['name']} {addDict['contract'] etc..."
You can add logic within the {}, so something like
{addDict['addr2'] if addDict['addr2'] else ""}
should work, depending on what the specific output you were looking for was.
I'm not sure I understand the part with the dictionary. You could just leave it out, right?
Then
def setShippingAddr(*args):
return "\n".join([str(arg) for arg in args if arg])
s = setShippingAddr("Delenges", "Me", "Streetstreet", "Borrough", False,
"Town of City", "Landcountry", 12353, "+1 555 4545454")
print(s)
prints
Delenges
Me
Streetstreet
Borrough
Town of City
Landcountry
12353
+1 555 4545454
Here's a more pythonic solution,
def dict_to_string(dic):
s = ''
for k, v in dic.items():
s += "{} : {}\n".format(k, v)
return s
addDict = {'name': 'name', 'contact': 'contact', 'addr1': 'addr1', 'addr2': '',
'city': 'city', 'state': 'state', 'zip': 'zipCode', 'phone': 'phone'}
print(dict_to_string(addDict))
In this case, I've used addr2, which has a blank value. If you want, addr2 to be omitted completely, then check for the value while iterating.
def dict_to_string(dic):
s = ''
for k, v in dic.items():
if k:
s += "{} : {}\n".format(k, v)
return s
addDict = {'name': 'name', 'contact': 'contact', 'addr1': 'addr1', 'addr2': '',
'city': 'city', 'state': 'state', 'zip': 'zipCode', 'phone': 'phone'}
print(dict_to_string(addDict))
Finally if the natural order of iterating is not what you want you can use the OrderedDict
I am running a search on a list of ads (adscrape). Each ad is a dict within adscrape (e.g. ad below). It searches through a list of IDs (database_ids) which could be between 200,000 - 1,000,000 items long. I want to find any ads in adscrape that don't have an ID already in database_ids.
My current code is below. It takes a loooong time, and multiple seconds for each ad to scan through database_ids. Is there a more efficient/faster way of running this (finding which items in a big list, are in another big list)?
database_ids = ['id1','id2','id3'...]
ad = {'body': u'\xa0SUV', 'loc': u'SA', 'last scan': '06/02/16', 'eng': u'\xa06cyl 2.7L ', 'make': u'Hyundai', 'year': u'2006', 'id': u'OAG-AD-12371713', 'first scan': '06/02/16', 'odo': u'168911', 'active': 'Y', 'adtype': u'Dealer: Used Car', 'model': u'Tucson Auto 4x4 ', 'trans': u'\xa0Automatic', 'price': u'9990'}
for ad in adscrape:
ad['last scan'] = date
ad['active'] = 'Y'
adscrape_ids.append(ad['id'])
if ad['id'] not in database_ids:
ad['first scan'] = date
print 'new ad:',ad
newads.append(ad)
`You can use list comprehensions for this as the code base given below. Use the existing database_ids list and adscrape dict as given above.
Code base:
new_adds_ids = [ad for ad in adscrape if ad['id'] not in database_ids]`
You can build ids_map as dict and check whether id in list by accessing key in that ids_map as in code snippet below:
database_ids = ['id1','id2','id3']
ad = {'id': u'OAG-AD-12371713', 'body': u'\xa0SUV', 'loc': u'SA', 'last scan': '06/02/16', 'eng': u'\xa06cyl 2.7L ', 'make': u'Hyundai', 'year': u'2006', 'first scan': '06/02/16', 'odo': u'168911', 'active': 'Y', 'adtype': u'Dealer: Used Car', 'model': u'Tucson Auto 4x4 ', 'trans': u'\xa0Automatic', 'price': u'9990'}
#build ids map
ids_map = dict((k, v) for v, k in enumerate(database_ids))
for ad in adscrape:
# some logic before checking whether id in database_ids
try:
ids_map[ad['id']]
except KeyError:
pass
else:
#error not thrown perform logic for existed ids
print 'id %s in list' % ad['id']