Related
I've been struggling with this assignment for a few days and can't figure out how to write proper pythonic code to replace the values in the lists when there are pipes in the list strings.
We have 2 variables: fr and d. fr is a list of strings and d is a dictionary with email addresses as keys and numbers as values (numbers in string format).
Write code to replace the email address in each of the strings in the fr list with the associated value of that email looked up from the dictionary d.
If the dictionary does not contain the email found in the list, add a new entry in the dictionary for the email found in the fr list. The value for this new email key will be the next highest value number in the dictionary in string format.
Once the dictionary is populated with this new email key and a new number value, replace that email's occurrence in the fr list with the number value.
Don't manually change fr and d.
Sample input:
fr = [
'7#comp1.COM|4|11|GDSPV',
'7#comp1.COM|16|82|GDSPV',
'13#comp1.COM|12|82|GDSPV',
'26#comp1.COM|19|82|GDSPV'
]
d = {
'7#comp1.COM': '199',
'8#comp4.COM': '200',
'13#comp1.COM': '205'
}
The assignment gives what the output should look like, but I'm struggling to get there because of the pipes:
Value of fr:
['199|4|11|GDSPV', '199|16|82|GDSPV', '205|12|82|GDSPV', '206|19|82|GDSPV']
Value of d:
{'7#comp1.COM': '199', '8#comp4.COM': '200', '13#comp1.COM': '205', '26#comp1.COM': '206'}
This is what the assignment gives you to start off:
line_list = []
for line in fr:
And this is what I have so far:
line_list = []
for line in fr:
pipes = line.split('|')
if pipes[0] == '7#comp1.COM':
pipes[0] = d['7#comp1.COM']
elif pipes[0] == '13#comp1.COM':
pipes[0] = d['13#comp1.COM']
elif pipes[0] == '26#comp1.COM':
pipes[0] = d['26#comp1.COM']
print(pipes)
if len(d) < 4:
d['26#comp1.COM'] = '206'
print("Value of fr: ")
print(fr)
print("Value of d:")
print(d)
Which outputs:
['199', '4', '11', 'GDSPV']
['199', '16', '82', 'GDSPV']
['205', '12', '82', 'GDSPV']
['206', '19', '82', 'GDSPV']
Value of fr:
['7#comp1.COM|4|11|GDSPV', '7#comp1.COM|16|82|GDSPV', '13#comp1.COM|12|82|GDSPV', '26#comp1.COM|19|82|GDSPV']
Value of d:
{'7#comp1.COM': '199', '8#comp4.COM': '200', '13#comp1.COM': '205', '26#comp1.COM': '206'}
Here's a complete solution:
fr = [
'7#comp1.COM|4|11|GDSPV',
'7#comp1.COM|16|82|GDSPV',
'13#comp1.COM|12|82|GDSPV',
'26#comp1.COM|19|82|GDSPV'
]
d = {
'7#comp1.COM': '199',
'8#comp4.COM': '200',
'13#comp1.COM': '205'
}
# Figure out the highest key value in the `d` dictionary and set `next_id` to be one greater than that
next_id = -1
for id in d.values():
if int(id) > next_id:
next_id = int(id)
next_id += 1
# Create the start of the list we're going to build up
r = []
# For each input in `fr`...
for line in fr:
# Split the input into elements
elements = line.split('|')
# Extract the email address
email = elements[0]
# Is this address in `d`?
if email not in d:
# No, so add it with the next id as its value
d[email] = str(next_id)
next_id += 1
# Replace the email element with the value for that email from `d`
elements[0] = d[email]
# Concatenate the elements back together and put the resulting string in our results list `r`
r.append('|'.join(elements))
# Print our three structures
print(f"Value of fr: {fr}")
print(f"Value of d: {d}")
print(f"Value of r: {r}")
Result:
Value of fr: ['7#comp1.COM|4|11|GDSPV', '7#comp1.COM|16|82|GDSPV', '13#comp1.COM|12|82|GDSPV', '26#comp1.COM|19|82|GDSPV']
Value of d: {'7#comp1.COM': '199', '8#comp4.COM': '200', '13#comp1.COM': '205', '26#comp1.COM': '206'}
Value of r: ['199|4|11|GDSPV', '199|16|82|GDSPV', '205|12|82|GDSPV', '206|19|82|GDSPV']
Notice that we don't have to know what any of the email addresses are.
We just process whatever we find. I wasn't sure what "the next highest value number in the dictionary " meant, so maybe what I did to come up with the next value to use in the dictionary needs to be changed if my interpretation of that is incorrect.
I think you're forgetting to join the elements of the list. "|".join(pipes) will give you the final string. From there, all you have to do is to append to line_list and print it out after the loop. That isn't the way I'd do it, though. I would look abstract it into a function. In particular:
def substitute(string):
email, *rest = string.split('|')
number = d[email]
return '|'.join([number] + rest)
line_list = []
for line in fr:
line_list.append(substitute(line))
fr = line_list
I think str.partition is a bit cuter than str.split in this case. There's no need to split on all pipes. Doing it this way also avoids having to explicitly join with pipes again afterwards, though you still have to join. replaced_fr in this case will be the list containing the replaced, desired output.
replaced_fr = []
for line in fr:
email, *partitions = line.partition("|")
value = d.get(email, None)
if value is None:
value = str(max(map(int, d.values())) + 1)
d[email] = value
replaced_line = "".join([value] + partitions)
replaced_fr.append(replaced_line)
In python, we can use str.format to construct string like this:
string_format + value_of_keys = formatted_string
Eg:
FMT = '{name:} {age:} {gender}' # string_format
VoK = {'name':'Alice', 'age':10, 'gender':'F'} # value_of_keys
FoS = FMT.format(**VoK) # formatted_string
In this case, formatted_string = 'Alice 10 F'
I just wondering if there is a way to get the value_of_keys from formatted_string and string_format? It should be function Fun with
VoK = Fun('{name:} {age:} {gender}', 'Alice 10 F')
# the value of Vok is expected as {'name':'Alice', 'age':10, 'gender':'F'}
Is there any way to get this function Fun?
ADDED :
I would like to say, the '{name:} {age:} {gender}' and 'Alice 10 F' is just a simplest example. The realistic situation could be more difficult, the space delimiter may not exists.
And mathematically speaking, most of the cases are not reversible, such as:
FMT = '{key1:}{key2:}'
FoS = 'HelloWorld'
The VoK could be any one in below:
{'key1':'Hello','key2':'World'}
{'key1':'Hell','key2':'oWorld'}
....
So to make this question well defined, I would like to add two conditions:
1. There are always delimiters between two keys
2. All delimiters are not included in any value_of_keys.
In this case, this question is solvable (Mathematically speaking) :)
Another example shown with input and expected output:
In '{k1:}+{k2:}={k:3}', '1+1=2' Out {'k1':1,'k2':2, 'k3':3}
In 'Hi, {k1:}, this is {k2:}', 'Hi, Alice, this is Bob' Out {'k1':'Alice', 'k2':'Bob'}
You can indeed do this, but with a slightly different format string, called regular expressions.
Here is how you do it:
import re
# this is how you write your "format"
regex = r"(?P<name>\w+) (?P<age>\d+) (?P<gender>[MF])"
test_str = "Alice 10 F"
groups = re.match(regex, test_str)
Now you can use groups to access all the components of the string:
>>> groups.group('name')
'Alice'
>>> groups.group('age')
'10'
>>> groups.group('gender')
'F'
Regex is a very cool thing. I suggest you learn more about it online.
I wrote a funtion and it seems work:
import re
def Fun(fmt,res):
reg_keys = '{([^{}:]+)[^{}]*}'
reg_fmts = '{[^{}:]+[^{}]*}'
pat_keys = re.compile(reg_keys)
pat_fmts = re.compile(reg_fmts)
keys = pat_keys.findall(fmt)
lmts = pat_fmts.split(fmt)
temp = res
values = []
for lmt in lmts:
if not len(lmt)==0:
value,temp = temp.split(lmt,1)
if len(value)>0:
values.append(value)
if len(temp)>0:
values.append(temp)
return dict(zip(keys,values))
Usage:
eg1:
fmt = '{k1:}+{k2:}={k:3}'
res = '1+1=2'
print Fun(fmt,res)
>>>{'k2': '1', 'k1': '1', 'k': '2'}
eg2:
fmt = '{name:} {age:} {gender}'
res = 'Alice 10 F'
print Fun(fmt,res)
>>>
eg3:
fmt = 'Hi, {k1:}, this is {k2:}'
res = 'Hi, Alice, this is Bob'
print Fun(fmt,res)
>>>{'k2': 'Bob', 'k1': 'Alice'}
There is no way for python to determine how you created the formatted string once you get the new string.
For example: once your format "{something} {otherthing}" with values with space and you get the desired string, you can not differentiate whether the word with space was the part of {something} or {otherthing}
However you may use some hacks if you know about the format of the new string and there is consistency in the result.
For example, in your given example: if you are sure that you'll have word followed by space, then a number, then again a space and then a word, then you may use below regex to extract the values:
>>> import re
>>> my_str = 'Alice 10 F'
>>> re.findall('(\w+)\s(\d+)\s(\w+)', my_str)
[('Alice', '10', 'F')]
In order to get the desired dict from this, you may update the logic as:
>>> my_keys = ['name', 'age', 'gender']
>>> dict(zip(my_keys, re.findall('(\w+)\s(\d+)\s(\w+)', my_str)[0]))
{'gender': 'F', 'age': '10', 'name': 'Alice'}
I suggest another approach to this problem using **kwargs, such as...
def fun(**kwargs):
result = '{'
for key, value in kwargs.iteritems():
result += '{}:{} '.format(key, value)
# stripping the last space
result = result[:-1]
result += '}'
return result
print fun(name='Alice', age='10', gender='F')
# outputs : {gender:F age:10 name:Alice}
NOTE : kwargs is not an ordered dict, and will only keep the parameters order up to version 3.6 of Python. If order is something you with to keep, it is easy though to build a work-around solution.
This code produces strings for all the values, but it does split the string into its constituent components. It depends on the delimiter being a space, and none of the values containing a space. If any of the values contains a space this becomes a much harder problem.
>>> delimiters = ' '
>>> d = {k: v for k,v in zip(('name', 'age', 'gender'), 'Alice 10 F'.split(delimiters))}
>>> d
{'name': 'Alice', 'age': '10', 'gender': 'F'}
for your requirement, I have a solution.
This solution concept is:
change all delimiters to same delimiter
split input string by the same delimiter
get the keys
get the values
zip keys and values as dict
import re
from collections import OrderedDict
def Func(data, delimiters, delimiter):
# change all delimiters to delimiter
for d in delimiters:
data[0] = data[0].replace(d, delimiter)
data[1] = data[1].replace(d, delimiter)
# get keys with '{}'
keys = data[0].split(delimiter)
# if string starts with delimiter remove first empty element
if keys[0] == '':
keys = keys[1:]
# get keys without '{}'
p = re.compile(r'{([\w\d_]+):*.*}')
keys = [p.match(x).group(1) for x in keys]
# get values
vals = data[1].split(delimiter)
# if string starts with delimiter remove first empty element
if vals[0] == '':
vals = vals[1:]
# pack to a dict
result_1 = dict(zip(keys, vals))
# if you need Ordered Dict
result_2 = OrderedDict(zip(keys, vals))
return result_1, result_2
The usage:
In_1 = ['{k1}+{k2:}={k3:}', '1+2=3']
delimiters_1 = ['+', '=']
result = Func(In_1, delimiters_1, delimiters_1[0])
# Out_1 = {'k1':1,'k2':2, 'k3':3}
print(result)
In_2 = ['Hi, {k1:}, this is {k2:}', 'Hi, Alice, this is Bob']
delimiters_2 = ['Hi, ', ', this is ']
result = Func(In_2, delimiters_2, delimiters_2[0])
# Out_2 = {'k1':'Alice', 'k2':'Bob'}
print(result)
The output:
({'k3': '3', 'k2': '2', 'k1': '1'},
OrderedDict([('k1', '1'), ('k2', '2'), ('k3', '3')]))
({'k2': 'Bob', 'k1': 'Alice'},
OrderedDict([('k1', 'Alice'), ('k2', 'Bob')]))
try this :
import re
def fun():
k = 'Alice 10 F'
c = '{name:} {age:} {gender}'
l = re.sub('[:}{]', '', c)
d={}
for i,j in zip(k.split(), l.split()):
d[j]=i
print(d)
you can change the fun parameters as your wish and assign it to variables. It accepts the same string you want to give. and gives the dict like this:
{'name': 'Alice', 'age': '10', 'gender': 'F'}
I think the only right answer is that, what you are searching for isn't really possible generally after all. You just don't have enough information. A good example is:
#python 3
a="12"
b="34"
c="56"
string=f"{a}{b}{c}"
dic = fun("{a}{b}{c}",string)
Now dic might be {"a":"12","b":"34","c":"56"} but it might as well just be {"a":"1","b":"2","c":"3456"}. So any universal reversed format function would ultimately fail to this ambiguity. You could obviously force a delimiter between each variable, but that would defeat the purpose of the function.
I know this was already stated in the comments, but it should also be added as an answer for future visitors.
I have this:
query='id=10&q=7&fly=none'
and I want to split it to create a dictionary like this:
d = { 'id':'10', 'q':'7', 'fly':'none'}
How can I do it with little code?
By splitting twice, once on '&' and then on '=' for every element resulting from the first split:
query='id=10&q=7&fly=none'
d = dict(i.split('=') for i in query.split('&'))
Now, d looks like:
{'fly': 'none', 'id': '10', 'q': '7'}
In your case, the more convenient way would be using of urllib.parse module:
import urllib.parse as urlparse
query = 'id=10&q=7&fly=none'
d = {k:v[0] for k,v in urlparse.parse_qs(query).items()}
print(d)
The output:
{'id': '10', 'q': '7', 'fly': 'none'}
Note, that urlparse.parse_qs() function would be more useful if there multiple keys with same value in a query string. Here is an example:
query = 'id=10&q=7&fly=none&q=some_identifier&fly=flying_away'
d = urlparse.parse_qs(query)
print(d)
The output:
{'q': ['7', 'some_identifier'], 'id': ['10'], 'fly': ['none', 'flying_away']}
https://docs.python.org/3/library/urllib.parse.html#urllib.parse.parse_qs
This is what I came up with:
dict_query = {}
query='id=10&q=7&fly=none'
query_list = query.split("&")
for i in query_list:
query_item = i.split("=")
dict_query.update({query_item[0]: query_item[1]})
print(dict_query)
dict_query returns what you want. This code works by splitting the query up into the different parts, and then for each of the new parts, it splits it by the =. It then updates the dict_query with each new value. Hope this helps!
I have a long list of dictionaries that for the most part do not overlap. However, some of the dictionaries have the same 'Name' field and I'd only like unique names in the list of dictionaries. I'd like the first occurrence of the name to be the one that stays and any thereafter be deleted from the list.
I've put a short list below to illustrate the scenario:
myList = [
{'Name':'John', 'Age':'50', 'Height':'70'},
{'Name':'Kathy', 'Age':'43', 'Height':'65'},
{'Name':'John','Age':'46','Height':'68'},
{'Name':'John','Age':'50','Height':'72'}
]
I'd like this list to return the first 'John' and Kathy, but not the second or third Johns and their related information.
An acceptable, but not optimal solution would also be not having dictionaries with the same name next to each other.
You could run over the list and keep a set of unique names. Every time you encounter a new name (i.e., a name that isn't in the set), you add it to the set and the respective dict to the result:
def uniqueNames(dicts):
names = set()
result = []
for d in dicts:
if not d['Name'] in names:
names.add(d['Name'])
result.append(d)
return result
You can easily write a for-loop for this.
def getName(name):
'''Gets first occurence of name in list of dicts.'''
for i in myList:
if i['Name'] == name:
return i
Initial list:
my_list = [
{'Name':'John', 'Age':'50', 'Height':'70'},
{'Name':'Kathy', 'Age':'43', 'Height':'65'},
{'Name':'John','Age':'46','Height':'68'},
{'Name':'John','Age':'50','Height':'72'}
]
The logical (potentially newbie-friendlier) way:
names = set()
new_list = []
for d in my_list:
name = d['Name']
if name not in names:
new_list.append(d)
names.add(d['Name'])
print new_list # [{'Age': '50', 'Name': 'John', 'Height': '70'}, {'Age': '43', 'Name': 'Kathy', 'Height': '65'}]
A one-liner way:
new_list = {d['Name']: d for d in reversed(my_list)}.values()
print new_list # [{'Age': '43', 'Name': 'Kathy', 'Height': '65'}, {'Age': '50', 'Name': 'John', 'Height': '70'}]
Note: The one-liner will contain the first occurrence of each name, but it will return an arbitrarily ordered list.
I have a dictionary with key-value pair. My value contains strings. How can I search if a specific string exists in the dictionary and return the key that correspond to the key that contains the value.
Let's say I want to search if the string 'Mary' exists in the dictionary value and get the key that contains it. This is what I tried but obviously it doesn't work that way.
#Just an example how the dictionary may look like
myDict = {'age': ['12'], 'address': ['34 Main Street, 212 First Avenue'],
'firstName': ['Alan', 'Mary-Ann'], 'lastName': ['Stone', 'Lee']}
#Checking if string 'Mary' exists in dictionary value
print 'Mary' in myDict.values()
Is there a better way to do this since I may want to look for a substring of the value stored ('Mary' is a substring of the value 'Mary-Ann').
You can do it like this:
#Just an example how the dictionary may look like
myDict = {'age': ['12'], 'address': ['34 Main Street, 212 First Avenue'],
'firstName': ['Alan', 'Mary-Ann'], 'lastName': ['Stone', 'Lee']}
def search(values, searchFor):
for k in values:
for v in values[k]:
if searchFor in v:
return k
return None
#Checking if string 'Mary' exists in dictionary value
print search(myDict, 'Mary') #prints firstName
I am a bit late, but another way is to use list comprehension and the any function, that takes an iterable and returns True whenever one element is True :
# Checking if string 'Mary' exists in the lists of the dictionary values
print any(any('Mary' in s for s in subList) for subList in myDict.values())
If you wanna count the number of element that have "Mary" in them, you can use sum():
# Number of sublists containing 'Mary'
print sum(any('Mary' in s for s in subList) for subList in myDict.values())
# Number of strings containing 'Mary'
print sum(sum('Mary' in s for s in subList) for subList in myDict.values())
From these methods, we can easily make functions to check which are the keys or values matching.
To get the keys containing 'Mary':
def matchingKeys(dictionary, searchString):
return [key for key,val in dictionary.items() if any(searchString in s for s in val)]
To get the sublists:
def matchingValues(dictionary, searchString):
return [val for val in dictionary.values() if any(searchString in s for s in val)]
To get the strings:
def matchingValues(dictionary, searchString):
return [s for s i for val in dictionary.values() if any(searchString in s for s in val)]
To get both:
def matchingElements(dictionary, searchString):
return {key:val for key,val in dictionary.items() if any(searchString in s for s in val)}
And if you want to get only the strings containing "Mary", you can do a double list comprehension :
def matchingStrings(dictionary, searchString):
return [s for val in dictionary.values() for s in val if searchString in s]
Klaus solution has less overhead, on the other hand this one may be more readable
myDict = {'age': ['12'], 'address': ['34 Main Street, 212 First Avenue'],
'firstName': ['Alan', 'Mary-Ann'], 'lastName': ['Stone', 'Lee']}
def search(myDict, lookup):
for key, value in myDict.items():
for v in value:
if lookup in v:
return key
search(myDict, 'Mary')
import re
for i in range(len(myDict.values())):
for j in range(len(myDict.values()[i])):
match=re.search(r'Mary', myDict.values()[i][j])
if match:
print match.group() #Mary
print myDict.keys()[i] #firstName
print myDict.values()[i][j] #Mary-Ann
>>> myDict
{'lastName': ['Stone', 'Lee'], 'age': ['12'], 'firstName': ['Alan', 'Mary-Ann'],
'address': ['34 Main Street, 212 First Avenue']}
>>> Set = set()
>>> not ['' for Key, Values in myDict.items() for Value in Values if 'Mary' in Value and Set.add(Key)] and list(Set)
['firstName']
For me, this also worked:
def search(myDict, search1):
search.a=[]
for key, value in myDict.items():
if search1 in value:
search.a.append(key)
search(myDict, 'anyName')
print(search.a)
search.a makes the list a globally available
if a match of the substring is found in any value, the key of that
value will be appended to a
Following is one liner for accepted answer ... (for one line lovers ..)
def search_dict(my_dict,searchFor):
s_val = [[ k if searchFor in v else None for v in my_dict[k]] for k in my_dict]
return s_val
To provide a more general solution for others using this post to do similar or more complex python dictionary searches: you can use dictpy
import dictpy
myDict = {'age': ['12'], 'address': ['34 Main Street, 212 First Avenue'],
'firstName': ['Alan', 'Mary-Ann'], 'lastName': ['Stone', 'Lee']}
search = dictpy.DictSearch(data=myDict, target='Mary-Ann')
print(search.result) # prints -> [firstName.1, 'Mary-Ann']
The first entry in the list is the target location: dictionary key "firstName" and position 1 in the list. The second entry is the search return object.
The benefit of dictpy is it can find multiple 'Mary-Ann' and not just the first one. It tells you the location in which it found it, and you can search more complex dictionaries (more levels of nesting) and change what the return object is.
import re
for i in range(len(myDict.values())):
for j in range(len(myDict.values()[i])):
match=re.search(r'Mary', myDict.values()[i][j])
if match:
print match.group() #Mary
print myDict.keys()[i] #firstName
print myDict.values()[i][j] #Mary-Ann
def search(myDict, lookup):
a=[]
for key, value in myDict.items():
for v in value:
if lookup in v:
a.append(key)
a=list(set(a))
return a
if the research involves more keys maybe you should create a list with all the keys
import json
'mtach' in json.dumps(myDict)
is true if found