I have a python script which takes a name, re-formats it, and then compares it to a list of other names to see how many times it matches. The issue is the names it is being compared to have middle initials (which I don't want to have entered in the script).
list_of_names = ['Doe JM', 'Cruz CR', 'Smith JR', 'Doe JM', 'Maltese FL', 'Doe J']
Now I have a simple function that reformats the name.
f_name = name_format('John','Doe')
print(f_name)
> 'Doe J'
Now I want to do comparisons where everytime "Doe J" or "Doe JM" appears, the value is true. The below function would not work as intended.
def matches(name, list):
count = 0
for i in list:
if i == name:
count = count + 1
else:
pass
return(count)
print (matches(f_name, list_of_names))
> 1
My goal is to make the return equal to 3. To do these, I want ignore the middle initial which in this case would be 'M' in 'Doe JM'.
What I want to do is something along the lines of formatting the name to 'Doe J?' where '?' is a wild card. I tried importing fnmatch and re to use some of their tools but was unsuccessful.
Use two for and yield. Function will return duplicate values and you need use set for remove it:
list_of_names = ['Doe JM', 'Cruz CR', 'Smith JR', 'Doe JM', 'Maltese FL', 'Doe J']
# List of names
def check_names(part_names, full_name_list):
for full_name in full_name_list:
for part_name in part_names:
if part_name in full_name:
yield full_name
result = set(check_names(['Doe J', 'Cruz'], list_of_names))
# One name
def check_names(name, full_name_list):
for full_name in full_name_list:
if name in full_name:
yield full_name
result = check_names('Doe J', list_of_names)
print list(result) # List of result
print len(result) # Count of names
You were on the right track with the re module. I believe the solution to your problem would be:
import re
def matches(name, name_list):
regex = name + '\w?' # Allows one addition word character after the name
result = list(map(lambda test_name: re.match(regex, test_name) is not None, name_list))
return result.count(True)
print(matches(f_name, list_of_names))
# 3
This solution ensures that exactly one alphanumeric character is allowed after the name.
Related
I have an assignment where I have a list with two names and I have to print the first name and then I have to print the last name.
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
I have two names and then when I find the whitespace between them I have to print both the first name and the last name out of the list and put it into a variable.
I have tried to slice it but I don't understand it enough to use it. Here is an example:
substr = x[0:2]
This just brings both names instead of only substring it.
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
for i in range(0, len(names_list)):
nf = names_list[i].split(' ')
name = nf[0]
family = nf[1]
print("Name is: {}, Family is: {}".format(name, family))
Output:
Name is: Oluwaferanmi, Family is: Fakolujo
Name is: Ajibola, Family is: Fakolujo
This will only work for Python 3.x
You can use the split() method and indexing for these kinds of problems.
Iterate through the list
split() the string
Store the values in a variable, so that we can index them
Display the values
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
for i in names_list:
string = i.split(" ")
first_name = string[0]
last_name = string[1]
print(f"First Name: {first_name} Last Name: {last_name}")
Using this code I was able to cycle through several instances of attributes and extract First and Last name if they matched the criteria. The results are a list of dict. How would i make all of these results which match the criteria, return as a full name each on it's own line as text?
my_snapshot = cfm.child('teamMap').get()
for players in my_snapshot:
if players['age'] != 27:
print({players['firstName'], players['lastName']})
Results of Print Statement
{'Chandon', 'Sullivan'}
{'Urban', 'Brent'}
Are you looking for this:
print(players['firstName'], players['lastName'])
This would output:
Chandon Sullivan
Urban Brent
Your original trial just put the items to a set {}, and then printed the set, for no apparent reason.
Edit:
You can also for example join the firstName and lastName to be one string and then append the combos to a lists. Then you can do whatever you need with the list:
names = []
my_snapshot = cfm.child('teamMap').get()
for players in my_snapshot:
if players['age'] != 27:
names.append(f"{players['firstName']} {players['lastName']}")
If you're using a version of Python lower than 3.6 and can't use f-strings you can do the last line for example like this:
names.append("{} {}").format(players['firstName'], players['lastName'])
Or if you prefer:
names.append(players['firstName'] + ' ' + players['lastName'])
Ok I figured out by appending the first and last name and creating a list for the found criteria. I then converted the list to a string to display it on the device.
full_list = []
my_snapshot = cfm.child('teamMap').get()
for players in my_snapshot:
if players['age'] != 27:
full_list.append((players['firstName'] + " " + players['lastName']))
send_message('\n'.join(str(i) for i in full_list))
as the title says I am trying to get a exact match from any list of strings in a lists. I'm finding it hard to explain so ill show code now.
List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]
This would be an example of the lists layout, then i want to send information through from a web scrape to see if anything matches any of the item[0]+item[1]+item[2] in the list, the problem i am having is that the web scrape is using a for argument:-
HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:
Then my next part also using a for argument:-
for item in List:
if item[0] == name and item[1] == age and item[2] == sex:
pass
else:
print 'Name = '+name
print 'Age = '+age
print 'Sex = '+sex
But obviously if the result matches any of the single sets of lists it cannot match the other 2 so it will not pass, is there a way i can achieve it to check to see if it matches anything set of 3 results in the list name,age,and sex being item[0],item[1],item[2] exactly? I have also tried:
if all(item[0] == name and item[1] == age and item[2] == sex for item in List):
pass
This does not work, I'm assuming its because its not a direct match in all the lists of list and if i change all to any i get results coming back that skip if any of the strings match, ie age is 27,32 or 28. I know my regex is poor form and not the ideal way to parse HTML but its all I can use confidently at the moment sorry. Full code below for easier reading.
List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]
HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:
for item in List:
if item[0] == name and item[1] == age and item[2] == sex:
pass
else:
print 'Name = '+name
print 'Age = '+age
print 'Sex = '+sex
Any help would be greatly appreciated, I am still a beginner and have not used forum's much so I will apologise in advance if it's not grammatically correct or I have asked in the wrong way.
re.findall returns tuples, so you can simplify the comparison if the items in your list match the return type:
import re
# Changed sub-lists to tuples.
items = [('BOB','27','male'),('SUE','32','female'),('TOM','28','unsure')]
html = '''\
Name"BOB" Age"27" Sex"male"
Name"PAT" Age"19" Sex"unsure"
Name"SUE" Age"31" Sex"female"
Name"TOM" Age"28" Sex"unsure"
'''
for item in re.findall('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"', html):
if item in items:
name,age,sex = item
print 'Name =', name
print 'Age =', age
print 'Sex =', sex
print
Output:
Name = BOB
Age = 27
Sex = male
Name = TOM
Age = 28
Sex = unsure
You can also use item not in items if you want the ones that don't match.
First change the name of the list. List is not a reserved keyword,but it's not a good practice to use abstract names. My suggestion is to make the data a list. If I understood right your question, it's a matter of getting everything differently. So:
for sublist in my_list:
if (sublist[0] != weblist[0]) and (sublist[1] != weblist[1]) and (sublist[2] != weblist[2]):
print("List is different")
I have a file at /location/all-list-info.txt underneath I have some items in below manner:
aaa:xxx:abc.com:1857:xxx1:rel5t2:y
ifa:yyy:xyz.com:1858:yyy1:rel5t2:y
I process these items with a below python code:
def pITEMName():
global itemList
itemList = str(raw_input('Enter pipe separated list of ITEMS : ')).upper().strip()
items = itemList.split("|")
count = len(items)
print 'Total Distint Item Count : ', count
pipelst = itemList.split('|')
filepath = '/location/all-item-info.txt '
f = open(filepath, 'r')
for lns in f:
split_pipe = lns.split(':', 1)
if split_pipe[0] in pipelst:
index = pipelst.index(split_pipe[0])
del pipelst[index]
for lns in pipelst:
print lns,' is wrong item Name'
f.close()
if podList:
After execution of above python code its gives a prompt as :
Enter pipe separated list of ITEMS:
And then I passes the items :
Enter pipe separated list of ITEMS: aaa|ifa-mc|ggg-mc
now after pressing enter above code process further like below :
Enter pipe separated list of ITEMS : aaa|ifa-mc|ggg-mc
Total Distint Item Count : 3
IFA-MC is wrong Item Name
GGG-MC is wrong Item Name
ITEMs Belonging to other Centers :
Item Count From Other Center = 0
ITEMs Belonging to Current Centers :
Active Items in US1 :
^IFA$
Test Active Items in US1 :
^AAA$
Ignored Item Count From Current center = 0
You Have Entered ItemList belonging to this center as: ^IFA$|^AAA$
Active Pod Count : 2
My question is if I suffix the '-mc' in items while giving the input its given me as wrong item whereas it presents in /location/all-item-info.txt file with not present the item in /location/all-item-info.txt . Please have a look at below output again :
IFA-MC is wrong Item Name
GGG-MC is wrong Item Name
In above example 'ifa' is present in /location/all-items-info.txt path whereas as ggg is not present.
Request you to help me here what can I do on above code so if I suffix the -mc which are present in /location/all-items-info.txt file it should not count as wrong item name. it should count only for those items which are not present in /location/all-items-info.txt file.
Please give you help.
Thanks,
Ritesh.
If you want to avoid checking for -mc as well, then you can modify this part of your script -
pipelst = itemList.split('|')
To -
pipelst = [i.split('-')[0] for i in itemList.split('|')]
It's a bit unclear exactly what you are asking, but basically to ignore any '-mc' from user input, you can explicitly preprocess the user input to strip it out:
pipelst = itemList.split('|')
pipelst = [item.rsplit('-mc',1)[0] for item in pipelst]
If instead you want to allow for the possibility of -mc-suffixed words in the file as well, simply add the stripped version to the list instead of replacing
pipelst = itemList.split('|')
for item in pipelist:
if item.endswith('-mc'):
pipelst.append(item.rsplit('-mc',1)[0])
Another issue may be based on the example lines you gave from /location/all-list-info.txt, it sounds like all the items are lowercase. However, pipelst is explicitly making the user input all uppercase. String equality and in mechanics is case-sensitive, so for instance
>>> print 'ifa-mc' in ['IFA-MC']
False
You probably want:
itemList = str(raw_input('Enter pipe separated list of ITEMS : ')).lower().strip()
and you could use .upper() only when printing or wherever it is needed
Finally, there are a few other things that could be tweaked with the code just to make things a bit faster and cleaner. The main one that comes to mind is it seems like pipelst should be a python set and not a list as checking inclusion and removal would then be much faster for large lists, and the code to remove an item from a set is much cleaner:
>>> desserts = set(['ice cream', 'cookies', 'cake'])
>>> if 'cake' in desserts:
... desserts.remove('cake')
>>> print desserts
set(['cookies', 'ice cream'])
I have a string variable that is a person's first and middle names, which have been accidentally concatenated. Let's call it firstMiddle="johnadam"
I need to identify what's the first name and what isn't, and then split them into different variables. So I have this big text file full of first names, and the idea is that you check the full firstMiddle string to see if it's in the list, and if it isn't, then you decrement by one character and retry. (if you increment you fail, e.g. "max" from "maxinea")
I have tried writing this a hundred different ways, and my problem seems to be that I can't get it to x in y a whole word (this \b regex stuff only works on actual strings and not string variables?). The best outcome I had decremented "johnadam" down to "johna" because there is a name "johnathan" in the list. Now I can't even remember how I did that and my current code decrements just once and then quits even though nameToMatch in nameList == False.
I'm a total noob. I know I'm doing something very obviously stupid. Please help. Here's some code:
firstMiddle = "johnadam"
nameToCheck = firstMiddle
for match in nameList:
if nameToCheck not in nameList:
nameToCheck = nameToCheck[:-1]
break
firstName = nameToCheck
middleName = firstMiddle.partition(nameToCheck)[2]
firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']
while nameToCheck not in nameList:
nameToCheck = nameToCheck[:-1]
firstname = nameToCheck
middleName = firstMiddle[ len(firstName): ]
This is a simple change from what Gabriel has done. The concept is basically the same. This one just looks at the longest match rather than the first match. Its difficult to put the entire code in the comment section so answering separately.
firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']
firstNames = filter(lambda m: firstMiddle.startswith(m), nameList)
middleName = ''
if firstNames: # if the list isnt empty
firstName = sorted( firstNames, key=len )[-1]
middleName = firstMiddle.partition(firstName)[2]
else:
firstName = firstMiddle
print firstName
See if this works ...
You could do this in a brute force way.
Iterate over the mixedname and slice it every time a bit more.
So you get
['johnadam']
['johnada','m']
['johnad','am']
['johna','dam']
['john','adam'] # your names in the list
If one of them match you put them aside and keep doing it until all of them are match.
If you have names that start same like 'john' and 'johnathan' or are in middle of other names, like 'natham' and 'johnathan' you should not stop when you find a match, but keep doing the slicing, so you get ['john','athan'] and ['joh','nathan']
mixnames = ['johnadam','johnathan','johnamax']
names = ['john','adam', 'johnathan','nathan','max']
foundnames = []
for name in mixnames:
for i in xrange(len(name)):
name1 = name[0:i+1]
name2 = name[i:]
if name1 in names and name1 not in foundnames:
foundnames.append(name1)
if name2 in names and name2 not in foundnames:
foundnames.append(name2)
print foundnames
output:
['john', 'adam', 'johnathan', 'nathan', 'max']