How to distinguish first names from last names contained in array - python

I have an array called datos with names and surnames in random order
datos = ['Lucas Martinez', 'Gonzalez Carmen', 'Garcia Sofia', 'Cristian Ines Perez', 'Jorge Rodriguez']
As you can see, it can happen that there will be two names with one surname.
I also have an array with only names:
nombres = ['Sofia', 'Lucas', 'Cristian', 'Jorge', 'Ines', 'Carmen']
I want to find names and output with adjacent surname using the scheme:
"firstname lastname"
"firstname lastname"
like:
Lucas Martinez
Carmen Gonzalez
when there are two names to separate it into two separate data:
Cristian Perez
Ines Perez
I can find the name using this:
any(i.split()[0] in nombres for i in datos)
def verificacion(a, b):
res = [i.split()[0] for i in a if i.split()[0] in b]
return res
print(verificacion(datos, nombres))
but only for schema when name comes first

Since names can be three "names" long, I would propose adding a preprocessing function. Afterwards, we can match all names to the nombres list and append them in the correct order as such:
datos = ['Lucas Martinez', 'Gonzalez Carmen', 'Garcia Sofia', 'Cristian Ines Perez', 'Jorge Rodriguez']
nombres = ['Sofia', 'Lucas', 'Cristian', 'Jorge', 'Ines', 'Carmen']
def preprocess(datos):
new_datos = []
for person in datos:
if len(person.split(' ')) > 2: # get names which are 3 names long and split them
s_name = person.split(' ')
new_datos.append(f'{s_name[0]} {s_name[2]}')
new_datos.append(f'{s_name[1]} {s_name[2]}')
else: # default case
new_datos.append(person)
return new_datos
def verificacion(datos, nombres):
ver_list = []
datos = preprocess(datos)
for person in datos:
split_name = person.split(' ')
if split_name[0] in nombres: # add the name normally if it's in the correct order
ver_list.append(person)
elif split_name[1] in nombres: # otherwise reverse the order
ver_list.append(f'{split_name[1]} {split_name[0]}')
return ver_list
print(verificacion(datos, nombres))
Probably a more succinct way of doing this but I'll leave that up to you. This should work if I understood the requirements correctly.

Related

How to Iterate through an array and find more than one elements

The Problem -
If I input more than one player's names. Only the first name comes up and the program then stops.
What can I do to print all names and the W/R Percentage presented by User's Input.
The code -
def print_player_data():
nba_data = pd.read_csv("csv_data.csv", sep=",")
dataList = []
player_names = input("Enter a list of player names: ")
player_names = player_names.split(",")
print(player_names)
for player in player_names:
for index, row in nba_data.iterrows():
if row["PLAYER_NAME"] == player:
dataList.append(row["W/R_percentage"])
print(dataList)
print_player_data()
The Data -
PLAYER_NAME,TEAM_ABBREVIATION,Player Impact Rating,GP,Wins,Losses,W/R_percentage
Alex Len,ATL,0.1,77,28,49,36.36
Alex Poythress,ATL,0.069,21,7,14,33.33
Daniel Hamilton,ATL,0.07,19,7,12,36.84
DeAndre Bembry,ATL,0.081,82,29,53,35.37
It seems like when inputting the names, you add spaces after the comma.
Add this line before iterating through your list to remove leading whitespaces:
player_names = [name.lstrip() for name in player_names]

How to sub-string a list in Python

I have an assignment where I have a list with two names and I have to print the first name and then I have to print the last name.
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
I have two names and then when I find the whitespace between them I have to print both the first name and the last name out of the list and put it into a variable.
I have tried to slice it but I don't understand it enough to use it. Here is an example:
substr = x[0:2]
This just brings both names instead of only substring it.
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
for i in range(0, len(names_list)):
nf = names_list[i].split(' ')
name = nf[0]
family = nf[1]
print("Name is: {}, Family is: {}".format(name, family))
Output:
Name is: Oluwaferanmi, Family is: Fakolujo
Name is: Ajibola, Family is: Fakolujo
This will only work for Python 3.x
You can use the split() method and indexing for these kinds of problems.
Iterate through the list
split() the string
Store the values in a variable, so that we can index them
Display the values
names_list = ['Oluwaferanmi Fakolujo', 'Ajibola Fakolujo']
for i in names_list:
string = i.split(" ")
first_name = string[0]
last_name = string[1]
print(f"First Name: {first_name} Last Name: {last_name}")

How to make a text file (name1:hobby1 name2:hobby2) into this (name1:hobby1, hobby2 name2:hobby1, hobby2)?

I'm new to programming and I need some help. I have a text file with lots of names and hobbies that looks something like this:
Jack:crafting
Peter:hiking
Wendy:gaming
Monica:tennis
Chris:origami
Sophie:sport
Monica:design
Some of the names and hobbies are repeated. I'm trying to make the program display something like this:
Jack: crafting, movies, yoga
Wendy: gaming, hiking, sport
This is my program so far, but the 4 lines from the end are incorrect.
def create_dictionary(file):
newlist = []
dict = {}
file = open("hobbies_database.txt", "r")
hobbies = file.readlines()
for rows in hobbies:
rows1 = rows.split(":")
k = rows1[0] # nimi
v = (rows1[1]).rstrip("\n") # hobi
dict = {k: v}
for k, v in dict.items():
if v in dict[k]:
In this case I would use defaultdict.
import sys
from collections import defaultdict
def create_dictionary(inputfile):
d = defaultdict(list)
for line in inputfile:
name, hobby = line.split(':', 1)
d[name].append(hobby.strip())
return d
with open(sys.argv[1]) as fp:
for name, hobbies in create_dictionary(fp).items():
print(name, ': ', sep='', end='')
print(*hobbies, sep=', ')
Your example give me this result:
Sophie: sport
Chris: origami
Peter: hiking
Jack: crafting
Wendy: gaming
Monica: tennis, design
you may try this one
data = map(lambda x:x.strip(), open('hobbies_database.txt'))
tmp = {}
for i in data:
k,v = i.strip().split(':')
if not tmp.get(k, []):
tmp[k] = []
tmp[k].append(v)
for k,v in tmp.iteritems():
print k, ':', ','.join(v)
output:
Monica : tennis,design
Jack : crafting
Wendy : gaming
Chris : origami
Sophie : sport
Peter : hiking
You could try something like this. I've deliberately rewritten this as I'm trying to show you how you would go about this in a more "Pythonic way". At least making use of the language a bit more.
For example, you can create arrays within dictionaries to represent the data more intuitively. It will then be easier to print the information out in the way you want.
def create_dictionary(file):
names = {} # create the dictionary to store your data
# using with statement ensures the file is closed properly
# even if there is an error thrown
with open("hobbies_database.txt", "r") as file:
# This reads the file one line at a time
# using readlines() loads the whole file into memory in one go
# This is far better for large data files that wont fit into memory
for row in file:
# strip() removes end of line characters and trailing white space
# split returns an array [] which can be unpacked direct to single variables
name, hobby = row.strip().split(":")
# this checks to see if 'name' has been seen before
# is there already an entry in the dictionary
if name not in names:
# if not, assign an empty array to the dictionary key 'name'
names[name] = []
# this adds the hobby seen in this line to the array
names[name].append(hobby)
# This iterates through all the keys in the dictionary
for name in names:
# using the string format function you can build up
# the output string and print it to the screen
# ",".join(array) will join all the elements of the array
# into a single string and place a comma between each
# set(array) creates a "list/array" of unique objects
# this means that if a hobby is added twice you will only see it once in the set
# names[name] is the list [] of hobby strings for that 'name'
print("{0}: {1}\n".format(name, ", ".join(set(names[name]))))
Hope this helps, and perhaps points you in the direction of a few more Python concepts. If you haven't been through the introductory tutorial yet... i'd definitely recommend it.

How can I organize case-insensitive text and the material following it?

I'm very new to Python so it'd be very appreciated if this could be explained as in-depth as possible.
If I have some text like this on a text file:
matthew : 60 kg
MaTtHew : 5 feet
mAttheW : 20 years old
maTThEw : student
MaTTHEW : dog owner
How can I make a piece of code that can write something like...
Matthew : 60 kg , 5 feet , 20 years old , student , dog owner
...by only gathering information from the text file?
def test_data():
# This is obviously the source data as a multi-line string constant.
source = \
"""
matthew : 60 kg
MaTtHew : 5 feet
mAttheW : 20 years old
maTThEw : student
MaTTHEW : dog owner
bob : 70 kg
BoB : 6 ft
"""
# Split on newline. This will return a list of lines like ["matthew : 60 kg", "MaTtHew : 5 feet", etc]
return source.split("\n")
def append_pair(d, p):
k, v = p
if k in d:
d[k] = d[k] + [v]
else:
d[k] = [v]
return d
if __name__ == "__main__":
# Do a list comprehension. For every line in the test data, split by ":", strip off leading/trailing whitespace,
# and convert to lowercase. This will yield lists of lists.
# This is mostly a list of key/value size-2-lists
pairs = [[x.strip().lower() for x in line.split(":", 2)] for line in test_data()]
# Filter the lists in the main list that do not have a size of 2. This will yield a list of key/value pairs like:
# [["matthew", "60 kg"], ["matthew", "5 feet"], etc]
cleaned_pairs = [p for p in pairs if len(p) == 2]
# This will iterate the list of key/value pairs and send each to append_pair, which will either append to
# an existing key, or create a new key.
d = reduce(append_pair, cleaned_pairs, {})
# Now, just print out the resulting dictionary.
for k, v in d.items():
print("{}: {}".format(k, ", ".join(v)))
import sys
# There's a number of assumptions I have to make based on your description.
# I'll try to point those out.
# Should be self-explanatory. something like: "C:\Users\yourname\yourfile"
path_to_file = "put_your_path_here"
# open a file for reading. The 'r' indicates read-only
infile = open(path_to_file, 'r')
# reads in the file line by line and strips the "invisible" endline character
readLines = [line.strip() for line in infile]
# make sure we close the file
infile.close()
# An Associative array. Does not use normal numerical indexing.
# instead, in our case, we'll use a string(the name) to index into.
# At a given name index(AKA key) we'll save the attributes about that person.
names = dict()
# iterate through each line we read in from the file
# each line in this loop will be stored in the variable
# item for that iteration.
for item in readLines:
#assuming that your file has a strict format:
# name : attribute
index = item.find(':')
# if there was a ':' found then continue
if index is not -1:
# grab only the name of the person and convert the string to all lowercase
name = item[0:index].lower()
# see if our associative array already has that peson
if names.has_key(name):
# if that person has already been indexed add the new attribute
# this assumes there are no dupilcates so I don't check for them.
names[name].append(item[index+1:len(item)])
else:
# if that person was not in the array then add them.
# we're adding a list at that index to store their attributes.
names[name] = list()
# append the attribute to the list.
# the len() function tells us how long the string 'item' is
# offsetting the index by 1 so we don't capture the ':'
names[name].append(item[index+1:len(item)])
else:
# there was no ':' found in the line so skip it
pass
# iterate through keys (names) we found.
for name in names:
# write it to stdout. I am using this because the "print" built-in to python
# always ends with a new line. This way I can print the name and then
# iterate through the attributes associated with them
sys.stdout.write(name + " : ")
# iterate through attributes
for attribute in names[name]:
sys.stdout.write(attribute + ", ")
# end each person with a new line.
sys.stdout.write('\r\n')

Integrating Wildcard In Comparison of Strings

I have a python script which takes a name, re-formats it, and then compares it to a list of other names to see how many times it matches. The issue is the names it is being compared to have middle initials (which I don't want to have entered in the script).
list_of_names = ['Doe JM', 'Cruz CR', 'Smith JR', 'Doe JM', 'Maltese FL', 'Doe J']
Now I have a simple function that reformats the name.
f_name = name_format('John','Doe')
print(f_name)
> 'Doe J'
Now I want to do comparisons where everytime "Doe J" or "Doe JM" appears, the value is true. The below function would not work as intended.
def matches(name, list):
count = 0
for i in list:
if i == name:
count = count + 1
else:
pass
return(count)
print (matches(f_name, list_of_names))
> 1
My goal is to make the return equal to 3. To do these, I want ignore the middle initial which in this case would be 'M' in 'Doe JM'.
What I want to do is something along the lines of formatting the name to 'Doe J?' where '?' is a wild card. I tried importing fnmatch and re to use some of their tools but was unsuccessful.
Use two for and yield. Function will return duplicate values and you need use set for remove it:
list_of_names = ['Doe JM', 'Cruz CR', 'Smith JR', 'Doe JM', 'Maltese FL', 'Doe J']
# List of names
def check_names(part_names, full_name_list):
for full_name in full_name_list:
for part_name in part_names:
if part_name in full_name:
yield full_name
result = set(check_names(['Doe J', 'Cruz'], list_of_names))
# One name
def check_names(name, full_name_list):
for full_name in full_name_list:
if name in full_name:
yield full_name
result = check_names('Doe J', list_of_names)
print list(result) # List of result
print len(result) # Count of names
You were on the right track with the re module. I believe the solution to your problem would be:
import re
def matches(name, name_list):
regex = name + '\w?' # Allows one addition word character after the name
result = list(map(lambda test_name: re.match(regex, test_name) is not None, name_list))
return result.count(True)
print(matches(f_name, list_of_names))
# 3
This solution ensures that exactly one alphanumeric character is allowed after the name.

Categories