Splitting a string of names - python

I'm trying to write a program that will ask a user to input several names, separated by a semi-colon. The names would be entered as lastname,firstname. The program would then print each name in a firstname lastname format on separate lines. So far, my program is:
def main():
names=input("Please enter your list of names: ")
person=names.split(";")
xname=person.split(",")
This is as far as I got,because there's an error when I try to split on the comma. What am I doing wrong? The output should look like this:
Please enter your list of names: Falcon, Claudio; Ford, Eric; Owen, Megan; Rogers, Josh; St. John, Katherine
You entered:
Claudio Falcon
Eric Ford
Megan Owen
Josh Rogers
Katherine St. John

.split is a string method that returns a list of strings. So it works fine on splitting the original string of names, but you can't call it on the resulting list (list doesn't have a .split method, and that really wouldn't make sense). So you need to call .split on each of the strings in the list. And to be neat, you should clean up any leading or trailing spaces on the names. Like this:
names = "Falcon, Claudio; Ford, Eric; Owen, Megan; Rogers, Josh; St. John, Katherine"
for name in names.split(';'):
last, first = name.split(',')
print(first.strip(), last.strip())
output
Claudio Falcon
Eric Ford
Megan Owen
Josh Rogers
Katherine St. John

.split returns a list, so you are attempting
["Falcon, Claudio", "Ford, Eric" ...].split(',')
Which obviously doesn't work, as split is a string method. Try this:
full_names = []
for name in names.split("; "):
last, first = name.split(', ')
full_names.append(first + " " + last)
To give you
['Claudio Falcon', 'Eric Ford', 'Megan Owen', 'Josh Rogers', 'Katherine St. John']

You are splitting the whole list instead of each string. Change it to this:
def main():
names=input("Please enter your list of names: ")
person=names.split("; ")
xname=[x.split(", ") for x in person]
To print it out, do this:
print("\n".join([" ".join(x[::-1]) for x in xname]))

You can use the following code:
names = raw_input("Please enter your list of names:")
data = names.split(";")
data will return you list so process that list to get first name and last name
f_names=[]
for i in data:
l_name,f_name= i.split(",")
f_names.append(f_name+" "+l_name)
print "you entered \n"+ '\n'.join(p for p in f_names)
So this way you can print desired input

Related

Can I get Python to compare a list of nicknames with a list of full names?

So first off I have a character data frame that has a column called name and contains the full name for 100+ people.
Eg, Name: Johnathan Jay Smith, Harold Robert Doe, Katie Holt.
Then I have a list of unique nicknames eg, [Mr. Doe, Aunt Katie, John]
It's important to note that they are not in the same order, and that not everyone with a nickname is in the full name list, and not everyone in the full name list is in the nickname list. I will be removing rows that don't have matching values at the end.
My Question: is there a way I can get python to read through these 2 lists item by item and match John with Johnathan Jay Smith for everyone that has a match? Basically if the nickname appears as a part of the whole name, can I add a nickname column to my existing character data frame without doing this manually for over 100 people?
Thank you in advance, I don't even know where to start with this one!
This is very straight forward and does not take spelling variants into account
from itertools import product
names = ['Johnathan Jay Smith', 'Harold Robert Doe', 'Katie Holt']
nicknames = ["Mr. Doe", "Aunt Katie", "John"]
def match_nicknames(names, nicknames):
splitted_names = [n.split(' ') for n in names]
splitted_nn = [n.split(' ') for n in nicknames]
matches = []
for name in splitted_names:
name_pairs = product(name, splitted_nn)
matched = filter(lambda x: any([nn in x[0] for nn in x[1]]), name_pairs)
if matched:
matches += [(" ".join(name), " ".join(nn)) for name_part, nn in matched]
return matches
match_nicknames(names, nicknames)
>> [('Johnathan Jay Smith', 'John'),
('Harold Robert Doe', 'Mr. Doe'),
('Katie Holt', 'Aunt Katie')]

How to replace certain parts of a string using a list?

namelist = ['John', 'Maria']
e_text = 'John is hunting, Maria is cooking'
I need to replace 'John' and 'Maria'. How can I do this?
I tried:
for name in namelist:
if name in e_text:
e_text.replace(name, 'replaced')
But it only works with 'John'. The output is: 'replaced is hunting, Maria is cooking'. How can I replace the two names?
Thanks.
Strings are immutable in python, so replacements don't modify the string, only return a modified string. You should reassign the string:
for name in namelist:
e_text = e_text.replace(name, "replaced")
You don't need the if name in e_text check since replace already does nothing if it's not found.
You could form a regex alteration of names and then re.sub on that:
namelist = ['John', 'Maria']
pattern = r'\b(?:' + '|'.join(namelist) + r')\b'
e_text = 'John is hunting, Maria is cooking'
output = re.sub(pattern, 'replaced', e_text)
print(e_text + '\n' + output)
This prints:
John is hunting, Maria is cooking
replaced is hunting, replaced is cooking

Matching similar items in a list from user input with Python?

I'm very new to Python, and before anything else, I'd like to apologize if the title is not specific enough, but I don't know another way to word it. Also, apologies if this question is a duplicate. As I said, I really don't know what to look for. Anyways, here's what I want to do:
I have a list like this: names = ['william shakespeare', 'shakira', 'tom ford', 'tim ford']
I want the user to be able to input a name, so I'll use name = input('Enter name:'), and here's where I get stuck. I want the user to be able to enter a string like shak and have the program display 1. William Shakespeare 2. Shakira or maybe if the user put ford have the program show 1. Tom Ford 2. Tim Ford, and if the user gets more specific, like: shakespeare have the program show only 1. William Shakespeare.
I suppose this is a Regex question, but I find Regex very confusing. I've tried watching several videos to no avail. Any type of help is appreciated. Thanks in advance!
Regex would be overkill in my opinion. You can use a for loop to iterate through the list. Then using the in operator you can check if an entered string is within the string (called a substring). I'll give a simple example.
names = ['william shakespeare', 'shakira', 'tom ford', 'tim ford']
substring = input('Enter a substring: ')
for name in names:
if (substring in name):
print(name)
Since you say you are new to python I refrained from using list comprehensions or anything complicated.
Does this work for you? If the output isn't quite the way you want it, I expect that you can handle that part later. The logic finds each applicable name.
names = ['william shakespeare', 'shakira', 'tom ford', 'tim ford']
while True:
name = input('Enter name: ')
print([entry for entry in names if name in entry])
Regex is very useful, but this particular example doesn't require a regex. Using string.lower() on both strings makes this case insensitive.
i=0
For item in names:
if name.lower() in item.lower():
i++
print str(i) + '. ' + item
names = ['william shakespeare', 'shakira', 'tom ford', 'tim ford']
search = str(input("Please enter search term "))
ref = list(search.lower()) # this is your search text
out = []
count = 1
ind = 0
for i in names: # loops over your list
#print () #take the hash out to see what is i
for j in list(i.lower()): # list converts your variable into a list ie shakira = ['s','h'..]
if j == ref[ind]:
ind += 1
else:
ind = 0
if ind == len(ref): #ind is gonna go over each index in ref, if ind = the length of your
# search text then you have found a match
out.append(i)
ind = 0
break
if len(out) == 0:
out.append(" Nothing found ")
count = 0
for i in out:
print ("{}. {}".format(count,i), end = "\t\t")
count += 1
Okay! there are a lot of ways to do this, tried to make it as simple as possible.
Flaws: If there are two names which are similar, eg Will and William and you search for will, you will get both names. If you search for william it will give you Willaim only. Also the design is not that efficient, two for loops, basically it running linear to the number of letters in the list, (total letters.) See how it goes.
You could try something like this.
list = ['william shakespeare', 'shakira', 'tom ford', 'tim ford']
def get_matching_term(input, list):
matches = []
for word in list:
if input.lower() in word.lower():
matches.append(word)
return matches
thing_to_match = raw_input("Enter name to find:")
matches = get_matching_term(thing_to_match, list)
for idx, match in enumerate(matches):
print("Match #" + str((idx + 1)) + ": " + match)
Then you can invoke it like such.
python test.py
Enter name to find: shak
Match #1: william shakespeare
Match #2: shakira
dict_list = [{"First_Name": "Sam", "Last_Name": "John", "Age": 24},
{"First_Name": "Martin", "Last_Name": "Lukas", "Age": 34},
{"First_Name": "Jeff", "Last_Name": "Mark", "Age": 44},
{"First_Name": "Jones", "Last_Name": "alfred", "Age": 54}]
getdetails = input("Name: ")
for i in range(len(dict_list)):
if (getdetails in dict_list[i]["First_Name"]):
print(f'Details are {dict_list[i]}')
$$$$$$$ output $$$$$$
Name: Jeff
Details are {'First_Name': 'Jeff', 'Last_Name': 'Mark', 'Age': 44}
Process finished with exit code 0

Function to convert a string to a tuple

Here is an example of what data I will have:
472747372 42 Lawyer John Legend Bishop
I want to be able to take a string like that and using a function convert it into a tuple so that it will be split like so:
"472747372" "42" "Lawyer" "John Legend" "Bishop"
NI Number, Age, Job, surname and other names
What about:
>>> string = "472747372 42 Lawyer John Legend Bishop"
>>> string.split()[:3] + [' '.join(string.split()[3:5])] + [string.split()[-1]]
['472747372', '42', 'Lawyer', 'John Legend', 'Bishop']
Or:
>>> string.split(maxsplit=3)[:-1] + string.split(maxsplit=3)[-1].rsplit(maxsplit=1)
['472747372', '42', 'Lawyer', 'John Legend', 'Bishop']
In python, str has a built-in method called split which will split the string into a list, splitting on whatever character you pass it. It's default is to split on whitespace, so you can simply do:
my_string = '472747372 42 Lawyer Hermin Shoop Tator'
tuple(my_string.split())
EDIT: After OP changed the post.
Assuming there will always be an NI Number, Age, Job, and surname, you would have to do:
elems = my_string.split()
tuple(elems[:3] + [' '.join(elems[3:5])] + elems[5:])
This will allow you to support an arbitrary number of "other" names after the surname

disassemble and reassemble strings based on list

I have four speakers like this:
Team_A=[Fred,Bob]
Team_B=[John,Jake]
They are having a conversation and it is all represented by a string, ie. convo=
Fred
hello
John
hi
Bob
how is it going?
Jake
we are doing fine
How do I disassemble and reassemble the string so I can split it into 2 strings, 1 string of what Team_A said, and 1 string from what Team_A said?
output: team_A_said="hello how is it going?", team_B_said="hi we are doing fine"
The lines don't matter.
I have this awful find... then slice code that is not scalable. Can someone suggest something else? Any libraries to help with this?
I didn't find anything in nltk library
This code assumes that contents of convo strictly conforms to the
name\nstuff they said\n\n
pattern. The only tricky code it uses is zip(*[iter(lines)]*3), which creates a list of triplets of strings from the lines list. For a discussion on this technique and alternate techniques, please see How do you split a list into evenly sized chunks in Python?.
#!/usr/bin/env python
team_ids = ('A', 'B')
team_names = (
('Fred', 'Bob'),
('John', 'Jake'),
)
#Build a dict to get team name from person name
teams = {}
for team_id, names in zip(team_ids, team_names):
for name in names:
teams[name] = team_id
#Each block in convo MUST consist of <name>\n<one line of text>\n\n
#Do NOT omit the final blank line at the end
convo = '''Fred
hello
John
hi
Bob
how is it going?
Jake
we are doing fine
'''
lines = convo.splitlines()
#Group lines into <name><text><empty> chunks
#and append the text into the appropriate list in `said`
said = {'A': [], 'B': []}
for name, text, _ in zip(*[iter(lines)]*3):
team_id = teams[name]
said[team_id].append(text)
for team_id in team_ids:
print 'Team %s said: %r' % (team_id, ' '.join(said[team_id]))
output
Team A said: 'hello how is it going?'
Team B said: 'hi we are doing fine'
You could use a regular expression to split up each entry. itertools.ifilter can then be used to extract the required entries for each conversation.
import itertools
import re
def get_team_conversation(entries, team):
return [e for e in itertools.ifilter(lambda x: x.split('\n')[0] in team, entries)]
Team_A = ['Fred', 'Bob']
Team_B = ['John', 'Jake']
convo = """
Fred
hello
John
hi
Bob
how is it going?
Jake
we are doing fine"""
find_teams = '^(' + '|'.join(Team_A + Team_B) + r')$'
entries = [e[0].strip() for e in re.findall('(' + find_teams + '.*?)' + '(?=' + find_teams + r'|\Z)', convo, re.S+re.M)]
print 'Team-A', get_team_conversation(entries, Team_A)
print 'Team-B', get_team_conversation(entries, Team_B)
Giving the following output:
Team-A ['Fred\nhello', 'Bob\nhow is it going?']
Team_B ['John\nhi', 'Jake\nwe are doing fine']
It is a problem of language parsing.
Answer is a Work in progress
Finite state machine
A conversation transcript can be understood by imagining it as parsed by automata with the following states :
[start] ---> [Name]----> [Text]-+----->[end]
^ |
| | (whitespaces)
+-----------------+
You can parse your conversation by making it follow that state machine. If your parsing succeeds (ie. follows the states to end of text) you can browse your "conversation tree" to derive meaning.
Tokenizing your conversation (lexer)
You need functions to recognize the name state. This is straightforward
name = (Team_A | Team_B) + '\n'
Conversation alternation
In this answer, I did not assume that a conversation involves alternating between the people speaking, like this conversation would :
Fred # author 1
hello
John # author 2
hi
Bob # author 3
how is it going ?
Bob # ERROR : author 3 again !
are we still on for saturday, Fred ?
This might be problematic if your transcript concatenates answers from same author

Categories