Divide list into two parts based on condition python - python

I have a list which contains a chat conversation between agent and customer.
chat = ['agent',
'Hi',
'how may I help you?',
'customer',
'I am facing issue with internet',
'agent',
'Can i know your name and mobile no.?'
'customer',
'john doe',
'111111',..... ]
This is a sample of chat list.
I am looking to divide the list into two parts, agent_chat and customer_chat, where agent_chat contains all the lines that agent said, and customer_chat containing the lines said by customer.
Something like this(final output).
agent_chat = ['Hi','how may I help you?','Can i know your name and mobile no.?'...]
customer_chat = ['I am facing issue with internet','john doe','111111',...]
I'm facing issues while solving this, i tried using list.index() method to split the chat list based on indexes, but I'm getting multiple values for the same index.
For example, the following snippet:
[chat.index(l) for l in chat if l=='agent']
Displays [0, 0], since its only giving me first occurrence.
Is there a better way to achieve the desired output?

index() returns only the first index of the element so you'll need to accumulate the index of all occurrence by iterating over the list.
I would suggest to solve this using a simple for loop as:
agent_chat = []
customer_chat = []
chat_type = 'agent'
for chat in chats:
if chat in ['agent', 'customer']:
chat_type = chat
continue
if chat_type == 'agent':
agent_chat.append(chat)
else:
customer_chat.append(chat)
Other approaches like list comprehension will require two iterations of the list.

This would be my solution to your problem.
chat = ['agent',
'Hi',
'how may I help you?',
'customer',
'I am facing issue with internet',
'agent',
'Can i know your name and mobile no.?',
'customer',
'john doe',
'111111']
agent_list = []
customer_list = []
agent = False
customer = False
for message in chat:
if message == 'agent':
agent = True
customer = False
elif message == 'customer':
agent = False
customer = True
elif agent:
agent_list.append(message)
elif customer:
customer_list.append(message)
else:
pass

Here is my solution. I don't know this is the best one but I hope it helps
def chat_lists(chat):
agent_chat = []
customer_chat = []
user_flag = ""
for message in chat:
if message == 'agent':
user_flag = 'agent'
elif message == 'customer':
user_flag = 'customer'
else :
if user_flag == 'agent':
agent_chat.append(message)
else:
customer_chat.append(message)
return customer_chat, agent_chat

You can do something like this.
chat = ['agent',
'Hi',
'how may I help you?',
'customer',
'I am facing issue with internet',
'agent',
'Can i know your name and mobile no.?'
'customer',
'john doe',
'111111']
agent = []
customer = []
for j in chat:
if j=='agent':
curr = 'agent'
continue
if j=='customer':
curr = 'customer'
continue
if(curr=='agent'):
agent.append(j)
else:
customer.append(j)
print(agent)
print(customer)

You can set up a while loop to parse through the messages, and set up a variable to act as a 'switch' for whether the agent or client is talking.
# Get the current speaker (first speaker)
current_speaker = chat[0]
# Make the chat logs
agent_chat = []
customer_chat = []
# Iterate for the array
for message in chat:
# If the current speaker is updated
if message in ['agent', 'customer']:
# Then update the speaker
current_speaker = message
# Skip to the next iteration
continue
# Add the message based
if current_speaker == 'agent':
agent_chat.append(message)
else:
customer_chat.append(message)
Another note to keep in mind is that this whole system will bug out heavily if a customer or agent decides, for whatever reason, to type in the word 'agent' or 'customer'.

Related

Pull variable data from all outlook emails that have a particular subject line then take date from the body

I get an email every day with fruit quantities sold on the day. Though I now have come up with some code to log the relevant data going forward, I have been unable to do it going backwards.
The data is stored in the body of the email like so:
Date of report:,01-Jan-2020
Apples,8
Pears,5
Lemons,7
Oranges,9
Tomatoes,6
Melons,3
Bananas,0
Grapes,4
Grapefruit,8
Cucumber,2
Satsuma,1
What I would like for the code to do is first search through my emails and find the emails that match a particular subject, iterate line by line through and find the variables I'm searching for, and then log them in a dataframe with the "Date of Report" logged in a date column and converted into the format: "%m-%d-%Y".
I think I can achieve this by doing some amendments to the code I've written to deal with keeping track of it going forward:
# change for the fruit you're looking for
Fruit_1 = "Apples"
Fruit_2 = "Pears"
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
# find data email
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
if Fruit_1 and Fruit_2 in message.body:
data = str(message.body)
break
else:
print('No data for', Fruit_1, 'or', Fruit_2, 'was found')
break
fruitd = open("fruitd.txt", "w") # copy the contents of the latest email into a .txt file
fruitd.write(data)
fruitd.close()
def get_vals(filename: str, searches: list) -> dict:
#Searches file for search terms and returns the values
dct = {}
with open(filename) as file:
for line in file:
term, *value = line.strip().split(',')
if term in searches:
dct[term] = float(value[0]) # Unpack value
# if terms are not found update the dictionary w remaining and set value to None
if len(dct.keys()) != len(searches):
dct.update({x: None for x in search_terms if x not in dct})
return dct
searchf = [
Fruit_1,
Fruit_2
] # the list of search terms the function searches for
result = get_vals("fruitd.txt", searchf) # search for terms
print(result)
# create new dataframe with the values from the dictionary
d = {**{'date':today}, **result}
fruit_vals = pd.DataFrame([d]).rename(columns=lambda z: z.upper())
fruit_vals['DATE'] = pd.to_datetime(fruit_vals['DATE'], format='%d-%m-%Y')
print(fruit_vals)
I'm creating a .txt titled 'fruitd' because I was unsure how I could iterate through an email message body any other way. Unfortunately I don't think creating a .txt for each of the past emails is really feasible and I was wondering whether there's a better way of doing it?
Any advice or pointers would be most welcome.
**EDIT Ideally would like to get all the variables in the search list; so Fruit_1 & Fruit_2 with room to expand it to a Fruit_3 + Fruit_4 (etc) if necessary.
#PREP THE STUFF
Fruit_1 = "Apples"
Fruit_2 = "Pears"
SEARCHF = [
Fruit_1,
Fruit_2
]
#DEF THE STUFF
# modified to take a list of list of strs as `report` arg
# apparently IDK how to type-hint; type-hinting removed
def get_report_vals(report, searches):
dct = {}
for line in report:
term, *value = line
# `str.casefold` is similar to `str.lower`, arguably better form
# if there might ever be a possibility of dealing with non-Latin chars
if term.casefold().startswith('date'):
#FIXED (now takes `date` str out of list)
dct['date'] = pd.to_datetime(value[0])
elif term in searches:
dct[term] = float(value[0])
if len(dct.keys()) != len(searches):
# corrected (?) `search_terms` to `searches`
dct.update({x: None for x in searches if x not in dct})
return dct
#DO THE STUFF
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
results = []
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
# are you looking for:
# Fruit_1 /and/ Fruit_2
# or:
# Fruit_1 /or/ Fruit_2
if Fruit_1 in message.body and Fruit_2 in message.body:
# FIXED
data = [line.strip().split(",") for line in message.body.split('\n')]
results.append(get_report_vals(data, SEARCHF))
else:
pass
fruit_vals = pd.DataFrame(results)
fruit_vals.columns = map(str.upper, fruit_vals.columns)

Filter employees by project in odoo 12

I'm customizing the Odoo project module, in the module, we have Projects that have tasks assigned to employees, I need to have a dropdown of employees based on the project selection, but because there isn't a direct relation I have to search all the tasks related to the project and then search for the employees.
this is my model so far:
class myModel(models.TransientModel):
_name = "mymodule.mymodel"
project_id = fields.Many2one('project.project', string="Project")
task_id = fields.Many2one('project.task', string="Task", domain="[('project_id', '=', project_id)]")
employee_id = fields.Many2one('hr.employee', string="Assign To")
#api.onchange('project_id')
def _projecy_onchange(self):
if not self.project_id.id:
return {'domain': {'employee_id': []}}
tasks = self.env['project.task'].search([('project_id','=',self.project_id.id)])
user_ids = []
for t in tasks:
if t.user_id:
user_ids.append(t.user_id.id)
if len(user_ids)>0:
employees = self.env['hr.employee'].search(['user_id','in', user_ids])
return {'domain': {'employee_id': employees}}
else:
return {'domain': {'employee_id': []}}
I'm having an issue when I want to search the employees:
employees = self.env['hr.employee'].search(['user_id','in',
user_ids])
I get the following error:
elif token[1] == 'in' and not token[2]: IndexError: tuple index out
of range
when I print user_ids is a basic list with the ids something like [9] (single element, cloud me more obviously)
I understand search can work as
employees = self.env['hr.employee'].search(['user_id','in', [9])
any guidance will be appreciated
You have got the wrong syntax of odoo's search method do it like this,
employees = self.env['hr.employee'].search([('user_id','in', user_ids)])
Missing part from your syntax: round braces around your domain.

Python RegEx code to detect specific features in a sentence

I created a simple word feature detector. So far been able to find particular features (jumbled within) the string, but the algorithm get confused with certain sequences of words. Let me illustrate:
from nltk.tokenize import word_tokenize
negative_descriptors = ['no', 'unlikely', 'no evidence of']
negative_descriptors = '|'.join(negative_descriptors)
negative_trailers = ['not present', 'not evident']
negative_trailers = '|'.join(negative_descriptors)
keywords = ['disc prolapse', 'vertebral osteomyelitis', 'collection']
def feature_match(message, keywords, negative_descriptors):
if re.search(r"("+negative_descriptors+")" + r".*?" + r"("+keywords+")", message): return True
if re.search(r"("+keywords+")" + r".*?" + r"("+negative_trailers+")", message): return True
The above returns True for the following messages:
message = 'There is no evidence of a collection.'
message = 'A collection is not present.'
That is correct as it implies that the keyword/condition I am looking for is NOT present. However, it returns None for the following messages:
message = 'There is no evidence of disc prolapse, collection or vertebral osteomyelitis.'
message = 'There is no evidence of disc prolapse/vertebral osteomyelitis/ collection.'
It seem to be matching 'or vertebral osteomyelitis' in the first message and '/ collection' in the second message as negative matches, but this is wrong and implies that the message reads 'the condition that I am looking for IS present'. It should really be returning 'True' instead.
How do I prevent this?
There are several problems with the code you posted :
negative_trailers = '|'.join(negative_descriptors) should be negative_trailers = '|'.join(negative_trailers )
You should also convert your list keywords to string as you did for your other lists so that it can be passed to a regex
There is no use to use 3 times 'r' in your regex
After these corrections your code should look like this :
negative_descriptors = ['no', 'unlikely', 'no evidence of']
negative_descriptors = '|'.join(negative_descriptors)
negative_trailers = ['not present', 'not evident']
negative_trailers = '|'.join(negative_trailers)
keywords = ['disc prolapse', 'vertebral osteomyelitis', 'collection']
keywords = '|'.join(keywords)
if re.search(r"("+negative_descriptors+").*("+keywords+")", message): neg_desc_present = True
if re.search(r"("+keywords+").*("+negative_trailers+")", message): neg_desc_present = True

Creating dictionaries inside dictionaries based on user input

New to programming, and teaching myself by making a to-do list app that stores things to do in a dictionary called ThingsToDo, based on user input. I'm using the dict.update function, which is working, but I want to add a feature so that in the dictionary ThingsToDo, every time a user inputs a new thing to do, it stores the new item as a dictionary inside ThingsToDo, with things like Due Date, and Status inside that sub-dictionary. How can I do this?
Here is the code so far (just started):
ThingsToDo = {}
while True:
item = input("What do you need to do? ")
DueDate = input("When do you need to do it by? ")
status = "Not done."
ThingsToDo.update({
"Item": item,
"Due Date": DueDate,
"Status": status,
})
print(ThingsToDo)
First I suggest you to think about what will be the key(s) of your dictionary(ies). I can suggest you this simple solution using "what you need to do" as a key. It works well:
ThingsToDo = {}
while True:
item = input("What do you need to do? ")
DueDate = input("When do you need to do it by? ")
status = "Not done."
ThingsToDo.update({item: {"Due Date": DueDate, "Status": status}})
print(ThingsToDo)
Output example:
What do you need to do? Wash the car
When do you need to do it by? 2018/09/01
{'Wash the car': {'Due Date': '2018/09/01', 'Status': 'Not done.'}}
What do you need to do? Sort papers
When do you need to do it by? 2018/08/23
{'Wash the car': {'Due Date': '2018/09/01', 'Status': 'Not done.'},
'Sort papers': {'Due Date': '2018/08/23', 'Status': 'Not done.'}}
You have "many" things to do, so you can put that dict in a list, or a dict.
You cannot put that in a set beacuse it has to be hashable.
If you chose a dict, you have to chose a "key" for each element, right?
You can read about "data structures" here .
Let's try list :
thingsToDo = []
newThingToDo = {
'Item': 'I need a haircut',
'DueDate': 'Right now',
'Status': 'Done'
}
thingsToDo.append(newThingToDo)
Using a dict would allow you to order your tasks, but I'm not sure it is your question. Let's try ordering your tasks by adding a number.
thingsToDo = {}
newThingToDo = {
'Item': 'I need a haircut',
'DueDate': 'Right now',
'Status': 'Done'
}
thingsToDo[5] = newThingToDo
So your haircut is task number 5.
If you choose the dict way, your code could look like this :
# -*- coding: utf-8 -*-
thingsToDo = {}
while True:
n = int(input("How many tasks do you want to add?"))
for i in range(n):
item = input("What do you need to do? ")
duedate = input("When do you need to do it by? ")
status = "Not done."
newThingToDo = {
'Item': item,
'DueDate': duedate,
'Status': status
}
thingsToDo[i] = newThingToDo
print(thingsToDo)

Python - Delete Conditional Lines of Chat Log File

I am trying to delete my conversation from a chat log file and only analyse the other persons data. When I load the file into Python like this:
with open(chatFile) as f:
chatLog = f.read().splitlines()
The data is loaded like this (much longer than the example):
'My Name',
'08:39 Chat data....!',
'Other person's name',
'08:39 Chat Data....',
'08:40 Chat data...,
'08:40 Chat data...?',
I would like it to look like this:
'Other person's name',
'08:39 Chat Data....',
'08:40 Chat data...,
'08:40 Chat data...?',
I was thinking of using an if statement with regular expressions:
name = 'My Name'
for x in chatLog:
if x == name:
"delete all data below until you get to reach the other
person's name"
I could not get this code to work properly, any ideas?
I think you misunderstand what "regular expressions" means... It doesn't mean you can just write English language instructions and the python interpreter will understand them. Either that or you were using pseudocode, which makes it impossible to debug.
If you don't have the other person's name, we can probably assume it doesn't begin with a number. Assuming all of the non-name lines do begin with a number, as in your example:
name = 'My Name'
skipLines = False
results = []
for x in chatLog:
if x == name:
skipLines = True
elif not x[0].isdigit():
skipLines = False
if not skipLines:
results.append(x)
others = []
on = True
for line in chatLog:
if not line[0].isdigit():
on = line != name
if on:
others.append(line)
You can delete all of your messages using re.sub with an empty string as the second argument which is your replacement string.
Assuming each chat message starts on a new line beginning with a time stamp, and that nobody's name can begin with a digit, the regular expression pattern re.escape(yourname) + r',\n(?:\d.*?\n)*' should match all of your messages, and then those matches can be replaced with the empty string.
import re
with open(chatfile) as f:
chatlog = f.read()
yourname = 'My Name'
pattern = re.escape(yourname) + r',\n(?:\d.*?\n)*'
others_messages = re.sub(pattern, '', chatlog)
print(others_messages)
This will work to delete the messages of any user from any chat log where an arbitrary number of users are chatting.

Categories