Chatbot Conversation Objects, your approach? - python

I'm relatively new to programming and a recent project I've started to work on is a chatbot in python for an irc channel I frequent.
One of my goals is to have the bot able to very basically keep track of conversations it is having with users.
I presently an using conversation objects. When a user addresses the bot, it creates a new convo object and stores a log of the conversation, the current topic, etc. in that object.
When the user speaks, if their message matches the topic of the conversation, it chooses a response based on what they've said and the new topic.
For example, if the bot joins, and a user says: "Hello, bot." the conversation will be created and the topic set to "greeting".
the bot will say hello back and if the user asks: "What's up?", the bot will change the topic to "currentevents", and reply with "not much" or similar.
topic have related topic, and if the bot notices a sudden change to a topic not marked as related (questions are exceptions), it will act slightly confused and taken aback.
My question is though: I feel like my method is a bit overly complicated and unnecessary. I'm sure objects aren't the best thing to use. What would be another approach to keeping track of a conversation and it's topic? Be it a better way or worse, I'm just looking for ideas and a little brainstorming.
Before you say this isn't the right place, I've tried asking on programmers.stackexchange.com, but I didn't receive a relevant response, just someone who misunderstood me. I'm hoping I can get some more feedback on a more active site. In a way this is code help :)
Here's the code for my current approach. There are still a few bugs and I'm sure the code is far from efficient. Any tips or help on the code is welcomed.
def __init__(slef):
self.dicti_topics = {"None":["welcomed", "ask", "badbot", "leave"],
"welcomed":["welcomed", "howare", "badbot", "ask", "leave"],
"howare":["greetfinished", "badbot", "leave"]}
self.dicti_lines = {"hello":"welcomed", "howareyou":"howare", "goaway":"leave", "you'rebad":"badbot", "question":"asked"}
self.dicti_responce = dicti["Arriving dicti_responce"]
def do_actions(self):
if len(noi.recv) > 0:
line = False
##set vars
item = noi.recv.pop(0)
#update and trim lastrecv list
noi.lastrecv.append(item)
if len(noi.lastrecv) > 10: noi.lastrecv = noi.lastrecv[1:10]
args = item.split()
channel, user = args[0], args[1].split("!")[0]
message = " ".join(w for w in args[2:])
print "channel:", channel
print "User:", user
print "Message:", message
if re.match("noi", message):
if not user in noi.convos.keys():
noi.convos[user] = []
if not noi.convos[user]:
noi.convos[user] = Conversation(user)
noi.convos[user].channel = channel
line = "What?"
send(channel, line)
if re.match("hello|yo|hey|ohai|ello|howdy|hi", message) and (noi.jointime - time.time() < 20):
print "hello convo created"
if not user in noi.convos.keys():
noi.convos[user] = []
if not noi.convos[user]:
noi.convos[user] = Conversation(user, "welcomed")
noi.convos[user].channel = channel
#if user has an active convo
if user in noi.convos.keys():
##setvars
line = None
convo = noi.convos[user]
topic = convo.topic
#remove punctuation, "noi", and make lowercase
rmsg = message.lower()
for c in [".", ",", "?", "!", ";"]:
rmsg = rmsg.replace(c, "")
#print rmsg
rlist = rmsg.split("noi")
for rmsg in rlist:
rmsg.strip(" ")
#categorize message
if rmsg in ["hello", "yo", "hey", "ohai", "ello", "howdy", "hi"]: rmsg = "hello"
if rmsg in ["how do you do", "how are you", "sup", "what's up"]: rmsg = "howareyou"
if rmsg in ["gtfo", "go away", "shooo", "please leave", "leave"]: rmsg = "goaway"
if rmsg in ["you're bad", "bad bot", "stfu", "stupid bot"]: rmsg = "you'rebad"
#if rmsg in []: rmsg =
#if rmsg in []: rmsg =
#Question handling
r = r'(when|what|who|where|how) (are|is) (.*)'
m = re.match(r, rmsg)
if m:
rmsg = "question"
responce = "I don't know %s %s %s." % (m.group(1), m.group(3), m.group(2))
#dicti_lines -> {message: new_topic}
#if msg has an entry, get the new associated topic
if rmsg in self.dicti_lines.keys():
new_topic = self.dicti_lines[rmsg]
#dicti_topics
relatedtopics = self.dicti_topics[topic]
#if the topic is related, change topic
if new_topic in relatedtopics:
convo.change_topic(new_topic)
noi.convos[user] = convo
#and respond
if new_topic == "leave": line = random.choice(dicti["Confirm"])
if rmsg == "question": line = responce
else: line = random.choice(self.dicti_responce[new_topic])
#otherwise it's confused
else:
line = "Huh?"
if line:
line = line+", %s." % user
send(channel, line)
This is the do_action of a state in a state machine.

having clear goals is important in programming even before you set out deciding what objects and how. Unfortunately from what I've read above this isn't really clear.
So first forget the how of the program. forget about the objects, the code and what they do.
Now Imagine someone else was going to write for you the program. someone who is such a good programmer, they don't need you do tell them how to code. here are some questions they might ask you.
What is the purpose of your program in one sentence?
explain to me as simply as possible the main terms, IRC, Conversation.
what must it be able do? short bullet points.
explain in steps how you would use the program eg:
i type in
it then says
depending on weather... it gives me a list of this...
Having done this then forget about your convo object or whatever and think in terms of 1, 2 and 4. On Pen and paper think about the main elements of your problems i,e Conversations.Don't Just create Objects... you'll find them.
Now think about the relationships of those elements in terms of how they interact. i.e.
"Bot adds message to Topic, User adds message to Topic, messages from Topic are sent to Log."
this will help you find what the objects are, what they must do and what information they'll need to store.
Having said all that, I would say your biggest problem is your taking on more than you can chew. To begin with, for a computer to recognise words and put them into topics is quite complicated and involves linguistics and/or statistics. As a new programmer I'd tend to avoid these areas because they will simply let you down and in the process kill your motivation. start small... then go BIG. try messing with GUI programming, then make a simple calculator and stuff...

Related

KeyError with Riot API Matchv5 When Trying To Pull Data

I'm trying to pull a list of team and player stats from match IDs. Everything looks fine to me but when I run my "for loops" to call the functions for pulling the stats I want, it just prints the error from my try/except block. I'm still pretty new to python and this is my first project so I've tried everything I can think of in the past few days but no luck. I believe the problem is with my actual pull request but I'm not sure as I'm also using a GitHub library I found to help me with the Riot API while I change and update it to get the info I want.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
# Check to make sure match is long enough
try:
length_match = match_data_all['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
And then this is a shortened version of the stat function:
def get_player_stats(match_data, player):
# Get player information at the fifteenth minute of the game.
player_query = match_data['frames'][15]['participantFrames'][player]
player_team = player_query['teamId']
player_total_gold = player_query['totalGold']
player_level = player_query['level']
And there are some other functions in the code as well but I'm not sure they are faulty as well or if they are needed to figure out the error. But here is the "for loop" to call the request and defines the variable 'matchid'
for matchid_batch in all_batches:
match_data = []
for match_id in matchid_batch:
time.sleep(1.5)
if match_id == 'MatchId':
pass
else:
try:
match_entry = get_match_row(match_id)
if match_entry[0] == 'Match is too short. Skipping.':
print('Match', match_id, "is too short.")
else:
match_entry = get_match_row(match_id).reshape(1, -1)
match_data.append(np.array(match_entry))
except KeyError:
print('KeyError.')
match_data = np.array(match_data)
match_data.shape = -1, 17
df = pd.DataFrame(match_data, columns=column_titles)
df.to_csv('Match_data_Diamond.csv', mode='a')
print('Done Batch!')
Since this is my first project any help would be appreciated since I can't find any info on this particular subject so I really don't know where to look to learn why it's not working on my own.
I guess your issue was that the 'frame' array is subordinate to the array 'info'.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
try:
length_match = match_data_all['info']['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
def get_player_stats(match_data, player): # player has to be an int (1-10)
# Get player information at the fifteenth minute of the game.
player_query = match_data['info']['frames'][15]['participantFrames'][str(player)]
#player_team = player_query['teamId'] - It is not possibly with the endpoint to get the teamId
player_total_gold = player_query['totalGold']
player_level = player_query['level']
return player_query
This example worked for me. Unfortunately it is not possible to gain the teamId only through your API-endpoint. Usually the players 1-5 are in team 100 (blue side) and 6-10 in team 200 (red side).

How to make a Python program automatically prints what matched after iterating through lists

I have this Python code:
with open('save.data') as fp:
save_data = dict([line.split(' = ') for line in fp.read().splitlines()])
with open('brute.txt') as fp:
brute = fp.read().splitlines()
for username, password in save_data.items():
if username in brute:
break
else:
print("didn't find the username")
Here is a quick explanation; the save.data is a file that contains variables of Batch-file game (such as username, hp etc...) and brute.txt is a file that contains "random" strings (like what seen in wordlists used for brute-force).
save.data:
username1 = PlayerName
password1 = PlayerPass
hp = 100
As i said before, it's a Batch-file game so, no need to quote strings
brute.txt:
username
usrnm
username1
password
password1
health
hp
So, let's assume that the Python file is a "game hacker" that "brute" a Batch-file's game save file in hope of finding matches and when it does find, it retrieves them and display them to the user.
## We did all the previous code
...
>>> print(save_data["username1"])
PlayerName
Success! we retrieved the variables! But I want to make the program capable of displaying the variables it self (because I knew that "username1" was the match, that's why I chose to print it). What I mean is, I want to make the program print the variables that matched. E.g: If instead of "username1" in save.data there was "usrnm", it will surely get recognized after the "bruting" process because it's already in brute.txt. So, how to make the program print what matched? because I don't know if it's "username" or "username1" etc... The program does :p (of course without opening save.data) And of course that doesn't mean the program will search only for the username, it's a game and there should be other variables like gold/coins, hp etc... If you didn't understand something, kindly comment it and I will clear it up, and thanks for your time!
Use a dict such as this:
with open('brute.txt', 'r') as f:
# First get all the brute file stuff
lookup_dic = {word.strip(): None for word in f.readlines()}
with open('save.data', 'r') as f:
# Update that dict with the stuff from the save.data
lines = (line.strip().split(' = ') for line in f.readlines())
for lookup, val in lines:
if lookup in lookup_dic:
print(f"{lookup} matched and its value is {val}")
lookup_dic[lookup] = val
# Now you have a complete lookup table.
print(lookup_dic)
print(lookup_dic['hp'])
Output:
username1 matched and its value is PlayerName
password1 matched and its value is PlayerPass
hp matched and its value is 100
{'username': None, 'usrnm': None, 'username1': 'PlayerName', 'password': None, 'password1': 'PlayerPass','health': None, 'hp': '100'}
100

How to send scraped data through reddit bot

So I've got this bot that I want to use to reply with the box score of the mets game anytime someone says "mets score" on a specific subreddit. This is my first python project and I plan on using it on a dummy subreddit I created as a learning tool. I'm having trouble sending the scores from the website I scraped through the bot so it can appear in the reply to the "mets score" comments. Any suggestions?
import praw
import time
from lxml import html
import requests
from bs4 import BeautifulSoup
r = praw.Reddit(user_agent = 'my_first_bot')
r.login('user_name', 'password')
def scores():
soup = BeautifulSoup(requests.get("http://scores.nbcsports.com/mlb/scoreboard.asp?day=20160621&meta=true").content, "lxml")
table = soup.find("a",class_="teamName", text="NY Mets").find_previous("table")
a, b = [a.text for a in table.find_all("a",class_="teamName")]
inn, a_score, b_score = ([td.text for td in row.select("td.shsTotD")] for row in table.find_all("tr"))
print (" ".join(inn))
print ("{}: {}".format(a, " ".join(a_score)))
print ("{}: {}".format(b, " ".join(b_score)))
words_to_match = ['mets score']
cache = []
def run_bot():
print("Grabbing subreddit...")
subreddit = r.get_subreddit("random_subreddit")
print("Grabbing comments...")
comments = subreddit.get_comments(limit=40)
for comment in comments:
print(comment.id)
comment_text = comment.body.lower()
isMatch = any(string in comment_text for string in words_to_match)
if comment.id not in cache and isMatch:
print("match found!" + comment.id)
comment.reply('heres the score to last nights mets game...' scores())
print("reply successful")
cache.append(comment.id)
print("loop finished, goodnight")
while True:
run_bot()
time.sleep(120)
I think I'll just put you out of your misery ;). There are multiple issues with your code snippet:
comment.reply('heres the score to last nights mets game...' scores())
The .reply() method requires a string or an object that can have a good enough representation as a string. Assuming the method scores() returns a string, you should concatenate the two arguments, like this:
comment.reply('heres the score to last nights mets game...'+ scores())
It looks like your knowledge of basic python syntax and constructs is dusty. For a quick refresher see this.
Your method scores() doesn't return anything. It just prints out a bunch of lines (which I assume are for debugging purposes).
def scores():
soup = BeautifulSoup(requests.get("http://scores.nbcsports.com/mlb/scoreboard.asp?day=20160621&meta=true").content, "lxml")
.......
print (" ".join(inn))
print ("{}: {}".format(a, " ".join(a_score)))
print ("{}: {}".format(b, " ".join(b_score)))
Funnily enough you could use those exact strings for your return value (or maybe something else entirely, as suit your needs) like this:
def scores():
.......
inn_string = " ".join(inn)
a_string = "{}: {}".format(a, " ".join(a_score))
b_string = "{}: {}".format(b, " ".join(b_score))
return "\n".join([inn_string, a_string, b_string])
These should get you up and running.
More advice: Have you read the Reddit PRAW docs? You should. You should also probably use praw.helpers.comment_stream(). It's simple and easy to use and will handle retrieving new comments for you. Currently you try and fetch a maximum of 40 comments every 120 seconds. What happens when there are more than that many relevant comments in that 120 second span. You'll end up missing some of the comments you should've replied to. .comment_stream() will take care of rate limiting for you so that your bot can reply to each new comment which needs its attention at its own pace. Read more about this here.

Python - Delete Conditional Lines of Chat Log File

I am trying to delete my conversation from a chat log file and only analyse the other persons data. When I load the file into Python like this:
with open(chatFile) as f:
chatLog = f.read().splitlines()
The data is loaded like this (much longer than the example):
'My Name',
'08:39 Chat data....!',
'Other person's name',
'08:39 Chat Data....',
'08:40 Chat data...,
'08:40 Chat data...?',
I would like it to look like this:
'Other person's name',
'08:39 Chat Data....',
'08:40 Chat data...,
'08:40 Chat data...?',
I was thinking of using an if statement with regular expressions:
name = 'My Name'
for x in chatLog:
if x == name:
"delete all data below until you get to reach the other
person's name"
I could not get this code to work properly, any ideas?
I think you misunderstand what "regular expressions" means... It doesn't mean you can just write English language instructions and the python interpreter will understand them. Either that or you were using pseudocode, which makes it impossible to debug.
If you don't have the other person's name, we can probably assume it doesn't begin with a number. Assuming all of the non-name lines do begin with a number, as in your example:
name = 'My Name'
skipLines = False
results = []
for x in chatLog:
if x == name:
skipLines = True
elif not x[0].isdigit():
skipLines = False
if not skipLines:
results.append(x)
others = []
on = True
for line in chatLog:
if not line[0].isdigit():
on = line != name
if on:
others.append(line)
You can delete all of your messages using re.sub with an empty string as the second argument which is your replacement string.
Assuming each chat message starts on a new line beginning with a time stamp, and that nobody's name can begin with a digit, the regular expression pattern re.escape(yourname) + r',\n(?:\d.*?\n)*' should match all of your messages, and then those matches can be replaced with the empty string.
import re
with open(chatfile) as f:
chatlog = f.read()
yourname = 'My Name'
pattern = re.escape(yourname) + r',\n(?:\d.*?\n)*'
others_messages = re.sub(pattern, '', chatlog)
print(others_messages)
This will work to delete the messages of any user from any chat log where an arbitrary number of users are chatting.

Do I need to use transactions in google appengine

update 0
My def post() code has changed dramatically because originally it was base on a digital form which included both checkboxes and text entry fields, not just text entry fields, which is the current design to be more paper-like. However, as a result I have other problems which may be solved by one of the proposed solutions, but I cannot exactly follow that proposed solution, so let me try to explain new design and the problems.
The smaller problem is the inefficiency of my implementation because in the def post() I create a distinct name for each input timeslot which is a long string <courtname><timeslotstarthour><timeslotstartminute>. In my code this name is read in a nested for loop with the following snippet [very inefficient, I imagine].
tempreservation=courtname+str(time[0])+str(time[1])
name = self.request.get('tempreservation',None)
The more serious immediate problem is that my def post() code is never read and I cannot figure out why (and maybe it wasn't being read before, either, but I had not tested that far). I wonder if the problem is that for now I want both the post and the get to "finish" the same way. The first line below is for the post() and the second is for the get().
return webapp2.redirect("/read/%s" % location_id)
self.render_template('read.html', {'courts': courts,'location': location, ... etc ...}
My new post() is as follows. Notice I have left in the code the logging.info to see if I ever get there.
class MainPageCourt(BaseHandler):
def post(self, location_id):
logging.info("in MainPageCourt post ")
startTime = self.request.get('startTime')
endTime = self.request.get('endTime')
day = self.request.get('day')
weekday = self.request.get('weekday')
nowweekday = self.request.get('nowweekday')
year = self.request.get('year')
month = self.request.get('month')
nowmonth = self.request.get('nowmonth')
courtnames = self.request.get_all('court')
for c in courtnames:
logging.info("courtname: %s " % c)
times=intervals(startTime,endTime)
for courtname in courtnames:
for time in times:
tempreservation=courtname+str(time[0])+str(time[1])
name = self.request.get('tempreservation',None)
if name:
iden = courtname
court = db.Key.from_path('Locations',location_id,'Courts', iden)
reservation = Reservations(parent=court)
reservation.name = name
reservation.starttime = time
reservation.year = year
reservation.nowmonth = int(nowmonth)
reservation.day = int(day)
reservation.nowweekday = int(nowweekday)
reservation.put()
return webapp2.redirect("/read/%s" % location_id)
Eventually I want to add checking/validating to the above get() code by comparing the existing Reservations data in the datastore with the implied new reservations, and kick out to an alert which tells the user of any potential problems which she can address.
I would also appreciate any comments on these two problems.
end of update 0
My app is for a community tennis court. I want to replace the paper sign up sheet with an online digital sheet that mimics a paper sheet. As unlikely as it seems there may be "transactional" conflicts where two tennis appointments collide. So how do I give the second appointment maker a heads up to the conflict but also give the successful party the opportunity to alter her appointment like she would on paper (with an eraser).
Each half hour is a time slot on the form. People normally sign up for multiple half hours at one time before "submitting".
So in my code within a loop I do a get_all. If any get succeeds I want to give the user control over whether to accept the put() or not. I am still thinking the put() would be an all or nothing, not selective.
So my question is, do I need to make part of the code use an explicit "transaction"?
class MainPageCourt(BaseHandler):
def post(self, location_id):
reservations = self.request.get_all('reservations')
day = self.request.get('day')
weekday = self.request.get('weekday')
nowweekday = self.request.get('nowweekday')
year = self.request.get('year')
month = self.request.get('month')
nowmonth = self.request.get('nowmonth')
if not reservations:
for r in reservations:
r=r.split()
iden = r[0]
temp = iden+' '+r[1]+' '+r[2]
court = db.Key.from_path('Locations',location_id,'Courts', iden)
reservation = Reservations(parent=court)
reservation.starttime = [int(r[1]),int(r[2])]
reservation.year = int(r[3])
reservation.nowmonth = int(r[4])
reservation.day = int(r[5])
reservation.nowweekday = int(nowweekday)
reservation.name = self.request.get(temp)
reservation.put()
return webapp2.redirect("/read/%s" % location_id)
else:
... this important code is not written, pending ...
return webapp2.redirect("/adjust/%s" % location_id)
Have a look at optimistic concurrency control:
http://en.wikipedia.org/wiki/Optimistic_concurrency_control
You can check for the availability of the time slots in a given Court, and write the corresponding Reservations child entities only if their stat_time don't conflict.
Here is how you would do it for 1 single reservation using a ancestor Query:
#ndb.transactional
def make_reservation(court_id, start_time):
court = Court(id=court_id)
existing = Reservation.query(Reservation.start_time == start_time,
ancestor=court.key).fetch(2, keys_only=True)
if len(existing):
return False, existing[0]
return True, Reservation(start_time=start_time, parent=court.key).put()
Alternativly, if you make the slot part of the Reservation id, you can remove the query and construct the Reservation entity keys to check if they already exists:
#ndb.transactional
def make_reservations(court_id, slots):
court = Court(id=court_id)
rs = [Reservation(id=s, parent=court.key) for s in slots]
existing = ndb.get_multi(r.key for r in rs)
if any(existing):
return False, existing
return True, ndb.put_multi(rs)
I think you should always use transactions, but I don't think your concerns are best addressed by transactions.
I think you should implement a two-stage reservation system - which is what you see on most shopping bags and ticketing companies.
Posting the form creates a "reservation request" , which blocks out the time(s) as "in someone else's shopping bag" for 5-15 minutes
Users must submit again on an approval screen to confirm the times. You can give them the ability to update the conflicts on that screen too, and reset the 'reservation lock' on the timeslots as long as possible.
A cronjob - or a faked one that is triggered by a request coming in at a certain window - clears out expired reservation locks and returns the times back to the pool of available slots.

Categories