Python--adding list into dict (beginner)

Python--adding list into dict (beginner) - python

I'm very new to programming (taking my first class in it now), so bear with me for format issues and misunderstandings, or missing easy fixes.
I have a dict with tweet data: 'user' as keys and then 'text' as their values. My goal here is to find the tweets where they are replying to another user, signified by starting with the # symbol, and then make a new dict that contains the author's user and the users of everyone he replied to. That's the fairly simple if statement I have below. I was also able to use the split function to isolate the username of the person they are replying to (the function takes all the text between the # symbol and the next space after it).
st='#'
en=' '
task1dict={}
for t in a,b,c,d,e,f,g,h,i,j,k,l,m,n:
if t['text'][0]=='#':
user=t['user']
repliedto=t['text'].split(st)[-1].split(en)[0]
task1dict[user]=[repliedto]
Username1 replied to username2. Username2 replied to both username3 and username5.
I am trying to create a dict (caled tweets1) that reads something like:
'user':'repliedto'
username1:[username2]
username2:[username3, username5]
etc.
Is there a better way to isolate the usernames, and then put them into a new dict? Here's a 2 entry sample of the tweet data:
{"user":"datageek88","text":"#sundevil1992 good question! #joeclarknet Is this on the exam?"},
{"user":"joeclarkphd","text":"Exam questions will be answered in due time #sundevil1992"}
I am now able to add them to a dict, but it would only save one 'repliedto' for each 'user', so instead of showing username2 have replied to both 3 and 5, it just shows the latest one, 5:
{'username1': ['username2'],
'username2': ['username5']}
Again, if I'm making a serious no-no anywhere in here, I apologize, and please show me what I'm doing wrong!

Modify the last line to
task1dict.setdefault(user, [])
task1dict[user].append (repliedto)
You were overwriting the users replied to array each time you edited it. The setdefault method will set the dict to have a empty list if it doesn't already exist. Then just append to the list.
EDIT: same code using a set for uniqueness.
task1dict.setdefault(user, set())
task1dict[user].add (repliedto)
For a set you add an element to the set. Whereas a list you append to the list

I might do it like this. Use the following regular expression to identify all usernames.
r"#([^\s]*)"
It means look for the # symbol, and then return all characters that aren't a space. A defaultdict is a simply a dictionary that returns a default value if they key isn't found. In this case, I specify an empty set as the return type in the event that we are adding a new key.
import re
from collections import defaultdict
tweets = [{"user":"datageek88","text":"#sundevil1992 good question! #joeclarknet Is this on the exam?"},
{"user":"joeclarkphd","text":"Exam questions will be answered in due time #sundevil1992"}]
from_to = defaultdict(set)
for tweet in tweets:
if "#" in tweet['text']:
user = tweet['user']
for replied_to in re.findall(r"#([^\s]*)", tweet['text']):
from_to[user].add(replied_to)
print from_to
Output
defaultdict(<type 'list'>, {'joeclarkphd': ['sundevil1992'],
'datageek88': ['sundevil1992', 'joeclarknet']})

Related

Import and insert word in sequence in Python

I want to import and insert word in sequence and NOT RANDOMLY, each registration attempt uses a single username and stop until the registration is completed. Then logout and begin a new registration with the next username in the list if the REGISTRATION is FAILED, and skip if the REGISTRATION is SUCCEDED.
I'm really confused because I have no clue. I've tried this code but it chooses randomly and I have no idea how to use the "for loop"
import random
Copy = driver.find_element_by_xpath('XPATH')
Copy.click()
names = [
"Noah" ,"Liam" ,"William" ,"Anthony"
]
idx = random.randint(0, len(names) - 1)
print(f"Picked name: {names[idx]}")
Copy.send_keys(names[idx])
How can I make it choose the next word in sequence and NOT RANDOMLY
Any Help Please

I am going to assume that you are happy with what the code does, with exception that the names it picks are random. This narrows everything down to one line, and namely the one that picks names randomly:
idx = random.randint(0, len(names) - 1)
Simple enough, you want "the next word in sequence and NOT RANDOMLY":
https://docs.python.org/3/tutorial/datastructures.html#more-on-lists
If you take a look at the link I've provided, you can see that lists have a pop() method, returning and removing some element from the list. We want the first one so we will provide 0 as the argument for the pop method.
We modify the line to look something like this
name = names.pop(0)
Now you still want to have the for-loop that will loop over all of the actions including name picking so you encapsulate all of the code in a for-loop:
names = [
"Noah" ,"Liam" ,"William" ,"Anthony"
]
for i in range(len(names)):
# ...
Copy = driver.find_element_by_xpath('XPATH')
Copy.click()
name = names.pop(0)
print(f"Picked name: {name}")
Copy.send_keys(name)
# ...
You might notice that the names list is not inside the for-loop. That is because we don't want to reassign the list every time we try to use a new name.
If you're completely unsure how for-loops work or how to implement one yourself, you should probably start by reading about how they work.
https://docs.python.org/3/tutorial/controlflow.html?highlight=loop#for-statements
Last but not least you can see some # ... comments in my example indicating where the logic will probably go for the other part of your question: "Then logout and begin a new registration with the next username in the list if the REGISTRATION is FAILED, and skip if the REGISTRATION is SUCCEDED." I don't think we I can help you with that since there is simply not enough context or examples in your question.
Refer to this guide explaining how to ask a well formulated question so we can help you more next time.

Python - Searching a dictionary for strings

Basically, I have a troubleshooting program, which, I want the user to enter their input. Then, I take this input and split the words into separate strings. After that, I want to create a dictionary from the contents of a .CSV file, with the key as recognisable keywords and the second column as solutions. Finally, I want to check if any of the strings from the split users input are in the dictionary key, print the solution.
However, the problem I am facing is that I can do what I have stated above, however, it loops through and if my input was 'My phone is wet', and 'wet' was a recognisable keyword, it would go through and say 'Not recognised', 'Not recognised', 'Not recognised', then finally it would print the solution. It says not recognised so many times because the strings 'My', 'phone' and 'is' are not recognised.
So how do I test if a users split input is in my dictionary without it outputting 'Not recognised' etc..
Sorry if this was unclear, I'm quite confused by the whole matter.
Code:
import csv, easygui as eg
KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
Problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
for Problems, Solutions in (KeywordsCSV.items()):
pass
Note, I have the pass there, because this is the part I need help on.
My CSV file consists of:
problemKeyword | solution
For example;
wet Put the phone in a bowl of rice.

Your code reads like some ugly code golf. Let's clean it up before we look at how to solve the problem
import easygui as eg
import csv
# # KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
# why are you nesting THREE function calls? That's awful. Don't do that.
# KeywordsCSV should be named something different, too. `problems` is probably fine.
with open("Keywords and Solutions.csv") as f:
reader = csv.reader(f)
problems = dict(reader)
problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
# this one's not bad, but I lowercased your `Problem` because capital-case
# words are idiomatically class names. Chaining this many functions together isn't
# ideal, but for this one-shot case it's not awful.
Let's break a second here and notice that I changed something on literally every line of your code. Take time to familiarize yourself with PEP8 when you can! It will drastically improve any code you write in Python.
Anyway, once you've got a problems dict, and a problem that should be a KEY in that dict, you can do:
if problem in problems:
solution = problems[problem]
or even using the default return of dict.get:
solution = problems.get(problem)
# if KeyError: solution is None
If you wanted to loop this, you could do something like:
while True:
problem = eg.enterbox(...) # as above
solution = problems.get(problem)
if solution is None:
# invalid problem, warn the user
else:
# display the solution? Do whatever it is you're doing with it and...
break

Just have a boolean and an if after the loop that only runs if none of the words in the sentence were recognized.

I think you might be able to use something like:
for word in Problem:
if KeywordsCSV.has_key(word):
KeywordsCSV.get(word)
or the list comprehension:
[KeywordsCSV.get(word) for word in Problem if KeywordsCSV.has_key(word)]

Parsing JSON in Python (Reverse dictionary search)

I'm using Python and "requests" to practice the use of API. I've had success with basic requests and parsing, but having difficulty with list comprehension for a more complex project.
I requested from a server and got a dictionary. From there, I used:
participant_search = (match1_request['participantIdentities'])
To convert the values of the participantIdentities key to get the following data:
[{'player':
{'summonerName': 'Crescent Bladex',
'matchHistoryUri': '/v1/stats/player_history/NA1/226413119',
'summonerId': 63523774,
'profileIcon': 870},
'participantId': 1},
My goal here is to combine the summonerId and participantId to one list. Which is easy normally, but the order of ParticipantIdentities is randomized. So the player I want information on will sometimes be 1st on the list, and other times third.
So I can't use the var = list[0] like how I would normally do.
I have access to summonerId, so I'm thinking I can search the list the summonerId, then somehow collect all the information around it. For instance, if I knew 63523774 then I could find the key for it. From here, is it possible to find the parent list of the key?
Any guidance would be appreciated.
Edit (Clarification):
Here's the data I'm working with: http://pastebin.com/spHk8VP0
At line 1691 is where participant the nested dictionary 'participantIdentities' is. From here, there are 10 dictionaries. These 10 dictionaries include two nested dictionaries, "player" and "participantId".
My goal is to search these 10 dictionaries for the one dictionary that has the summonerId. The summonerId is something I already know before I make this request to the server.
So I'm looking for some sort of "search" method, that goes beyond "true/false". A search method that, if a value is found within an object, the entire dictionary (key:value) is given.

Not sure if I properly understood you, but would this work?
for i in range(len(match1_request['participantIdentities'])):
if(match1_request['participantIdentities'][i]['summonerid'] == '63523774':
# do whatever you want with it.
i becomes the index you're searching for.

ds = match1_request['participantIdentities']
result_ = [d for d in ds if d["player"]["summonerId"] == 12345]
result = result_[0] if result_ else {}
See if it works for you.

You can use a dict comprehension to build a dict wich uses summonerIds as keys:
players_list = response['participantIdentities']
{p['player']['summonerId']: p['participantId'] for p in players_list}

I think what you are asking for is: "How do I get the stats for a given a summoner?"
You'll need a mapping of participantId to summonerId.
For example, would it be helpful to know this?
summoner[1] = 63523774
summoner[2] = 44610089
...
If so, then:
# This is probably what you are asking for:
summoner = {ident['participantId']: ident['player']['summonerId']
for ident in match1_request['participantIdentities']}
# Then you can do this:
summoner_stats = {summoner[p['participantId']]: p['stats']
for p in match1_request['participants']}
# And to lookup a particular summoner's stats:
print summoner_stats[44610089]
(ref: raw data you pasted)

match hex string with list indice

I'm building a de-identify tool. It replaces all names by other names.
We got a report that <name>Peter</name> met <name>Jane</name> yesterday. <name>Peter</name> is suspicious.
outpout :
We got a report that <name>Billy</name> met <name>Elsa</name> yesterday. <name>Billy</name> is suspicious.
It can be done on multiple documents, and one name is always replaced by the same counterpart, so you can still understand who the text is talking about. BUT, all documents have an ID, referring to the person this file is about (I'm working with files in a public service) and only documents with the same people ID will be de-identified the same way, with the same names. (the goal is to watch evolution and people's history) This is a security measure, such as when I hand over the tool to a third party, I don't hand over the key to my own documents with it.
So the same input, with a different ID, produces :
We got a report that <name>Henry</name> met <name>Alicia</name> yesterday. <name>Henry</name> is suspicious.
Right now, I'm hashing each name with the document ID as a salt, I convert the hash to an integer, then subtract the length of the name list until I can request a name with that integer as an indice. But I feel like there should be a quicker/more straightforward approach ?
It's really more of an algorithmic question, but if it's of any relevance I'm working with python 2.7 Please request more explanation if needed. Thank you !
I hope it's clearer this way ô_o Sorry when you are neck-deep in your code you forget others need a bigger picture to understand how you got there.

As #LutzHorn pointed out, you could just use a dict to map real names to false ones.
You could also just do something like:
existing_names = []
for nameocurrence in original_text:
if not nameoccurence.name in existing_names:
nameoccurence.id = len(existing_names)
existing_names.append(nameoccurence.name)
else:
nameoccurence.id = existing_names.index(nameoccurence.name)
for idx, _ in enumerate(existing_names):
existing_names[idx] = gimme_random_name()

Try using a dictionary of names.
import re
names = {"Peter": "Billy", "Jane": "Elsa"}
for name in re.findall("<name>([a-zA-Z]+)</name>", s):
s = re.sub("<name>" + name + "</name>", "<name>"+ names[name] + "</name>", s)
print(s)
Output:
'We got a report that <name>Billy</name> met <name>Elsa</name> yesterday. <name>Billy</name> is suspicious.'

Using Strings to Name Hash Keys?

I'm working through a book called "Head First Programming," and there's a particular part where I'm confused as to why they're doing this.
There doesn't appear to be any reasoning for it, nor any explanation anywhere in the text.
The issue in question is in using multiple-assignment to assign split data from a string into a hash (which doesn't make sense as to why they're using a hash, if you ask me, but that's a separate issue). Here's the example code:
line = "101;Johnny 'wave-boy' Jones;USA;8.32;Fish;21"
s = {}
(s['id'], s['name'], s['country'], s['average'], s['board'], s['age']) = line.split(";")
I understand that this will take the string line and split it up into each named part, but I don't understand why what I think are keys are being named by using a string, when just a few pages prior, they were named like any other variable, without single quotes.
The purpose of the individual parts is to be searched based on an individual element and then printed on screen. For example, being able to search by ID number and then return the entire thing.
The language in question is Python, if that makes any difference. This is rather confusing for me, since I'm trying to learn this stuff on my own.
My personal best guess is that it doesn't make any difference and that it was personal preference on part of the authors, but it bewilders me that they would suddenly change form like that without it having any meaning, and further bothers me that they don't explain it.
EDIT: So I tried printing the id key both with and without single quotes around the name, and it worked perfectly fine, either way. Therefore, I'd have to assume it's a matter of personal preference, but I still would like some info from someone who actually knows what they're doing as to whether it actually makes a difference, in the long run.
EDIT 2: Apparently, it doesn't make any sense as to how my Python interpreter is actually working with what I've given it, so I made a screen capture of it working https://www.youtube.com/watch?v=52GQJEeSwUA

I don't understand why what I think are keys are being named by using a string, when just a few pages prior, they were named like any other variable, without single quotes
The answer is right there. If there's no quote, mydict[s], then s is a variable, and you look up the key in the dict based on what the value of s is.
If it's a string, then you look up literally that key.
So, in your example s[name] won't work as that would try to access the variable name, which is probably not set.
EDIT: So I tried printing the id key both with and without single
quotes around the name, and it worked perfectly fine, either way.
That's just pure luck... There's a built-in function called id:
>>> id
<built-in function id>
Try another name, and you'll see that it won't work.

Actually, as it turns out, for dictionaries (Python's term for hashes) there is a semantic difference between having the quotes there and not.
For example:
s = {}
s['test'] = 1
s['othertest'] = 2
defines a dictionary called s with two keys, 'test' and 'othertest.' However, if I tried to do this instead:
s = {}
s[test] = 1
I'd get a NameError exception, because this would be looking for an undefined variable called test whose value would be used as the key.
If, then, I were to type this into the Python interpreter:
>>> s = {}
>>> s['test'] = 1
>>> s['othertest'] = 2
>>> test = 'othertest'
>>> print s[test]
2
>>> print s['test']
1
you'll see that using test as a key with no quotes uses the value of that variable to look up the associated entry in the dictionary s.
Edit: Now, the REALLY interesting question is why using s[id] gave you what you expected. The keyword "id" is actually a built-in function in Python that gives you a unique id for an object passed as its argument. What in the world the Python interpreter is doing with the expression s[id] is a total mystery to me.
Edit 2: Watching the OP's Youtube video, it's clear that he's staying consistent when assigning and reading the hash about using id or 'id', so there's no issue with the function id as a hash key somehow magically lining up with 'id' as a hash key. That had me kind of worried for a while.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.