Need to write a function that takes an open file as the only parameter and returns a dictionary that maps a string to a list of strings and integers.
each line in the text will have a username, first name, last name, age, gender and an e-mail address. The function will insert each person's information into a dictionary with their username as the key, and the value being a list of [last name, first name, e-mail, age, gender].
basically what im trying to do is open a text file that contains this:
ajones Alice Jones 44 F alice#alicejones.net
and return something like this:
{ajones: ['Jones', 'Alice', 'alice#alicejones.net', 44, 'F']}
so far i have done this, but is there any other easier way?
def create_dict(file_name):
'''(io.TextIOWrapper) -> dict of {str: [str, str, str, int, str]}
'''
newdict = {}
list2 = []
for line in file_name:
while line:
list1 = line.split() #for a key, create a list of values
if list2(0):
value += list1(1)
if list2(1):
value += list1(2)
if list2(2):
value += list1(3)
if list2(3):
value += list1(4)
newdict[list1(0)] = list2
for next_line in file_name:
list1 = line.split()
newdict[list1(0)] = list1
return newdict
def helper_func(fieldname):
'''(str) -> int
Returns the index of the field in the value list in the dictionary
>>> helper_func(age)
3
'''
if fieldname is "lastname":
return 0
elif fieldname is "firstname":
return 1
elif fieldname is "email":
return 2
elif fieldname is "age":
return 3
elif fieldname is "gender":
return 4
If you have Python 2.7+, you can use a dictionary comprehension:
{l[0]: l[1:] for l in (line.rstrip().split(' ') for line in f)}
for line in file_name:
lineData = line.split() #for a key, create a list of values
my_dict[lineData[0]] = lineData[1:]
is a little easier i think ... although Im not sure if thats doing what you want ...
Agree with the first answer, here is the slightly different version to match the spec:
file=open('filename', 'r')
{username: [lname, fname, email, int(age), sex] for username, fname, lname, age, sex, email in (line.rstrip().split(' ') for line in file)}
There are certainly easier ways to build your dictionary:
d={}
st='ajones Alice Jones 44 F alice#alicejones.net'
li=st.split()
d[li[0]]=li[1:]
print d
# {'ajones': ['Alice', 'Jones', '44', 'F', 'alice#alicejones.net']}
If you want to change the order of the fields, do so as you are storing them:
d={}
st='ajones Alice Jones 44 F alice#alicejones.net'
li=st.split()
li2=li[1:]
d[li[0]]=[li2[i] for i in (1,0,4,3,2)]
print d
# {'ajones': ['Jones', 'Alice', 'alice#alicejones.net', 'F', '44']}
Or, just use named tuples or a dictionary rather than a list for the data fields.
One you have that part right, you can use it with your file:
# untested...
def create_dict(file_name):
newdict = {}
with open(file_name) as fin:
for line in fin:
li=line.split()
li2=li[1:]
li2[2]=int(li[2])
newdict[li[0]]=[li2[i] for i in (1,0,4,3,2)]
return newdict
Related
I am searching for a different way to access every key in a dictionary within a for loop. Underneath, there is an example code, where I iterate through a dictionary and access every key with the help of a counter and a if statement. Is there another way to access the keys, without a counter or an if statement?
def string_to_dict(csv):
dict = []
tmp = csv.splitlines()
for i in tmp:
tmp_dict = {"vorname" : "none", "nachname" : "none", "email" : "none"};
tmp_i= i.split(",")
counter = 0;
for si in tmp_i:
if counter ==0:
tmp_dict["vorname"] = si
counter =counter + 1
elif counter == 1:
tmp_dict["nachname"] = si
counter = counter + 1
else:
tmp_dict["email"] = si
dict.append(tmp_dict)
csv = """Donald,Duck,d.duck#entenhausen.com
Wiley,Coyote,whiley#canyon.org
Road,Runner,roadrunner#canyon.org"""
There is no need for the loop if you already expect name, surname and email.
def string_to_dict(csv):
dict = []
tmp = csv.splitlines()
for i in tmp:
tmp_dict = {"vorname" : "none", "nachname" : "none", "email" : "none"};
tmp_i= i.split(",")
tmp_dict["vorname"] = tmp_i[0]
tmp_dict["nachname"] = tmp_i[1]
tmp_dict["email"] = tmp_i[2]
dict.append(tmp_dict)
We can keep iterating to improve the solution:
def string_to_dict(csv):
dict = []
tmp = csv.splitlines()
for i in tmp:
tmp_dict = {"vorname" : None, "nachname" : None, "email" : None};
tmp_i= i.split(",")
tmp_dict["vorname"] = tmp_i[0]
tmp_dict["nachname"] = tmp_i[1]
tmp_dict["email"] = tmp_i[2]
dict.append(tmp_dict)
And even more (if you want to use a protected keyword like dict, naming convention is to use an underscore after it):
def string_to_dict(csv):
dict_ = []
for line in csv.splitlines():
vor_name, nach_name, email = line.split(",")
dict_.append({"vorname" : vor_name, "nachname" : nach_name, "email" : email})
return dict_
And with list comprehensions:
def string_to_dict(csv):
def _parse_item(vor_name, nach_name, email):
return {"vorname" : vor_name, "nachname" : nach_name, "email" : email}
return [_parse_item(*line.split(",")) for line in csv.splitlines()]
If you want minimal changes to what you have done so far, you can just get list of keys and use the index value (counter variable in your case), something like this:
for i in tmp:
tmp_dict = {"vorname" : "none", "nachname" : "none", "email" : "none"};
tmp_i= i.split(",")
counter = 0;
keys = [*temp_dict.keys()] # List of Keys
for si in tmp_i:
tmp_dict[keys[counter]] = si # Key at index counter
counter += 1
dict.append(tmp_dict)
Sample Run:
>>string_to_dict(csv)
[{'vorname': ' Road', 'nachname': 'Runner', 'email': 'roadrunner#canyon.org'}, {'vorname': ' Road', 'nachname': 'Runner', 'email': 'roadrunner#canyon.org'}, {'vorname': ' Road', 'nachname': 'Runner', 'email': 'roadrunner#canyon.org'}]
Another Note: You're naming the variable as dict You should avoid that since it's a keyword in Python
Lets start with the fact that you are not trying to iterate over a dictionary but to create a list containing dictionary entries from a CSV format string.
secondly there are a lot of python syntactic mistakes and errors in your code.
Refrain from using reserved word such as "dict" as parameter names.
You can use this code snippet as a start if it helps you but I recommend brushing up on python syntax and best practices.
result = []
for line in csv.splitlines():
vorname, nachname, email = line.split(",")
result.append(
{"vorname": vorname.strip(), "nachname": nachname.strip(), "email": email.strip()})
This can be done also using list comprehension, but is much less readable
I'm trying to create a dict that contains a list of users and their ssh-keys.
The list of users and the ssh-keys are stored in different yaml files which need to grab the info from. The files are "admins" and "users" and they look like:
Admins file:
admins:
global:
- bob
- john
- jimmy
- hubert
SSH key file:
users:
bob:
fullname: Bob McBob
ssh-keys:
ssh-rsa "thisismysshkey"
john:
fullname: John McJohn
ssh-keys:
ssh-rsa "thisismysshkey"
So far i have this code:
import yaml
#open admins list as "f"
f = open("./admins.sls", 'r')
#creates "admins" list
admins = yaml.load(f)
#grab only needed names and make a list
admins = admins['admins']['global']
#convert back to dict with dummy values of 0
admin_dict = dict.fromkeys(admins, 0)
So at this point I have this dict:
print(admin_dict)
{'bob': 0, 'john': 0}
Now i want to loop through the list of names in "admins" and update the key (currently set to 0) with their ssh-key from the other file.
So i do:
f = open("./users.sls", 'r')
ssh_keys = yaml.load(f)
for i in admins:
admin_dict[k] = ssh_keys['users'][i]['ssh-keys']
but when running that for loop, only one value is getting updated.
Kinda stuck here, i'm way out of my python depth... am i on the right track here?
edit:
changed that last loop to be:
for i in admins:
for key, value in admin_dict.items():
admin_dict[key] = ssh_keys['users'][i]['ssh-keys']
and things look better. Is this valid?
With an admin.yaml file like:
admins:
global:
- bob
- john
- jimmy
- hubert
And a ssh_key.yaml like so:
users:
bob:
fullname: Bob McBob
ssh-keys:
ssh-rsa: "bob-rsa-key"
john:
fullname: John McJohn
ssh-keys:
ssh-rsa: "john-rsa-key"
jimmy:
fullname: Jimmy McGill
ssh-keys:
ssh-rsa: "jimmy-rsa-key"
ssh-ecdsa: "jimmy-ecdsa-key"
You could do something like this asssuming you want to know which type of ssh key each user has (if not just go index one level deeper for the specific name of the key type in the dictionary comprehension):
import yaml
import pprint
def main():
with open('admin.yaml', 'r') as f:
admins_dict = yaml.load(f, yaml.SafeLoader)
admins_list = admins_dict['admins']['global']
with open('ssh_keys.yaml', 'r') as f:
ssh_dict = yaml.load(f, yaml.SafeLoader)
users_dict = ssh_dict['users']
admins_with_keys_dict = {
admin: users_dict[admin]['ssh-keys'] if admin in users_dict else None
for admin in admins_list
}
pp = pprint.PrettyPrinter(indent=2)
pp.pprint(admins_with_keys_dict)
if __name__ == '__main__':
main()
Output:
{ 'bob': {'ssh-rsa': 'bob-rsa-key'},
'hubert': None,
'jimmy': {'ssh-ecdsa': 'jimmy-ecdsa-key', 'ssh-rsa': 'jimmy-rsa-key'},
'john': {'ssh-rsa': 'john-rsa-key'}}
Alternative Output if you only want the rsa keys:
{ 'bob': 'bob-rsa-key',
'hubert': None,
'jimmy': 'jimmy-rsa-key',
'john': 'john-rsa-key'}
Above output achieved making the following change to the dictionary comprehension:
admin: users_dict[admin]['ssh-keys']['ssh-rsa'] if admin in users_dict else None
^^^^^^^^^^^
I have an input file that I am trying to build a data base from.
Each line looks like this:
Amy Shchumer, Trainwreck, I Feel Pretty, Snatched, Inside Amy Shchumer
Bill Hader,Inside Out, Trainwreck, Tropic Thunder
And so on.
The first string is an actor\actress, and then movies they played in.
The data isn't sorted and they are some trailing whitespaces.
I would like to create a dictionary that would look like this:
{'Trainwreck': {'Amy Shchumer', 'Bill Hader'}}
The key would be the movie, the values should be the actors in it, unified in a set data type.
def create_db():
my_dict = {}
raw_data = open('database.txt','r+')
for line in raw_data:
lst1 = line.split(",") //to split by the commas
len_row = len(lst1)
lst2 = list(lst1)
for j in range(1,len_row):
my_dict[lst2[j]] = set([lst2[0]])
print(my_dict)
It doesn't work... it doesn't solve the issue that when a key already exists then the actor should be unified in a set with the prev actor
Instead I end up with:
'Trainwreck': {'Amy Shchumer'}, 'Inside Out': {'Bill Hader'}
def create_db():
db = {}
with open("database.txt") as data:
for line in data.readlines():
person, *movies = line.split(",")
for m in movies:
m = m.strip()
db[m] = db.get(m, []) + [person]
return db
Output:
{'Trainwreck': ['Amy Shchumer', 'Bill Hader'],
'I Feel Pretty': ['Amy Shchumer'],
'Snatched': ['Amy Shchumer'],
'Inside Amy Shchumer': ['Amy Shchumer'],
'Inside Out': ['Bill Hader'],
'Tropic Thunder': ['Bill Hader']}
This will loop through the data and assign the first value of each line to person and the rest to movies (see here for an example of how * unpacks tuples). Then for all the movies, it uses .get to check if it’s in the database yet, returning the list if it is and an empty list if it isn’t. Then it adds the new actor to the list.
Another way to do this would be to use a defaultdict:
from collections import defaultdict
def create_db():
db = defaultdict(lambda: [])
with open("database.txt") as data:
for line in data.readlines():
person, *movies = line.split(",")
for m in movies:
db[m.strip()].append(person)
return db
which automatically assigns [] if the key does not exist.
I have csv file:
shack_imei.csv:
shack, imei
F10, "5555"
code:
reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
print shack
def get_imei_from_entered_shack(shack):
for key, value in my_dict.iteritems():
if key == shack:
return value
list = str(get_imei_from_entered_shack(shack))
print list
which gives me "5555"
But I need this value in a list structure like this:
["5555"]
I've tried a lot of different methods, and they all end up with extra ' or""
EDIT 1:
new simpler code:
reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
imei = my_dict[shack]
print imei
"5555"
list(imei) gives me ['"5555"'], I need it to be ["5555"]
You can change your "return" sentence:
shack = raw_input('Enter Shack:')
print shack
def get_imei_from_entered_shack(shack):
for key, value in my_dict.iteritems():
if key == shack:
return [str(value)]
list = get_imei_from_entered_shack(shack)
print list
As far as I understand, you want to create a list containing the returned string, which you do with [ ]
list = [str(get_imei_from_entered_shack(shack))]
There are a few problems with this code, which are too long to tackle in comments
my_dict
my_dict = dict(reader) works only well if this csv is a collection of keys and values. If there are duplicate keys, this might give some problems
get_imei_from_entered_shack
Why this special method, instead of just asking my_dict the correct value. Even if you don't want it to trow an Exception when you ask for a shack that doesn't exists, you can use the dict.get(<key>, <default>) method
my_dict(shack, None)
does the same as your 4-line method
list
don't name variables the same as builtins
list2
if you want a list, you can do [<value>] or list(<value>) (unless you replaced list with your own variable assignment)
reader = csv.reader(open("shack_imei.csv", "rb"))
my_dict = dict(reader)
shack = raw_input('Enter Shack:')
imei = my_dict[shack]
imei = imei.replace('"',"")
IMEI_LIST =[]
IMEI_LIST.append(imei)
print IMEI_LIST
['5555']
After the end of my code, I have a dictionary like so:
{'"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831}
What I want to do is to find each of the keys in a separate file, teams.txt, which is formatted like this:
1901,'BRO','LAD'
1901,'CHA','CHW'
1901,'WS1','MIN'
Using the year, which is 1901, and the team, which is the key of each item in the dictionary, I want to create a new dictionary where the key is the third column in teams.txt if the year and team both match, and the value is the value of the team in the first dictionary.
I figured this would be easiest if I created a function to "lookup" the year and the team, and return "franch", and then apply that function to each key in the dictionary. This is what I have so far, but it gives me a KeyError
def franch(year, team_str):
team_str = str(team_str)
with open('teams.txt') as imp_file:
teams = imp_file.readlines()
for team in teams:
(yearID, teamID, franchID) = team.split(',')
yearID = int(yearID)
if yearID == year:
if teamID == team_str:
break
franchID = franchID[1:4]
return franchID
And in the other function with the dictionary that I want to apply this function to:
franch_teams={}
for team in teams:
team = team.replace('"', "'")
franch_teams[franch(year, team)] = teams[team]
The ideal output of what I am trying to accomplish would look like:
{'"MIN"': 1475.9778073075058, '"LAD"': 1554.1437268304624, '"CHW"': 1552.228925324831}
Thanks!
Does this code suite your needs?
I am doing an extra check for equality, because there were different string signs in different parts of your code.
def almost_equals(one, two):
one = one.replace('"', '').replace("'", "")
two = two.replace('"', '').replace("'", "")
return one == two
def create_data(year, data, text_content):
""" This function returns new dictionary. """
content = [line.split(',') for line in text_content.split('\n')]
res = {}
for key in data.keys():
for one_list in content:
if year == one_list[0] and almost_equals(key, one_list[1]):
res[one_list[2]] = data[key]
return res
teams_txt = """1901,'BRO','LAD'
1901,'CHA','CHW'
1901,'WS1','MIN'"""
year = '1901'
data = { '"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831 }
result = create_data(year, data, teams_txt)
And the output:
{"'CHW'": 1552.228925324831, "'LAD'": 1554.1437268304624, "'MIN'": 1475.9778073075058}
Update:
To read from text file use this function:
def read_text_file(filename):
with open(filename) as file_object:
result = file_object.read()
return result
teams_txt = read_text_file('teams.txt')
You may try something like:
#!/usr/bin/env python
def clean(_str):
return _str.strip('"').strip("'")
first = {'"WS1"': 1475.9778073075058, '"BRO"': 1554.1437268304624, '"CHA"': 1552.228925324831}
clean_first = dict()
second = dict()
for k,v in first.items():
clean_first[clean(k)] = v
with open("teams.txt", "r") as _file:
lines = _file.readlines()
for line in lines:
_,old,new = line.split(",")
second[new.strip()] = clean_first[clean(old)]
print second
Which gives the expected:
{"'CHW'": 1552.228925324831, "'LAD'": 1554.1437268304624, "'MIN'": 1475.9778073075058}