I am a beginner and I've tried searching online everywhere, but I'm not sure I'm searching the right terms.
My CSV file looks this:
https://drive.google.com/file/d/0B74bmJNIxxW-dWl0Y0dsV1E4bjA/view?usp=sharing
I want to know how to use the CSV file to do something like this,
Email
driver.find_element_by_name('emailAddress').send_keys("johndoe#example.com")
print "Successfully Entered Email..."
There are lots of ways that you could do this. One would be to use the csv module.
with open("foo.csv", "r") as fh:
lines = csv.reader(fh)
for line in lines:
address = line[0]
driver.find_element_by_name('emailAddress').send_keys(address)
It really helps to post the data here so that we see what the format really is and run code ourselves. So, I invented some sample data
emails.csv
Email,Password,First Name,Last Name,City
foo1#example.com,frobinate,John,Doe,District Heights
foo2#example.com,frobinate,John,Doe,District Heights
foo3#example.com,frobinate,John,Doe,District Heights
foo4#example.com,frobinate,John,Doe,District Heights
I can use the csv module to read that. csv.DictReader reads each row into its own dict that lets me reference cells by the name given in the header. Since I'll be looking up records by email name later, I'll read it into another dict that will act as an index into the records. If the same user is in there multiple times, only the last one will be remembered.
With the index in place, I can grab the row by email name.
>>> import csv
>>> with open('emails.csv', newline='') as fp:
... reader = csv.DictReader(fp) # auto-reads header
... for row in reader:
... email_index[row['Email']] = row
...
>>> for item in email_index.items():
... print(item)
...
('foo3#example.com', {'Email': 'foo3#example.com', 'City': 'District Heights', 'First Name': 'John', 'Password': 'frobinate', 'Last Name': 'Doe'})
('foo2#example.com', {'Email': 'foo2#example.com', 'City': 'District Heights', 'First Name': 'John', 'Password': 'frobinate', 'Last Name': 'Doe'})
('foo4#example.com', {'Email': 'foo4#example.com', 'City': 'District Heights', 'First Name': 'John', 'Password': 'frobinate', 'Last Name': 'Doe'})
('foo1#example.com', {'Email': 'foo1#example.com', 'City': 'District Heights', 'First Name': 'John', 'Password': 'frobinate', 'Last Name': 'Doe'})
>>>
>>> user = 'foo1#example.com'
>>> record = email_index[user]
>>> print("{Email} is {First Name} {Last Name} and lives in {City}".format(**record))
foo4#example.com is John Doe and lives in District Heights
>>>
Related
I have two files where I want to compare value of firstname, lastname and account_id between the two files in Python.
Content of the two files are as below
users.csv contains below content where column names are first_name,last_name,id respectively
Test1,Last1,1101011
Test2,Last2,1231231
user1.txt contains
[{'firstname': 'Test1',
'time_utc': 1672889600.0,
'lastname': 'Last1',
'name': 'Test Last1',
'account_id': 1101011},
{'firstname': 'Test2',
'time_utc': None,
'lastname': 'Last2',
'name': 'Test2 Last2',
'account_id': 1231231}]
I tried with below code but it returns false result and returns all the users even though they are present in both the files.
import csv
with open ("users.csv", 'r') as read_csv:
f = csv.reader(read_csv)
with open ("user1.txt", 'r') as file:
x = file.readlines()
print (x)
for row in f:
print(row)
if row not in x:
print (row)
I'm trying to get two attributes at the time from my json data and add them as an item on my python list. However, when trying to add those two: ['emailTypeDesc']['createdDate'] it throws an error. Could someone help with this? thanks in advance!
json:
{
'readOnly': False,
'senderDetails': {'firstName': 'John', 'lastName': 'Doe', 'emailAddress': 'johndoe#gmail.com', 'emailAddressId': 123456, 'personalId': 123, 'companyName': 'ACME‘},
'clientDetails': {'firstName': 'Jane', 'lastName': 'Doe', 'emailAddress': 'janedoe#gmail.com', 'emailAddressId': 654321, 'personalId': 456, 'companyName': 'Lorem Ipsum‘}},
'notesSection': {},
'emailList': [{'requestId': 12345667, 'emailId': 9876543211, 'emailType': 3, 'emailTypeDesc': 'Email-In', 'emailTitle': 'SampleTitle 1', 'createdDate': '15-May-2020 11:15:52', 'fromMailList': [{'firstName': 'Jane', 'lastName': 'Doe', 'emailAddress': 'janedoe#gmail.com',}]},
{'requestId': 12345667, 'emailId': 14567775, 'emailType': 3, 'emailTypeDesc': 'Email-Out', 'emailTitle': 'SampleTitle 2', 'createdDate': '16-May-2020 16:15:52', 'fromMailList': [{'firstName': 'Jane', 'lastName': 'Doe', 'emailAddress': 'janedoe#gmail.com',}]},
{'requestId': 12345667, 'emailId': 12345, 'emailType': 3, 'emailTypeDesc': 'Email-In', 'emailTitle': 'SampleTitle 3', 'createdDate': '17-May-2020 20:15:52', 'fromMailList': [{'firstName': 'Jane', 'lastName': 'Doe', 'emailAddress': 'janedoe#gmail.com',}]
}
python:
final_list = []
data = json.loads(r.text)
myId = [(data['emailList'][0]['requestId'])]
for each_req in myId:
final_list.append(each_req)
myEmailList = [mails['emailTypeDesc']['createdDate'] for mails in data['emailList']]
for each_requ in myEmailList:
final_list.append(each_requ)
return final_list
This error comes up when I run the above code:
TypeError: string indices must be integers
Desired output for final_list:
[12345667, 'Email-In', '15-May-2020 11:15:52', 'Email-Out', '16-May-2020 16:15:52', 'Email-In', '17-May-2020 20:15:52']
My problem is definetely in this line:
myEmailList = [mails['emailTypeDesc']['createdDate'] for mails in data['emailList']]
because when I run this without the second attribute ['createdDate'] it would work, but I need both attributes on my final_list:
myEmailList = [mails['emailTypeDesc'] for mails in data['emailList']]
I think you're misunderstanding the syntax. mails['emailTypeDesc']['createdDate'] is looking for the key 'createdDate' inside the object mails['emailTypeDesc'], but in fact they are two items at the same level.
Since mails['emailTypeDesc'] is a string, not a dictionary, you get the error you have quoted. It seems that you want to add the two items mails['emailTypeDesc'] and mails['createdDate'] to your list. I'm not sure if you'd rather join these together into a single string or create a sub-list or something else. Here's a sublist option.
myEmailList = [[mails['emailTypeDesc'], mails['createdDate']] for mails in data['emailList']]
Strings in JSON must be in double quotes, not single.
Edit: As well as names.
I have a csv file and Im trying to create a nested dictionary that looks like this:
contacts = {"Tom": {"name": "Tom Techie",
"phone": "123 123546",
"email": "tom#tom.fi",
"skype": "skypenick"},
"Mike": {"name": "Mike Mechanic",
"phone": "000 123546",
"email": "mike#mike.fi",
"skype": "-Mike-M-"}}
etc
And this is what I have written:
file = open("csv","r")
d = {}
for i in file:
f = i.strip()
x = f.split(";")
if x[4] != "":
d.update({x[0] : {"name":x[1],
"phone":x[2],
"email":x[3],
"skype":x[4]}})
else:
d.update ({x[0] : {"name": x[1],
"phone": x[2],
"email": x[3]}})
However it prints the dict as a normal dictionary with the updates as keys when they should be like stated above.
EDIT:
First lines of the csv:
key;name;phone;email;skype
Tom;Tom Techie;123 123546;tom#tom.fi;skypenick
Mike;Mike Mechanic;000 123456;mike#mike.fi;-Mike-M-
Archie;Archie Architect;050 987654;archie#archie
You can use pd.read_csv() and to_dict():
import pandas as pd
contacts = pd.read_csv('test.csv', sep=';').set_index('key').to_dict(orient='index')
Yields:
{'Tom': {'name': 'Tom Techie', 'phone': '123 123546', 'email': 'tom#tom.fi', 'skype': 'skypenick'}, 'Mike': {'name': 'Mike Mechanic', 'phone': '000 123456', 'email': 'mike#mike.fi', 'skype': '-Mike-M-'}, 'Archie': {'name': 'Archie Architect', 'phone': '050 987654', 'email': 'archie#archie', 'skype': nan}}
I like the pandas answer, but if you don't want a 3rd party library, use the built-in csv module:
import csv
from pprint import pprint
D = {}
with open('csv',newline='') as f:
r = csv.DictReader(f,delimiter=';')
for line in r:
name = line['key']
del line['key']
D[name] = dict(line)
pprint(D)
Output:
{'Archie': {'email': 'archie#archie',
'name': 'Archie Architect',
'phone': '050 987654',
'skype': None},
'Mike': {'email': 'mike#mike.fi',
'name': 'Mike Mechanic',
'phone': '000 123456',
'skype': '-Mike-M-'},
'Tom': {'email': 'tom#tom.fi',
'name': 'Tom Techie',
'phone': '123 123546',
'skype': 'skypenick'}}
You can use zip() to achieve your goal:
file = """key;name;phone;email;skype
Tom;Tom Techie;123 123546;tom#tom.fi;skypenick
Mike;Mike Mechanic;000 123456;mike#mike.fi;-Mike-M-
Archie;Archie Architect;050 987654;archie#archie""".splitlines()
d = {}
h = None
for i in file: # works the same for your csv-file
# first row == header, store in h
if h is None:
h = i.strip().split(";")[1:]
continue # done for first row
x = i.strip().split(";")
# zip pairs the read in line with the header line to get tuples
# wich are fed into the dict constructor that creates the inner dict
d[x[0]] = dict(zip(h,x[1:]+[""])) # no default for skype
# use this instead if you want the skype key always present with empty default
# d[x[0]] = dict(zip(h,x[1:]+[""]))
print(d)
zip() discards the elements of the longer list - you won't need any checks for that.
Output:
{'Tom': {'name': 'Tom Techie', 'phone': '123 123546',
'email': 'tom#tom.fi', 'skype': 'skypenick'},
'Mike': {'name': 'Mike Mechanic', 'phone': '000 123456',
'email': 'mike#mike.fi', 'skype': '-Mike-M-'},
'Archie': {'name': 'Archie Architect', 'phone': '050 987654',
'email': 'archie#archie'}}
If you use the commented line, the data will get a default value of '' for the skype - works only b/c skype is the last element of the splitted line
You can use a dict comprehension! Assuming the data is something like
with open("df.csv", "r") as file:
d = {x.split(";")[0]:{
"name": x.split(";")[2],
"phone": x.split(";")[3],
"email": x.split(";")[1],
"skype": x.split(";")[4][:-1] # Slice off trailing newline
} for x in file}
d.pop("")
We want to open files using with whenever possible to benefit from Python's context management. See https://www.python.org/dev/peps/pep-0343/ for fundamental understanding of the with statement.
Since the key "" only appears once at the head of the csv, we can pop it at the end and avoid performing a comparison at every iteration. A dict comprehension accomplishes the same thing you wanted to achieve with d.update.
More about comprehensions:
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
Edit: refactoring to remove the repetitive calls to .split can look something like this:
def line_to_dict(x, d):
x = x.split(";")
d[x[0]] = {
"name": x[2],
"phone": x[3],
"email": x[1],
"skype": x[4][:-1] # Slice off trailing newline
}
with open("df.csv", "r") as file:
d = {}
for x in file:
line_to_dict(x, d)
d.pop("")
I currently have a list of data containing these properties:
properties = {
'address':address,
'city':city,
'state':state,
'postal_code':postal_code,
'price':price,
'facts and features':info,
'real estate provider':broker,
'url':property_url,
'title':title
}
These are populated with about 25 rows.
I am attempting to write them to a csv file using this:
with open("ogtest-%s-%s.csv" % (listValue, zipCodes),'w') as csvfile:
fieldnames = ['title','address','city','state','postal_code','price','facts and features','real estate provider','url']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for row in scraped_data:
writer.writerow(row)
my end csv result is a list with the data in one row. Each area such as address has title and below it ALL the values.
['/profile/Colin-Welman/', '/profile/andreagressinger/', '/profile/Regina-Vannicola/', '/profile/Kathryn-Perkins/', etc
The scraped_data appears like this:
{'city': '(844) 292-5128(310) 505-7493(310) 562-8483(310) 422-9001(310) 439-5303(323) 736-4891(310) 383-8111(310) 482-2033(646) 872-4990(310) 963-1648', 'state': None, 'postal_code': None, 'facts and features': u'', 'address': ['/profile/Jonathan-Pearson/', '/profile/margret2/', '/profile/user89580694/', '/profile/Rinde-Philippe/', '/profile/RogerPerryLA/', '/profile/tamaramattox/', '/profile/JFKdogtownrealty/', '/profile/The-Cunningham-Group/', '/profile/TheHeatherGroup/', '/profile/EitanSeanConstine1/'], 'url': None, 'title': None, 'price': None, 'real estate provider': 'Jonathan PearsonMargret EricksonSusan & Rachael RosalesRinde PhilippeRoger PerryTamara Mattoxjeff koneckeThe Cunningham GroupHeather Shawver & Heather RogersEitan Sean Constine'}
My goal is for each item to be on it's own row.
I've tried this: adding 'newline' in the with open (it gives an error)
adding delimiter to csv.Dictwriter (this added individual columns for each record not rows)
Any help would be much appreciated.
Your scraped_data should be a list of dictionaries for csv.DictWriter to work. For instance:
import csv
# mock data
scraped_data = list()
row = {
'title': 'My Sweet Home',
'address': '1000, Abc St.',
'city': 'San Diego',
'facts and features': 'Miniramp in the backyard',
'postal_code': '000000',
'price': 1000000,
'real estate provider': 'Abc Real Estate',
'state': 'CA',
'url': 'www.mysweethome.com'
}
scraped_data.append(row)
fieldnames = [
'title', 'address', 'city', 'state', 'postal_code', 'price',
'facts and features', 'real estate provider', 'url'
]
# writing CSV file
with open('ogtest.csv', 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for row in scraped_data:
writer.writerow(row)
Hope it helps.
Cheers
This question already has answers here:
How do I read and write CSV files with Python?
(7 answers)
Closed 3 months ago.
"Type","Name","Description","Designation","First-term assessment","Second-term assessment","Total"
"Subject","Nick","D1234","F4321",10,19,29
"Unit","HTML","D1234-1","F4321",18,,
"Topic","Tags","First Term","F4321",18,,
"Subtopic","Review of representation of HTML",,,,,
All the above are the value from an excel sheet , which is converted to csv and that is the one shown above
The header as you notice contains seven coulmns,the data below them vary,
I have this script to generate these from python script,the script is below
from django.db import transaction
import sys
import csv
import StringIO
file = sys.argv[1]
no_cols_flag=0
flag=0
header_arr=[]
print file
f = open(file, 'r')
while (f.readline() != ""):
for i in [line.split(',') for line in open(file)]: # split on the separator
print "==========================================================="
row_flag=0
row_d=""
for j in i: # for each token in the split string
row_flag=1
print j
if j:
no_cols_flag=no_cols_flag+1
data=j.strip()
print j
break
How to modify the above script to say that this data belongs to a particular column header..
thanks..
You're importing the csv module but never use it. Why?
If you do
import csv
reader = csv.reader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.reader(open(file, newline=""), dialect="excel")
you get a reader object that will contain all you need; the first row will contain the headers, and the subsequent rows will contain the data in the corresponding places.
Even better might be (if I understand you correctly):
import csv
reader = csv.DictReader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.DictReader(open(file, newline=""), dialect="excel")
This DictReader can be iterated over, returning a sequence of dicts that use the column header as keys and the following data as values, so
for row in reader:
print(row)
will output
{'Name': 'Nick', 'Designation': 'F4321', 'Type': 'Subject', 'Total': '29', 'First-term assessment': '10', 'Second-term assessment': '19', 'Description': 'D1234'}
{'Name': 'HTML', 'Designation': 'F4321', 'Type': 'Unit', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'D1234-1'}
{'Name': 'Tags', 'Designation': 'F4321', 'Type': 'Topic', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'First Term'}
{'Name': 'Review of representation of HTML', 'Designation': '', 'Type': 'Subtopic', 'Total': '', 'First-term assessment': '', 'Second-term assessment': '', 'Description': ''}