I am very new to python.
I have a list of stock names in a csv. I extract the names and put it before a website domain to create urls. I am trying to write the urls I created into another csv, but it only writes the last one out of the list. I want it to write all of the url into the csv.
with open('names.csv', 'r') as datafile:
for line in datafile:
domain = f'https://ceo.ca/{line}'
urls_link = (domain.strip())
print(urls_link)
y = open("url.csv","w")
y.writelines(urls_link)
y.close()
names.csv: https://i.stack.imgur.com/WrrLw.png
url.csv: https://i.stack.imgur.com/BYEgN.png
I would want the url csv look like this:
https://i.stack.imgur.com/y4xre.png
I apologise if I worded some things horribly.
You can use csv module in python
Try using this code:
from csv import writer,reader
in_FILE = "names.csv"
out_FILE = 'url.csv'
urls = list()
with open(in_FILE, 'r') as infile:
read = reader(infile, delimiter=",")
for domain_row in read:
for domain in domain_row:
url = f'https://ceo.ca/{domain.strip()}'
urls.append(url)
with open(out_FILE, 'w') as outfile:
write = writer(outfile)
for url in urls:
write.writerow([url])
Related
So I've never really used import csv before, but I've managed to scrape a bunch of information from websites and now want to put them in a csv file. The issue I'm having is that all my list values are being separated by commas (i.e. Jane Doe = J,a,n,e, ,D,o,e).
Also, I have three lists (one with names, one with emails, and one with titles) and I would like to add them each as its own column in the CSV file (so col1 = Name, col2 = title, col3= email)
Any thoughts on how to execute this? Thanks.
from bs4 import BeautifulSoup
import requests
import csv
urls = ''
with open('websites.txt', 'r') as f:
for line in f.read():
urls += line
urls = list(urls.split())
name_lst = []
position_lst = []
email_lst = []
for url in urls:
print(f'CURRENTLY PARSING: {url}')
print()
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
try:
for information in soup.find_all('tr', class_='sidearm-staff-member'):
names = information.find("th", attrs={'headers': "col-fullname"}).text.strip()
positions = information.find("td", attrs={'headers': "col-staff_title"}).text.strip()
emails = information.find("td", attrs={'headers': "col-staff_email"}).script
target = emails.text.split('var firstHalf = "')[1]
fh = target.split('";')[0]
lh = target.split('var secondHalf = "')[1].split('";')[0]
emails = fh + '#' + lh
name_lst.append(names)
position_lst.append(positions)
email_lst.append(emails)
except Exception as e:
pass
with open('test.csv', 'w') as csv_file:
csv_writer = csv.writer(csv_file)
for line in name_lst:
csv_writer.writerow(line)
for line in position_lst:
csv_writer.writerow(line)
for line in email_lst:
csv_writer.writerow(line)
Writing your data column-by-column is easy. All you have to do is write the rows where each row contains elements of the 3 tables with the same list index. Here is the code:
with open('test.csv', 'w') as csv_file:
csv_writer = csv.writer(csv_file)
for name, position, email in zip(name_lst, position_lst, email_lst):
csv_writer.writerow([name, position, email])
Assuming that the name_lst, position_lst and email_lst are all correct and are of the same size, Your problem is in the last part of your code where you write it to a CSV file.
Here is a way to do this:
fieldnames = ['Name', 'Position', 'Email']
with open('Data_to_Csv.csv', 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for i in range(len(name_lst)):
writer.writerow({'Name':name_lst[i],'Position':position_lst[i], 'Email':email_lst[i]})
This would of course fail if you are the length of the lists are unequal. You need to make sure that you are adding dummy values for entries that are not available to make sure that 3 lists have equal number of values.
My program takes a csv file as input and writes it as an output file in json format. On the final line, I use the print command to output the contents of the json format file to the screen. However, it does not print out the json file contents and I don't understand why.
Here is my code that I have so far:
import csv
import json
def jsonformat(infile,outfile):
contents = {}
csvfile = open(infile, 'r')
reader = csvfile.read()
for m in reader:
key = m['No']
contents[key] = m
jsonfile = open(outfile, 'w')
jsonfile.write(json.dumps(contents))
csvfile.close()
jsonfile.close()
return jsonfile
infile = 'orders.csv'
outfile = 'orders.json'
output = jsonformat(infile,outfile)
print(output)
Your function returns the jsonfile variable, which is a file.
Try adding this:
jsonfile.close()
with open(outfile, 'r') as file:
return file.read()
Your function returns a file handle to the file jsonfile that you then print. Instead, return the contents that you wrote to that file. Since you opened the file in w mode, any previous contents are removed before writing the new contents, so the contents of your file are going to be whatever you just wrote to it.
In your function, do:
def jsonformat(infile,outfile):
...
# Instead of this:
# jsonfile.write(json.dumps(contents))
# do this:
json_contents = json.dumps(contents, indent=4) # indent=4 to pretty-print
jsonfile.write(json_contents)
...
return json_contents
Aside from that, you aren't reading the CSV file the correct way. If your file has a header, you can use csv.DictReader to read each row as a dictionary. Then, you'll be able to use for m in reader: key = m['No']. Change reader = csvfile.read() to reader = csv.DictReader(csvfile)
As of now, reader is a string that contains all the contents of your file. for m in reader makes m each character in this string, and you cannot access the "No" key on a character.
a_file = open("sample.json", "r")
a_json = json.load(a_file)
pretty_json = json.dumps(a_json, indent=4)
a_file.close()
print(pretty_json)
Using this sample to print the contents of your json file. Have a good day.
There is an algorithm in the end of the text. It reads lines from the file SP500.txt. File contains strings and it looks like:
AAA
BBB
CCC
Substitutes these strings in the get request and saves the entire url to a file url_requests.txt. For the example:
https://apidate.com/api/api/AAA.US?api_token=XXXXXXXX&period=d
https://apidate.com/api/api/BBB.US?api_token=XXXXXXXX&period=d
https://apidate.com/api/api/CCC.US?api_token=XXXXXXXX&period=d
and then processes each request via the API and adds all responses to get requests to responses.txt.
I don't know how to save the response from each request from the file url_requests.txt into separate csv file instead of responses.txt (now they are all written to this file, and not separately). In this case, it is important to name each file with the corresponding line from the file SP500.txt. For example:
AAA.csv `(which contains data from the request response https://apidate.com/api/api/AAA.US?api_token=XXXXXXXX&period=d)`
BBB.csv `(which contains data from the request response https://apidate.com/api/api/BBB.US?api_token=XXXXXXXX&period=d)`
CCC.csv `(which contains data from the request response https://apidate.com/api/api/CCC.US?api_token=XXXXXXXX&period=d)`
So, algorithm is:
import requests
# to use strip to remove spaces in textfiles.
import sys
# two variables to squeeze a string between these two so it will become a full uri
part1 = 'https://apidate.com/api/api/'
part2 = '.US?api_token=XXXXXXXX&period=d'
# open the outputfile before the for loop
text_file = open("url_requests.txt", "w")
# open the file which contains the strings
with open('SP500.txt', 'r') as f:
for i in f:
uri = part1 + i.strip(' \n\t') + part2
print(uri)
text_file.write(uri)
text_file.write("\n")
text_file.close()
# open a new file textfile for saving the responses from the api
text_file = open("responses.txt", "w")
# send every uri to the api and write the respones to a textfile
with open('url_requests.txt', 'r') as f2:
for i in f2:
uri = i.strip(' \n\t')
batch = requests.get(i)
data = batch.text
print(data)
text_file.write(data)
text_file.write('\n')
text_file.close()
And I know how to save csv from this response. It is like:
import csv
import requests
url = "https://apidate.com/api/api/AAA.US?api_token=XXXXXXXX&period=d"
response = requests.get(url)
with open('out.csv', 'w') as f:
writer = csv.writer(f)
for line in response.iter_lines():
writer.writerow(line.decode('utf-8').split(','))
To save in different names you have to use open() and write() inside for-loop when you read data.
It would good to read all names to list and later generate urls and also keep on list so you would not have to read them.
When I see code which you use to save csv then it looks like you get csv from server so you could save all at once using open() write() without csv module.
I see it in this way.
import requests
#import csv
# --- read names ---
all_names = [] # to keep all names in memory
with open('SP500.txt', 'r') as text_file:
for line in text_file:
line = line.strip()
print('name:', name)
all_names.append(line)
# ---- generate urls ---
url_template = 'https://apidate.com/api/api/{}.US?api_token=XXXXXXXX&period=d'
all_uls = [] # to keep all urls in memory
with open("url_requests.txt", "w") as text_file:
for name in all_names:
url = url_template.format(name)
print('url:', url)
all_uls.append(url)
text_file.write(url + "\n")
# --- read data ---
for name, url in zip(all_names, all_urls):
#print('name:', name)
#print('url:', url)
response = requests.get(url)
with open(name + '.csv', 'w') as text_file:
text_file.write(response.text)
#writer = csv.writer(text_file)
#for line in response.iter_lines():
# writer.writerow(line.decode('utf-8').split(',')
You could calculate a filename for every string i, and open (create) a file each time.
Something like this:
import sys
import requests
# two variables to squeeze a string between these two so it will become a full uri
part1 = 'https://apidate.com/api/api/'
part2 = '.US?api_token=XXXXXXXX&period=d'
# open the outputfile before the for loop
text_file = open("url_requests.txt", "w")
uri_dict = {}
with open('SP500.txt', 'r') as f:
for i in f:
uri = part1 + i.strip(' \n\t') + part2
print(uri)
text_file.write(uri)
text_file.write("\n")
uri_dict[i] = uri
text_file.close()
for symbol, uri in uri_dict:
batch = requests.get(uri)
data = batch.text
print(data)
#create the filename
filename = symbol+".csv"
#open (create) the file and save the data
with open(filename, "w") as f:
f.write(data)
f.write('\n')
You could also get rid of url_requests.csv, which becomes useless (until you have other uses for it).
I have a JSON file like this: [{"ID": "12345", "Name":"John"}, {"ID":"45321", "Name":"Max"}...] called myclass.json. I used json.load library to get "ID" and "Name" values.
I have another .txt file with the content below. File name is list.txt:
Student,12345,Age 14
Student,45321,Age 15
.
.
.
I'm trying to create a script in python that compares the two files line by line and replace the student ID for the students name in list.txt file, so the new file would be:
Student,John,Age 14
Student,Max,Age 15
.
.
Any ideas?
My code so far:
import json
with open('/myclass.json') as f:
data = json.load(f)
for key in data:
x = key['Name']
z = key['ID']
with open('/myclass.json', 'r') as file1:
with open('/list.txt', 'r+') as file2:
for line in file2:
x = z
try this:
import json
import csv
with open('myclass.json') as f:
data = json.load(f)
with open('list.txt', 'r') as f:
reader = csv.reader(f)
rows = list(reader)
def get_name(id_):
for item in data:
if item['ID'] == id_:
return item["Name"]
with open('list.txt', 'w') as f:
writer = csv.writer(f)
for row in rows:
name = get_name(id_ = row[1])
if name:
row[1] = name
writer.writerows(rows)
Keep in mind that this script technically does not replace the items in the list.txt file one by one, but instead reads the entire file in and then overwrites the list.txt file entirely and constructs it from scratch. I suggest making a back up of list.txt or naming the new txt file something different incase the program crashes from some unexpected input.
One option is individually open each file for each mode while appending a list for matched ID values among those two files as
import json
with open('myclass.json','r') as f_in:
data = json.load(f_in)
j=0
lis=[]
with open('list.txt', 'r') as f_in:
for line in f_in:
if data[j]['ID']==line.split(',')[1]:
s = line.replace(line.split(',')[1],data[j]['Name'])
lis.append(s)
j+=1
with open('list.txt', 'w') as f_out:
for i in lis:
f_out.write(i)
I am trying to save my data to a file. My problem is the file i saved contains double quotes at the first and the last of a line. I have tried many ways to solve it from str.replace(), strip, csv to json, pickle. However, the problem has been still persistent. I have got stuck with it. Please help me. I will detail my problem below.
Firstly, I have a file called angles.txt like that:
{'left_w0': -2.6978887076110842, 'left_w1': -1.3257428944152834, 'left_w2': -1.7533400385498048, 'left_e0': 0.03566505327758789, 'left_e1': 0.6948932961 181641, 'left_s0': -1.1665923878540039, 'left_s1': -0.6726505747192383}
{'left_w0': -2.6967382220214846, 'left_w1': -0.8440729275695802, 'left_w2': -1.7541070289428713, 'left_e0': 0.036048548474121096, 'left_e1': 0.166820410 49194338, 'left_s0': -0.7731263162109375, 'left_s1': -0.7056311616210938}
I read line by line from the text file and transfer to a dict variable called data. Here is the reading file code:
def read_data_from_file(file_name):
data = dict()
f = open(file_name, 'r')
for index_line in range(1, number_lines +1):
data[index_line] = eval(f.readline())
f.close()
return data
Then I changed something in the data. Something like data[index_line]['left_w0'] = data[index_line]['left_w0'] + 0.0006. After that I wrote my data into another text file. Here is the code:
def write_data_to_file(data, file_name)
f = open(file_name, 'wb')
data_convert = dict()
for index_line in range(1, number_lines):
data_convert[index_line] = repr(data[index_line])
data_convert[index_line] = data_convert[index_line].replace('"','') # I also used strip
json.dump(data_convert[index_line], f)
f.write('\n')
f.close()
The result I received in the new file is:
"{'left_w0': -2.6978887076110842, 'left_w1': -1.3257428944152834, 'left_w2': -1.7533400385498048, 'left_e0': 0.03566505327758789, 'left_e1': 0.6948932961 181641, 'left_s0': -1.1665923878540039, 'left_s1': -0.6726505747192383}"
"{'left_w0': -2.6967382220214846, 'left_w1': -0.8440729275695802, 'left_w2': -1.7541070289428713, 'left_e0': 0.036048548474121096, 'left_e1': 0.166820410 49194338, 'left_s0': -0.7731263162109375, 'left_s1': -0.7056311616210938}"
I cannot remove "".
You could simplify your code by removing unnecessary transformations:
import json
def write_data_to_file(data, filename):
with open(filename, 'w') as file:
json.dump(data, file)
def read_data_from_file(filename):
with open(filename) as file:
return json.load(file)