Why can't I create a csv file with the csv module? - python

I am trying to write a list to a csv file.
the following code runs and returns no error, but it doesnt work, in that it doesnt actually populate the csv file with the stuff in the list. I am probably doing it wrong because I don't understand something.
import newspaper
import os
from newspaper import article
libya_newspaperlist = []
libya_newspaper=newspaper.build('https://www.cnn.com', memoize_article=False)
for article in libya_newspaper.articles:
libya_newspaperlist.append(article.url)
import csv
os.chdir("/users/patrickharned/")
libya_newspaper.csv = "/users/patrickharned/libya_newspaper.csv"
def write_list_to_file(libya_newspaperlist):
"""Write the list to csv file."""
with open("/users/patrickharned/libya_newspaper.csv") as outfile:
outfile.write(libya_newspaperlist)
So I changed the code to this.
import newspaper
import os
from newspaper import article
libya_newspaperlist = []
libya_newspaper=newspaper.build('https://www.cnn.com', memoize_article=False)
for article in libya_newspaper.articles:
libya_newspaperlist.append(article.url)
import csv
os.chdir("/users/patrickharned/")
libya_newspaper.csv = "/users/patrickharned/libya_newspaper.csv"
with open("/users/patrickharned/libya_newspaper.csv", "w") as outfile:
outfile.write(str(libya_newspaperlist))
now it does output to the csv file, but it only outputs the first entry and wont do the rest. any suggestions?

You have to open the file in write mode:
with open("/users/patrickharned/libya_newspaper.csv", "w") as outfile:
outfile.write(libya_newspaperlist)

Related

changing the value in ASCII file and saving it as with different name - Python

I have one ASCII file with .dat extention. The file has a data as shown below,
MPOL3_VPROFILE
{
ID="mpvp_1" Cycle="(720)[deg]" Lift="(9)[mm]" Period="(240)[deg]"
Phase="(0)[deg]" TimingHeight="(1.0)[mm]" RampTypeO="Const Velo"
RampHO="(0.3)[mm]" RampVO="(0.00625)[mm/deg]" RampTypeC="auto"
RampHC="(auto)[mm]" RampVC="(auto)[mm/deg]" bO="0.7" cO="0.6" dO="1.0"
eO="1.5" bC="auto" cC="auto" dC="auto" eC="auto" th1O="(14)[deg]"
Now I would like to read this file in Python and then change the value of RampHO="(0.3)[mm]" to lets say RampHO="(0.2)[mm]" and save it as a new .dat file. How can I do this ?
Currently I am able to read the file and line successfully using below code,
import sys
import re
import shutil
import os
import glob
import argparse
import copy
import fileinput
rampOpen = 'RampHO='
file = open('flatFollower_GenCam.dat','r')
#data = file.readlines()
#print (data)
for line in file:
line.strip().split('/n')
if rampOpen in line:
print (line[4:22])
But I am now stuck how to change the float value and save it as with different name.
First up, you should post your code inside your text and not in seperate images. Just indent each line with four spaces to format it as code.
You can simply read in a file line by line, change the lines you want to change and then write the output.
with open(infile, 'r') as f_in, open(outfile, 'w') as f_out:
for line in f_in:
output_line = edit_line(line)
f_out.write(output_line)
Then you just have to write a function that does the string replacement.

Saving Salesforce Report Data to CSV

I am currently using the code below to go out and fetch a Salesforce report and try to write it to a csv file. When I take the length of items its 2000, but when I execute this code it produces a CSV file that only contains 55 rows total. My guess is something is off in the write function but I am unsure.
Anyone suggestions would be appreciated.
import csv
from salesforce_reporting import Connection
import salesforce_reporting
sf = Connection(username='user',password='pw',security_token='token')
report = sf.get_report('report_id',details=True)
parser = salesforce_reporting.ReportParser(report)
items = parser.records()
with open("output.csv", "w") as f:
writer = csv.writer(f)
writer.writerows(items)
I was able to figure out that the issue was indeed in the writing aspect of my code. The code below will export your report without headers.
import csv
from salesforce_reporting import Connection
import salesforce_reporting
sf = Connection(username='user',password='pw',secrity_token='token')
report = sf.get_report('reportid',details=True)
parser = salesforce_reporting.ReportParser(report)
items = parser.records()
f = csv.writer(open('test_output.csv','w'))
f.writerows(items)

Combining multiple CSV files into 1

I have a python code that takes multiple text files as input and generates output in separate CSV file so if my text files are ABC.txt and XYX.txt then my code is generating output in 2 CSV files ABC.csv and XYX.csv. My ultimate goal is get one single CSV file with all the outputs. Since I am more comfortable with sql I was thinking about uploading all the files to a database and then combine them using sql but I was wondering if I can modify my python code below to generate one single CSV file containing all output. Here is my code:
import json
from watson_developer_cloud import ToneAnalyzerV3Beta
import urllib.request
import codecs
import csv
import os
import re
import sys
import collections
import glob
import xlwt
from bs4 import BeautifulSoup
ipath = 'C:/TEMP/' # input folder
opath = 'C:/TEMP/' # output folder
reader = codecs.getreader("utf-8")
tone_analyzer = ToneAnalyzerV3Beta(
url='https://gateway.watsonplatform.net/tone-analyzer/api',
username='1f2fd51b-d0fb-45d8-aba2-08e22777b77d',
password='DykYfXjV4UXP',
version='2016-02-11')
path = 'C:/TEMP/*.html'
file = glob.glob(path)
# iterate over the list getting each file
writer = csv.writer(open('C:/TEMP/test', mode='w'))
# now enter our input loop
for fle in file:
# open the file and then call .read() to get the text
with open(fle) as f:
...
# output tone name and score to file
for i in tonename:
writer.writerows((tone['tone_name'],tone['score']) for tone in cat['tones'])
Modifying your existing code as little as possible ... you simply need to open the csv file before entering your loop that reads the text files:
...
path = 'C:/TEMP/*.html'
file = glob.glob(path)
# !! open our output csv
writer = csv.writer(open('our-merged-data', mode='w'))
# iterate over the list getting each file
for fle in file:
# open the file and then call .read() to get the text
with open(fle) as f:
...
# output tone name and score to file
for i in tonename:
writer.writerows((tone['tone_name'],tone['score'],Date,Title) for tone in cat['tones'])

running data parser on multiple files in folder? Python

long time lurker, but never posted here. Sorry if this isn't a good post...I made a program that uses regex to pull the names and emails out of resumes. I can get it to open a specific file in my resume folder, but getting the program to iterate over all of the files in the folder has me stumped. Here's the pseudo-code for what I'm doing:
open resume folder
read file1.txt
execute nameFinder
execute emailFinder
create new dictionary candidateData
Export to Excel
read file2.txt
...
Here's the code:
import re
import os
import pprint
with open('John Doe -Resume.txt', 'r') as f:
#This pulls the first line of the resume,
#Which is generally the name.
first_line_name = f.readline().strip()
#This pulls the Email from the resume.
bulkemails = f.read()
r = re.compile(r'(\b[\w.]+#+[\w.]+.+[\w.]\b)')
candidateEmail = r.findall(bulkemails)
emails = ""
for x in candidateEmail:
emails += str(x)+"\n"
#This creates the dictionary data
candidateData = {'candidateEmail' : str(candidateEmail), \
'candidateName' : str(first_line_name)}
pprint.pprint(candidateData)
Then, I get this as an output:
{'candidateEmail': "['JohnDoe#gmail.com']",
'candidateName': 'John Doe'}
All ready to be exported into Excel.
SO HERE"S MY QUESTION FOR YOU! How do I get it to do this for ALL of the .txt files in my resume folder, and not just the file I specify? Also, any cod critique would be greatly appreciated, Thanks guys! :D
You can use glob to iterate over all .txt files in your directory and then run the function on each file. Add this to the start
import re
import os
import glob
import pprint
os.chdir("resumes")
for file in glob.glob("*.txt"):
with open(file, 'r') as f:
#Rest of your execution code here
EDIT: In answer to your question in the comments:
import re
import os
import glob
import pprint
candidateDataList = []
for file in glob.glob("*.txt"):
with open(file, 'r') as f:
#This pulls the first line of the resume,
#Which is generally the name.
first_line_name = f.readline().strip()
#This pulls the Email from the resume.
bulkemails = f.read()
r = re.compile(r'(\b[\w.]+#+[\w.]+.+[\w.]\b)')
candidateDataList.append({'name':str(first_line_name),
'email':r.findall(bulkemails)})
pprint.pprint(candidateDataList)
#Jakob's answer is spot on. I only wanted to mention a nice alternative which I usually prefer myself, the pathlib:
import re
import pprint
from pathlib import Path
resumes_dir = Path("resumes")
for path in resumes_dir.glob("*.txt"):
with path.open() as f:
#Rest of your execution code here

how to clean a JSON file and store it to another file in Python

I am trying to read a JSON file with Python. This file is described by the authors as not strict JSON. In order to convert it to strict JSON, they suggest this approach:
import json
def parse(path):
g = gzip.open(path, 'r')
for l in g:
yield json.dumps(eval(l))
however, not being familiar with Python, I am able to execute the script but I am not able to produce any output file with the new clean JSON. How should I modify the script in order to produce a new JSON file? I have tried this:
import json
class Amazon():
def parse(self, inpath, outpath):
g = open(inpath, 'r')
out = open(outpath, 'w')
for l in g:
yield json.dumps(eval(l), out)
amazon = Amazon()
amazon.parse("original.json", "cleaned.json")
but the output is an empty file. Any help more than welcome
import json
class Amazon():
def parse(self, inpath, outpath):
g = open(inpath, 'r')
with open(outpath, 'w') as fout:
for l in g:
fout.write(json.dumps(eval(l)))
amazon = Amazon()
amazon.parse("original.json", "cleaned.json")
another shorter way of doing this
import json
class Amazon():
def parse(readpath, writepath):
with open(readpath) as g, open(writepath, 'w') as fout:
for l in g:
json.dump(eval(l), fout)
amazon = Amazon()
amazon.parse("original.json", "cleaned.json")
While handling json data it is better to use json modules json.dump(json, output_file) for dumping json in file and json.load(file_path) to load the data. In this way you can get maintain json wile saving and reading json data.
For very large amount of data say 1k+ use python pandas module.

Categories