Tweets Json File to Convert in csv - python

I have a complete dataset of tweets which i collect through Tweepy and save them as a json file. Now i want to Convert that data in csv file according to my need. Like only Text, Username, Created at and 4-5 more colums.
How can i do this can any one please provide me a python code for this. and another problem is that on converting the data in csv my tweet text is also split where any comma comes.
Please help us. I am a new in this field.
Thanks in Advance.

You would need to read your file in and convert each non-empty line from json format. You could then use itemgetter() to extract the required keys from the resulting dictionary and write the results to your output.csv file:
from operator import itemgetter
import csv
import json
header = ['text', 'username', 'created_at']
required_cols = itemgetter(*header)
with open('python1.json') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(header)
for row in f_input:
if row.strip():
csv_output.writerow(required_cols(json.loads(row)))
If you are using Python 3.x, use the following line:
with open('python1.json') as f_input, open('output.csv', 'w', newline='') as f_output:

Related

Read and write CSV file in Python

I'm trying to read sentences in a csv file, convert them to lowercase and save in other csv file.
import csv
import pprint
with open('dataset_elec_4000.csv') as f:
with open('output.csv', 'w') as ff:
data = f.read()
data = data.lower
writer = csv.writer(ff)
writer.writerow(data)
but I got error "_csv.Error: sequence expected". What should I do?
*I'm a beginner. Please be nice to me:)
You need to read over your input CSV row-by-row, and for each row, transform it, then write it out:
import csv
with open('output.csv', 'w', newline='') as f_out:
writer = csv.writer(f_out)
with open('dataset_elec_4000.csv', newline='') as f_in:
reader = csv.reader(f_in)
# comment these two lines if no input header
header = next(reader)
writer.writerow(header)
for row in reader:
# row is sequence/list of cells, so...
# select the cell with your sentence, I'm presuming it's the first cell (row[0])
data = row[0]
data = data.lower()
# need to put data back into a "row"
out_row = [data]
writer.writerow(out_row)
Python contains a module called csv for the handling of CSV files. The reader class from the module is used for reading data from a CSV file. At first, the CSV file is opened using the open() method in ‘r’ mode(specifies read mode while opening a file) which returns the file object then it is read by using the reader() method of CSV module that returns the reader object that iterates throughout the lines in the specified CSV document.
import csv
# opening the CSV file
with open('Giants.csv', mode ='r')as file:
# reading the CSV file
csvFile = csv.reader(file)
# displaying the contents of the CSV file
for lines in csvFile:
print(lines)

csv.Error: iterable expected trying to save a CSV file from a For

everything good?
I need some help to save this script in CSV that reads a CSV and transforms the data through a lib. I've been racking my brain for hours and I can't figure out why I can't save the CSV file.
Can anybody help me? I am a beginner in python and I am learning the tool to use in ETL processes.
import csv
from user_agents import parse
with open('UserAgent.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
idUser = 0
space = ' / '
for line in csv_reader:
user_agent = parse(line[0])
idUser = idUser + 1
with open('data.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(user_agent)
writer.writerow expects an iterable. Your user_agent must not be an iterable.
Try
writer.writerow( [user_agent] )
instead of
writer.writerow(user_agent)
Check if that's what you want.

Python Web Api to CSV

I am looking for some assistance with writing API results to a .CSV file using Python.
I have my source as CSV file. It contains the below urls in a column as separate rows.
https://webapi.nhtsa.gov/api/SafetyRatings/modelyear/2013/make/Acura/model/rdx?format=csv
https://webapi.nhtsa.gov/api/SafetyRatings/modelyear/2017/make/Chevrolet/model/Corvette?format=csv
I can call the Web API and get the printed results. Please find attached 'Web API results' snapshot.
When I try to export these results into a csv, I am getting them as per the attached 'API results csv'. It is not transferring all the records. Right now, It is only sending the last record to csv.
My final output should be as per the attached 'My final output should be' for all the given inputs.
Please find the below python code that I have used. I appreciate your help on this. Please find attached image for my code.My Code
import csv, requests
with open('C:/Desktop/iva.csv',newline ='') as f:
reader = csv.reader(f)
for row in reader:
urls = row[0]
print(urls)
r = requests.get(urls)
print (r.text)
with open('C:/Desktop/ivan.csv', 'w') as csvfile:
csvfile.write(r.text)
You'll have to create a writer object of the csvfile(to be created). and use the writerow() method you could write to the csvfile.
import csv,requests
with open('C:/Desktop/iva.csv',newline ='') as f:
reader = csv.reader(f)
for row in reader:
urls = row[0]
print(urls)
r = requests.get(urls)
print (r.text)
with open('C:/Desktop/ivan.csv', 'w') as csvfile:
writerobj=csv.writer(r.text)
for line in reader:
writerobj.writerow(line)
One problem in your code is that every time you open a file using open and mode w, any existing content in that file will be lost. You could prevent that by using append mode open(filename, 'a') instead.
But even better. Just open the output file once, outside the for loop.
import csv, requests
with open('iva.csv') as infile, open('ivan.csv', 'w') as outfile:
reader = csv.reader(infile)
for row in reader:
r = requests.get(urls[0])
outfile.write(r.text)

Merge txt files in a folder and replacing characters in python

I have a doubt about how to do to continue the code, I need to take all files from a folder and merge them in 1 file with another text format.
Example:
The Input files are of text format like this:
"{'nr': '3173391045', 'data': '27/12/2017'}"
"{'nr': '2173391295', 'data': '05/01/2017'}"
"{'nr': '5173351035', 'data': '07/03/2017'}"
The Output files must be lines like this:
"3173391045","27/09/2017"
"2173391295","05/01/2017"
"5173351035","07/03/2017"
This is my working code, it's working for merge and taking out the blank lines
import glob2
import datetime
filenames=glob2.glob("*.txt")
with open(datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S-%f")+".SAI", 'w') as file:
for filename in filenames:
with open(filename,"r") as f:
file.write(f.read())
I'm trying something with .replace but is not working, I get syntax errors or blank files
filedata = filedata.replace("{", "") for line in filedata
If your input files had contained valid JSON strings, the correct way would have been to parse the lines as JSON and write them back in csv. As strings are enclosed in single quotes (') they are rejected by the json module of the Python library, and my advice is to use a regex to parse them. Code could become:
import glob2
import datetime
import csv
import re
# the regex to parse the line
rx = re.compile(r".*'nr'\s*:\s*'(\d+)'.*'data'\s*:\s*'([/\d]+)'")
filenames=glob2.glob("*.txt")
with open(datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S-%f")+".SAI", 'w') as file:
wr = csv.writer(file, quoting = csv.QUOTE_ALL)
for filename in filenames:
with open(filename,"r") as f:
for line in f: # process line by line
m = rx.match(line)
wr.writerow(m.groups())
With a few tweaks, the input data can be coerced into a form suitable for JSON parsing:
from datetime import datetime
import json
import glob2
import csv
with open(datetime.now().strftime("%Y-%m-%d-%H-%M-%S-%f")+".SAI", 'w', newline='') as f_output:
csv_output = csv.writer(f_output, quoting=csv.QUOTE_ALL)
for filename in glob2.glob('*.txt'):
with open(filename) as f_input:
for row in f_input:
row_dict = json.loads(row.strip('"\n').replace("'", '"'))
csv_output.writerow([row_dict['nr'], row_dict['data']])
Giving you:
"3173391045","27/12/2017"
"2173391295","05/01/2017"
"5173351035","07/03/2017"
Note, in Python 3.x the output file should be opened with newline=''. Without this, extra blank lines can appear in the output file.
using regex/replaces to parse those strings is dangerous. You could always stumble on a data containing the delimiter, the comma, etc..
And in this case, even if json cannot read those lines,ast.literal_eval can without any modification whatsoever:
import ast
with open("output.csv",newline="") as fw:
cw = csv.writer(fw)
for filename in filenames:
with open(filename) as f:
for line in f:
d = ast.literal_eval(line)
cw.writerow([d['nr'],d['data'])

Import CSV NBA Stats in Excel

So after struggling a long time I've found a way to get the data from nba.com in comma separated values
This is the result http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision=
How do I get that into a nice CSV or excel file?
Or even better if possible, how can I automatically query this data like web querying a table through excel web query?
The following should get you started:
import requests
import csv
url = "http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision="
data = requests.get(url)
entries = data.json()
with open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(entries['resultSets'][0]['headers'])
csv_output.writerows(entries['resultSets'][0]['rowSet'])
This would produce an output.csv file starting as follows:
PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,GP,W,L,W_PCT,MIN,OFF_RATING,DEF_RATING,NET_RATING,AST_PCT,AST_TO,AST_RATIO,OREB_PCT,DREB_PCT,REB_PCT,TM_TOV_PCT,EFG_PCT,TS_PCT,USG_PCT,PACE,PIE,FGM,FGA,FGM_PG,FGA_PG,FG_PCT,CFID,CFPARAMS
201166,Aaron Brooks,1610612741,CHI,31.0,13,6,7,0.462,17.5,105.8,106.8,-0.9,0.243,2.4,25.9,0.015,0.077,0.046,10.8,0.5,0.511,0.198,95.84,0.065,36,85,2.8,6.5,0.424,5,"201166,1610612741"
203932,Aaron Gordon,1610612753,ORL,20.0,15,3,12,0.2,23.0,98.9,106.4,-7.5,0.1,1.91,15.7,0.089,0.228,0.158,8.2,0.575,0.608,0.151,94.16,0.124,46,87,3.1,5.8,0.529,5,"203932,1610612753"
1626151,Aaron Harrison,1610612766,CHA,21.0,7,3,4,0.429,4.2,103.3,95.4,7.9,0.0,0.0,0.0,0.08,0.08,0.08,16.7,0.0,0.0,0.095,100.22,-0.032,0,5,0.0,0.7,0.0,5,"1626151,1610612766"

Categories