Write StringIO object to csv file with newline character - python

Trying to write some Chinese letters in CSV file using StringIo object.
Here is my code:
import csv
import io
csvRow=['emp_name','Erica Meyers','中国日报网','IT']
data_temp = io.StringIO()
writer1 = csv.writer(data_temp, delimiter=',')
writer1.writerow(csvRow)
and data_temp object attaching to jira :
thisJira.add_attachment(issue=new_issue, attachment=data_temp, filename='CSVResult.csv')
In csv file I got this charactes:
Csv Result

Related

Pandas read_csv throws ValueError while reading gzip file

I am trying to read a gzip file using pandas.read_csv like so:
import pandas as pd
df = pd.read_csv("data.ZIP.gz", usecols=[*range(0, 39)], encoding="latin1", skipinitialspace=True)
But it throws this error:
ValueError: Passed header names mismatches usecols
However, if I manually extract the zip file from gz file, then read_csv if able to read the data without errors:
df = pd.read_csv("data.ZIP", usecols=[*range(0, 39)], encoding="latin1", skipinitialspace=True)
Since I have to read a lot of these files I don't want to manually extract them. So, how can I fix this error?
You have two levels of compression - gzip and zip - but pandas know how to work with only one level of compression.
You can use module gzip and zipfile with io.BytesIO to extract it to file-like object in memory.
Here minimal working code
It can be useful if zip has many files and you want to select which one to extract
import pandas as pd
import gzip
import zipfile
import io
with gzip.open('data.csv.zip.gz') as f1:
data = f1.read()
file_like_object_1 = io.BytesIO(data)
with zipfile.ZipFile(file_like_object_1) as f2:
#print([x.filename for x in f2.filelist]) # list all filenames
#data = f2.read('data.csv') # extract selected filename
#data = f2.read(f2.filelist[0]) # extract first file
data = f2.read(f2.filelist[0].filename) # extract first file
file_like_object_2 = io.BytesIO(data)
df = pd.read_csv(file_like_object_2)
print(df)
But if zip has only one file then you can use read_csv to extract it - it needs to add option compression='zip' because file-like object has no filename and read_csv can't use filename's extension to recognize compressed file.
import pandas as pd
import gzip
import io
with gzip.open('data.csv.zip.gz') as f1:
data = f1.read()
file_like_object_1 = io.BytesIO(data)
df = pd.read_csv(file_like_object_1, compression='zip')
print(df)
use the gzip module to unzip all your files somethings like this
for file in list_file_names:
file_name=file.replace(".gz","")
with gzip.open(file, 'rb') as f:
file_content = f.read()
with open(file_name,"wb") as r:
r.write(file_content)
You can use zipfile module, such as :
import zipfile
with zipfile.ZipFile(path_to_zip_file, 'r') as zip_ref:
zip_ref.extractall(directory_to_extract_to)

write to encrypted excel file in pandas

i have an encrypted excel file that i need to work with i know how to read data from that using this method
import io
import pandas as pd
import msoffcrypto
password= 'something'
decrypted_file = io.BytesIO()
with open(path_to_excel, "rb") as file:
excel_file = msoffcrypto.OfficeFile(file)
excel_file.load_key(password)
excel_file.decrypt(decrypted_file)
return decrypted_file
how to read data: From password-protected Excel file to pandas DataFrame
now my question is how to write back to such files?

Convert json output to CSV for AWS API output "client.describe_trusted_advisor_check_result"

Convert JSON output to CSV for AWS API output client.describe_trusted_advisor_check_result.
I want to convert the output JSON to CSV with script in Python. The output in JSON I get is nested.
`
import json
import csv
with open('idcheck.json', 'r') as f:
json_dict = json.load(f)
# Open new CSV file
with open("output5.csv", "w") as csv_file:
writer = csv.writer(csv_file)
# Write CSV headers
# Write data to CSV for each image
images_data = json_dict['result']['flaggedResources']
for image in images_data:
writer.writerow([images_data['status'],
images_data['region'],
images_data['resourceId'],
images_data['isSuppressed']])
`

Open file from zip without extracting it in Python?

I am working on a script that fetches a zip file from a URL using tje request library. That zip file contains a csv file. I'm trying to read that csv file without saving it. But while parsing it's giving me this error: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
import csv
import requests
from io import BytesIO, StringIO
from zipfile import ZipFile
response = requests.get(url)
zip_file = ZipFile(BytesIO(response.content))
files = zip_file.namelist()
with zip_file.open(files[0]) as csvfile:
csvreader = csv.reader(csvfile)
# _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
for row in csvreader:
print(row)
Try this:
import pandas as pd
import requests
from io import BytesIO, StringIO
from zipfile import ZipFile
response = requests.get(url)
zip_file = ZipFile(BytesIO(response.content))
files = zip_file.namelist()
with zip_file.open(files[0]) as csvfile:
print(pd.read_csv(csvfile, encoding='utf8', sep=","))
As #Aran-Fey alluded to:
import zipfile
import csv
import io
with open('/path/to/archive.zip', 'r') as f:
with zipfile.ZipFile(f) as zf:
csv_filename = zf.namelist()[0] # see namelist() for the list of files in the archive
with zf.open(csv_filename) as csv_f:
csv_f_as_text = io.TextIOWrapper(csv_f)
reader = csv.reader(csv_f_as_text)
csv.reader (and csv.DictReader) require a file-like object opened in text mode. Normally this is not a problem when simply open(...)ing file in 'r' mode, as the Python 3 docs say, text mode is the default: "The default mode is 'r' (open for reading text, synonym of 'rt')". But if you try rt with open on a ZipFile, you'll see an error that: ZipFile.open() requires mode "r" or "w":
with zf.open(csv_filename, 'rt') as csv_f:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
...
ValueError: open() requires mode "r" or "w"
That's what io.TextIOWrapper is for -- for wrapping byte streams to be readable as text, decoding them on the fly.

How to read json file which has multiple json objects seperated by new line?

I want to read a json file in which each line contains a new json object.
File looks like below -
{'P':'a1','D':'b1','T':'c1'}
{'P':'a2','D':'b2','T':'c2'}
{'P':'a3','D':'b3','T':'c3'}
{'P':'a4','D':'b4','T':'c4'}
I'm trying to read this file like below -
print pd.read_json("sample.json", lines = True)
I'm facing below exception -
ValueError: Expected object or value
Actually this sample.json file is of ~240mb. Format of this file is like this only. It's each line contains one new json object and I want to read this file using python pandas.
As others have said in the comments, it's not really JSON. You can use ast.literal_eval():
import pandas as pd
import ast
with open('sample.json') as f:
content = f.readlines()
pd.DataFrame([ast.literal_eval(line) for line in content])
Or replace the single quotes with doubles:
import pandas as pd
import json
with open('sample.json') as f:
content = f.readlines()
pd.DataFrame([json.loads(line.replace("'", '"')) for line in content])

Categories