aws lambda python append to file from S3 object - python

I am trying to write the contents read from S3 object to a file . I am getting syntax error while doing the same.
object =s3.get_object(Bucket=bucket_name, Key="toollib/{0}/{1}/stages/{0}.groovy".format(tool,platform))
print(object)
jenkinsfile = object['Body'].read()
print(jenkinsfile)
basepath = '/mnt/efs/{0}/{1}/{2}/'.format(orderid, platform, technology)
filename = basepath+fileName
print(filename)
#file1=open(filename, "a")
with open(filename, 'a') as file:
file.write(jenkinsfile)
Error : "errorMessage": "write() argument must be str, not bytes"

Opening the file in binary mode should do the trick:
with open(filename, 'ab') as file:
file.write(jenkinsfile)

Related

Convert csv from UTF16 to UTF8 inside a Google Cloud function - Python

I am trying to convert a csv file from UTF16-le to UTF8 inside a google cloud function in python 3.9.
The utf16 file is in a bucket but can't be found with my code.
from google.cloud import storage
import codecs
import shutil
def utf_convert(blob_name):
bucket_name = "test_bucket"
blob_name = "Testutf16.csv"
new_file = "Testutf8.csv"
storage_client = storage.Client()
source_bucket = storage_client.bucket(bucket_name)
source_blob = source_bucket.blob(blob_name)
with codecs.open(source_blob, encoding="utf-16-le") as input_file:
with codecs.open(
new_file, "w", encoding="utf-8") as output_file:
shutil.copyfileobj(input_file, output_file)
I receive following error:
TypeError: expected str, bytes or os.PathLike object, not Blob
If I try to pass directly the name of the file or the uri in the codecs.open,
with codecs.open("gs://test_bucket/Testutf16.csv", encoding="utf-16-le")
I receive following error:
FileNotFoundError: [Errno 2] No such file or directory
How can my file be found?

How to load CSV file from GCS in read only mode

file_name = "r1.csv"
client = storage.Client()
bucket = client.get_bucket('upload-testing')
blob = bucket.get_blob(file_name)
blob.download_to_filename("csv_file")
Want to Open r1.csv file in read only Mode.
Getting this Error
with open(filename, 'wb') as file_obj:
Error: [Errno 30] Read-only file system: 'csv_file'
so the function download_to_filename open files in wb mode is there any way threw which i can open r1.csv in read-only mode
As mentioned in previous answer you need to use the r mode, however you don't need to specify that since that's the default mode.
In order to be able to read the file itself, you'll need to download it first, then read its content and treat the data as you want. The following example downloads the GCS file to a temporary folder, opens that downloaded object and gets all its data:
storage_client = storage.Client()
bucket = storage_client.get_bucket("<BUCKET_NAME>")
blob = bucket.blob("<CSV_NAME>")
blob.download_to_filename("/tmp/test.csv")
with open("/tmp/test.csv") as file:
data = file.read()
<TREAT_DATA_AS_YOU_WISH>
This example is thought to run inside GAE.
If you want to open a read only file you should use 'r' mode, 'wb' means write binary:
with open(filename, 'r') as file_obj:

Why does args.input shows me none type error

I get a error like this when run it in my terminal
filename = open(input(), 'rb')
input_file = filename
open(args.input_file, "rb").read()
This is output
File "<stdin>", line 1, in <module> TypeError: expected str, bytes or os.PathLike object, not NoneType```
This is also there in script
parser = argparse.ArgumentParser();
parser.add_argument("-i", dest="input_file", help="no file with this name")
args = parser.parse_args();```
The path i put is /storage/emulated/0/filrname.txt
filename = open(input(), 'rb')
This will open the filename that you entered in the terminal and return a file object, the filename object is not the name of the file.
open(args.input_file, "rb").read()
args is not defined in the script you provided. Even if it was, args.input is probably not set. That might cause the TypeError.
I assume you're trying to open the file entered by the user, you can do it like this:
filename = input()
with open(filename, 'rb') as f:
data = f.read()

PyPDF2.PdfFileWriter addAttachment not working

Based on https://programtalk.com/python-examples/PyPDF2.PdfFileWriter/, example 2, I try to to add an attachment into a PDF file.
Here is my code I am trying to run:
import os
from django.conf import settings
from PyPDF2 import PdfFileReader, PdfFileWriter
...
doc = os.path.join(settings.BASE_DIR, "../media/SC/myPDF.pdf")
reader = PdfFileReader(doc, "rb")
writer = PdfFileWriter()
writer.appendPagesFromReader(reader)
writer.addAttachment("The filename to display", "The data in the file")
with open(doc, "wb") as fp:
writer.write(fp)
When I run this code, I get: "TypeError: a bytes-like object is required, not 'str'".
If I replace
with open(doc, 'wb') as fp:
writer.write(fp)
by:
with open(doc, 'wb') as fp:
writer.write(b'fp')
I get this error: "'bytes' object has no attribute 'write'".
And if I try:
with open(doc, 'w') as fp:
writer.write(fp)
I get this error: "write() argument must be str, not bytes"
Can anyone help me?
Second argument in addAttachment has to be a byte-like object. You can do that by encoding the string:
writer.addAttachment("The filename to display", "The data in the file".encode())

How to write to a csv file on the local file system using PySpark

I have this code where I am writing to a csv file on local filesystem but I get error as - IOError: [Errno 2] No such file or directory: 'file:///folder1/folder2/output.csv'
columns = [0,1,2,3,4,5,6,7,8,9]
data1 = rdd1.map(lambda row: [row[i].encode("utf-8") for i in columns])
data1_tuple = data1.map(tuple)
with open("file:///folder1/folder2/output.csv", "w") as fw:
writer = csv.writer(fw, delimiter = ';')
for (r1, r2) in izip(data1_tuple.toLocalIterator(), labelsAndPredictions.toLocalIterator()):
writer.writerow(r1 + r2[1:2])
On my local filesystem the following directory exists - /folder1/folder2/. Why is it throwing this error and how can I write to a csv file on the local filesytem at a specific directory?
path argument for open is
either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped
not an URI. It means your code should look as follows:
with open("/folder1/folder2/output.csv", "w") as fw:
writer = csv.writer(fw, delimiter = ';')
...

Categories