I'm pulling in a file from FTP that I want to put in a Pandas dataframe eventually. I am stuck up on decoding the output into a string that can be read by pd.read_csv.
def fetch_data():
ftp = FTP('hostname')
ftp.login('username','password')
files = ftp.nlst()
output = []
for file in files:
filedata = open("C:/Users/USER/" + file, 'w+b')
ftp.retrbinary("RETR " + file, filedata.write)
ftp.quit()
decoded_data = bytes.decode(filedata)
output_frame = pd.read_csv(decoded_data)
output.append(output_frame)
Here's the traceback:
Traceback (most recent call last):
File "dataframe.py", line 70, in <module> fetch_data()
File "dataframe.py", line 32, in fetch_data
decoded_data = bytes.decode(filedata)
TypeError: descriptor 'decode' requires a 'bytes' object but received a'_io.BufferedRandom'
I think I am misunderstanding the binary information coming from ftp.retrbinary.
What's the best way to decode this information so that it can be read by pd.read_csv?
It looks like for what you want you could simply use pd.read_csv by passing it the path to your downloaded file instead:
output_frame = pd.read_csv("C:/Users/USER/" + file)
Related
I want to convert multiple resumes in a particular directory into base64
string and save it to the text file at a same time.
What I tried so far
import base64
import sys
with open("filename.pdf", "rb") as pdf_file , open("filename.pdf","w") as output:
encoded_string = base64.b64encode(pdf_file.read(),output.write())
I got this error when I execute the code
Traceback (most recent call last):
File "encode.py", line 5, in <module>
encoded_string = base64.b64encode(pdf_file.read(),output.write())
TypeError: write() takes exactly one argument (0 given)
Should be:
output.write(base64.b64encode(pdf_file.read()))
or:
encoded_string = base64.b64encode(pdf_file.read())
output.write(encoded_string)
I've just rebuilt my Raspberry Pi and hence installed the latest version of the Dropbox API and now my program doesn't work. I think this is due to point 1 in these breaking changes: https://github.com/dropbox/dropbox-sdk-python/releases/tag/v7.1.0. I'm sure this question from SO (Dropbox API v2 - trying to upload file with files_upload() - throws TypeError) solves my problem... but as a newbie, I can't figure out how to actually implement it - and anyway, I'm already using f.read()... can anyone help?
This is my code:
def DropboxUpload(file):
sourcefile = "/home/pi/Documents/iot_pm2/dropbox_transfer/" + filename
targetfile = "/" + filename
dbx = dropbox.Dropbox(cfg.dropboxtoken)
f = open(sourcefile, "r")
filecontents = f.read()
try:
dbx.files_upload(filecontents, targetfile, mode=dropbox.files.WriteMode.overwrite)
except dropbox.exceptions.ApiError as err:
print(err)
f.close()
And this is the error:
Traceback (most recent call last):
File "/home/pi/Documents/iot_pm2/dropbox_uploader.py", line 20, in <module>
DropboxUpload(filename)
File "/home/pi/Documents/iot_pm2/dropbox_uploader.py", line 12, in DropboxUpload
dbx.files_upload(filecontents, targetfile, mode=dropbox.files.WriteMode.overwrite)
File "/usr/local/lib/python3.5/dist-packages/dropbox/base.py", line 2125, in files_upload
f,
File "/usr/local/lib/python3.5/dist-packages/dropbox/dropbox.py", line 272, in request
timeout=timeout)
File "/usr/local/lib/python3.5/dist-packages/dropbox/dropbox.py", line 363, in request_json_string_with_retry
timeout=timeout)
File "/usr/local/lib/python3.5/dist-packages/dropbox/dropbox.py", line 407, in request_json_string
type(request_binary))
TypeError: expected request_binary as binary type, got <class 'str'>
Thanks in advance.
You need to supply bytes, but you're supplying str.
You can get bytes by changing the file mode to binary. I.e., instead of:
f = open(sourcefile, "r")
do:
f = open(sourcefile, "rb")
I am trying to get the code below to read the file raw.txt, split it by lines and save every individual line as a .txt file. I then want to append every text file to splits.zip, and delete them after appending so that the only thing remaining when the process is done is the splits.zip, which can then be moved elsewhere to be unzipped. With the current code, I get the following error:
Traceback (most recent call last): File "/Users/Simon/PycharmProjects/text-tools/file-splitter-txt.py",
line 13, in <module> at stonehenge summoning the all father. z.write(new_file)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 1123, in write st = os.stat(filename) TypeError: coercing to Unicode: need string or buffer,
file found
My code:
import zipfile
import os
z = zipfile.ZipFile("splits.zip", "w")
count = 0
with open('raw.txt','r') as infile:
for line in infile:
print line
count +=1
with open(str(count) + '.txt','w') as new_file:
new_file.write(str(line))
z.write(new_file)
os.remove(new_file)
You could simply use writestr to write a string directly into the zipFile. For example:
zf.writestr(str(count) + '.txt', str(line), compress_type=...)
Use the file name like below. write method expects the filename and remove expects path. But you have given the file (file_name)
z.write(str(count) + '.txt')
os.remove(str(count) + '.txt')
I need to read from a file, linewise. Also also need to make sure the encoding is correctly handled.
I wrote the following code:
#!/bin/bash
import codecs
filename = "something.x10"
f = open(filename, 'r')
fEncoded = codecs.getreader("ISO-8859-15")(f)
totalLength = 0
for line in fEncoded:
totalLength+=len(line)
print("Total Length is "+totalLength)
This code does not work on all files, on some files I get a
Traceback (most recent call last):
File "test.py", line 11, in <module>
for line in fEncoded:
File "/usr/lib/python3.2/codecs.py", line 623, in __next__
line = self.readline()
File "/usr/lib/python3.2/codecs.py", line 536, in readline
data = self.read(readsize, firstline=True)
File "/usr/lib/python3.2/codecs.py", line 480, in read
data = self.bytebuffer + newdata
TypeError: can't concat bytes to str
Im using python 3.3 and the script must work with this python version.
What am I doing wrong, I was not able to find out which files work and which not, even some plain ASCII files fail.
You are opening the file in non-binary mode. If you read from it, you get a string decoded according to your default encoding (http://docs.python.org/3/library/functions.html?highlight=open%20builtin#open).
codec's StreamReader needs a bytestream (http://docs.python.org/3/library/codecs#codecs.StreamReader)
So this should work:
import codecs
filename = "something.x10"
f = open(filename, 'rb')
f_decoded = codecs.getreader("ISO-8859-15")(f)
totalLength = 0
for line in f_decoded:
total_length += len(line)
print("Total Length is "+total_length)
or you can use the encoding parameter on open:
f_decoded = open(filename, mode='r', encoding='ISO-8859-15')
The reader returns decoded data, so I fixed your variable name. Also, consider pep8 as a guide for formatting and coding style.
I am trying to get an image from a website and I don't know what i am doing wrong.
Here is my code:
import httplib2
h = httplib2.Http('.cache')
response, content = h.request('http://1.bp.blogspot.com/-KSBzFF0bIyU/TtFyj-KgC0I/AAAAAAAABEo/o43IoY_9Bec/s1600/praia-de-ponta-verde-maceio.jpg')
print(response.status)
with open('maceio.jpg', 'wb') as f:
print(content, file = f)
--------------------------------------------------------------------------------
200
Traceback (most recent call last):
File "/home/matheus/workspace/Get Link/new_method_v2.py", line 12, in <module>
print(content, file = f)
TypeError: 'str' does not support the buffer interface
The error is caused by the following line:
print(content, file = f)
print implicitely converts the bytes object named content to a string (str object), which cannot be written to a file in binary mode, since Python does not know which character encoding to use.
Why are you taking the print detour at all? Just write the contents to the file using the file.write() method:
with open('maceio.jpg', 'wb') as f:
f.write(content)