I am parsing images from a webpage into a specif folder everything goes very well a huge part of the images are parsed into the desired folder then before the process ends until it gives this error:
IOError: [Errno 2] No such file or directory: u'C:\\Users\\pro\\Downloads\\AAA\\photos\\'
The code is something like this:
import os
save_path = raw_input("give save path. like '/home/user/dalbums'")
album = raw_input("the name of album: ")
completeName = os.path.join(save_path,album)
class X:
def saver(self, info):
path_name = os.path.join(completeName, 'photos')
if not os.path.exists(path_name):
os.makedirs(path_name)
with open(os.path.join(path_name, info), 'a') as f:
for i in lo:
f.write(lo)
If I keep only this part the error goes away but then the images goes to the wrong place:
with open(info, 'a') as f:
for i in lo:
f.write(lo)
When i try to use url https://www.google.com i get this error for the same code
InvalidSchema:
No connection adapters were found for 'javascript:void(0)'
You show different code in your question than that actually cause the error.
This is the relevant code:
with open(os.path.join(imgs_folder, My_imgs.strip()), 'wb') as f:
Since My_imgs.strip() returns an empty string, your file name is an empty string. Therefore, you try to write to directory after join the empty string to a directory name.
Here is where you create My_imgs:
My_imgs = data_fetched.split('/')[-1].split("?")[0]
For debugging you could do:
if not My_imgs.strip():
print('data_fetched:', data_fetched)
to see what data_fetched actually is.
Related
I have a list of URLs, which direct to filings from the SEC (e.g., https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm)
My goal ist to write a for loop that opens the URLs, request the document and save it to a folder.
However, I need to be able to identify the documents later. Thats why I wanted to use "htps://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm" this filing-specific number as document name
directory = r"\Desktop\10ks"
for url in url_list:
response = requests.get(url).content
path = (directory + str(url)[40:-5] +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
But everytime, I get the following error message: filenotfounderror: [errno 2] no such file or directory:
I really hope you can help me out!!
Thanks
import requests
import os
url_list = ["https://www.sec.gov/Archives/edgar/data/18651/000119312509042636/d10k.htm"]
#Create the path Desktop/10ks/
directory = os.path.expanduser("~/Desktop") + "\\10ks"
for url in url_list:
#Get the content as string instead of getting it as bytes
response = requests.get(url).text
#Replace slash in filename with underscore
filename = str(url)[40:-5].replace("/", "_")
#print filename to check if it is correct
print(filename)
path = (directory + "\\" + filename +".txt")
with open(path, "w") as f:
f.write(response)
f.close()
See comments.
I guess backslashes in filenames are not allowed, since
filename = str(url)[40:-5].replace("/", "\\")
gives me
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user/Desktop\\10ks\\18651\\000119312509042636\\d10.txt'
See also:
https://docs.python.org/3/library/os.path.html#os.path.expanduser
Get request python as a string
https://docs.python.org/3/library/stdtypes.html#str.replace
This works
for url in url_list:
response = requests.get(url).content.decode('utf-8')
path = (directory + str(url)[40:-5] +".txt").replace('/', '\\')
with open(path, "w+") as f:
f.write(response)
f.close()
the path that you were build was something like this \\Desktop\\10ks18651/000119312509042636/d10.txt I suppose you are working on windows for those backslashes, anyways you just need to replace the slashes that were coming in the url to backslashes.
Another thing, write receives a string, because of that you need to decode your response that is coming in bytes to string.
I hope this helps you!
I'm using this to connect to Azure File Share and upload a file. I would like to chose what extension file will have, but I can't. I got an error shown below. If I remove .txt everything works fine. Is there a way to specify file extension while uploading it?
Error:
Exception: ResourceNotFoundError: The specified parent path does not exist.
Code:
def main(blobin: func.InputStream):
file_client = ShareFileClient.from_connection_string(conn_str="<con_string>",
share_name="data-storage",
file_path="outgoing/file.txt")
f = open('/home/temp.txt', 'w+')
f.write(blobin.read().decode('utf-8'))
f.close()
# Operation on file here
f = open('/home/temp.txt', 'rb')
string_to_upload = f.read()
f.close()
file_client.upload_file(string_to_upload)
I believe the reason you're getting this error is because outgoing folder doesn't exist in your file service share. I took your code and ran it with and without extension and in both situation I got the same error.
Then I created a folder and tried to upload the file and I was able to successfully do so.
Here's the final code I used:
from azure.storage.fileshare import ShareFileClient, ShareDirectoryClient
conn_string = "DefaultEndpointsProtocol=https;AccountName=myaccountname;AccountKey=myaccountkey;EndpointSuffix=core.windows.net"
share_directory_client = ShareDirectoryClient.from_connection_string(conn_str=conn_string,
share_name="data-storage",
directory_path="outgoing")
file_client = ShareFileClient.from_connection_string(conn_str=conn_string,
share_name="data-storage",
file_path="outgoing/file.txt")
# Create folder first.
# This operation will fail if the directory already exists.
print "creating directory..."
share_directory_client.create_directory()
print "directory created successfully..."
# Operation on file here
f = open('D:\\temp\\test.txt', 'rb')
string_to_upload = f.read()
f.close()
#Upload file
print "uploading file..."
file_client.upload_file(string_to_upload)
print "file uploaded successfully..."
Is there a way for Python to close that the file is already open file.
Or at the very least display a popup that file is open or a custom written error message popup for permission error.
As to avoid:
PermissionError: [Errno 13] Permission denied: 'C:\\zf.csv'
I've seen a lot of solutions that open a file then close it through python. But in my case. Lets say I left my csv open and then tried to run the job.
How can I make it so it closes the currently opened csv?
I've tried the below variations but none seem to work as they expect that I have already opened the csv at an earlier point through python. I suspect I'm over complicating this.
f = 'C:\\zf.csv'
file.close()
AttributeError: 'str' object has no attribute 'close'
This gives an error as there is no reference to opening of file but simply strings.
Or even..
theFile = open(f)
file_content = theFile.read()
# do whatever you need to do
theFile.close()
As well as:
fileobj=open('C:\\zf.csv',"wb+")
if not fileobj.closed:
print("file is already opened")
How do I close an already open csv?
The only workaround I can think of would be to add a messagebox, though I can't seem to get it to detect the file.
filename = "C:\\zf.csv"
if not os.access(filename, os.W_OK):
print("Write access not permitted on %s" % filename)
messagebox.showinfo("Title", "Close your CSV")
Try using a with context, which will manage the close (__exit__) operation smoothly at the end of the context:
with open(...) as theFile:
file_content = theFile.read()
You can also try to copy the file to a temporary file, and open/close/remove it at will. It requires that you have read access to the original, though.
In this example I have a file "test.txt" that is write-only (chmod 444) and it throws a "Permission denied" error if I try writing to it directly. I copy it to a temporary file that has "777" rights so that I can do what I want with it:
import tempfile, shutil, os
def create_temporary_copy(path):
temp_dir = tempfile.gettempdir()
temp_path = os.path.join(temp_dir, 'temp_file_name')
os.chmod(temp_path, 0o777); # give full access to the tempfile so we can copy
shutil.copy2(path, temp_path) # copy the original into the temp one
os.chmod(temp_path, 0o777); # replace permissions from the original file
return temp_path
path = "./test.txt" # original file
copy_path = create_temporary_copy(path) # temp copy
with open(copy_path, "w") as g: # can do what I want with it
g.write("TEST\n")
f = open("C:/Users/amol/Downloads/result.csv", "r")
print(f.readlines()) #just to check file is open
f.close()
# here you can add above print statement to check if file is closed or not. I am using python 3.5
ftp.cwd("TXNnIGZvbGRlcg==/")
file = open('msg.txt', 'rb')
file.storbinary('STOR msg.txt', file)
file.close
So just a quick answer I need... the code shows a msg being saved in the base64 folder in the FTP server, however in earlier code, I've already said:
if name != "":
ftp.mkd(name)
ftp.cwd(name)
So it's already navigated somewhere, but I need help on finding the command on how to go back a directory.
something like
ftp.goback()
Or something.
I believe you can try something like this.
ftp.cwd("TXNnIGZvbGRlcg==/")
file = open('msg.txt', 'rb')
ftp.storbinary('STOR msg.txt', file)
file.close()
ftp.cwd("../")
I am new to Python, and with some really great assistance from StackOverflow, I've written a program that:
1) Looks in a given directory, and for each file in that directory:
2) Runs a HTML-cleaning program, which:
Opens each file with BeautifulSoup
Removes blacklisted tags & content
Prettifies the remaining content
Runs Bleach to remove all non-whitelisted tags & attributes
Saves out as a new file
It works very well, except when it hits a certain kind of file content that throws up a bunch of BeautifulSoup errors and aborts the whole thing. I want it to be robust against that, as I won't have control over what sort of content winds up in this directory.
So, my question is: How can I re-structure the program so that when it errors on one file within the directory, it reports that it was unable to process that file, and then continues to run through the remaining files?
Here is my code so far (with extraneous detail removed):
def clean_dir(directory):
os.chdir(directory)
for filename in os.listdir(directory):
clean_file(filename)
def clean_file(filename):
tag_black_list = ['iframe', 'script']
tag_white_list = ['p', 'div']
attr_white_list = {'*': ['title']}
with open(filename, 'r') as fhandle:
text = BeautifulSoup(fhandle)
text.encode("utf-8")
print "Opened "+ filename
# Step one, with BeautifulSoup: Remove tags in tag_black_list, destroy contents.
[s.decompose() for s in text(tag_black_list)]
pretty = (text.prettify())
print "Prettified"
# Step two, with Bleach: Remove tags and attributes not in whitelists, leave tag contents.
cleaned = bleach.clean(pretty, strip="TRUE", attributes=attr_white_list, tags=tag_white_list)
fout = open("../posts-cleaned/"+filename, "w")
fout.write(cleaned.encode("utf-8"))
fout.close()
print "Saved " + filename +" in /posts-cleaned"
print "Done"
clean_dir("../posts/")
I looking for any guidance on how to write this so that it will keep running after hitting a parsing/encoding/content/attribute/etc error within the clean_file function.
You can handle the Errors using :try-except-finally
You can do the error handling inside clean_file or in the for loop.
for filename in os.listdir(directory):
try:
clean_file(filename)
except:
print "Error processing file %s" % filename
If you know what exception gets raised you can use a more specific catch.