Python converting url into directory - python

I am trying to convert a url like "www.example.com/images/dog.png" into directories from the current directory.
So I get a folder named "www.example.com", inside that "images" and finally inside that the file saved as "dog.png"
I've tried using urllib.url2pathname(path) but it keeps appending P:\ to the start of it.

You can use os.makedirs() to create the directory tree, but that will fail if the final directory already exists. So you can test if it exists before attempting to create the directory tree, or use try: ... except OSError:. In Python 3 you can supply an exist_ok parameter to over-ride this behaviour, see the Python docs of os.makedirs for further info.
#!/usr/bin/env python
import os
cwd = os.getcwd()
url = "www.example.com/images/dog.png"
fullname = os.path.join(cwd, url)
path, basename = os.path.split(fullname)
if not os.path.exists(path):
os.makedirs(path)
with open(fullname, 'w') as f:
f.write('test\n')
If your system doesn't support directory names containing periods you can translate them to another character, eg _, like this:
fullname = fullname.replace('.', '_')
(just insert this after the fullname = os.path.join(cwd, url) line).
And as jwilner mentions in the comments, it's more efficient to use
path = os.path.dirname
than path, basename = os.path.split(fullname) if you don't need the base component of the file name (in this example "dog.png").

Related

Create directory if does NOT exist: Python [duplicate]

I want to check if a folder by the name "Output Folder" exists at the path
D:\LaptopData\ISIS project\test\d0_63_b4_01_18_ba\00_17_41_41_00_0e
if the folder by the name "Output Folder" does not exist then create that folder there.
can anyone please help with providing a solution for this?
The best way would be to use os.makedirs like,
os.makedirs(name, mode=0o777, exist_ok=False)
Recursive directory creation function. Like mkdir(), but makes a intermediate-level directories needed to contain the leaf directory.
The mode parameter is passed to mkdir() for creating the leaf directory; see the mkdir() description for how it is interpreted. To set the file permission bits of any newly-created parent directories you can set the umask before invoking makedirs(). The file permission bits of existing parent directories are not changed.
>>> import os
>>> os.makedirs(path, exist_ok=True)
# which will not raise an error if the `path` already exists and it
# will recursively create the paths, if the preceding path doesn't exist
or if you are on python3, using pathlib like,
Path.mkdir(mode=0o777, parents=False, exist_ok=False)
Create a new directory at this given path. If mode is given, it is combined with the process’ umask value to determine the file mode and access flags. If the path already exists, FileExistsError is raised.
If parents is true, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIX mkdir -p command).
If parents is false (the default), a missing parent raises FileNotFoundError. > If exist_ok is false (the default), FileExistsError is raised if the target directory already exists.
If exist_ok is true, FileExistsError exceptions will be ignored (same behavior as the POSIX mkdir -p command), but only if the last
path component is not an existing non-directory file.
Changed in version 3.5: The exist_ok parameter was added.
>>> import pathlib
>>> path = pathlib.Path(somepath)
>>> path.mkdir(parents=True, exist_ok=True)
import os
import os.path
folder = "abc"
os.chdir(".")
print("current dir is: %s" % (os.getcwd()))
if os.path.isdir(folder):
print("Exists")
else:
print("Doesn't exists")
os.mkdir(folder)
I hope this helps
pathlib application where csv files need to be created inside a csv folder under parent directory, from a xlsx file with full path (e.g., taken with Path Copy Copy) provided.
If exist_ok is true, FileExistsError exceptions will be ignored, if directory is already created.
from pathlib import Path
wrkfl = 'C:/xlsx/my.xlsx' # path get from Path Copy Copy context menu
xls_file = Path(wrkfl)
(xls_file.parent / 'csv').mkdir(parents=True, exist_ok=True)
Search for folder whether it exists or not, it will return true or false: os.path.exists('<folder-path>')
Create a new folder: os.mkdir('<folder-path>')
Note: import os will be required to import the module.
Hope you can write the logic using above two functions as per your requirement.
import os
def folder_creat(name, directory):
os.chdir(directory)
fileli = os.listdir()
if name in fileli:
print(f'Folder "{name}" exist!')
else:
os.mkdir(name)
print(f'Folder "{name}" succesfully created!')
return
folder_creat('Output Folder', r'D:\LaptopData\ISIS project\test\d0_63_b4_01_18_ba\00_17_41_41_00_0e')
This piece of code does the exactly what you wanted. First gets the absolute path, then joins folder wanted in the path, and finally creates it if it is not exists.
import os
# Gets current working directory
path = os.getcwd()
# Joins the folder that we wanted to create
folder_name = 'output'
path = os.path.join(path, folder_name)
# Creates the folder, and checks if it is created or not.
os.makedirs(path, exist_ok=True)
Getting help from the answers above, I reached this solution
if not os.path.exists(os.getcwd() + '/' + folderName):
os.makedirs(os.getcwd() + '/' + folderName, exist_ok=True)

Is there a simpler function or one liner to check if folder exists if not create it and paste a specific file into it?

I am aiming to create a function that does the following:
Declare a path with a file, not just a folder. e.g. 'C:/Users/Lampard/Desktop/Folder1/File.py'
Create a folder in same folder as the declared file path - Calling it 'Archive'
Cut the file and paste it into the new folder just created.
If the folder 'Archive' already exists - then simply cut and paste the file into there
I have spent approx. 15-20min going through these:
https://www.programiz.com/python-programming/directory
Join all except last x in list
https://docs.python.org/3/library/pathlib.html#operators
And here is what I got to:
import os
from pathlib import Path, PurePath
from shutil import copy
#This path will change every time - just trying to get function right first
path = 'C:/Users/Lampard/Desktop/Folder1/File.py'
#Used to allow suffix function
p = PurePath(path)
#Check if directory is a file not a folder
if not p.suffix:
print("Not an extension")
#If it is a file
else:
#Create new folder before last file
#Change working directory
split = path.split('/')
new_directory = '/'.join(split[:-1])
apply_new_directory = os.chdir(new_directory)
#If folder does not exist create it
try:
os.mkdir('Archive')#Create new folder
#If not, continue process to copy file and paste it into Archive
except FileExistsError:
copy(path, new_directory + '/Archive/' + split[-1])
Is this code okay? - does anyone know a simpler method?
Locate folder/file in path
print [name for name in os.listdir(".") if os.path.isdir(name)]
Create path
import os
# define the name of the directory to be created
path = "/tmp/year"
try:
os.mkdir(path)
except OSError:
print ("Creation of the directory %s failed" % path)
else:
print ("Successfully created the directory %s " % path)
To move and cut files you can use this library
As you're already using pathlib, there's no need to use shutil:
from pathlib import Path
path = 'C:/Users/Lampard/Desktop/Folder1/File.py' # or whatever
p = Path(path)
target = Path(p.with_name('Archive')) # replace the filename with 'Archive'
target.mkdir() # create target directory
p.rename(target.joinpath(p.name)) # move the file to the target directory
Feel free to add appriopriate try…except statements to handle any errors.
Update: you might find this version more readable:
target = p.parent / 'Archive'
target.mkdir()
p.rename(target / p.name)
This is an example of overloading / operator.

FileNotFoundError when trying to use os.rename

I've tried to write some code which will rename some files in a folder - essentially, they're listed as xxx_(a).bmp whereas they need to be xxx_a.bmp, where a runs from 1 to 2000.
I've used the inbuilt os.rename function to essentially swap them inside of a loop to get the right numbers, but this gives me FileNotFoundError [WinError2] the system cannot find the file specified Z:/AAA/BBB/xxx_(1).bmp' -> 'Z:/AAA/BBB/xxx_1.bmp'.
I've included the code I've written below if anyone could point me in the right direction. I've checked that I'm working in the right directory and it gives me the directory I'm expecting so I'm not sure why it can't find the files.
import os
n = 2000
folder = r"Z:/AAA/BBB/"
os.chdir(folder)
saved_path = os.getcwd()
print("CWD is" + saved_path)
for i in range(1,n):
old_file = os.path.join(folder, "xxx_(" + str(i) + ").bmp")
new_file = os.path.join(folder, "xxx_" +str(i)+ ".bmp")
os.rename(old_file, new_file)
print('renamed files')
The problem is os.rename doesn't create a new directory if the new name is a filename in a directory that does not currently exist.
In order to create the directory first, you can do the following in Python3:
os.makedirs(dirname, exist_ok=True)
In this case dirname can contain created or not-yet-created subdirectories.
As an alternative, one may use os.renames, which handles new and intermediate directories.
Try iterating files inside the directory and processing the files that meet your criteria.
from pathlib import Path
import re
folder = Path("Z:/AAA/BBB/")
for f in folder.iterdir():
if '(' in f.name:
new_name = f.stem.replace('(', '').replace(')', '')
# using regex
# new_name = re.sub('\(([^)]+)\)', r'\1', f.stem)
extension = f.suffix
new_path = f.with_name(new_name + extension)
f.rename(new_path)

Move pairs of files (.txt & .xml) into their corresponding folder using Python

I have been working this challenge for about a day or so. I've looked at multiple questions and answers asked on SO and tried to 'MacGyver' the code used for my purpose, but still having issues.
I have a directory (lets call it "src\") with hundreds of files (.txt and .xml). Each .txt file has an associated .xml file (let's call it a pair). Example:
src\text-001.txt
src\text-001.xml
src\text-002.txt
src\text-002.xml
src\text-003.txt
src\text-003.xml
Here's an example of how I would like it to turn out so each pair of files are placed into a single unique folder:
src\text-001\text-001.txt
src\text-001\text-001.xml
src\text-002\text-002.txt
src\text-002\text-002.xml
src\text-003\text-003.txt
src\text-003\text-003.xml
What I'd like to do is create an associated folder for each pair and then move each pair of files into its respective folder using Python. I've already tried working from code I found (thanks to a post from Nov '12 by Sethdd, but am having trouble figuring out how to use the move function to grab pairs of files. Here's where I'm at:
import os
import shutil
srcpath = "PATH_TO_SOURCE"
srcfiles = os.listdir(srcpath)
destpath = "PATH_TO_DEST"
# grabs the name of the file before extension and uses as the dest folder name
destdirs = list(set([filename[0:9] for filename in srcfiles]))
def create(dirname, destpath):
full_path = os.path.join(destpath, dirname)
os.mkdir(full_path)
return full_path
def move(filename, dirpath):
shutil.move(os.path.join(srcpath, filename)
,dirpath)
# create destination directories and store their names along with full paths
targets = [
(folder, create(folder, destpath)) for folder in destdirs
]
for dirname, full_path in targets:
for filename in srcfile:
if dirname == filename[0:9]:
move(filename, full_path)
I feel like it should be easy, but Python isn't something I work with everyday and it's been a while since my scripting days... Any help would be greatly appreciated!
Thanks,
WK2EcoD
Use the glob module to interate all of the 'txt' files. From that you can parse and create the folders and copy the files.
The process should be as simple as it appears to you as a human.
for file_name in os.listdir(srcpath):
dir = file_name[:9]
# if dir doesn't exist, create it
# move file_name to dir
You're doing a lot of intermediate work that seems to be confusing you.
Also, insert some simple print statements to track data flow and execution flow. It appears that you have no tracing output so far.
You can do it with os module. For every file in directory check if associated folder exists, create if needed and then move the file. See the code below:
import os
SRC = 'path-to-src'
for fname in os.listdir(SRC):
filename, file_extension = os.path.splitext(fname)
if file_extension not in ['xml', 'txt']:
continue
folder_path = os.path.join(SRC, filename)
if not os.path.exists(folder_path):
os.mkdir(folderpath)
os.rename(
os.path.join(SRC, fname),
os.path.join(folder_path, fname)
)
My approach would be:
Find the pairs that I want to move (do nothing with files without a pair)
Create a directory for every pair
Move the pair to the directory
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import os, shutil
import re
def getPairs(files):
pairs = []
file_re = re.compile(r'^(.*)\.(.*)$')
for f in files:
match = file_re.match(f)
if match:
(name, ext) = match.groups()
if ext == 'txt' and name + '.xml' in files:
pairs.append(name)
return pairs
def movePairsToDir(pairs):
for name in pairs:
os.mkdir(name)
shutil.move(name+'.txt', name)
shutil.move(name+'.xml', name)
files = os.listdir()
pairs = getPairs(files)
movePairsToDir(pairs)
NOTE: This script works when called inside the directory with the pairs.

Open file in a relative location in Python

Suppose my python code is executed a directory called main and the application needs to access main/2091/data.txt.
how should I use open(location)? what should the parameter location be?
I found that below simple code will work.. does it have any disadvantages?
file = "\2091\sample.txt"
path = os.getcwd()+file
fp = open(path, 'r+');
With this type of thing you need to be careful what your actual working directory is. For example, you may not run the script from the directory the file is in. In this case, you can't just use a relative path by itself.
If you are sure the file you want is in a subdirectory beneath where the script is actually located, you can use __file__ to help you out here. __file__ is the full path to where the script you are running is located.
So you can fiddle with something like this:
import os
script_dir = os.path.dirname(__file__) #<-- absolute dir the script is in
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)
This code works fine:
import os
def read_file(file_name):
file_handle = open(file_name)
print file_handle.read()
file_handle.close()
file_dir = os.path.dirname(os.path.realpath('__file__'))
print file_dir
#For accessing the file in the same folder
file_name = "same.txt"
read_file(file_name)
#For accessing the file in a folder contained in the current folder
file_name = os.path.join(file_dir, 'Folder1.1/same.txt')
read_file(file_name)
#For accessing the file in the parent folder of the current folder
file_name = os.path.join(file_dir, '../same.txt')
read_file(file_name)
#For accessing the file inside a sibling folder.
file_name = os.path.join(file_dir, '../Folder2/same.txt')
file_name = os.path.abspath(os.path.realpath(file_name))
print file_name
read_file(file_name)
I created an account just so I could clarify a discrepancy I think I found in Russ's original response.
For reference, his original answer was:
import os
script_dir = os.path.dirname(__file__)
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)
This is a great answer because it is trying to dynamically creates an absolute system path to the desired file.
Cory Mawhorter noticed that __file__ is a relative path (it is as well on my system) and suggested using os.path.abspath(__file__). os.path.abspath, however, returns the absolute path of your current script (i.e. /path/to/dir/foobar.py)
To use this method (and how I eventually got it working) you have to remove the script name from the end of the path:
import os
script_path = os.path.abspath(__file__) # i.e. /path/to/dir/foobar.py
script_dir = os.path.split(script_path)[0] #i.e. /path/to/dir/
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)
The resulting abs_file_path (in this example) becomes: /path/to/dir/2091/data.txt
It depends on what operating system you're using. If you want a solution that is compatible with both Windows and *nix something like:
from os import path
file_path = path.relpath("2091/data.txt")
with open(file_path) as f:
<do stuff>
should work fine.
The path module is able to format a path for whatever operating system it's running on. Also, python handles relative paths just fine, so long as you have correct permissions.
Edit:
As mentioned by kindall in the comments, python can convert between unix-style and windows-style paths anyway, so even simpler code will work:
with open("2091/data/txt") as f:
<do stuff>
That being said, the path module still has some useful functions.
I spend a lot time to discover why my code could not find my file running Python 3 on the Windows system. So I added . before / and everything worked fine:
import os
script_dir = os.path.dirname(__file__)
file_path = os.path.join(script_dir, './output03.txt')
print(file_path)
fptr = open(file_path, 'w')
Try this:
from pathlib import Path
data_folder = Path("/relative/path")
file_to_open = data_folder / "file.pdf"
f = open(file_to_open)
print(f.read())
Python 3.4 introduced a new standard library for dealing with files and paths called pathlib. It works for me!
Code:
import os
script_path = os.path.abspath(__file__)
path_list = script_path.split(os.sep)
script_directory = path_list[0:len(path_list)-1]
rel_path = "main/2091/data.txt"
path = "/".join(script_directory) + "/" + rel_path
Explanation:
Import library:
import os
Use __file__ to attain the current script's path:
script_path = os.path.abspath(__file__)
Separates the script path into multiple items:
path_list = script_path.split(os.sep)
Remove the last item in the list (the actual script file):
script_directory = path_list[0:len(path_list)-1]
Add the relative file's path:
rel_path = "main/2091/data.txt
Join the list items, and addition the relative path's file:
path = "/".join(script_directory) + "/" + rel_path
Now you are set to do whatever you want with the file, such as, for example:
file = open(path)
import os
def file_path(relative_path):
dir = os.path.dirname(os.path.abspath(__file__))
split_path = relative_path.split("/")
new_path = os.path.join(dir, *split_path)
return new_path
with open(file_path("2091/data.txt"), "w") as f:
f.write("Powerful you have become.")
If the file is in your parent folder, eg. follower.txt, you can simply use open('../follower.txt', 'r').read()
Get the path of the parent folder, then os.join your relative files to the end.
# get parent folder with `os.path`
import os.path
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
# now use BASE_DIR to get a file relative to the current script
os.path.join(BASE_DIR, "config.yaml")
The same thing with pathlib:
# get parent folder with `pathlib`'s Path
from pathlib import Path
BASE_DIR = Path(__file__).absolute().parent
# now use BASE_DIR to get a file relative to the current script
BASE_DIR / "config.yaml"
Python just passes the filename you give it to the operating system, which opens it. If your operating system supports relative paths like main/2091/data.txt (hint: it does), then that will work fine.
You may find that the easiest way to answer a question like this is to try it and see what happens.
Not sure if this work everywhere.
I'm using ipython in ubuntu.
If you want to read file in current folder's sub-directory:
/current-folder/sub-directory/data.csv
your script is in current-folder
simply try this:
import pandas as pd
path = './sub-directory/data.csv'
pd.read_csv(path)
When I was a beginner I found these descriptions a bit intimidating. As at first I would try
For Windows
f= open('C:\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f)
and this would raise an syntax error. I used get confused alot. Then after some surfing across google. found why the error occurred. Writing this for beginners
It's because for path to be read in Unicode you simple add a \ when starting file path
f= open('C:\\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f)
And now it works just add \ before starting the directory.
In Python 3.4 (PEP 428) the pathlib was introduced, allowing you to work with files in an object oriented fashion:
from pathlib import Path
working_directory = Path(os.getcwd())
path = working_directory / "2091" / "sample.txt"
with path.open('r+') as fp:
# do magic
The with keyword will also ensure that your resources get closed properly, even if you get something goes wrong (like an unhandled Exception, sigint or similar)

Categories