How to get Path in the form "file://///SERVER//folder1/folder2/ - python

i am rather new to python and i have the following problem (just an example):
import os
mypath = 'I:\Folder1'
for dirpath,_,filenames in os.walk(mypath):
for f in filenames:
getpath = os.path.abspath(os.path.join(dirpath, f))
returns the path in the form:
I:\Folder1\Folder2
which is normally ok for me
However "I:\" is one of our servers at work and for further processing (html stuff) i would need the exact address in such a form
file://///Servername/Subfolder/Folder1/Folder2
Edit: In other words:
My program may be used locally or on different servers - it just depends on the user. Rather stupidly said I need a function that returns what in win10 goes like this: "right click on a folder --> Path Copy --> file:////....". And I only know that this path on my computer is called "I:\Folder1" ..but "I:\Folder1" is the server name
Edit 2: Solved (see comments)

If you are on a Windows platform and need forward slashes, it is actually possible to import the OS specific version. For example you could use posixpath.
To solve your problem you would need to first strip off mypath from each return dirpath. Next split this into folder components using split with your operating system's seperator i.e. \. This can then be all rejoined with a server prefix using the posixpath.join() command. For example:
import posixpath
import os
mypath = r'I:\Folder1'
server = 'file://///Servername/Subfolder'
for dirpath,_,filenames in os.walk(mypath):
for f in filenames:
subfolder = dirpath[len(mypath):]
server_path = posixpath.join(server, *subfolder.split(os.sep), f)
print(server_path)

Related

Why does root returned from os.walk() contain / as directory separator but os.sep (or os.path.sep) return \ on Win10?

Why does the root element returned from os.walk() show / as the directory separator but os.sep (or os.path.sep) shows \ on Win10?
I'm just trying to create the complete path for a set of files in a folder as follows:
import os
base_folder = "c:/data/MA Maps"
for root, dirs, files in os.walk(base_folder):
for f in files:
if f.endswith(".png") and f.find("_N") != -1:
print(os.path.join(root, f))
print(os.path.sep)
Here's what I get as an output:
c:/data/MA Maps\Map_of_Massachusetts_Nantucket_County.png
c:/data/MA Maps\Map_of_Massachusetts_Norfolk_County.png
\
I understand that some of python's library functions (like open()) will work with mixed path separators (at least on Windows) but relying on that hack really can't be trusted across all libraries. It just seems like the items returned from os.walk() and os.path (.sep or .join()) should yield consistent results based on the operating system being used. Can anyone explain why this inconsistency is happening?
P.S. - I know there is a more consistent library for working with file paths (and lots of other file manipulation) called pathlib that was introduced in python 3.4 and it does seem to fix all this. If your code is being used in 3.4 or beyond, is it best to use pathlib methods to resolve this issue? But if your code is targeted for systems using python before 3.4, what is the best way to address this issue?
Here's a good basic explanation of pathlib: Python 3 Quick Tip: The easy way to deal with file paths on Windows, Mac and Linux
Here's my code & result using pathlib:
import os
from pathlib import Path
# All of this should work properly for any OS. I'm running Win10.
# You can even mix up the separators used (i.e."c:\data/MA Maps") and pathlib still
# returns the consistent result given below.
base_folder = "c:/data/MA Maps"
for root, dirs, files in os.walk(base_folder):
# This changes the root path provided to one using the current operating systems
# path separator (/ for Win10).
root_folder = Path(root)
for f in files:
if f.endswith(".png") and f.find("_N") != -1:
# The / operator, when used with a pathlib object, just concatenates the
# the path segments together using the current operating system path separator.
print(root_folder / f)
c:\data\MA Maps\Map_of_Massachusetts_Nantucket_County.png
c:\data\MA Maps\Map_of_Massachusetts_Norfolk_County.png
This can even be done more succinctly using only pathlib and list comprehension (with all path separators correctly handled per OS used):
from pathlib import Path
base_folder = "c:/data/MA Maps"
path = Path(base_folder)
files = [item for item in path.iterdir() if item.is_file() and
str(item).endswith(".png") and
(str(item).find("_N") != -1)]
for file in files:
print(file)
c:\data\MA Maps\Map_of_Massachusetts_Nantucket_County.png
c:\data\MA Maps\Map_of_Massachusetts_Norfolk_County.png
This is very Pythonic and at least I feel it is quite easy to read and understand. .iterdir() is really powerful and makes dealing with files and dirs reasonably easy and in a cross-platform way. What do you think?
The os.walk function always yields the initial part of the dirpath unchanged from what you pass in to it. It doesn't try to normalize the separators itself, it just keeps what you've given it. It does use the system-standard separators for the rest of the path, as it combines each subdirectory's name to the root directory with os.path.join. You can see the current version of the implementation of the os.walk function in the CPython source repository.
One option for normalizing the separators in your output is to normalize the base path you pass in to os.walk, perhaps using pathlib. If you normalize the initial path, all the output should use the system path separators automatically, since it will be the normalized path that will be preserved through the recursive walk, rather than the non-standard one. Here's a very basic transformation of your first code block to normalize the base_folder using pathlib, while preserving all the rest of the code, in its simplicity. Whether it's better than your version using more of pathlib's features is a judgement call that I'll leave up to you.
import os
from pathlib import Path
base_folder = Path("c:/data/MA Maps") # this will be normalized when converted to a string
for root, dirs, files in os.walk(base_folder):
for f in files:
if f.endswith(".png") and f.find("_N") != -1:
print(os.path.join(root, f))

Linux is unable to find specified path

My python script is running on Windows without any issues. But the same script has to run on a Linux machine as well. When ran, it prompts that the specified path is non-existent. Note that variable "path" is pointing to a cloud server
I subsequently tried the os.path.join function after looking through some forums, which also failed
import os
import re
import sys
#List .xlsx files followed by the string ESC
path = '\\\\cloudnetworkonlinuxserver'
path2 = 'DBX'
path3 = 'SrcFiles'
path4 = 'MEBilling'
path5 = 'ParmFiles'
filenames = os.listdir(os.path.join(path, path2, path3, path4))
for filename in filenames:
getdate = re.search('(?<=ESC_)\w+', filename)
#Replace '_' with '-'
if getdate:
date = getdate.group(0).replace('_', '-')
print('The following ESC file has date', date)
#Create .prm file with following body
f = open(os.path.join(path, path2, path5, "wf_SC_Monthend_Billing_XLS" + "." + "prm"), 'w')
#f.write cannot take more than one argument. Write variables such
a = '$$WF_PERIOD='
b = date
#Write in body of file
f.write("[Global]\n")
f.write('%s%s' % (a,b,))
#Close writing process
f.close
What other methods are there in specifying a path that is compatible with those both operating systems?
Assuming that \\cloudnetworkonlinuxserver is a Samba share: This path alone is Windows-specific. Depending on the platform, there may be a different way how this share would be accessed.
On Linux, you'd have to mount this share at some physical path first, for example /mnt/cloudshare. Then you'd access this path instead.
You should take this path from a command line argument or environment variable so that the appropriate path can be passed depending on what is correct in each environment.
Or, in case this should be part of the script, you'd have to make your script take care of mounting the share in case the environment is Linux (and ideally also unmounting it when no longer needed).

How do I access a similar path to a file that only has a minor difference between computers?

I am trying to access a file from a Box folder as I am working on two different computers. So the file path is pretty much the same except for the username.
I am trying to load a numpy array from a .npy file and I could easily change the path each time, but it would be nice if I could make it universal.
Here is what the line of code looks like on my one computer:
y_pred_walking = np.load('C:/Users/Eric/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
And here is what the line of code looks like on the other computer:
y_pred_walking = 'C:/Users/erapp/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
The only difference is that the username on one computer is Eric and the other is erapp, but is there a way where I can make the line universal to all computers where all computers will have the Box folder?
You could either save the file to a path that doesn't depend on the user: e.g. 'C:/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
Or you could do some string formatting. One way would be with an environment or configuration variable that indicates which is the relevant user, and then for your load statement:
import os
current_user = os.environ.get("USERNAME") # assuming you're running on the Windows box as the relevant user
# Now load the formatted string. f-strings are better, but this is more obvious since f-strings are still very new to Python
y_pred_walking = 'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'.format(user=current_user)
Yes, there is a way, at least for the problem as it is right now solution is pretty simple: to use f-strings
user='Eric'
y_pred_walking =np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
or more general
def pred_walking(user):
return np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
so on any machine you just do
y_pred_walking=pred_walking(user)
with defined user before, to receive the result
Simply search the folders recursivly for your file:
filename = 'y_pred_test.npy'
import os
import random
# creates 1000 directories with a 1% chance of having the file as well
for k in range(20):
for i in range(10):
for j in range(5):
os.makedirs(f"./{k}/{i}/{j}")
if random.randint(1,100) == 2:
with open(f"./{k}/{i}/{j}/{filename}","w") as f:
f.write(" ")
# search the directories for your file
found_in = []
# this starts searching in your current folder - you can give it your c:\Users\ instead
for root,dirs,files in os.walk("./"):
if filename in files:
found_in.append(os.path.join(root,filename))
print(*found_in,sep = "\n")
File found in:
./17/3/1/y_pred_test.npy
./3/8/1/y_pred_test.npy
./16/3/4/y_pred_test.npy
./16/5/3/y_pred_test.npy
./14/2/3/y_pred_test.npy
./0/5/4/y_pred_test.npy
./11/9/0/y_pred_test.npy
./9/8/1/y_pred_test.npy
If you get read errors because of missing file/directory permissions you can start directly in the users folder:
# Source: https://stackoverflow.com/a/4028943/7505395
from pathlib import Path
home = str(Path.home())
found_in = []
for root,dirs,files in os.walk(home):
if filename in files:
found_in.append(os.path.join(root,filename))
# use found_in[0] or break as soon as you find first file
You can use the expanduser function in the os.path module to modify a path to start from the home directory of a user
https://docs.python.org/3/library/os.path.html#os.path.expanduser

File not found from Python although file exists

I'm trying to load a simple text file with an array of numbers into Python. A MWE is
import numpy as np
BASE_FOLDER = 'C:\\path\\'
BASE_NAME = 'DATA.txt'
fname = BASE_FOLDER + BASE_NAME
data = np.loadtxt(fname)
However, this gives an error while running:
OSError: C:\path\DATA.txt not found.
I'm using VSCode, so in the debug window the link to the path is clickable. And, of course, if I click it the file opens normally, so this tells me that the path is correct.
Also, if I do print(fname), VSCode also gives me a valid path.
Is there anything I'm missing?
EDIT
As per your (very helpful for future reference) comments, I've changed my code using the os module and raw strings:
BASE_FOLDER = r'C:\path_to_folder'
BASE_NAME = r'filename_DATA.txt'
fname = os.path.join(BASE_FOLDER, BASE_NAME)
Still results in error.
Second EDIT
I've tried again with another file. Very basic path and filename
BASE_FOLDER = r'Z:\Data\Enzo\Waste_Code'
BASE_NAME = r'run3b.txt'
And again, I get the same error.
If I try an alternative approach,
os.chdir(BASE_FOLDER)
a = os.listdir()
then select the right file,
fname = a[1]
I still get the error when trying to import it. Even though I'm retrieving it directly from listdir.
>> os.path.isfile(a[1])
False
Using the module os you can check the existence of the file within python by running
import os
os.path.isfile(fname)
If it returns False, that means that your file doesn't exist in the specified fname. If it returns True, it should be read by np.loadtxt().
Extra: good practice working with files and paths
When working with files it is advisable to use the amazing functionality built in the Base Library, specifically the module os. Where os.path.join() will take care of the joins no matter the operating system you are using.
fname = os.path.join(BASE_FOLDER, BASE_NAME)
In addition it is advisable to use raw strings by adding an r to the beginning of the string. This will be less tedious when writing paths, as it allows you to copy-paste from the navigation bar. It will be something like BASE_FOLDER = r'C:\path'. Note that you don't need to add the latest '\' as os.path.join takes care of it.
You may not have the full permission to read the downloaded file. Use
sudo chmod -R a+rwx file_name.txt
in the command prompt to give yourself permission to read if you are using Ubuntu.
For me the problem was that I was using the Linux home symbol in the link (~/path/file). Replacing it with the absolute path /home/user/etc_path/file worked like charm.

How to allow only opening files in current directory in Python3?

I am writing a simple file server in Python. The filename is provided by the client and should be considered untrusted. How to verify that it corresponds to a file inside the current directory (within it or any of its subdirectories)? Will something like:
pwd=os.getcwd()
if os.path.commonpath((pwd,os.path.abspath(filename))) == pwd:
open(filename,'rb')
suffice?
Convert the filename to a canonical path using os.path.realpath, get the directory portion, and see if the current directory (in canonical form) is a prefix of that:
import os, os.path
def in_cwd(fname):
path = os.path.dirname(os.path.realpath(fname))
return path.startswith(os.getcwd())
By converting fname to a canonical path we handle symbolic links and paths containing ../.
Update
Unfortunately, the above code has a little problem. For example,
'/a/b/cd'.startswith('/a/b/c')
returns True, but we definitely don't want that behaviour here! Fortunately, there's an easy fix: we just need to append os.sep to the paths before performing the prefix test. The new version also handles any OS pathname case-insensitivity issues via os.path.normcase.
import os, os.path
def clean_dirname(dname):
dname = os.path.normcase(dname)
return os.path.join(dname, '')
def in_cwd(fname):
cwd = clean_dirname(os.getcwd())
path = os.path.dirname(os.path.realpath(fname))
path = clean_dirname(path)
return path.startswith(cwd)
Thanks to DSM for pointing out the flaw in the previous code.
Here's a version that's a little more efficient. It uses os.path.commonpath, which is more robust than appending os.sep and doing a string prefix test.
def in_cwd(fname):
cwd = os.path.normcase(os.getcwd())
path = os.path.normcase(os.path.dirname(os.path.realpath(fname)))
return os.path.commonpath((path, cwd)) == cwd

Categories