Copying files from one location of a server to another using python

Copying files from one location of a server to another using python - python

Say I have a file that contains the different locations where some '.wav' files are present on a server. For example say the content of the text file location.txt containing the locations of the wav files is this
/home/user/test_audio_folder_1/audio1.wav
/home/user/test_audio_folder_2/audio2.wav
/home/user/test_audio_folder_3/audio3.wav
/home/user/test_audio_folder_4/audio4.wav
/home/user/test_audio_folder_5/audio5.wav
Now what I want to do is that I want to copy these files from different locations within the server to a particular directory within that server, for example say /home/user/final_audio_folder/ and this directory will contain all the audio files from audio1.wav to audio5.wav
I am trying to perform this task by using shutil, but the main problem with shutil that I am facing is that while copying the files, I need to name the file. I have written a demo version of what I am trying to do, but dont know how to scale it when I will be reading the paths of the '.wav' files from the txt file and copy them to my desired location using a loop.
My code for copying a single file goes as follows,
import shutil
original = r'/home/user/test_audio_folder_1/audio1.wav'
target=r'/home/user/final_audio_folder_1/final_audio1.wav'
shutil.copyfile(original,target)
Any suggestions will be really helpful. Thank you.

import shutil
i=0
with open(r'C:/Users/turing/Desktop/location.txt', "r") as infile:
for t in infile:
i+=1
x="audio"+str(i)+".wav"
t=t.rstrip('\n')
original= r'{}'.format(t)
target=r'C:/Users/turing/Desktop/audio_in/' + x
shutil.copyfile(original, target)

Use the built-in string's split() method within a for loop on the location.txt contents & split the name of the directory on the '/' character, then the last element in a new list would be your filename.

Related

Errors with Glob while outputting file names

I am combining two questions here because they are related to each other.
Question 1: I am trying to use glob to open all the files in a folder but it is giving me "Syntax Error". I am using Python 3.xx. Has the syntax changed for Python 3.xx?
Error Message:
File "multiple_files.py", line 29
files = glob.glob(/src/xyz/rte/folder/)
SyntaxError: invalid syntax
Code:
import csv
import os
import glob
from pandas import DataFrame, read_csv
#extracting
files = glob.glob(/src/xyz/rte/folder/)
for fle in files:
with open (fle) as f:
print("output" + fle)
f_read.close()
Question 2: I want to read input files, append "output" to the names and print out the names of the files. How can I do that?
Example: Input file name would be - xyz.csv and the code should print output_xyz.csv .
Your help is appreciated.

Your first problem is that strings, including pathnames, need to be in quotes. This:
files = glob.glob(/src/xyz/rte/folder/)
… is trying to divide a bunch of variables together, but the leftmost and rightmost divisions are missing operands, so you've confused the parser. What you want is this:
files = glob.glob('/src/xyz/rte/folder/')
Your next problem is that this glob pattern doesn't have any globs in it, so the only thing it's going to match is the directory itself.
That's perfectly legal, but kind of useless.
And then you try to open each match as a text file. Which you can't do with a directory, hence the IsADirectoryError.
The answer here is less obvious, because it's not clear what you want.
Maybe you just wanted all of the files in that directory? In that case, you don't want glob.glob, you want listdir (or maybe scandir): os.listdir('/src/xyz/rte/folder/').
Maybe you wanted all of the files in that directory or any of its subdirectories? In that case, you could do it with rglob, but os.walk is probably clearer.
Maybe you did want all the files in that directory that match some pattern, so glob.glob is right—but in that case, you need to specify what that pattern is. For example, if you wanted all .csv files, that would be glob.glob('/src/xyz/rte/folder/*.csv').
Finally, you say "I want to read input files, append "output" to the names and print out the names of the files". Why do you want to read the files if you're not doing anything with the contents? You can do that, of course, but it seems pretty wasteful. If you just want to print out the filenames with output appended, that's easy:
for filename in os.listdir('/src/xyz/rte/folder/'):
print('output'+filename)

This works in http://pyfiddle.io:
Doku: https://docs.python.org/3/library/glob.html
import csv
import os
import glob
# create some files
for n in ["a","b","c","d"]:
with open('{}.txt'.format(n),"w") as f:
f.write(n)
print("\nFiles before")
# get all files
files = glob.glob("./*.*")
for fle in files:
print(fle) # print file
path,fileName = os.path.split(fle) # split name from path
# open file for read and second one for write with modified name
with open (fle) as f,open('{}{}output_{}'.format(path,os.sep, fileName),"w") as w:
content = f.read() # read all
w.write(content.upper()) # write all modified
# check files afterwards
print("\nFiles after")
files = glob.glob("./*.*") # pattern for all files
for fle in files:
print(fle)
Output:
Files before
./d.txt
./main.py
./c.txt
./b.txt
./a.txt
Files after
./d.txt
./output_c.txt
./output_d.txt
./main.py
./output_main.py
./c.txt
./b.txt
./output_b.txt
./a.txt
./output_a.txt
I am on windows and would use os.walk (Doku) instead.
for d,subdirs,files in os.walk("./"): # deconstruct returned aktDir, all subdirs, files
print("AktDir:", d)
print("Subdirs:", subdirs)
print("Files:", files)
Output:
AktDir: ./
Subdirs: []
Files: ['d.txt', 'output_c.txt', 'output_d.txt', 'main.py', 'output_main.py',
'c.txt', 'b.txt', 'output_b.txt', 'a.txt', 'output_a.txt']
It also recurses into subdirs.

Using os.system() in a specific directory only

I have a directory containing mutliple files with similar names and subdirectories named after these so that files with like-names are located in that subdirectory. I'm trying to concatenate all the .sdf files in a given subdirectory to a single .sdf file.
import os
from os import system
for ele in os.listdir(Path):
if ele.endswith('.sdf'):
chdir(Path + '/' + ele[0:5])
system('cat' + ' ' + '*.sdf' + '>' + ele[0:5] + '.sdf')
However when I run this, the concatenated file includes every .sdf file from the original directory rather than just the .sdf files from the desired one. How do I alter my script to concatenate the files in the subdirectory only?

this is a very clumsy way of doing it. Using chdir is not recommended, and system either (deprecated, and overkill to call cat)
Let me propose a pure python implementation using glob.glob to filter the .sdf files, and read each file one by one and write to the big file opened before the loop:
import glob,os
big_sdf_file = "all_data.sdf" # I'll let you compute the name/directory you want
with open(big_sdf_file,"wb") as fw:
for sdf_file in glob.glob(os.path.join(Path,"*.sdf")):
with open(sdf_file,"rb") as fr:
fw.write(fr.read())
I left big_sdf_file not computed, I would not recommend to put it in the same directory as the other files, since running the script twice would result in taking the output as input as well.
Note that the drawback of this approach is that if the files are big, they're read fully into memory, which can cause problems. In that case, replace
fw.write(fr.read())
by:
shutil.copyfileobj(fr,fw)
(importing shutil is necessary in that case). That allows packet copy instead of full-file read/write.
I'll add that it's probably not the full solution you're expecting, since there seem to be something about scanning the sub-directories of Path to create 1 big .sdf file per sub-directory, but with the provided code which doesn't use any system command or chdir, it should be easier to adapt to your needs.

Extracting all file names in python

I have a application that converts from one photo format to another by inputting in cmd.exe following: "AppConverter.exe" "file.tiff" "file.jpeg"
But since i don't want to input this every time i want a photo converted, i would like a script that converts all files in the folder. So far i have this:
def start(self):
for root, dirs, files in os.walk("C:\\Users\\x\\Desktop\\converter"):
for file in files:
if file.endswith(".tiff"):
subprocess.run(['AppConverter.exe', '.tiff', '.jpeg'])
So how do i get the names of all the files and put them in subprocess. I am thinking taking basename (no ext.) for every file and pasting it in .tiff and .jpeg, but im at lost on how to do it.

I think the fastest way would be to use the glob module for expressions:
import glob
import subprocess
for file in glob.glob("*.tiff"):
subprocess.run(['AppConverter.exe', file, file[:-5] + '.jpeg'])
# file will be like 'test.tiff'
# file[:-5] will be 'test' (we remove the last 5 characters, so '.tiff'
# we add '.jpeg' to our extension-less string
All those informations are on the post I've linked in the comments o your original question.

You could try looking into os.path.splitext(). That allows you to split the file name into a tuple containing the basename and extension. That might help...
https://docs.python.org/3/library/os.path.html

How to input all files within a directory with a certain ending? Python

I have a folder called Test '/Desktop/Test/'
I have several files in the folder (eg. 1.fa,2.fa,3.fa,X.fa)
or '/Desktop/Test/1.fa','/Desktop/Test/2.fa','/Desktop/Test/3.fa','/Desktop/Test/X.fa'
I'm trying to create an element in my function to open up every file in the directory '/Desktop/Test/' that has an ending .fa without actually making a function with 20 or so variables (the only way I know how to do it)
Example:
def simple(input):
#input would be the directory '/Desktop/Test/'
#for each .fa in the directory (eg. 1.fa, 2.fa, 3.fa, X.fa) i need it to create a list of all the strings within each file
#query=open(input,'r').read().split('\n') is what I would use in my simple codes that only have one input
How can one input all files within a directory with a certain ending (.fa) ?

You can do:
import glob
import os
def simple(input)
os.chdir(input)
for file in glob.glob("*.fa"):
with open(file, 'r+') as f:
print f.readlines() #or do whatever

Concatenating fasta files from different folders

I have a large numbers of fasta files (these are just text files) in different subfolders. What I need is a way to search through the directories for files that have the same name and concatenate these into a file with the name of the input files. I can't do this manually as I have 10000+ genes that I need to do this for.
So far I have the following Python code that looks through one of the directories and then uses those file names to search through the other directories. This returns a list that has the full path for each file.
import os
from os.path import join, abspath
path = '/directoryforfilelist/' #Directory for source list
listing = os.listdir(path)
for x in listing:
for root, dirs, files in os.walk('/rootdirectorytosearch/'):
if x in files:
pathlist = abspath(join(root,x))
Where I am stuck is how to concatenate the files it returns that have the same name. The results from this script look like this.
/directory1/file1.fasta
/directory2/file1.fasta
/directory3/file1.fasta
/directory1/file2.fasta
/directory2/file2.fasta
/directory3/file2.fasta
In this case I would need the end result to be two files named file1.fasta and file2.fasta that contain the text from each of the same named files.
Any leads on where to go from here would be appreciated. While I did this part in Python anyway that gets the job done is fine with me. This is being run on a Mac if that matters.

Not tested, but here's roughly what I'd do:
from itertools import groupby
import os
def conc_by_name(names):
for tail, group in groupby(names, key=os.path.split):
with open(tail, 'w') as out:
for name in group:
with open(name) as f:
out.writelines(f)
This will create the files (file1.fasta and file2.fasta in your example) in the current folder.

For each file of your list, allocate the target file in append mode, read each line of your source file and write it to the target file.
Assuming that the target folder is empty to start with, and is not in /rootdirectorytosearch.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.