Running python script on multiple files - python

I have a python script that reads a file and extracts certain data from it, then stores the data in a dictionary, and finally inserts that into a mysql table. The first line of my code is where I enter in my filename:
filename = "examplelogfile"
which is a text file, and is later referred to as 'filename'. But I have to run this code on almost a thousand different files of this type, and I've moved all of these files to a certain server (I'm not running the script in terminal since I need to run it on the same server where the mysql database is located). What's the easiest way to run my code on all of the files?
Edit: I have tried this but it isn't giving me any output (it is supposed to print all the dictionaries):
import glob
files = glob.glob('~/Desktop/pythoncode/*logfile')
for filename in files:
rest of code
print dict

Try the glob module to get all files like this:
import glob
path = 'C:/Users/telli/Desktop/Test Shapes/Shapes/Squares'
filenames = glob.glob(path + '/*.gif')
for filename in filenames:
# Do something

Related

Copying files from one location of a server to another using python

Say I have a file that contains the different locations where some '.wav' files are present on a server. For example say the content of the text file location.txt containing the locations of the wav files is this
/home/user/test_audio_folder_1/audio1.wav
/home/user/test_audio_folder_2/audio2.wav
/home/user/test_audio_folder_3/audio3.wav
/home/user/test_audio_folder_4/audio4.wav
/home/user/test_audio_folder_5/audio5.wav
Now what I want to do is that I want to copy these files from different locations within the server to a particular directory within that server, for example say /home/user/final_audio_folder/ and this directory will contain all the audio files from audio1.wav to audio5.wav
I am trying to perform this task by using shutil, but the main problem with shutil that I am facing is that while copying the files, I need to name the file. I have written a demo version of what I am trying to do, but dont know how to scale it when I will be reading the paths of the '.wav' files from the txt file and copy them to my desired location using a loop.
My code for copying a single file goes as follows,
import shutil
original = r'/home/user/test_audio_folder_1/audio1.wav'
target=r'/home/user/final_audio_folder_1/final_audio1.wav'
shutil.copyfile(original,target)
Any suggestions will be really helpful. Thank you.
import shutil
i=0
with open(r'C:/Users/turing/Desktop/location.txt', "r") as infile:
for t in infile:
i+=1
x="audio"+str(i)+".wav"
t=t.rstrip('\n')
original= r'{}'.format(t)
target=r'C:/Users/turing/Desktop/audio_in/' + x
shutil.copyfile(original, target)
Use the built-in string's split() method within a for loop on the location.txt contents & split the name of the directory on the '/' character, then the last element in a new list would be your filename.

Python: How to read multiple .NBT files and export to JSON?

I am a newbie when it comes to programming and i'm looking for a way to iterate through a directory filled with .NBT files and export them to JSON.
I know i can loop thorugh a directory using os.listdir, but i have trouble reading the files one by one and deciding what steps to take in order to get it to a JSON format.
The actual assignment is to loop through a bunch of .NBT files to see which Minecraft crate is faced towards with direction.
This is what i have now:
import python_nbt.nbt as nbt
import os
for file in os.listdir("Directory to nbt files"):
if file.endswith(".nbt"):
filename = nbt.read_from_nbt_file(file)
From here i get a FileNotFoundError saying there is no such file or directory.
What am i doing wrong?
And what must be done to continue?
you didn't specified the folder the file is in:
do nbt.read_from_nbt_file('Directory to nbt files' + '\\' + file)

Trying to print name of all csv files within a given folder

I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname

taking data from files which are in folder

How do I get the data from multiple txt files that placed in a specific folder. I started with this could not fix. It gives an error like 'No such file or directory: '.idea' (??)
(Let's say I have an A folder and in that, there are x.txt, y.txt, z.txt and so on. I am trying to get and print the information from all the files x,y,z)
def find_get(folder):
for file in os.listdir(folder):
f = open(file, 'r')
for data in open(file, 'r'):
print data
find_get('filex')
Thanks.
If you just want to print each line:
import glob
import os
def find_get(path):
for f in glob.glob(os.path.join(path,"*.txt")):
with open(os.path.join(path, f)) as data:
for line in data:
print(line)
glob will find only your .txt files in the specified path.
Your error comes from not joining the path to the filename, unless the file was in the same directory you were running the code from python would not be able to find the file without the full path. Another issue is you seem to have a directory .idea which would also give you an error when trying to open it as a file. This also presumes you actually have permissions to read the files in the directory.
If your files were larger I would avoid reading all into memory and/or storing the full content.
First of all make sure you add the folder name to the file name, so you can find the file relative to where the script is executed.
To do so you want to use os.path.join, which as it's name suggests - joins paths. So, using a generator:
def find_get(folder):
for filename in os.listdir(folder):
relative_file_path = os.path.join(folder, filename)
with open(relative_file_path) as f:
# read() gives the entire data from the file
yield f.read()
# this consumes the generator to a list
files_data = list(find_get('filex'))
See what we got in the list that consumed the generator:
print files_data
It may be more convenient to produce tuples which can be used to construct a dict:
def find_get(folder):
for filename in os.listdir(folder):
relative_file_path = os.path.join(folder, filename)
with open(relative_file_path) as f:
# read() gives the entire data from the file
yield (relative_file_path, f.read(), )
# this consumes the generator to a list
files_data = dict(find_get('filex'))
You will now have a mapping from the file's name to it's content.
Also, take a look at the answer by #Padraic Cunningham . He brought up the glob module which is suitable in this case.
The error you're facing is simple: listdir returns filenames, not full pathnames. To turn them into pathnames you can access from your current working directory, you have to join them to the directory path:
for filename in os.listdir(directory):
pathname = os.path.join(directory, filename)
with open(pathname) as f:
# do stuff
So, in your case, there's a file named .idea in the folder directory, but you're trying to open a file named .idea in the current working directory, and there is no such file.
There are at least four other potential problems with your code that you also need to think about and possibly fix after this one:
You don't handle errors. There are many very common reasons you may not be able to open and read a file--it may be a directory, you may not have read access, it may be exclusively locked, it may have been moved since your listdir, etc. And those aren't logic errors in your code or user errors in specifying the wrong directory, they're part of the normal flow of events, so your code should handle them, not just die. Which means you need a try statement.
You don't do anything with the files but print out every line. Basically, this is like running cat folder/* from the shell. Is that what you want? If not, you have to figure out what you want and write the corresponding code.
You open the same file twice in a row, without closing in between. At best this is wasteful, at worst it will mean your code doesn't run on any system where opens are exclusive by default. (Are there such systems? Unless you know the answer to that is "no", you should assume there are.)
You don't close your files. Sure, the garbage collector will get to them eventually--and if you're using CPython and know how it works, you can even prove the maximum number of open file handles that your code can accumulate is fixed and pretty small. But why rely on that? Just use a with statement, or call close.
However, none of those problems are related to your current error. So, while you have to fix them too, don't expect fixing one of them to make the first problem go away.
Full variant:
import os
def find_get(path):
files = {}
for file in os.listdir(path):
if os.path.isfile(os.path.join(path,file)):
with open(os.path.join(path,file), "r") as data:
files[file] = data.read()
return files
print(find_get("filex"))
Output:
{'1.txt': 'dsad', '2.txt': 'fsdfs'}
After the you could generate one file from that content, etc.
Key-thing:
os.listdir return a list of files without full path, so you need to concatenate initial path with fount item to operate.
there could be ideally used dicts :)
os.listdir return files and folders, so you need to check if list item is really file
You should check if the file is actually file and not a folder, since you can't open folders for reading. Also, you can't just open a relative path file, since it is under a folder, so you should get the correct path with os.path.join. Check below:
import os
def find_get(folder):
for file in os.listdir(folder):
if not os.path.isfile(file):
continue # skip other directories
f = open(os.path.join(folder, file), 'r')
for line in f:
print line

Opening .out files in Python

Am I right in thinking Python cannot open and read from .out files?
My application currently spits out a bunch of .out files that would be read manually for logging purposes, I'm building a Python script to automate this.
When the script gets to the following
for file in os.listdir(DIR_NAME):
if (file.endswith('.out')):
open(file)
The script blows up with the following error "IOError : No such file or directory: 'Filename.out' "
I've a similar function with the above code and works fine, only it reads .err files. Printing out DIR_NAME before the above code also shows the correct directory is being pointed to.
os.listdir() returns only filenames, not full paths. Use os.path.join() to create a full path:
for file in os.listdir(DIR_NAME):
if (file.endswith('.out')):
open(os.path.join(DIR_NAME, file))
As an alternative that I find a bit easier and flexible to use:
import glob,os
for outfile in glob.glob( os.path.join(DIR_NAME, '*.out') ):
open(outfile)
Glob will also accept things like '*/*.out' or '*something*.out'. I also read files of certain types and have found this to be very handy.

Categories