python read a multiple files in order - python

I have files in an order something like this:
H2_000.csv,
H2_001.csv,
H2_002.csv,
H2_003.csv,
H2_004.csv,
H2_005.csv.
import glob
path = 'path/H2_*.csv'
files=glob.glob(path)
for file in files:
f=open(file, 'r')
print f
output
open file 'path/H2_003.csv', mode 'r' at 0x7f3ce9eca150,
open file 'path/H2_000.csv', mode 'r' at 0x7f3ce9eca1e0,
open file 'path/H2_004.csv', mode 'r' at 0x7f3ce9eca150,
open file 'path/H2_001.csv', mode 'r' at 0x7f3ce9eca1e0,
but this reads the file randomly,
I want the file to be open in an order.
Can any one help me. thanks!

All you need to do is sort the list of files (and also always use with)
import glob
path = 'path/H2_*.csv'
files=glob.glob(path)
for file in sorted(files):
with open(file, 'r') as f:
print f

Related

How can I iterate through a list of files and open each file

I have a list of file names I am trying to iterate over each file and use a with open statement.
#list of text files
files = ['file1.txt','file2.txt','file3.txt']
for file in files:
with open(file as f ):
file.readlines()
This should work. Note that I used os.chdir() to change the working directory to the directory containing the files. If you files List contain the full path of the files, then you won't need to do this.
import os
#change working directory to the directory containing the files
os.chdir("C:\\Folder1\\Folder Containing files")
files = ['file1.txt','file2.txt','file3.txt']
content = []
for file in files:
with open(file, 'r') as f:
content.append(f.readlines()) # note that it's f.readlines() and not file.readlines()
with open(file as f ) should be changed to with open(file, 'r') as f. This specifies we want to open the file object in read mode and store this file object in read mode as the variable f.
You should also replace file.readlines() f.readlines() as file is the string of the file path rather than the file object itself.

How open file based on extension?

I want to open any .txt file in the same directory.
In ruby I can do
File.open("*.txt").each do |line|
puts line
end
In python I can't do this it will give an error
file = open("*.txt","r")
print(file.read())
file.close()
It gives an error invalid argument.
So is there any way around it?
You can directly use the glob module for this
import glob
for file in glob.glob('*.txt'):
with open(file, 'r') as f:
print(f.read())
Use os.listdir to list all files in the current directory.
all_files = os.listdir()
Then, filter the ones which have the extension you are looking for and open each one of them in a loop.
for filename in all_files:
if filename.lower().endswith('.txt'):
with open(filename, 'rt') as f:
f.read()

Copying all files of a directory to one text file in python

My intention is to copy the text of all my c# (and later aspx) files to one final text file, but it doesn't work.
For some reason, the "yo.txt" file is not created.
I know that the iteration over the files works, but I can't write the data into the .txt file.
The variable 'data' eventually does contain all text from the files . . .
*******Could it be connected to the fact that there are some non-ascii characters in the text of the c# files?
Here is my code:
import os
import sys
src_path = sys.argv[1]
os.chdir(src_path)
data = ""
for file in os.listdir('.'):
if os.path.isfile(file):
if file.split('.')[-1]=="cs" and (len(file.split('.'))==2 or len(file.split('.'))==3):
print "Copying", file
with open(file, "r") as f:
data += f.read()
print data
with open("yo.txt", "w") as f:
f.write(data)
If someone has an idea, it will be great :)
Thanks
You have to ensure the directory the file is created has sufficient write permissions, if not run
chmod -R 777 .
to make the directory writable.
import os
for r, d, f in os.walk(inputdir):
for file in f:
filelist.append(f"{r}\\{file}")
with open(outputfile, 'w') as outfile:
for f in filelist:
with open(f) as infile:
for line in infile:
outfile.write(line)
outfile.write('\n \n')

How to sequentially read all the files in a directory and export the contents in Python?

I have a directory /directory/some_directory/ and in that directory I have a set of files. Those files are named in the following format: <letter>-<number>_<date>-<time>_<dataidentifier>.log, for example:
ABC1-123_20162005-171738_somestring.log
DE-456_20162005-171738_somestring.log
ABC1-123_20162005-153416_somestring.log
FG-1098_20162005-171738_somestring.log
ABC1-123_20162005-031738_somestring.log
DE-456_20162005-171738_somestring.log
I would like to read those a subset of those files (for example, read only files named as ABC1-123*.log) and export all their contents to a single csv file (for example, output.csv), that is, a CSV file that will have all the data from the inidividual files collectively.
The code that I have written so far:
#!/usr/bin/env python
import os
file_directory=os.getcwd()
m_class="ABC1"
m_id="123"
device=m_class+"-"+m_id
for data_file in sorted(os.listdir(file_dir)):
if str(device)+"*" in os.listdir(file_dir):
print data_file
I don't know how to read a only a subset of filtered files and also how to export them to a common csv file.
How can I achieve this?
just use re lib to match file name pattern, and use csv lib to export.
Only a few adjustments, You were close
filesFromDir = os.listdir(os.getcwd())
fileList = [file for file in filesFromDir if file.startswith(device)]
f = open("LogOutput.csv", "ab")
for file in fileList:
#print "Processing", file
with open(file, "rb") as log_file:
txt = log_file.read()
f.write(txt)
f.write("\n")
f.close()
Your question could be better stated, based on your current code snipet, I'll assume that you want to:
Filter files in a directory based on glob pattern.
Concatenate their contents to a file named output.csv.
In python you can achieve (1.) by using glob to list filenames.
import glob
for filename in glob.glob('foo*bar'):
print filename
That would print all files starting with foo and ending with bar in
the current directory.
For (2.) you just read the file and write its content to your desired
output, using python's open() builtin function:
open('filename', 'r')
(Using 'r' as the mode you are asking python to open the file for
"reading", using 'w' you are asking python to open the file for
"writing".)
The final code would look like the following:
import glob
import sys
device = 'ABC1-123'
with open('output.csv', 'w') as output:
for filename in glob.glob(device+'*'):
with open(filename, 'r') as input:
output.write(input.read())
You can use the os module to list the files.
import os
files = os.listdir(os.getcwd())
m_class = "ABC1"
m_id = "123"
device = m_class + "-" + m_id
file_extension = ".log"
# filter the files by their extension and the starting name
files = [x for x in files if x.startswith(device) and x.endswith(file_extension)]
f = open("output.csv", "a")
for file in files:
with open(file, "r") as data_file:
f.write(data_file.read())
f.write(",\n")
f.close()

Python file-IO and zipfile. Trying to loop through all the files in a folder and then loop through the texts in respective file using Python

Trying to extract all the zip files and giving the same name to the folder where all the files are gonna be.
Looping through all the files in the folder and then looping through the lines within those files to write on a different text file.
This is my code so far:
#!usr/bin/env python3
import glob
import os
import zipfile
zip_files = glob.glob('*.zip')
for zip_filename in zip_files:
dir_name = os.path.splitext(zip_filename)[0]
os.mkdir(dir_name)
zip_handler = zipfile.ZipFile(zip_filename, "r")
zip_handler.extractall(dir_name)
path = dir_name
fOut = open("Output.txt", "w")
for filename in os.listdir(path):
for line in filename.read().splitlines():
print(line)
fOut.write(line + "\n")
fOut.close()
This is the error that I encounter:
for line in filename.read().splitlines():
AttributeError: 'str' object has no attribute 'read'
You need to open the file and also join the path to the file, also using splitlines and then adding a newline to each line is a bit redundant:
path = dir_name
with open("Output.txt", "w") as fOut:
for filename in os.listdir(path):
# join filename to path to avoid file not being found
with open(os.path.join(path, filename)):
for line in filename:
fOut.write(line)
You should always use with to open your files as it will close them automatically. If the files are not large you can simply fOut.write(f.read()) and remove the loop.
You also set path = dir_name which means path will be set to whatever the last value of dir_name was in your first loop which may or may not be what you want. You can also use iglob to avoid creating a full list zip_files = glob.iglob('*.zip').

Categories