Creating new CSV files from inputs in different directory - python

I'm simply trying to alter a CSV file with Python.
When all the files were in the same dir, everything was fine.
Now that the input files are in a different dir than where the output files will be, everything blows up, apparently b/c the files do not exist?
I first found this:
open() in Python does not create a file if it doesn't exist
Then I learned to change to the directory, which helped me loop over the CSVs in the target dir:
Moving up one directory in Python
When I run the command:
python KWRG.py ../Weekly\ Reports\ -\ Inbound/Search\ Activity/ 8/9/2021
I will get:
Traceback (most recent call last): File "KWRG.py", line 15, in <module> with open(args.input, 'r') as in_file, open(args.output, 'w') as out_file: IsADirectoryError: [Errno 21] Is a directory: '../Weekly Reports - Inbound/Search Activity/'
Sorry If I'm missing the obvious here, but why is the file not being created in the directory that I'm pointing the script to (or at all for that matter)?
The code:
import csv
import argparse
import os
# Create a parser to take arguments
#...snip...
cur_dir = os.getcwd()
reports_dir = os.chdir(os.path.join(cur_dir, args.dir))
for csv_file in os.listdir(reports_dir):
# Shorthand the name of the file
#...snip...
# Open the in and out files
with open(csv_file, 'r') as in_file, open(f'{out_name}-Search-Activity-{args.date}.csv', 'w+') as out_file:
# Re-arrange CSV
# EOF

Your problem is with this line:
reports_dir = os.chdir(os.path.join(cur_dir, args.dir))
os.chdir() doesn't return anything, it just performs the operation requested - changing the current working directory. From an interactive session with the REPL:
>>> import os
>>> result = os.chdir("/Users/mattdmo/")
>>> result
>>>
For your purposes, all you need is
reports_dir = os.path.join(cur_dir, args.dir)
and you'll be all set.

Related

How to get the path of the file calling a function of an imported file from within the imported file

I'm trying to make parsing CSVs a little easier on me later down the road so I've created a small file to allow me to run parse_csv.toList('data.csv') and return a list to my script. Here is what the parse_csv.py imported file looks like:
parse_csv.py
import csv
def toList(file_location_name):
result_list = []
with open(file_location_name) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
result_list.append(row)
return result_list
This is how I'm calling it in my scripts that are trying to utilize that file:
import-test.py
import parse_csv
print(
parse_csv.toList('../data.csv')
)
I'm getting the following error when I run import-test.py:
Error
Traceback (most recent call last):
File "{system path placeholder}\directory-test\import-test.py", line 5, in <module>
parse_csv.toList('../data.csv')
File "{system path placeholder}\parse_csv.py", line 6, in toList
with open(file_location_name) as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: '../data.csv'
My current project directory structure looks like this
Project
|
|--parse_csv.py
|--data.csv
|--directory-test
|
|--import-test.py
My first thought is that when I call open, '../data.csv' is being relatively referenced according to the parse_csv.py file instead of the intended import-test.py file.
I just want to make it so parse_csv.py can be imported anywhere and it will respect relative file paths in the calling file.
Please let me know if I need to be more clear. I know my wording may be confusing here.
Edit for clarity: The goal is to only call parse_csv.toList() and have it accept a string of a relative path to the file that called it.
You can have your parse_csv.toList function accept a file object instead of a file path. This way you open a file, give that to the module and it will work. Something like:
import parse_csv
with open('../data.csv') as csvFile:
print(parse_csv.toList(csvFile))
Or you can convert the relative path to absolute path before call toList. Refer How to get an absolute file path in Python. It'll just add one extra line.
In import-test.py,
import os.path
import parse_csv
# to retrieve import-test.py's own absolute path
abs_path = os.path.abspath(__file__)
# its dir_path
dir_path = os.path.dirname(abs_path)
# data.csv's path
csv_path = os.path.join(dir_path, '..', 'data.csv')
# use the path
print(
parse_csv.toList(csv_path)
)

Proper way of reading in files from a directory using Python 2.6 in bash shell

I am trying to read in files for text processing.
The idea is to run them through Hadoop pseudo distributed file system on my virtual machine using map-reduce code that I am writing. The interface is Ubuntu Linux, I am running Python 2.6 with the installation. I need to use sys.stdin for reading in the files, and sys.stdout so I pass from mapper to reducer.
Here is my test code for the mapper:
#!/usr/bin/env python
import sys
import string
import glob
import os
files = glob.glob(sys.stdin)
for file in files:
with open(file) as infile:
txt = infile.read()
txt = txt.split()
print(txt)
I'm not sure how glob works with sys.stdin and I get the following errors:
After testing with piping:
[training#localhost data]$ cat test | ./mapper.py
I get this:
cat: test: Is a directory
Traceback (most recent call last):
File "./mapper.py", line 8, in <module>
files = glob.glob(sys.stdin)
File "/usr/lib64/python2.6/glob.py", line 16, in glob
return list(iglob(pathname))
File "/usr/lib64/python2.6/glob.py", line 24, in iglob
if not has_magic(pathname):
File "/usr/lib64/python2.6/glob.py", line 78, in has_magic
return magic_check.search(s) is not None
TypeError: expected string or buffer
For the moment, I am just trying to read in three small .txt files in one directory.
Thanks!
Still I do not fully understand what is your expected output (list or plain
text), the following would work:
#!/usr/bin/env python
import sys, glob
dir = sys.stdin.read().rstrip('\r\n')
files = glob.glob(dir + '/*')
for file in files:
with open(file) as infile:
txt = infile.read()
txt = txt.split()
print(txt)
Then execute with:
echo "test" | ./mapper.py
My recommendation is to feed the directory name via the command line argument, not via the stdin as above.
If you want to tweak the format of the output, please let me know.
Hope this helps.
files = os.listdir(path)
Use this to list all files and then apply for loop.

FileNotFoundError but file exists

I am creating a Python application that imports many JSON files. The files are in the same folder as the python script's location. Before I moved the entire folder someplace else, the files imported perfectly. Since the script creates a files if none exists, it keeps creating the file in the home directory while ignoring the one in the same folder as it is in. When I specify an absolute path (code below):
startT= time()
with open('~/Documents/CincoMinutos-master/settings.json', 'a+') as f:
f.seek(0,0) # places pointer at start of file
corrupted = False
try:
# turns all json info into vars with load
self.s_settings = json.load(f)
self.s_allVerbs = []
# --- OFFLINE MODE INIT ---
if self.s_settings['Offline Mode']: # conjugation file reading only happens if setting is on
with open('~/Documents/CincoMinutos-master/verbconjugations.json', 'r+', encoding='utf-8') as f2:
self.s_allVerbs = [json.loads(line) for line in f2]
# --- END OFFLINE MODE INIT ---
for key in self.s_settings:
if not isinstance(self.s_settings[key], type(self.s_defaultSettings[key])): corrupted = True
except Exception as e: # if any unexpected error occurs
corrupted = True
print('File is corrupted!\n',e)
if corrupted or not len(self.s_settings):
f.truncate(0) # if there are any errors, reset & recreate the file
json.dump(self.s_defaultSettings, f, indent=2, ensure_ascii=False)
self.s_settings = {key: self.s_defaultSettings[key] for key in self.s_defaultSettings}
# --- END FILE & SETTINGS VAR INIT ---
print("Finished loading file in {:4f} seconds".format(time()-startT))
It spits out a FileNotFound error.
Traceback (most recent call last):
File "/Users/23markusz/Documents/CincoMinutos-master/__main__.py", line 709, in <module>
frame = CincoMinutos(root)
File "/Users/23markusz/Documents/CincoMinutos-master/__main__.py", line 42, in __init__
with open('~/Documents/CincoMinutos-master/settings.json', 'a+') as f:
FileNotFoundError: [Errno 2] No such file or directory: '~/Documents/CincoMinutos-master/settings.json'
Keep in mind that I am perfectly able to access it with the same absolute path when I operate from terminal. Can somebody please explain what I need to do in order for the files to import correctly?
Also, I am creating this application for multiple users. While /Users/23markusz/Documents/CincoMinutos-master/verbconjugations.json does work, it will not on another user's system. This file is also in the SAME FOLDER as the script so it should import correctly.
UPDATE:
While my issue is solved using os.path.expanduser(), I still do not understand why python refuses to open a file that is within the same folder as the python script. It should automatically open the file with just the filename and not the absolute path.
"~" isn't a real directory (and would not qualify as an "absolute path"), and that's why the open doesn't work.
In order to expand the tilde to an actual directory (e.g. /Users/23markusz), you can use os.path.expanduser:
import os
...
with open(os.path.expanduser('~/Documents/CincoMinutos-master/settings.json'), 'a+') as f:
# Do stuff

CSV file creation error in python

I am getting some error while writing contents to csv file in python
import sys
reload(sys)
sys.setdefaultencoding('utf8')
import csv
a = [['1/1/2013', '1/7/2013'], ['1/8/2013', '1/14/2013'], ['1/15/2013', '1/21/2013'], ['1/22/2013', '1/28/2013'], ['1/29/2013', '1/31/2013']]
f3 = open('test_'+str(a[0][0])+'_.csv', 'at')
writer = csv.writer(f3,delimiter = ',', lineterminator='\n',quoting=csv.QUOTE_ALL)
writer.writerow(a)
Error
Traceback (most recent call last):
File "test.py", line 10, in <module>
f3 = open('test_'+str(a[0][0])+'_.csv', 'at')
IOError: [Errno 2] No such file or directory: 'test_1/1/2013_.csv'
How to fix it and what is the error?
You have error message - just read it.
The file test_1/1/2013_.csv doesn't exist.
In the file name that you create - you use a[0][0] and in this case it result in 1/1/2013.
Probably this two signs '/' makes that you are looking for this file in bad directory.
Check where are this file (current directory - or in .test_1/1 directory.
It's probably due to the directory not existing - Python will create the file for you if it doesn't exist already, but it won't automatically create directories.
To ensure the path to a file exists you can combine os.makedirs and os.path.dirname.
file_name = 'test_'+str(a[0][0])+'_.csv'
# Get the directory the file resides in
directory = os.path.dirname(file_name)
# Create the directories
os.makedirs(directory)
# Open the file
f3 = open(file_name, 'at')
If the directories aren't desired you should replace the slashes in the dates with something else, perhaps a dash (-) instead.
file_name = 'test_' + str(a[0][0]).replace('/', '-') + '_.csv'

Relative paths break when executing Python script from Windows batch?

My Python script works perfectly if I execute it directly from the directory it's located in. However if I back out of that directory and try to execute it from somewhere else (without changing any code or file locations), all the relative paths break and I get a FileNotFoundError.
The script is located at ./scripts/bin/my_script.py. There is a directory called ./scripts/bin/data/. Like I said, it works absolutely perfectly as long as I execute it from the same directory... so I'm very confused.
Successful Execution (in ./scripts/bin/): python my_script.py
Failed Execution (in ./scripts/): Both python bin/my_script.py and python ./bin/my_script.py
Failure Message:
Traceback (most recent call last):
File "./bin/my_script.py", line 87, in <module>
run()
File "./bin/my_script.py", line 61, in run
load_data()
File "C:\Users\XXXX\Desktop\scripts\bin\tables.py", line 12, in load_data
DATA = read_file("data/my_data.txt")
File "C:\Users\XXXX\Desktop\scripts\bin\fileutil.py", line 5, in read_file
with open(filename, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'data/my_data.txt'
Relevant Python Code:
def read_file(filename):
with open(filename, "r") as file:
lines = [line.strip() for line in file]
return [line for line in lines if len(line) == 0 or line[0] != "#"]
def load_data():
global DATA
DATA = read_file("data/my_data.txt")
Yes, that is logical. The files are relative to your working directory. You change that by running the script from a different directory.
What you could do is take the directory of the script you are running at run time and build from that.
import os
def read_file(filename):
#get the directory of the current running script. "__file__" is its full path
path, fl = os.path.split(os.path.realpath(__file__))
#use path to create the fully classified path to your data
full_path = os.path.join(path, filename)
with open(full_path, "r") as file:
#etc
Your resource files are relative to your script. This is OK, but you need to use
os.path.realpath(__file__)
or
os.path.dirname(sys.argv[0])
to obtain the directory where the script is located. Then use os.path.join() or other function to generate the paths to the resource files.

Categories