The problem is: I have to create a file in an automatically way into some folder which have been automatically created before.
Let me explain better. First I post the code used to create the folder...
import os
from datetime import datetime
timestr = datetime.now().strftime("%Y%m%d-%H%M%S-%f")
now = datetime.now()
newDirName = now.strftime("%Y%m%d-%H%M%S-%f")
folder = os.mkdir('C:\\Users\\User\\Desktop\\' + newDirName)
This code will create a folder on Desktop with timestamp (microseconds included to make it as unique as possible..) as name.
Now I would like to create also a file (for example a txt) inside the folder. I already have the code to do it...
file = open('B' + timestr, 'w')
file.write('Exact time is: ' + timestr)
file.close()
How can I "combine" this together ? First create the folder and, near immediately, the file inside it?
Thank you. If it's still not clear, feel free to ask.
Yes, just create a directory and then immediately a file inside it. All I/O operations in Python are synchronous by default so you won't get any race conditions.
Resulting code will be (also made some improvings to your code):
import os
from datetime import datetime
timestr = datetime.now().strftime("%Y%m%d-%H%M%S-%f")
dir_path = os.path.join('C:\\Users\\User\\Desktop', timestr)
folder = os.mkdir(dir_path)
with open(os.path.join(dir_path, 'B' + timestr), 'w') as file:
file.write('Exact time is: ' + timestr)
You can also make your code (almost) cross-platform by replacing hard-coded desktop directory path with
os.path.join(os.path.expanduser('~'), 'Desktop')
Related
I'm pretty new to Azure and have been having a problem whilst trying to export to a csv. I want to rename the output file from the default part-0000-tid-12345 naming to something more recognisable. My problem is , that when I export the file it creates a Subdirectory with the filename and then within that directory I get the file. Is there a way of getting rid of the directory that's created i.e the path lookslike the write path below, but adds a directory ...outbound/cs_notes_.csv/filenmae.csv
%python
import os, sys, datetime
readPath = "/mnt/publisheddatasmets1mig/metering/smets1mig/cs/system_data_build/notes/rg"
writePath = "/mnt/publisheddatasmets1mig/metering/smets1mig/cs/system_data_build/notes/outbound"
file_list = dbutils.fs.ls(readPath)
for i in file_list:
file_path = i[0]
file_name = i[1]
file_name
Current_Date = datetime.datetime.today().strftime ('%Y-%m-%d-%H-%M-%S')
fname = "CS_Notes_" + str(Current_Date) + ".csv"
for i in file_list:
if i[1].startswith("part-00000"):
dbutils.fs.cp(readPath+"/"+file_name,writePath+"/"+fname)
dbutils.fs.rm(readPath+"/"+file_name)
Any help would be appreciated
It's not possible to do it directly to change the output file name in Apache Spark.
Spark uses Hadoop File Format, which requires data to be partitioned - that's why you have part- files. You can easily change output filename after processing just like in the SO thread.
You may refer similar SO thread, which addressed similar issue.
Hope this helps.
really new to python, was attempting to download a CSV through FTP.
I've made the connection go to the right folder, but I want to also print the tables as well.
import pandas as pd
from ftplib import FTP
ftp = FTP('f20-preview.xxx.com')
ftp.login(user='xxx_test', passwd = 'xxxxxxx')
ftp.cwd('/testfolder/')
def grabFile():
filename = 'MOCK_DATA.csv'
localfile = open(filename, 'wb')
ftp.retrbinary('RETR ' + filename, localfile.write, 1024)
data = pd.read_csv(filename)
data.head()
This causes a nameError, filename is not defined? Im ight be confusing myself so clarification would help.
In your code you are defining a function, never call it and afterwards you are expecting to find a variable defined inside that function.
One way to fix things would be to eliminate the line with def completely.
A possibly better solution would be something like this
import pandas as pd
from ftplib import FTP
# reusable method to retrieve a file
def grabFile(ftp_obj, filename):
localfile = open(filename, 'wb')
ftp.retrbinary('RETR ' + filename, localfile.write, 1024)
# connect to the ftp server
ftp = FTP('f20-preview.xxx.com')
ftp.login(user='xxx_test', passwd = 'xxxxxxx')
ftp.cwd('/testfolder/')
# then get files and work them
# having a "target file"
filename = 'MOCK_DATA.csv'
# grab the file
grabFile(ftp, filename)
# work the file
data = pd.read_csv(filename)
data.head()
# now you could still use the same connected ftp object and grab another file, and so on
You did not call your "grabfile" function. But it appears the other answers helped alleviate that issue, so I will merely share some quality-of-life code for working with data sets
I often store my data files in a separate folder from the python code, so this can help you keep things straight and organized if you'd prefer to have the input data in another folder.
import os
import pandas as pd
original_dir = os.getcwd()
os.chdir('/home/user/RestOfPath/')
data = pd.read_csv('Filename')
os.chdir(original_dir)
data.head()
Could you possibly use the absolute/full path instead of just the name for the CSV file? My guess is that it's looking in the wrong folder.
The working directory of your python script and the location which the CSV are stored need to be the same given the function you provided.
However, you do not call the function.
If you call the function and get the same error then it is likely that MOCK_DATA.csv is not in the location /testfolder/MOCK_DATA.csv you will run into issues.
The way to access this would be to delete def grabFile.
I am trying to access a file from a Box folder as I am working on two different computers. So the file path is pretty much the same except for the username.
I am trying to load a numpy array from a .npy file and I could easily change the path each time, but it would be nice if I could make it universal.
Here is what the line of code looks like on my one computer:
y_pred_walking = np.load('C:/Users/Eric/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
And here is what the line of code looks like on the other computer:
y_pred_walking = 'C:/Users/erapp/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
The only difference is that the username on one computer is Eric and the other is erapp, but is there a way where I can make the line universal to all computers where all computers will have the Box folder?
You could either save the file to a path that doesn't depend on the user: e.g. 'C:/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
Or you could do some string formatting. One way would be with an environment or configuration variable that indicates which is the relevant user, and then for your load statement:
import os
current_user = os.environ.get("USERNAME") # assuming you're running on the Windows box as the relevant user
# Now load the formatted string. f-strings are better, but this is more obvious since f-strings are still very new to Python
y_pred_walking = 'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'.format(user=current_user)
Yes, there is a way, at least for the problem as it is right now solution is pretty simple: to use f-strings
user='Eric'
y_pred_walking =np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
or more general
def pred_walking(user):
return np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
so on any machine you just do
y_pred_walking=pred_walking(user)
with defined user before, to receive the result
Simply search the folders recursivly for your file:
filename = 'y_pred_test.npy'
import os
import random
# creates 1000 directories with a 1% chance of having the file as well
for k in range(20):
for i in range(10):
for j in range(5):
os.makedirs(f"./{k}/{i}/{j}")
if random.randint(1,100) == 2:
with open(f"./{k}/{i}/{j}/{filename}","w") as f:
f.write(" ")
# search the directories for your file
found_in = []
# this starts searching in your current folder - you can give it your c:\Users\ instead
for root,dirs,files in os.walk("./"):
if filename in files:
found_in.append(os.path.join(root,filename))
print(*found_in,sep = "\n")
File found in:
./17/3/1/y_pred_test.npy
./3/8/1/y_pred_test.npy
./16/3/4/y_pred_test.npy
./16/5/3/y_pred_test.npy
./14/2/3/y_pred_test.npy
./0/5/4/y_pred_test.npy
./11/9/0/y_pred_test.npy
./9/8/1/y_pred_test.npy
If you get read errors because of missing file/directory permissions you can start directly in the users folder:
# Source: https://stackoverflow.com/a/4028943/7505395
from pathlib import Path
home = str(Path.home())
found_in = []
for root,dirs,files in os.walk(home):
if filename in files:
found_in.append(os.path.join(root,filename))
# use found_in[0] or break as soon as you find first file
You can use the expanduser function in the os.path module to modify a path to start from the home directory of a user
https://docs.python.org/3/library/os.path.html#os.path.expanduser
So I've a question, Like I'm reading the fits file and then i'm using the information from the header of the fits to define the other files which are related to the original fits file. But for some of the fits file, the other files (blaze_file, bis_file, ccf_table) are not available. And because of that my code gives the pretty obvious error that No Such file or directory.
import pandas as pd
import sys, os
import numpy as np
from glob import glob
from astropy.io import fits
PATH = os.path.join("home", "Desktop", "2d_spectra")
for filename in os.listdir(PATH):
if filename.endswith("_e2ds_A.fits"):
e2ds_hdu = fits.open(filename)
e2ds_header = e2ds_hdu[0].header
date = e2ds_header['DATE-OBS']
date2 = date = date[0:19]
blaze_file = e2ds_header['HIERARCH ESO DRS BLAZE FILE']
bis_file = glob('HARPS.' + date2 + '*_bis_G2_A.fits')
ccf_table = glob('HARPS.' + date2 + '*_ccf_G2_A.tbl')
if not all(file in os.listdir(PATH) for file in [blaze_file,bis_file,ccf_table]):
continue
So what i want to do is like, i want to make my code run only if all the files are available otherwise don't. But the problem is that, i'm defining the other files as variable inside the for loop as i'm using the header information. So how can i define them before the for loop???? and then use something like
So can anyone help me out of this?
The filenames returned by os.listdir() are always relative to the path given there.
In order to be used, they have to be joined with this path.
Example:
PATH = os.path.join("home", "Desktop", "2d_spectra")
for filename in os.listdir(PATH):
if filename.endswith("_e2ds_A.fits"):
filepath = os.path.join(PATH, filename)
e2ds_hdu = fits.open(filepath)
…
Let the filenames be ['a', 'b', 'a_ed2ds_A.fits', 'b_ed2ds_A.fits']. The code now excludes the two first names and then prepends the file path to the remaining two.
a_ed2ds_A.fits becomes /home/Desktop/2d_spectra/a_ed2ds_A.fits and
b_ed2ds_A.fits becomes /home/Desktop/2d_spectra/b_ed2ds_A.fits.
Now they can be accessed from everywhere, not just from the given file path.
I should become accustomed to reading a question in full before trying to answer it.
The problem I mentionned is a problem if you don't start the script from any path outside the said directory. Nevertheless, applying it will make your code much more consistent.
Your real problem, however, lies somewhere else: you examine a file and then, after checking its contents, want to read files whose names depend on informations from that first file.
There are several ways to accomplish your goal:
Just extend your loop with the proper tests.
Pseudo code:
for file in files:
if file.endswith("fits"):
open file
read date from header
create file names depending on date
if all files exist:
proceed
or
for file in files:
if file.endswith("fits"):
open file
read date from header
create file names depending on date
if not all files exist:
continue # actual keyword, no pseudo code!
proceed
Put some functionality into functions (variation of 1.)
Create a loop in a generator function which yields the "interesting information" of one fits file (or alternatively nothing) and have another loop run over them to actually work with the data.
If I am still missing some points or am not detailled enough, please let me know.
Since you have to read the fits file to know the other dependant files names, there's no way you can avoid reading the fit file first. The only thing you can do is test for the dependant files existance before trying to read them and skip the rest of the loop (using continue) if not.
Edit this line
e2ds_hdu = fits.open(filename)
And replace with
e2ds_hdu = fits.open(os.path.join(PATH, filename))
Could someone shed me some lights on the file path matter in Python?
For example my codes need to read a batch of files, the file names are listed and stored in a .txt file, C:\filelist.txt, which content is:
C:\1stfile.txt
C:\2ndfile.txt
C:\3rdfile.txt
C:\4thfile.txt
C:\5thfile.txt
And the codes start with:
list_open = open('c:\\aaa.txt')
read_list = list_open.read()
line_in_list = read_list.split('\n')
all run fine. But if I want to read files in another path, such as:
C:\WorkingFolder\6thfile.txt
C:\WorkingFolder\7thfile.txt
C:\WorkingFolder\8thfile.txt
C:\WorkingFolder\9thfile.txt
C:\WorkingFolder\10thfile.txt
It doesn’t work. I guess the path here C:\WorkingFolder\ is not properly put so Python cannot recognize it.
So in what way I shall put it? Thanks.
hello all,
sorry that maybe i didn't make my self clear.
the problem is, a text file, c:\aaa.txt contains below:
C:\1stfile.txt
C:\WorkingFolder\1stfile.txt
why only C:\1stfile.txt is readable, but the other one not?
The reason your program isn't working is that you're not changing the directory properly. Use os.chdir() to do so, then open the files as normal:
import os
path = "C:\\WorkingFolder\\"
# Check current working directory.
retval = os.getcwd()
print "Current working directory %s" % retval
# Now change the directory
os.chdir( path )
# Check current working directory.
retval = os.getcwd()
print "Directory changed successfully %s" % retval
REFERENCES:
http://www.tutorialspoint.com/python/os_chdir.htm
import os
BASEDIR = "c:\\WorkingFolder"
list_open = open(os.path.join(BASEDIR, 'aaa.txt'))
Simply try using forward slashes instead.
list_open = open("C:/WorkingFolder/6thfile.txt", "rt")
It works for me.