Edit CSV filename in Python to append to the current filename

Edit CSV filename in Python to append to the current filename - python

I'm trying to change the name of my csv file with python. So I know that when I want a filename it gives gives a path for instance
C:/user/desktop/somefolder/[someword].csv
so what I want to do is to change this file name to something like
C:/user/desktop/somefolder/[someword] [somenumber].csv
but I don't know what that number is automatically, or the word is automatically the word was generated from another code I don't have access to, the number is generated from the python code I have. so I just want to change the file name to include the [someword] and the [somenumber] before the .csv
I have the os library for python installed incase that's a good library to use for that.

Here is the solution (no extra libs needed):
import os
somenumber = 1 # use number generated in your code
fpath = "C:/user/desktop/somefolder"
for full_fname in os.listdir(fpath):
# `someword` is a file name without an extension in that context
someword, fext = os.path.splitext(full_fname)
old_fpath = os.path.join(fpath, full_fname)
new_fpath = os.path.join(fpath, f"{someword} {somenumber}{fext}")
os.rename(old_fpath, new_fpath)

Related

Using a value as a name for csv file

I'm working with a lot of different files and after I am done editing I want to save it as a new csv file.
print(files[0])
mhofmanmusselsT1_1L.raw
# I want the csv file for this dataset to be namend waves_T1_1L.csv,
# but if I select a files[3] it would be waves_T3_2S.csv
t = files[0]
testtype = t[14:19]
name= ("waves_"+testtype)
Using the to_csv code it uses the df name as the file name. I'am quite new to python so it might be something obvious but is there a way to use
print(name)
waves_T1_1L
name = pd.DataFrame(df)
#Where name would function like if "print(name)" would be used,
#so it will automatically update if a different "files[n]" would be used.
#Unfortunately it won't allow me to do that.
UPDATE
I have figured it out it took more steps than I expected.
Name = files[0]
testtype = Name[14:19]
filename = "waves_"+ testtype
wavedata.to_csv('out.csv')
old_name= r"workdirectory/out.csv"
new_name= r"workdirectory"+filename+".csv"
os.rename(old_name,new_name)
The output file will be changed from out.csv to waves_T1_1L.csv and it is updated if a different file is selected as input.

The first argument of pandas.DataFrame.to_csv will save it to the filename you specify.
df.to_csv('NewName.csv')
If you want to save it to a new folder, you have to use the os library
import os
os.makedirs('folder/subfolder', exist_ok=True)
df.to_csv('folder/subfolder/out.csv')
Relevant Documentation: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html

Bulk convert files extension using Python

Trying to develop a bulk webp to png converter using python.
Am using the webptools library (https://pypi.org/project/webptools/)
the documentation above only shows how to convert one file at each time and require user input of the file name.
So, what I am trying to do is to scan the folder for *.webp and then convert it to *.png with the original filename. I couldn't solve the output file names. I suppose with the current codes, it keeps overwriting the same file x.png, so it ended up with just 1 output file. I can't figure out how to fix this.
I am new to python. hope to get some guidance or help here. Thank you very much.
from webptools import dwebp
import os, glob
os.chdir("./images") # working directory
webp_list = []
for file in glob.glob("*.webp"):
webp_list = file
print([webp_list])
for files in webp_list:
print(dwebp(input_image=webp_list, output_image="x.png", option="-o", logging="-v"))
# documentation - code allows only 1 input and 1 output
# print(dwebp(input_image="sample.webp", output_image="sample.png", option="-o", logging="-v"))

After you do
webp_list = []
for file in glob.glob("*.webp"):
webp_list = file
print([webp_list])
webp_list is name of last file which matches, rather list of file names. glob.glob itself
Return a possibly-empty list of path names that match pathname(...)
so there is no need for such conhortion and you can simply do
webp_list = glob.glob("*.webp")
instead, then you need different output filename, for which I propose following solution
for filename in webp_list:
outname = filename[:-4] + "png"
dwebp(input_image=filename, output_image=outname, option="-o", logging="-v")
filename[:-4] means filename without 4 last characters (webp in this case), which is then concatenated with png.

I've never used this library before, so my suggestion is based just on how I guess it should work:
from webptools import dwebp
import os, glob
os.chdir("./images") # working directory
webp_list = []
for file in glob.glob("*.webp"):
output_file = file[:-4] + 'png'
dwebp(input_image=file, output_image=output_file, option="-o", logging="-v")

How do I access a similar path to a file that only has a minor difference between computers?

I am trying to access a file from a Box folder as I am working on two different computers. So the file path is pretty much the same except for the username.
I am trying to load a numpy array from a .npy file and I could easily change the path each time, but it would be nice if I could make it universal.
Here is what the line of code looks like on my one computer:
y_pred_walking = np.load('C:/Users/Eric/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
And here is what the line of code looks like on the other computer:
y_pred_walking = 'C:/Users/erapp/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
The only difference is that the username on one computer is Eric and the other is erapp, but is there a way where I can make the line universal to all computers where all computers will have the Box folder?

You could either save the file to a path that doesn't depend on the user: e.g. 'C:/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
Or you could do some string formatting. One way would be with an environment or configuration variable that indicates which is the relevant user, and then for your load statement:
import os
current_user = os.environ.get("USERNAME") # assuming you're running on the Windows box as the relevant user
# Now load the formatted string. f-strings are better, but this is more obvious since f-strings are still very new to Python
y_pred_walking = 'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'.format(user=current_user)

Yes, there is a way, at least for the problem as it is right now solution is pretty simple: to use f-strings
user='Eric'
y_pred_walking =np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
or more general
def pred_walking(user):
return np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
so on any machine you just do
y_pred_walking=pred_walking(user)
with defined user before, to receive the result

Simply search the folders recursivly for your file:
filename = 'y_pred_test.npy'
import os
import random
# creates 1000 directories with a 1% chance of having the file as well
for k in range(20):
for i in range(10):
for j in range(5):
os.makedirs(f"./{k}/{i}/{j}")
if random.randint(1,100) == 2:
with open(f"./{k}/{i}/{j}/{filename}","w") as f:
f.write(" ")
# search the directories for your file
found_in = []
# this starts searching in your current folder - you can give it your c:\Users\ instead
for root,dirs,files in os.walk("./"):
if filename in files:
found_in.append(os.path.join(root,filename))
print(*found_in,sep = "\n")
File found in:
./17/3/1/y_pred_test.npy
./3/8/1/y_pred_test.npy
./16/3/4/y_pred_test.npy
./16/5/3/y_pred_test.npy
./14/2/3/y_pred_test.npy
./0/5/4/y_pred_test.npy
./11/9/0/y_pred_test.npy
./9/8/1/y_pred_test.npy
If you get read errors because of missing file/directory permissions you can start directly in the users folder:
# Source: https://stackoverflow.com/a/4028943/7505395
from pathlib import Path
home = str(Path.home())
found_in = []
for root,dirs,files in os.walk(home):
if filename in files:
found_in.append(os.path.join(root,filename))
# use found_in[0] or break as soon as you find first file

You can use the expanduser function in the os.path module to modify a path to start from the home directory of a user
https://docs.python.org/3/library/os.path.html#os.path.expanduser

A way to create files and directories without overwriting

You know how when you download something and the downloads folder contains a file with the same name, instead of overwriting it or throwing an error, the file ends up with a number appended to the end? For example, if I want to download my_file.txt, but it already exists in the target folder, the new file will be named my_file(2).txt. And if I try again, it will be my_file(3).txt.
I was wondering if there is a way in Python 3.x to check that and get a unique name (not necessarily create the file or directory). I'm currently implementing it doing this:
import os
def new_name(name, newseparator='_')
#name can be either a file or directory name
base, extension = os.path.splitext(name)
i = 2
while os.path.exists(name):
name = base + newseparator + str(i) + extension
i += 1
return name
In the example above, running new_file('my_file.txt') would return my_file_2.txt if my_file.txt already exists in the cwd. name can also contain the full or relative path, it will work as well.

I would use PathLib and do something along these lines:
from pathlib import Path
def new_fn(fn, sep='_'):
p=Path(fn)
if p.exists():
if not p.is_file():
raise TypeError
np=p.resolve(strict=True)
parent=str(np.parent)
extens=''.join(np.suffixes) # handle multiple ext such as .tar.gz
base=str(np.name).replace(extens,'')
i=2
nf=parent+base+sep+str(i)+extens
while Path(nf).exists():
i+=1
nf=parent+base+sep+str(i)+extens
return nf
else:
return p.parent.resolve(strict=True) / p
This only handles files as written but the same approach would work with directories (which you added later.) I will leave that as a project for the reader.

Another way of getting a new name would be using the built-in tempfile module:
from pathlib import Path
from tempfile import NamedTemporaryFile
def new_path(path: Path, new_separator='_'):
prefix = str(path.stem) + new_separator
dir = path.parent
suffix = ''.join(path.suffixes)
with NamedTemporaryFile(prefix=prefix, suffix=suffix, delete=False, dir=dir) as f:
return f.name
If you execute this function from within Downloads directory, you will get something like:
>>> new_path(Path('my_file.txt'))
'/home/krassowski/Downloads/my_file_90_lv301.txt'
where the 90_lv301 part was generated internally by the Python's tempfile module.
Note: with the delete=False argument, the function will create (and leave undeleted) an empty file with the new name. If you do not want to have an empty file created that way, just remove the delete=False, however keeping it will prevent anyone else from creating a new file with such name before your next operation (though they could still overwrite it).
Simply put, having delete=False prevents concurrency issues if you (or the end-user) were to run your program twice at the same time.

Run only if "if " statement is true.!

So I've a question, Like I'm reading the fits file and then i'm using the information from the header of the fits to define the other files which are related to the original fits file. But for some of the fits file, the other files (blaze_file, bis_file, ccf_table) are not available. And because of that my code gives the pretty obvious error that No Such file or directory.
import pandas as pd
import sys, os
import numpy as np
from glob import glob
from astropy.io import fits
PATH = os.path.join("home", "Desktop", "2d_spectra")
for filename in os.listdir(PATH):
if filename.endswith("_e2ds_A.fits"):
e2ds_hdu = fits.open(filename)
e2ds_header = e2ds_hdu[0].header
date = e2ds_header['DATE-OBS']
date2 = date = date[0:19]
blaze_file = e2ds_header['HIERARCH ESO DRS BLAZE FILE']
bis_file = glob('HARPS.' + date2 + '*_bis_G2_A.fits')
ccf_table = glob('HARPS.' + date2 + '*_ccf_G2_A.tbl')
if not all(file in os.listdir(PATH) for file in [blaze_file,bis_file,ccf_table]):
continue
So what i want to do is like, i want to make my code run only if all the files are available otherwise don't. But the problem is that, i'm defining the other files as variable inside the for loop as i'm using the header information. So how can i define them before the for loop???? and then use something like
So can anyone help me out of this?

The filenames returned by os.listdir() are always relative to the path given there.
In order to be used, they have to be joined with this path.
Example:
PATH = os.path.join("home", "Desktop", "2d_spectra")
for filename in os.listdir(PATH):
if filename.endswith("_e2ds_A.fits"):
filepath = os.path.join(PATH, filename)
e2ds_hdu = fits.open(filepath)
…
Let the filenames be ['a', 'b', 'a_ed2ds_A.fits', 'b_ed2ds_A.fits']. The code now excludes the two first names and then prepends the file path to the remaining two.
a_ed2ds_A.fits becomes /home/Desktop/2d_spectra/a_ed2ds_A.fits and
b_ed2ds_A.fits becomes /home/Desktop/2d_spectra/b_ed2ds_A.fits.
Now they can be accessed from everywhere, not just from the given file path.
I should become accustomed to reading a question in full before trying to answer it.
The problem I mentionned is a problem if you don't start the script from any path outside the said directory. Nevertheless, applying it will make your code much more consistent.
Your real problem, however, lies somewhere else: you examine a file and then, after checking its contents, want to read files whose names depend on informations from that first file.
There are several ways to accomplish your goal:
Just extend your loop with the proper tests.
Pseudo code:
for file in files:
if file.endswith("fits"):
open file
read date from header
create file names depending on date
if all files exist:
proceed
or
for file in files:
if file.endswith("fits"):
open file
read date from header
create file names depending on date
if not all files exist:
continue # actual keyword, no pseudo code!
proceed
Put some functionality into functions (variation of 1.)
Create a loop in a generator function which yields the "interesting information" of one fits file (or alternatively nothing) and have another loop run over them to actually work with the data.
If I am still missing some points or am not detailled enough, please let me know.

Since you have to read the fits file to know the other dependant files names, there's no way you can avoid reading the fit file first. The only thing you can do is test for the dependant files existance before trying to read them and skip the rest of the loop (using continue) if not.

Edit this line
e2ds_hdu = fits.open(filename)
And replace with
e2ds_hdu = fits.open(os.path.join(PATH, filename))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Edit CSV filename in Python to append to the current filename - python

Related

Using a value as a name for csv file

Bulk convert files extension using Python

How do I access a similar path to a file that only has a minor difference between computers?

A way to create files and directories without overwriting

Run only if "if " statement is true.!

Categories

Resources