python split string and list full path name

python split string and list full path name - python

I am using glob to get a list of all PDF files in a folder (I need full path names to upload file to cloud)
also, during the upload I need to assign a "title" to the file which we be the items name in the cloud.
I need to split the last "\" and the "." and get the values in between. for example:
pdf_list = glob.glob(r'C:\User\username\Desktop\pdf\*.pdf')
a item in the list will be: "c:\User\username\Desktop\pdf\4434343434331.pdf"
I need another pythonic way to grab the pdfs file name in a separate variable while still in the for loop.
for file in pdf_list:
upload.file
file.title(file.split(".")[0]
however the above split will not return my desired results but something along those lines
I am using a for loop to upload each pdf (using file path)

Actually, there is a function for this already:
for file in pdf_list:
file_name = os.path.basename(file)
upload.file(file_name)

You can use pathlib, for example:
from pathlib import Path
p = list(Path('C:/User/username/Desktop/pdf').glob('*.pdf'))
first_filename = p[0].name

Related

Bulk convert files extension using Python

Trying to develop a bulk webp to png converter using python.
Am using the webptools library (https://pypi.org/project/webptools/)
the documentation above only shows how to convert one file at each time and require user input of the file name.
So, what I am trying to do is to scan the folder for *.webp and then convert it to *.png with the original filename. I couldn't solve the output file names. I suppose with the current codes, it keeps overwriting the same file x.png, so it ended up with just 1 output file. I can't figure out how to fix this.
I am new to python. hope to get some guidance or help here. Thank you very much.
from webptools import dwebp
import os, glob
os.chdir("./images") # working directory
webp_list = []
for file in glob.glob("*.webp"):
webp_list = file
print([webp_list])
for files in webp_list:
print(dwebp(input_image=webp_list, output_image="x.png", option="-o", logging="-v"))
# documentation - code allows only 1 input and 1 output
# print(dwebp(input_image="sample.webp", output_image="sample.png", option="-o", logging="-v"))

After you do
webp_list = []
for file in glob.glob("*.webp"):
webp_list = file
print([webp_list])
webp_list is name of last file which matches, rather list of file names. glob.glob itself
Return a possibly-empty list of path names that match pathname(...)
so there is no need for such conhortion and you can simply do
webp_list = glob.glob("*.webp")
instead, then you need different output filename, for which I propose following solution
for filename in webp_list:
outname = filename[:-4] + "png"
dwebp(input_image=filename, output_image=outname, option="-o", logging="-v")
filename[:-4] means filename without 4 last characters (webp in this case), which is then concatenated with png.

I've never used this library before, so my suggestion is based just on how I guess it should work:
from webptools import dwebp
import os, glob
os.chdir("./images") # working directory
webp_list = []
for file in glob.glob("*.webp"):
output_file = file[:-4] + 'png'
dwebp(input_image=file, output_image=output_file, option="-o", logging="-v")

How to use python to create every word list?

Here is wav and image file . and you can donwload it - https://www.dropbox.com/s/iuwt6boc2r2fotc/word_images_file.zip?dl=0
1st step create word list txt file for every word.
put image name to list , and the list name is every word.
but I don't know how to write python code for create every word image list .
example:
accordion-word.txt
file 'accordion_1_musical_instruments.jpg'
file 'accordion_2_musical_instruments.jpg'
file 'accordion_3_musical_instruments.jpg'
file 'accordion_musical_instruments.jpg'
2nd step create audio file list
don't know how to use python write code to create list for every word audio.
accordion-audio.txt
file 'slience_2sec.mp3'
file 'This_is_.mp3'
file 'slience_2sec.mp3'
file 'accordion.mp3'
Thank you !

I prefer os.listdir when I only need file names - Compared to full path that glob returns when absolute path is supplied to it.
I'm making a guess that images' name without numbers in it has same prefix with numbered ones. Regex went crazy without this premis.
Here's full code that does your stuff:
from os import listdir
import re
# Getting list of all image files in directory
location = 'X:test folder/'
image_list = [name for name in listdir(location) if name.endswith('.jpg')]
# Fetching all image keywords, separated by '_x'. ignoring file without it.
reg = re.compile(r'(^[^0-9]*(?=_[0-9]))')
keywords = [reg.match(name).group(0) for name in image_list if reg.match(name)]
# Create txt files per keywords
for keyword in keywords:
filtered = [f"file '{name}'" for name in image_list if name.startswith(keyword)]
with open(location + keyword + '-word.txt', 'w') as file:
file.write('\n'.join(filtered))
# Fetching .wav audio clips
audios = [f"file '{name}'" for name in listdir(location) if name.endswith('.wav')]
# Saving audio clips list
with open(location + 'audio.txt', 'w') as file:
file.write('\n'.join(audios))
Results: On My server.
You can make keyword section way simpler by using changing file names.
for example, air-conditioner_1 instead of air_conditioner_1. Then we know we need to separate at first underscore for all files to get keywords. Much simpler.

You could use python built-in module glob.
For example to get a list of all mp3 file:
glob.glob('C:\Downloads\*.mp3')
Note that the path format in the example above is for window.

Python string alphabet removal?

So in my program, I am reading in files and processing them.
My output should say just the file name and then display some data
When I am looping through files and printing output by their name and data,
it displays for example: myfile.txt. I don't want the .txt part. just myfile.
how can I remove the .txt from the end of this string?

The best way to do it is in the example
import os
filename = 'myfile.txt'
print(filename)
print(os.path.splitext(filename))
print(os.path.splitext(filename)[0])
More info about this very useful builtin module
https://docs.python.org/3.8/library/os.path.html

The answers given are totally right, but if you have other possible extensions, or don't want to import anything, try this:
name = file_name.rsplit(".", 1)[0]

You can use pathlib.Path which has a stem attribute that returns the filename without the suffix.
>>> from pathlib import Path
>>> Path('myfile.txt').stem
'myfile'

Well if you only have .txt files you can do this
file_name = "myfile.txt"
file_name.replace('.txt', '')
This uses the built in replace functionality. You can find more info on it here!

Removing file extension from filename with file handle as input

I have the following code f = open('01-01-2017.csv')
From f variable, I need to remove the ".csv" and set the remaining "01-01-2017" to a variable called "date". what is the best way to accomplish this

just retrieve the name of the file using f.name and apply os.path.splitext, keep the left part:
import os
date = os.path.splitext(os.path.basename(f.name))[0]
(I've used os.path.basename in case the file has an absolute path)

Extracting all file names in python

I have a application that converts from one photo format to another by inputting in cmd.exe following: "AppConverter.exe" "file.tiff" "file.jpeg"
But since i don't want to input this every time i want a photo converted, i would like a script that converts all files in the folder. So far i have this:
def start(self):
for root, dirs, files in os.walk("C:\\Users\\x\\Desktop\\converter"):
for file in files:
if file.endswith(".tiff"):
subprocess.run(['AppConverter.exe', '.tiff', '.jpeg'])
So how do i get the names of all the files and put them in subprocess. I am thinking taking basename (no ext.) for every file and pasting it in .tiff and .jpeg, but im at lost on how to do it.

I think the fastest way would be to use the glob module for expressions:
import glob
import subprocess
for file in glob.glob("*.tiff"):
subprocess.run(['AppConverter.exe', file, file[:-5] + '.jpeg'])
# file will be like 'test.tiff'
# file[:-5] will be 'test' (we remove the last 5 characters, so '.tiff'
# we add '.jpeg' to our extension-less string
All those informations are on the post I've linked in the comments o your original question.

You could try looking into os.path.splitext(). That allows you to split the file name into a tuple containing the basename and extension. That might help...
https://docs.python.org/3/library/os.path.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python split string and list full path name - python

Actually, there is a function for this already: for file in pdf_list: file_name = os.path.basename(file) upload.file(file_name)

You can use pathlib, for example: from pathlib import Path p = list(Path('C:/User/username/Desktop/pdf').glob('*.pdf')) first_filename = p[0].name

Related

Bulk convert files extension using Python

How to use python to create every word list?

Python string alphabet removal?

Removing file extension from filename with file handle as input

Extracting all file names in python

Categories

Resources