Get the size of a folder in Linux server - python

While the following code works well in windows, in Linux server (of pythonanywhere) the function only returns 0, without errors. What am I missing?
import os
def folder_size(path):
total = 0
for entry in os.scandir(path):
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += folder_size(entry.path)
return total
print(folder_size("/media"))
Ref: Code from https://stackoverflow.com/a/37367965/6546440

The solution was given by #gilen-tomas in the comments:
import os
def folder_size(path):
total = 0
for entry in os.scandir(path):
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += folder_size(entry.path)
return total
print(folder_size("/home/your-user/your-proyect/media/"))
A complete path is needed!

Depending on the filesystem, the underlying struct dirent may not know if any given entry is a file or directory (or something else). Perhaps, on the filesystem used by pythonanywhere, you need to stat first (stat_result.st_type ought to be valid).
Edit: A look in discussion on os.scandir suggests the DT_UNKNOWN case is handled by doing another stat. I'd still try confirming those checks work as expected.

you can try this..
For linux :
import os
path = '/home/user/Downloads'
folder = sum([sum(map(lambda fname: os.path.getsize(os.path.join(directory, fname)), files)) for directory, folders, files in os.walk(path)])
MB=1024*1024.0
print "%.2f MB"%(folder/MB)
For windows :
import win32com.client as com
folderPath = r"/home/user/Downloads"
fso = com.Dispatch("Scripting.FileSystemObject")
folder = fso.GetFolder(folderPath)
MB=1024*1024.0
print "%.2f MB"%(folder.Size/MB)

It's worked for me in linux (Ubuntu server 16.04, python 3.5), but there could be some permission errors if the process doesn't have permission for reading a file.

Not a solution for this, but other way to get the size is using the cmd from python:
import subprocess
import re
cmd = ["du", "-sh", "-b", "media"]
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
tmp = str(proc.stdout.read())
tmp = re.findall('\d+', tmp)[0]
print(tmp)
If you are executing this from your proyect (instead of manually in the terminal) a complete path is needed in "media" ("/home/your-user/your-proyect/media/")

Related

Open installed apps on Windows intelligently

I am coding a voice assistant to automate my pc which is running Windows 11 and I want to open apps using voice commands, I don't want to hard code every installed app's .exe path. Is there any way to get a dictionary of the app's name and their .exe path. I am able to get currently running apps and close them using this:
def close_app(app_name):
running_apps=psutil.process_iter(['pid','name'])
found=False
for app in running_apps:
sys_app=app.info.get('name').split('.')[0].lower()
if sys_app in app_name.split() or app_name in sys_app:
pid=app.info.get('pid')
try:
app_pid = psutil.Process(pid)
app_pid.terminate()
found=True
except: pass
else: pass
if not found:
print(app_name + " is not running")
else:
print('Closed ' + app_name)
Possibly using both wmic and use either which or gmc to grab the path and build the dict?
Following is a very basic code, not tested completely.
import subprocess
import shutil
Data = subprocess.check_output(['wmic', 'product', 'get', 'name'])
a = str(Data)
appsDict = {}
x = (a.replace("b\\'Name","").split("\\r\\r\\n"))
for i in range(len(x) - 1):
appName = x[i+1].rstrip()
appPath = shutil.which(appName)
appsDict.update({appName: appPath})
print(appsDict)
Under Windows PowerShell there is a Get-Command utility. Finding Windows executables using Get-Command is described nicely in this issue. Essentially it's just running
Get-Command *
Now you need to use this from python to get the results of command as a variable. This can be done by
import subprocess
data = subprocess.check_output(['Get-Command', '*'])
Probably this is not the best, and not a complete answer, but maybe it's a useful idea.
This can be accomplished via the following code:
import os
def searchfiles(extension, folder):
with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
filewrite.write(f"{r + file}\n")
searchfiles('.exe', 'H:\\')
Inspired from: https://pythonprogramming.altervista.org/find-all-the-files-on-your-computer/

Why i got error "The system cannot find the path specified." in subprocess python

import subprocess
import os
....
....
opn=open(ak,'w')
tt=txt.get(1.0,END)
opn.write(tt)
lst=[]
for i in range(0,len(ak)):
if ak[i]=='/':
lst.append(i)
else:
pass
val=lst[-1]+1
path=ak
file_name=path[val:]
sudo_path_name=path[3:val-1]
dir_name=path[:2]
path_name="cd "
for i in sudo_path_name:
if i=='/':
path_name+='\\'
else:
path_name+=i
command=dir_name+'&&'+path_name+'&&'+file_name
os.system(command)
output = subprocess.getoutput(command)
print(output)
ak is path of a file
My aim is just print the output when ak execute..
But whenever I tried to execute it give output as
The system cannot find the path specified.
The system cannot find the path specified.
When i get command and run with commmand prompt it executes successfully with no error..
Thank You
This depends on the editor. It happened to me too I used VS-code but when I changed cwd it worked! So it may be that problem.
Also try putting the full path or try this code:-
from pathlib import Path
import os
os.chdir(Path(__file__).parent)

Trying to create a run Key in the Registry using Python

Hey guys I am using this code from a book. For the code it does take putty and move it to the document folder, but it does not end up putting in the registry key. I am running it python version 2.7 on a windows 7 64 bit machine.
import os # needed for getting working directory
import shutil # needed for file copying
import subprocess # needed for getting user profile
import _winreg as wreg # needed for editing registry DB
path = os.getcwd().strip('/n') #Get current working directory where the backdoor gets executed, we use the output to build our source path
Null,userprof = subprocess.check_output('set USERPROFILE', shell=True).split('=')
destination = userprof.strip('\n\r') + '\\Documents\\' +'putty.exe'
if not os.path.exists(destination):
shutil.copyfile(path+'\putty.exe', destination)
key = wreg.OpenKey(wreg.HKEY_CURRENT_USER, "Software\Microsoft\Windows\CurrentVersion\Run",0,
wreg.KEY_ALL_ACCESS)
wreg.SetValueEx(key, 'RegUpdater', 0, wreg.REG_SZ,destination)
key.Close()
I run it in python 3 and it works well
path = os.getcwd().strip('\n')
Null, userprof = subprocess.check_output('set USERPROFILE', shell=True, stdin=subprocess.PIPE,
stderr=subprocess.PIPE).decode().split('=')
destination = userprof.strip('\n\r') + '\\Documents\\' + 'client.exe'
if not os.path.exists(destination):
shutil.copyfile(path + '\client.exe', destination)
key = wreg.OpenKey(wreg.HKEY_CURRENT_USER, "Software\Microsoft\Windows\CurrentVersion\Run", 0, wreg.KEY_ALL_ACCESS)
wreg.SetValueEx(key,'RegUpdater', 0 , wreg.REG_SZ, destination)
key.Close()

Get total length of videos in a particular directory in python

I have downloaded a bunch of videos from coursera.org and have them stored in one particular folder. There are many individual videos in a particular folder (Coursera breaks a lecture into multiple short videos). I would like to have a python script which gives the combined length of all the videos in a particular directory. The video files are .mp4 format.
First, install the ffprobe command (it's part of FFmpeg) with
sudo apt install ffmpeg
then use subprocess.run() to run this bash command:
ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 -- <filename>
(which I got from http://trac.ffmpeg.org/wiki/FFprobeTips#Formatcontainerduration), like this:
from pathlib import Path
import subprocess
def video_length_seconds(filename):
result = subprocess.run(
[
"ffprobe",
"-v",
"error",
"-show_entries",
"format=duration",
"-of",
"default=noprint_wrappers=1:nokey=1",
"--",
filename,
],
capture_output=True,
text=True,
)
try:
return float(result.stdout)
except ValueError:
raise ValueError(result.stderr.rstrip("\n"))
# a single video
video_length_seconds('your_video.webm')
# all mp4 files in the current directory (seconds)
print(sum(video_length_seconds(f) for f in Path(".").glob("*.mp4")))
# all mp4 files in the current directory and all its subdirectories
# `rglob` instead of `glob`
print(sum(video_length_seconds(f) for f in Path(".").rglob("*.mp4")))
# all files in the current directory
print(sum(video_length_seconds(f) for f in Path(".").iterdir() if f.is_file()))
This code requires Python 3.7+ because that's when text= and capture_output= were added to subprocess.run. If you're using an older Python version, check the edit history of this answer.
Download MediaInfo and install it (don't install the bundled adware)
Go to the MediaInfo source downloads and in the "Source code, All included" row, choose the link next to "libmediainfo"
Find MediaInfoDLL3.py in the downloaded archive and extract it anywhere.
Example location: libmediainfo_0.7.62_AllInclusive.7z\MediaInfoLib\Source\MediaInfoDLL\MediaInfoDLL3.py
Now make a script for testing (sources below) in the same directory.
Execute the script.
MediaInfo works on POSIX too. The only difference is that an so is loaded instead of a DLL.
Test script (Python 3!)
import os
os.chdir(os.environ["PROGRAMFILES"] + "\\mediainfo")
from MediaInfoDLL3 import MediaInfo, Stream
MI = MediaInfo()
def get_lengths_in_milliseconds_of_directory(prefix):
for f in os.listdir(prefix):
MI.Open(prefix + f)
duration_string = MI.Get(Stream.Video, 0, "Duration")
try:
duration = int(duration_string)
yield duration
print("{} is {} milliseconds long".format(f, duration))
except ValueError:
print("{} ain't no media file!".format(f))
MI.Close()
print(sum(get_lengths_in_milliseconds_of_directory(os.environ["windir"] + "\\Performance\\WinSAT\\"
)), "milliseconds of content in total")
In addition to Janus Troelsen's answer above, I would like to point out a small problem I
encountered when implementing his answer. I followed his instructions one by one but had different results on windows (7) and linux (ubuntu). His instructions worked perfectly under linux but I had to do a small hack to get it to work on windows. I am using a 32-bit python 2.7.2 interpreter on windows so I utilized MediaInfoDLL.py. But that was not enough to get it to work for me I was receiving this error at this point in the process:
"WindowsError: [Error 193] %1 is not a valid Win32 application".
This meant that I was somehow using a resource that was not 32-bit, it had to be the DLL MediaInfoDLL.py was loading. If you look at the MediaInfo intallation directory you will see 3 dlls MediaInfo.dll is 64-bit while MediaInfo_i386.dll is 32-bit. MediaInfo_i386.dll is the one which I had to use because of my python setup. I went to
MediaInfoDLL.py (which I already had included in my project) and changed this line:
MediaInfoDLL_Handler = windll.MediaInfo
to
MediaInfoDLL_Handler = WinDLL("C:\Program Files (x86)\MediaInfo\MediaInfo_i386.dll")
I didn't have to change anything for it to work in linux
Nowadays pymediainfo is available, so Janus Troelsen's answer could be simplified.
You need to install MediaInfo and pip install pymediainfo. Then the following code would print you the total length of all video files:
import os
from pymediainfo import MediaInfo
def get_track_len(file_path):
media_info = MediaInfo.parse(file_path)
for track in media_info.tracks:
if track.track_type == "Video":
return int(track.duration)
return 0
print(sum(get_track_len(f) for f in os.listdir('directory with video files')))
This link shows how to get the length of a video file https://stackoverflow.com/a/3844467/735204
import subprocess
def getLength(filename):
result = subprocess.Popen(["ffprobe", filename],
stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
return [x for x in result.stdout.readlines() if "Duration" in x]
If you're using that function, you can then wrap it up with something like
import os
for f in os.listdir('.'):
print "%s: %s" % (f, getLength(f))
Here's my take. I did this on Windows. I took the answer from Federico above, and changed the python program a little bit to traverse a tree of folders with video files. So you need to go above to see Federico's answer, to install MediaInfo and to pip install pymediainfo, and then write this program, summarize.py:
import os
import sys
from pymediainfo import MediaInfo
number_of_video_files = 0
def get_alternate_len(media_info):
myJson = media_info.to_data()
myArray = myJson['tracks']
for track in myArray:
if track['track_type'] == 'General' or track['track_type'] == 'Video':
if 'duration' in track:
return int(track['duration'] / 1000)
return 0
def get_track_len(file_path):
global number_of_video_files
media_info = MediaInfo.parse(file_path)
for track in media_info.tracks:
if track.track_type == "Video":
number_of_video_files += 1
if type(track.duration) == int:
len_in_sec = int(track.duration / 1000)
elif type(track.duration) == str:
len_in_sec = int(float(track.duration) / 1000)
else:
len_in_sec = get_alternate_len(media_info)
if len_in_sec == 0:
print("File path = " + file_path + ", problem in type of track.duration")
return len_in_sec
return 0
sum_in_secs = 0.0
os.chdir(sys.argv[1])
for root, dirs, files in os.walk("."):
for name in files:
sum_in_secs += get_track_len(os.path.join(root, name))
hours = int(sum_in_secs / 3600)
remain = sum_in_secs - hours * 3600
minutes = int(remain / 60)
seconds = remain - minutes * 60
print("Directory: " + sys.argv[1])
print("Total number of video files is " + str(number_of_video_files))
print("Length: %d:%02d:%02d" % (hours, minutes, seconds))
Run it: python summarize.py <DirPath>
Have fun. I found I have about 1800 hours of videos waiting for me to have some free time. Yeah sure

Determine Device of Filesystem in Python

How do you use Python to determine which Linux device/partition contains a given filesystem?
e.g.
>>> get_filesystem_device('/')
/dev/sda
>>> get_filesystem_partition('/')
/dev/sda1
Your question was about Linux, so this is (more or less) linux specific.
Below is code example for three variants for mapping major/minor to a device name.
Parse /proc/partitions.
Ask hal. Hal also keeps track of "parent" device, meaning you can easily get the disk aswell as the partition.
Check sysfs yourself. This is where hal gets its information from.
I'd say that /proc/partitions is simplest - it is just one file to open and check. hal gives you most information, and abstracts away lots of details. sysfs may be viewed as more correct that /proc/partitions and doesn't require hal to be running.
For a desktop program I would go for hal. On an embedded system I'd go with sysfs.
import os
def main():
dev = os.stat("/home/").st_dev
major, minor = os.major(dev), os.minor(dev)
print "/proc/partitions says:", ask_proc_partitions(major, minor)
print "HAL says:", ask_hal(major, minor)
print "/sys says:", ask_sysfs(major, minor)
def _parse_proc_partitions():
res = {}
for line in file("/proc/partitions"):
fields = line.split()
try:
tmaj = int(fields[0])
tmin = int(fields[1])
name = fields[3]
res[(tmaj, tmin)] = name
except:
# just ignore parse errors in header/separator lines
pass
return res
def ask_proc_partitions(major, minor):
d = _parse_proc_partitions()
return d[(major, minor)]
def ask_hal(major, minor):
import dbus
bus = dbus.SystemBus()
halobj = bus.get_object('org.freedesktop.Hal', '/org/freedesktop/Hal/Manager')
hal = dbus.Interface(halobj, 'org.freedesktop.Hal.Manager')
def getdevprops(p):
bdevi = dbus.Interface(bus.get_object('org.freedesktop.Hal', p),
"org.freedesktop.Hal.Device")
return bdevi.GetAllProperties()
bdevs = hal.FindDeviceByCapability("block")
for bdev in bdevs:
props = getdevprops(bdev)
if (props['block.major'], props['block.minor']) == (major, minor):
parentprops = getdevprops(props['info.parent'])
return (str(props['block.device']),
str(parentprops['block.device']))
def ask_sysfs(major, minor):
from glob import glob
needle = "%d:%d" % (major, minor)
files = glob("/sys/class/block/*/dev")
for f in files:
if file(f).read().strip() == needle:
return os.path.dirname(f)
return None
if __name__ == '__main__':
main()
It looks like this post has some of your answer (still not sure just how to grab the major/minor out of the /dev/sda2 entry to match it up with what os.stat() returns for /:
Device number in stat command output
>>> import os
>>> print hex(os.stat('/')[2])
0x802
\ \minor device number
\major device number
[me#server /]$ ls -l /dev/sda2
brw-rw---- 1 root disk 8, 2 Jun 24 2004 /dev/sda2
[me#server jgaines2]$ \ \minor device number
\major device number
I recently had a need for this solution also. After seeing all the convoluted methods of getting the result I wanted through pure python, I decided to turn to the shell for help.
import subprocess
device = subprocess.check_output("grep '/filesystem' /proc/mounts | awk '{printf $1}'", shell=True)
print device
This gives me exactly what I want, the device string for where my filesystem is mounted.
Short, sweet, and runs in python. :)
There are problems with quite a few of the above solutions. There's actually a problem with the question as well.
The last answer (searching /proc/mounts) just doesn't work: searching for "/" will match every line in /proc/mounts. Even correcting this like this won't work:
import subprocess
device = subprocess.check_output("awk '$2 == \"/filesystem\" { print $1}' /proc/mounts", shell=True)
print device
When "/filesystem" is "/" you'll typically get two entries, one for "rootfs" and one for the actual device. It also won't work when the mounted file system name has spaces in it (the space appears as \040 in /proc/mounts).
The problem is made worse with btrfs subvolumes. Each subvolume is mounted separately but they all share the same device. If you're trying to use a btrfs snapshot for backups (as I was) then you need the subvolume name and an indication of the filesystem type.
This function returns a tuple of (device, mountpoint, filesystem) and seems to work:
import os
def get_filesystem_partition(fs):
res = None
dev = os.lstat(fs).st_dev
for line in file('/proc/mounts'):
# lines are device, mountpoint, filesystem, <rest>
# later entries override earlier ones
line = [s.decode('string_escape') for s in line.split()[:3]]
if dev == os.lstat(line[1]).st_dev:
res = tuple(line)
return res
That seems to work for all the cases I can think of, although I expect that there are still pathological cases where it falls to bits.
It is not the purdiest, but this will get you started:
#!/usr/bin/python
import os, stat, subprocess, shlex, re, sys
dev=os.stat('/')[stat.ST_DEV]
major=os.major(dev)
minor=os.minor(dev)
out = subprocess.Popen(shlex.split("df /"), stdout=subprocess.PIPE).communicate()
m=re.search(r'(/[^\s]+)\s',str(out))
if m:
mp= m.group(1)
else:
print "cannot parse df"
sys.exit(2)
print "'/' mounted at '%s' with dev number %i, %i" % (mp,major,minor)
On OS X:
'/' mounted at '/dev/disk0s2' with dev number 14, 2
On Ubuntu:
'/' mounted at '/dev/sda1' with dev number 8, 1
To get the device name, chop off the minor number from the partition name. On OS X, also chop the 's' + minor number.
How about using the (linux) blkid command (/sbin/blkid)
$ uname --kernel-name --kernel-release
Linux 3.2.0-4-amd64
$ python --version
Python 2.7.3
-
#!/usr/bin/env python
import subprocess
sys_command = "/sbin/blkid"
proc = subprocess.Popen(sys_command,
stdout=subprocess.PIPE,
shell=True)
# proc.communicate() returns a tuple (stdout,stderr)
blkid_output = proc.communicate()[0]
print blkid_output
Here's the output on a dual-boot laptop with an (unmounted) USB drive (sdb1)
$ ./blkid.py
/dev/sda1: LABEL="RECOVERY" UUID="xxxx-xxxx" TYPE="vfat"
/dev/sda2: LABEL="OS" UUID="xxxxxxxxxxxxxxx" TYPE="ntfs"
/dev/sda5: UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="ext4"
/dev/sda6: UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="swap"
/dev/sda7: UUID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="ext4"
/dev/sdb1: LABEL="CrunchBang" TYPE="iso9660"
Here is how you can simply get the devices major and minor numbers:
import os
major, minor = divmod(os.stat('/').st_dev, 256)

Categories