How to Check File and Folder Size in Python?

You’ll be taught it on this article verify the dimensions of a file or listing in Python

Python is among the most versatile programming languages. It permits you to construct from a small CLI (Command-line Interface) program to a fancy internet utility.

Nevertheless, one of the underrated options is the flexibility to work together with working techniques. Managing OS operations with Python can prevent lots of time when creating automation processes.

Let’s examine how Python interacts with the working system.

How does Python talk with the working system?

Python communicates with the operating system using the os, sys, path, and subprocess modules

Nobody can dwell in isolation from his atmosphere. The identical is true in Python, the place it’s generally elementary to speak with the working system to get issues achieved.

Python has a number of modules that permit us to work together with the working system. Probably the most generally used are os, sys, pathlib and subprocess.

Since they’re built-in modules, you do not want to put in them with PIP. You’ll be able to import all of them with the next assertion:

import os
import sys
import pathlib
import subprocess

The record under signifies the primary performance of every of those imports:

  • Ox: Transportable approach to make use of system particular (relying in your working system) performance. It is the suitable alternative generally except you want one thing extra superior
  • system: System-specific parameters and capabilities. This module gives entry to interpreter variables and capabilities. The os module communicates with the working system and sys communicates with the Python interpreter
  • pathlib: Superior Path Utilization. It permits you to signify file techniques as objects, with the related semantics for every working system.
  • Subprocess: Execution and administration of subprocesses straight from Python. Which means working with the stdin, stdout, and return codes. You’ll be able to be taught extra about it by studying our Python subprocess information.

There are high-level libraries that comprise much more particular performance relying in your wants. Nevertheless, you’ll be able to often get began with the above modules.

Comment: A lot of the capabilities offered by these modules have a unique output relying in your working system. Do not forget that UNIX and Python are often one of the best match.

Now that you’ve got a fast understanding of how Python interacts with the working system, let’s dive into the strategies for checking file and folder sizes. All the following options can be found in file and listing dimension within the Python GitHub repository

os.stat().st_size

On this technique, we’re going to use the stat() operate from the os module. It returns lots of details about a particular path.

Comment: The os.path.getsize() operate additionally will get the job achieved. The benefit of utilizing os.stat().st_mate is that it doesn’t comply with sim hyperlinks.

Earlier than we go any additional, let’s create a check file known as lorem.txtthrough which we’re going to stick one thing silly textual content. We are able to go to a Lorem Ipsum textual content generator and browse the textual content in it lorem.txt file.

In the identical folder, create a file named method1.py and paste the code under:

import os
dimension = os.stat('lorem.txt').st_size
print(dimension)

Let’s break down what we’re doing with this code:

  • Within the first line we import the ox module
  • The dimensions variable incorporates the dimensions of the file lorem.txt
    • The os.stat() operate returns lots of data associated to the file
    • The st_mate attribute specifies the dimensions of the file
  • We print the mate variable

Attempt working the Python script. You’re going to get a unique outcome relying on the content material of your lorem.txt file.

Exit:

20064

The output is displayed in bytes. This is not readable in any respect, so let’s humanize it so we have now a greater concept of ​​the dimensions of the file.

First, set up the humanize package deal by working the next command in your shell:

pip set up humanize

Then you’ll be able to pure dimension() operate that converts a worth in bytes right into a readable file dimension, for instance KB, MB, GB, or TB.

import os
from humanize import naturalsize

dimension = os.stat('lorem.txt').st_size

print(dimension)
print(naturalsize(dimension))

First, the above code prints the dimensions of the file in bytes after which prints the lead to a readable dimension.

Exit:

20064
20.1 kB

Utilizing pathlib

Though pathlib is designed to work solely with paths, it contains some helpful capabilities from different modules as strategies of Path objects (Cases of the Path class).

Create a file method2.py and import the Path class.

from pathlib import Path

Then create a Path object that passes the trail to the lorem.txt file as argument.

file_ = Path('lorem.txt')

Now you may have entry to the statistic() technique of the Path class. It really works the identical because the os.stat() operate, due to this fact you’ll be able to print the dimensions of the file.

print(file_.stat().st_size)

Exit:

20064

As you’ll be able to see, we received the identical outcome as with the primary technique we used. The above outcome can be printed in byte format, so we are able to use the humanize module to make it readable.

from pathlib import Path
from humanize import naturalsize

dimension = Path('lorem.txt').stat().st_size

print(naturalsize(dimension))

This code produces the next output:

20.1 kB

Utilizing Unix instructions with Subprocess:

The subprocess module permits us to name and handle subprocesses from Python. Subsequently, we are able to run any command and deal with its output straight in Python.

Comment: This technique solely works in case you are utilizing a Unix working system (Linux, Mac)

Open a file method3.py and paste the code under:

from subprocess import run

course of = run(['du', 'lorem.txt'], capture_output=True, textual content=True)

print(course of.stdout)

Diving into this piece of code:

  • We import the run operate from the subprocess module
  • The variable course of incorporates the results of executing the command du lorem.txt
    • by is a Linux utility that permits us to acquire the disk house of a file
    • capture_output offers us entry to the standout attribute.
    • textual content means we retailer the output as a string as a substitute of bytes
  • We print the usual output of the method

Working the code above provides you with the next output:

20      lorem.txt

As you’ll be able to see, it offers us the dimensions and identify of the file. When you simply wish to know the dimensions of the file, cut up the output (bear in mind it is a string) and print the primary component.

from subprocess import run

course of = run(['du', 'lorem.txt'], capture_output=True, textual content=True)

dimension = course of.stdout.cut up()[0]

print(dimension)

Exit:

20

This output will not be readable in any respect. We are able to conclude that the unit of measurement used is KB (due to the earlier strategies), however nobody else may guess the dimensions of the file.

To resolve this downside, we are able to use the -H (human readable) flag.

Comment: You will get a information of this command by working man youor you assist.

from subprocess import run

course of = run(['du', '-h', 'lorem.txt'], capture_output=True, textual content=True)

dimension = course of.stdout.cut up()[0]

print(dimension)

Now the output of this script will likely be far more readable:

20K

To be taught extra concerning the subprocess module and doable makes use of, try our Python subprocess information.

Get the dimensions of a folder recursively

To seek out out the dimensions of a folder, you need to iterate over every file within the folder and its subfolders. We do it with two strategies:

  • Iterating over a path with pathlib
  • Utilizing the du command with subprocess

The next code makes use of a path to a check folder in my house listing. You have to substitute the trail of that file with the folder you wish to get the dimensions of.

Iterating over a path with pathlib

Let’s examine how one can get the dimensions of a folder by iterating the dimensions of the information.

from pathlib import Path
from humanize import naturalsize

def get_size(path = '.'):
    dimension = 0

    for file_ in Path(path).rglob('*'):

        dimension += file_.stat().st_size
    
    return naturalsize(dimension)

test_path = Path.house() / 'Paperwork/checks/'

print(get_size(test_path))

This piece of code appears a bit scary, let’s break down what every half does.

  • Import the Path class and the pure dimension() operate
  • Outline the get_size() operate with a parameter pathwhich factors to the present listing by default.
  • The dimensions variable is only a placeholder the place we are going to add the dimensions of every file
  • Repeat every file of the trail
    • The rglob() technique recursively returns the information that match the sample
    • rglob(‘*’)means we get all of the information within the folder
  • Get the dimensions of every file and add it to the dimensions variable
  • Returns the dimensions variable in a human readable format

After all I’ll check the operate with a listing that’s solely out there on my laptop. Remember to vary the trail to a folder in your laptop.

In my case I get the next output:

403.4 MB

Utilizing the du Command with Subprocess

This strategy has some benefits:

  • The result’s barely extra correct
  • It is a lot quicker
from subprocess import run
from pathlib import Path

test_path = Path.house() / 'Paperwork/checks/'

course of = run(['du', '-sh', test_path], capture_output=True, textual content=True)

dimension = course of.stdout.cut up()[0]

print(dimension)

We use the identical strategy as Methodology 3, however this time we get the dimensions of a folder as a substitute of a file.

Exit:

481M

As you’ll be able to see, these two methods of getting the dimensions of a folder give barely completely different outcomes. The larger the map is, the extra distinction you get.

It is as much as you to decide on between the padlib or the subprocess approaches. If you realize you utilize Linux each time subprocessin any other case you should use the pathlib answer.

To sum up

Python is extraordinarily helpful when interacting with the working system. With Python you’ll be able to automate processes and save lots of time. The principle modules for interacting with the working system are os, sys, path, and subprocess.

On this tutorial you realized:

  • How Python interacts with the working system
  • Utilizing built-in modules to carry out OS operations
  • How the humanize module for human readable printing
  • To calculate the dimensions of a file with 3 approaches
  • To calculate the dimensions of a listing recursively or with the by command

Leave a Comment

porno izle altyazılı porno porno