How to Compare Two Directories in Python?

Estimated read time 2 min read

To compare two directories in Python, you can recursively iterate over the files in each directory and compare their contents or attributes. Here’s an example implementation:

import os
import filecmp

def compare_directories(dir1, dir2):
    # Create a dircmp object to compare directories
    dcmp = filecmp.dircmp(dir1, dir2)

    # Retrieve the common files and subdirectories
    common_files = dcmp.common_files
    common_dirs = dcmp.common_dirs

    # Compare common files
    for file in common_files:
        file1_path = os.path.join(dir1, file)
        file2_path = os.path.join(dir2, file)
        if not filecmp.cmp(file1_path, file2_path, shallow=False):
            print(f"Different contents: {file}")

    # Recursively compare subdirectories
    for subdir in common_dirs:
        subdir1_path = os.path.join(dir1, subdir)
        subdir2_path = os.path.join(dir2, subdir)
        compare_directories(subdir1_path, subdir2_path)

    # Find missing files or directories
    for file in dcmp.left_only:
        print(f"Missing in dir2: {os.path.join(dir1, file)}")
    for file in dcmp.right_only:
        print(f"Missing in dir1: {os.path.join(dir2, file)}")

# Specify the paths of the directories to compare
dir1_path = '/path/to/dir1'
dir2_path = '/path/to/dir2'

# Compare the directories
compare_directories(dir1_path, dir2_path)

In this example, the compare_directories function takes two directory paths as input and compares the directories recursively. It uses the filecmp.dircmp class to create a directory comparison object (dcmp) that provides information about the common files, common subdirectories, and differences between the directories.

The function then compares the common files by iterating over common_files and using filecmp.cmp to compare the file contents. It prints a message for files with different contents.

Next, it recursively compares the common subdirectories by calling compare_directories on each pair of subdirectories.

Finally, the function finds missing files or directories by iterating over dcmp.left_only and dcmp.right_only and prints a message for each missing item.

Make sure to replace '/path/to/dir1' and '/path/to/dir2' with the actual paths of the directories you want to compare.

You May Also Like

More From Author

+ There are no comments

Add yours

Leave a Reply