back to homepage

The tempfile Module in Python

The Python programming language boasts an expansive standard library, providing developers modules for the most specific and esoteric of needs. In fact, there exists a famous XKCD comic where an individual imports "antigravity" and gains the ability to fly.

One of the lesser known modules in Python enable programmers to easily create and use temporary files. The file is created in a context manager, which automatically deletes the temporary file after exit. This module is called tempfile.

I've found myself using this handy built-in module a couple of times for real world tasks. For example, many CLI tools ask users for multi-line input by opening an editor. Examples include git, crontab, sudoedit, or even C-x C-e in the Bash shell. We can accomplish this feature in Python with 4 lines of code.

import os
import subprocess
import tempfile

def get_input():
    with tempfile.NamedTemporaryFile(suffix=".tmp") as tf:
        subprocess.run([os.environ["EDITOR"], tf.name], check=True)

        with open(tf.name, "r") as f:
            return f.read()

print(get_input())

Another example, I needed to fetch the subtitles of a YouTube video. By calling the external CLI tool yt-dlp in a temporary directory, we can easily return the subtitles as a string. Once the string is returned from the function, the temporary directory is automatically deleted. Better yet, on Linux, /tmp usually resides entirely in memory. So, the subtitles were never even written to disk.

import os
import subprocess
import tempfile
from pathlib import Path

def get_subtitles(url):
    with tempfile.TemporaryDirectory() as d:
        os.chdir(d)
        subprocess.run(
            [
                'yt-dlp',
                url,
                '--skip-download',
                '--write-sub',
                '--write-auto-sub',
                '--sub-lang',
                'en.*',
            ],
            check=True,
        )
        return next(Path(d).glob('*.vtt')).read_text()

Python has many other obscure libraries, for example tabnanny to check for ambiguous indentation in Python scripts; tomllib to parse TOML files; ipaddress to manipulate IPv4 and IPv6 addresses; email for handling MIME and emails; and of protocol clients for SMTP, IMAP, POP3, FTP, and of course HTTP.