(All) Databases Are Just Files. Postgres Too

Dear reader: If you’re feeling an urge to comment solely based on the title, just be warned that too many have done so already. Some of the comments on r/programming and r/PostgreSQL are hilarious. More serious ones on lobste.rs.

SQLite and DuckDB have earned their popularity in the data world, and for good reason. I’m a big fan of both. Their appeal is simple: they’re just files. You can see, copy, and move them around like any other file.

Of course, all databases are “just files” at the end of the day. But you can’t exactly copy-paste a PostgreSQL cluster as you can with an SQLite file. That said, with a little effort, you can simplify the process and make PostgreSQL feel much more approachable.

Here’s my core belief: If you take the time to understand how the components of your database system fit together, not even the internals, just the external pieces—you’ll move faster, debug more effectively, and build more confident workflows. It only takes a small upfront investment in learning the basics and trusting that you’ll be able to build on that foundation over time.

Veil of Ignorance

Most people install PostgreSQL through a package manager—sudo apt install postgresql-17—and promptly forget about it. It becomes a kind of black box: a service you interact with via psql, but rarely configure, inspect, or move. When something doesn’t work—say, a permissions issue—they just type sudo —u postgres and hope it goes away.

Another common approach is to spin up PostgreSQL in Docker containers. That works, of course, but in my experience, the tooling can be clunky and verbose for day-to-day development.

But it’s that casual sudo that does the most damage, not just because it can be destructive, But because it reinforces a layer of detachment between you and the system.

What I’ve learned is that if you’re willing to peel back that layer and start doing a few things manually, your data workflows get a lot smoother.

Lifting the Veil

Here’s the bigger picture:

When you install PostgreSQL via a package manager, it sets up a few defaults:

A system user named postgres
A data directory like /var/lib/postgresql/17/main
The database binary at /usr/lib/postgresql/17/bin/postgres
And a default database also named postgres

This overuse of the name postgres—user, binary, database—can get confusing fast. It’s no surprise beginners often mix them up. Let’s forget about all this and assume we’re just a normal everyday user.

At its core, postgres is simply a program that turns SQL queries into filesystem operations. A CREATE TABLE becomes a mkdir. An UPDATE eventually becomes: open a file, write to it, close it. It’s a complex and powerful system—but fundamentally, it’s just an executable that manipulates files.

These files live in the so-called data_directory, often referenced by the PGDATA environment variable. To create that directory from scratch, you can run:

initdb /tmp/db0

Now you can start a PostgreSQL server using:

PGPORT=1991 postgres -D /tmp/db0 -c shared_buffers="10GB"

This launches the database server on top of that directory. We’re also overriding the default shared_buffers setting here. Tools like psql connect to the server via the specified port, so PGPORT matters.

Package managers usually wrap this whole process in a system service that runs in the background. But under the hood, it’s basically just running that same command—plus a few more options.

One of those options is the config_file, which lets you point to a custom postgresql.conf. Here’s an example you could save as db0.pgconf:

shared_buffers = 16GB
work_mem = 64MB
max_worker_processes = 24

Then you’d start the server like this:

PGPORT=1991 postgres -D ./tmp/db0 -c config_file=./db0.pgconf

And to connect:

PGPORT=1991 psql postgres

When you’re done, just rm -rf /tmp/db0 .

Pitfalls

To be clear, this is mostly a development workflow. You wouldn’t manage production Postgres by juggling folders and environment variables by hand. But for local setups, prototyping, testing, or even sharing demos, it’s powerful. You can version entire database clusters using git, store config files alongside your code, and reproduce environments with a simple script.

Should one decide to follow that, though, one must have absolute clarity over what executable is being run at any moment, what configuration parameters are used, and where are they taken from. I also recommend running everything as a foreground process, for development especially. Otherwise, you can easily take a ride to insanity, if you think that you’re tweaking some configuration parameter in a file, but the actual running process takes another into account.

Why This Matters

Understanding how PostgreSQL actually works—how it’s started, where the files live, what the key binaries do—puts real power in your hands. It removes the mystery. It means you’re no longer just reacting when something breaks; you’re diagnosing, tweaking, even optimizing.

You don’t have to to dive into internals or become a database engineer. You just need a mental model of the system as a set of files, a process, and a config. Once that veil of ignorance is lifted, everything becomes simpler: debugging, provisioning, versioning, backups, and even just experimenting with settings.

This is why SQLite and DuckDB are so beloved: they give you that power by default. PostgreSQL can give you the same—once you know how to take it.

If you’re building systems, writing data pipelines, or just want to feel more in control of your dev environment, that small bit of upfront learning pays off fast. You’ll move with more confidence, build cleaner workflows, and stop treating your database like a black box.

Plus, who knows? You may fall in love with PostgreSQL and start digging a bit deeper.

Liked what you read. Maybe we could work together. Check my services.