Postgres RLS for Multi-Tenant Django

the bug you haven't found yet

Every multi-tenant Django shop I've ever audited has the same architecture: a tenant_id column on every table, a custom manager that injects .filter(tenant_id=request.tenant.id) into every queryset, and a README comment somewhere that says "remember to use the tenant-aware manager". It works fine, until it doesn't.

The failure mode is always the same. Someone writes a raw query for a performance-sensitive report. Or they use Model.objects.filter() directly instead of Model.tenant_objects.filter() because they're in a management command and the request context isn't threaded through. Or a junior dev adds a new model, copies an old one, forgets to set the default manager. Or a library you depend on does an ORM query internally and has no idea your managers exist. The query runs clean in dev and staging because you only ever have one tenant in those environments. It hits production and suddenly tenant A is reading tenant B's invoices.

We've seen this exact scenario play out during security audits we've run for clients, and the terrifying part is that it often doesn't show up in application logs as an error. The query succeeds. The data comes back. It's just the wrong data.

The only reliable fix is to enforce isolation at the database level, where the database itself refuses to return rows that don't belong to the current tenant. Postgres 17 has row-level security and it's been production-ready for years. The reason most Django shops aren't using it isn't technical, it's that the Django docs don't mention it and the ORM makes it easy to pretend the problem is solved.

how postgres RLS actually works

RLS is a Postgres feature where you attach policies to tables, and those policies are evaluated as implicit WHERE clauses on every SELECT, INSERT, UPDATE, and DELETE. The policies can reference session-level configuration variables, which is the key to making this work with a connection pool.

The basic setup looks like this. You enable RLS on a table:

sql

 1ALTER TABLE invoices ENABLE ROW LEVEL SECURITY;
 2ALTER TABLE invoices FORCE ROW LEVEL SECURITY;

The FORCE ROW LEVEL SECURITY is important: without it, the table owner (usually your app's DB user) bypasses policies entirely. Then you write a policy:

sql

 1CREATE POLICY tenant_isolation ON invoices
 2  USING (tenant_id = current_setting('app.current_tenant')::uuid);

Now every query against invoices that doesn't set app.current_tenant will return zero rows or throw an error, depending on how you configure the missing-variable behavior. Every query that does set it will only see rows where tenant_id matches.

The current_setting() function reads from the session's GUC (grand unified configuration) namespace. You can set these with SET LOCAL app.current_tenant = '...', where SET LOCAL means the setting is scoped to the current transaction and automatically rolls back when the transaction ends. This is crucial for connection pooling, because you don't want one request's tenant bleed into the next request's connection.

One thing that catches people: current_setting() throws an error if the variable isn't set, by default. You want current_setting('app.current_tenant', true) in your policy, where the second argument tells it to return NULL instead of erroring. Then your policy becomes:

sql

 1CREATE POLICY tenant_isolation ON invoices
 2  USING (tenant_id = current_setting('app.current_tenant', true)::uuid);

This way, if somehow a query runs without the tenant context set, it just returns nothing rather than crashing. Fail safe over fail loud, at least until you've got proper alerting in place.

the connection pool problem

The reason most people avoid session-level variables with connection pools is that they're scared of state leaking between requests. That fear is valid but the solution is simple: always use SET LOCAL inside a transaction, never SET.

SET persists for the lifetime of the connection. SET LOCAL persists only for the current transaction and is automatically reset on COMMIT or ROLLBACK. Since every Django request runs inside a transaction (or should, with ATOMIC_REQUESTS = True), you can safely set app.current_tenant at the start of each request transaction and it'll be gone before the connection returns to the pool.

With PgBouncer in transaction pooling mode (which is what you should be using for a SaaS app under any real load), this works correctly because PgBouncer recycles connections at the transaction boundary, exactly when your SET LOCAL variables expire anyway.

The Django side of this lives in middleware. Here's a minimal implementation:

python

 1from django.db import connection
 2
 3class TenantMiddleware:
 4    def __init__(self, get_response):
 5        self.get_response = get_response
 6
 7    def __call__(self, request):
 8        tenant = getattr(request, 'tenant', None)
 9        if tenant:
10            with connection.cursor() as cursor:
11                cursor.execute(
12                    "SET LOCAL app.current_tenant = %s",
13                    [str(tenant.id)]
14                )
15        return self.get_response(request)

The SET LOCAL call needs to happen inside the request's transaction, which means after Django has opened the connection. With ATOMIC_REQUESTS = True in your database settings, Django wraps each request in a transaction automatically, so the middleware __call__ is the right place for this.

One gotcha: if you're using Django's async views or sync_to_async, the connection handling gets more complex because async views can share connections across coroutines. For fully async workloads you'll want to ensure each async context gets its own connection and its own SET LOCAL call. Django 6.x has improved async ORM support significantly, but this particular pattern still requires careful attention at the boundary.

policies that survive JOIN-heavy queries

Simple USING policies work fine for single-table reads, but multi-tenant SaaS apps have JOINs everywhere, and that's where things get interesting.

Suppose you have invoices joined to line_items, and both have tenant_id. If you enable RLS on both tables with the same policy, Postgres evaluates the policy on each table independently during the JOIN. You don't need to do anything special; the isolation holds at every table in the query plan. This is one of the genuinely nice properties of RLS: it's applied per-table, not per-query.

Where it gets trickier is when you have tables that don't have a direct tenant_id column, because they're logically tied to a tenant through a foreign key chain rather than a direct column. Say you have audit_logs that references invoices.id but doesn't store tenant_id directly. You've got two choices: denormalize and add tenant_id to audit_logs (my preference, and it makes the policy trivial), or write a subquery policy:

sql

 1CREATE POLICY tenant_isolation ON audit_logs
 2  USING (
 3    invoice_id IN (
 4      SELECT id FROM invoices
 5      WHERE tenant_id = current_setting('app.current_tenant', true)::uuid
 6    )
 7  );

This works but it's a correlated subquery that runs for every row evaluation. Postgres's planner is usually smart enough to push it into a hash join, but you should EXPLAIN ANALYZE your most join-heavy queries after adding policies to make sure you're not introducing unexpected seq scans. I've seen cases where adding a policy caused Postgres to abandon a perfectly good index path because the subquery in the policy changed the cost estimate enough to tip the optimizer's decision.

Denormalizing tenant_id into every table is more storage, but it keeps your policies simple, your query plans predictable, and your EXPLAIN output readable. On Postgres 17 with its improved statistics and planner, you're also more likely to get sensible multi-column index usage if tenant_id is physically present in the table. A composite index on (tenant_id, id) or (tenant_id, created_at) on your high-traffic tables will do more for query performance than almost anything else in this stack.

wiring this into django 6.x without surprises

Django 6.x doesn't have native RLS support, and I don't expect it to grow any for a while, so you're wiring this up manually. The middleware approach above covers web requests, but you've got other code paths to think about: management commands, Celery tasks, and any direct database connections your background workers open.

For Celery, the cleanest pattern is a task base class that accepts a tenant_id kwarg and sets the session variable before running:

python

 1from celery import Task
 2from django.db import connection
 3
 4class TenantTask(Task):
 5    abstract = True
 6
 7    def __call__(self, *args, **kwargs):
 8        tenant_id = kwargs.pop('tenant_id', None)
 9        if tenant_id:
10            with connection.cursor() as cursor:
11                cursor.execute(
12                    "SET LOCAL app.current_tenant = %s",
13                    [str(tenant_id)]
14                )
15        return super().__call__(*args, **kwargs)

Then every tenant-scoped task inherits from TenantTask and gets called with some_task.delay(tenant_id=str(tenant.id), ...). It's a bit of boilerplate but it makes the tenant context explicit at the call site, which is exactly what you want.

Management commands are trickier because they often need cross-tenant access for things like aggregation, reporting, or backfill jobs. The right move is to create a separate database user for admin/superuser operations that's explicitly excluded from RLS policies:

sql

 1CREATE USER app_admin WITH PASSWORD '...';
 2GRANT ALL ON ALL TABLES IN SCHEMA public TO app_admin;
 3ALTER TABLE invoices FORCE ROW LEVEL SECURITY; -- applies to regular users
 4-- app_admin bypasses RLS by default as a superuser, or:
 5ALTER TABLE invoices NO FORCE ROW LEVEL SECURITY; -- and manage it per-role

You configure this in Django's DATABASES setting with multiple database connections: the default one uses the RLS-enforced role, and you add an admin connection that uses the unrestricted role, used only in management commands that have a legitimate reason to see across tenants.

One thing we enforce in our own projects: never let the admin connection touch any code path that could be triggered by user input. It's only for explicit, audited admin operations. The regular app connection is the one that handles every request, every API call, every webhook.

what you actually gain

Once this is in place, a developer can write Invoice.objects.all() in a view and get back only the current tenant's invoices, with no custom manager, no filter annotation, no nothing. The database enforces it. A raw SQL query through connection.cursor() gets the same isolation. A third-party library that runs its own queries gets the same isolation. A forgotten objects.filter() call instead of a custom manager doesn't blow the whole thing open.

There's also a meaningful difference in how you think about security audits. With ORM-layer filtering, proving to an auditor that tenant isolation is correct means tracing every code path that touches a tenant-scoped model and verifying that each one uses the right manager or annotation. With RLS, you show the auditor the policies, the middleware that sets the session variable, and the test suite that verifies the policy is enforced when the variable is missing or wrong. That's a much tighter argument.

The performance overhead is real but modest. Postgres evaluates the policy predicate as part of the query plan, and with proper indexing on tenant_id the incremental cost is negligible. We've run this in production on tables with tens of millions of rows and the policy evaluation doesn't show up as a meaningful contributor in pg_stat_statements. The subquery-style policies can be more expensive, which is another reason to denormalize.

The setup isn't zero-cost. You've got to apply policies to every table, handle the middleware and task boilerplate, manage the dual-connection setup for admin operations, and write tests that verify the isolation actually holds. At steezr we treat this as a one-time setup cost that pays for itself the first time a developer writes a slightly wrong query and nothing terrible happens. Given how many near-miss data leaks we've seen in audits of apps that relied purely on ORM-layer filtering, that trade-off is extremely easy to justify.

Postgres RLS Is Your Real Tenant Isolation Layer, Not Django

the bug you haven't found yet

how postgres RLS actually works

the connection pool problem

policies that survive JOIN-heavy queries

wiring this into django 6.x without surprises

what you actually gain

Server Actions Are Public Endpoints, Treat Them Like It

AI Code Passes CI and Breaks Production Anyway

Your Tests Passed and Production Still Broke: The AI Verification Gap Nobody Wants to Name

Want to work with us?