I rewrote my auth layer five times
Estimated reading time: 22 minutes 预计阅读时间: 22 分钟From sessions to dual tokens, bitmaps, and per-request deny checks.
A first-person retrospective, not a "how to do auth right" essay.
I'll walk through the five versions the
summerrs-adminauth layer went through — what I disliked at each stage, why I changed it, and what new problems showed up. If you're building something similar, this might save you a few months of wandering.
0. Where I landed
The current shape:
Four keywords: dual tokens / access token serves business / bitmap-compressed permissions / optional per-request deny check.
To explain why it ended up this shape, I have to start from the most naive v1.
1. v1: Plain sessions, Redis on every request
I started simple. Login issues a UUID, the user info goes into Redis with that UUID as key, every request brings the UUID, the middleware fetches user info from Redis and stuffs it into request extensions.
Pros are obvious: simple. One GET per check, kick is one DEL, change permissions by overwriting the value, business code reads whatever it needs.
But pretty quickly I felt off:
Every single request — list pages, config changes, button clicks — runs a Redis round-trip.
At 10 QPS who cares. At scale, Redis is a hotspot. And fundamentally this is just session-cookie auth wearing a JWT-shaped costume. I wrote a bunch of Rust on axum and ended up with the same shape as $_SESSION from PHP days.
Pain point: Mandatory Redis on every request. Doesn't scale.
2. v2: I read open-source admin code. I couldn't follow their trade-offs.
Time to see how others do it. I opened a few thousand-star admin projects and decoded the tokens they hand out. The results surprised me.
A disclaimer up front: the projects below are thousand-star, production-running, mature systems. "I can't follow their trade-offs" ≠ "they got it wrong" — they almost certainly carry constraints I can't see (legacy weight, team habits, compatibility with some frontend SDK, compliance / audit requirements).
I'm only saying that as a brand-new project starting from zero, I can't copy these designs — not for moral reasons, but because once I copy them I can't explain why each field exists.
2.1 Classic flavor: JWT wrapping a UUID
Many projects return tokens like:
Decoded payload:
One field. One UUID. That UUID is the Redis key for fetching user info.
I can't see the upside. How is this different from my v1, fundamentally? The only difference is wrapping the UUID in JWT:
- No user info in the token (still hits Redis)
- No expiry field (no
expin payload) - No refresh token
So why bother? You pay signing/verification cost without getting any of the benefits JWT was designed to give. HMAC prevents UUID tampering — but a UUID is already an unguessable random string; a tampered one just won't match anything.
The only visible "benefit" is matching the expectation that "modern projects use JWT." That's aesthetics, not engineering.
2.2 Frankenstein flavor: JWT with sessionId + tokenVersion
Some go further:
JWT with sessionId. Plus tokenVersion (for global invalidation?). Plus access/refresh distinction.
Now you maintain two worlds:
- JWT signature + expiry verification
- sessionId → Redis lookup (no savings)
- tokenVersion in some table
If your business genuinely demands this kind of "dual state" (think finance, or compliance rules requiring forced logout), the combo is actually a reasonable compromise — keep JWT's portability, but retain a central control point via sessionId/tokenVersion. For a generic admin panel though, complexity doesn't match the payoff — and if I copy it, I won't be able to articulate the contract of each field.
2.3 Raw UUID flavor
Some don't bother with JWT at all. Login returns UUID, straight to Redis.
Identical to my v1, same problems. I actually like this one better — at least it's not pretending to be "stateless."
After looking around, I didn't find what I wanted. Most projects either don't use JWT's stateless property (treating it as a random-string wrapper) or pull state right back in (JWT carrying sessionId).
So I decided to go to the other extreme — if I'm going to use JWT, I'll use it all the way. Stuff all the user info in the payload.
Enter v3.
3. v3: Stuff user info in the JWT payload, truly stateless
JWT's selling point is claims — the server signs data into the token, the client carries it back, no lookups needed.
People are nervous about this because the payload is base64 ("not safe!"). But JWT was never meant to encrypt — its guarantee is "you can't forge it," not "you can't read it." So only put non-sensitive info there.
I put basic user info in:
Middleware verifies the signature and reads roles / perms from claims, zero Redis interactions.
This is real statelessness. Latency dropped.
But after some time, two new problems surfaced:
3.1 User info changes don't propagate
I changed Admin's role from R_SUPER to R_USER in the backend, but the user's existing token still carried R_SUPER in claims, continuing to call APIs as super-admin.
Until the token expires (default 2 hours) and the user logs in again, they keep their old role.
Can the business tolerate 2-hour delays? Not for sensitive operations.
3.2 Tokens grow absurdly long
A super-admin with 200+ permission codes becomes a giant JSON array in claims. Base64-encoded JWT exceeds 4 KB, sent in Authorization header on every request.
Many reverse proxies cap headers at 8 KB. Any larger and you get 502. CDN cache keys explode too.
Pain point: Stateless → stale info + token bloat.
4. v4: Dual tokens, 5-minute rotation
I started looking at dual tokens (access + refresh). Access tokens are short — 5-10 minutes. Refresh tokens are longer — 1-2 hours (or more), used only to swap for new access tokens.
Industry benefits:
- Access token leak window is short — useless after 5 minutes
- Business calls have zero Redis — same as v3
- Updates have a window — change a role, within 5 minutes the new access token reflects it
4.1 My dual-token design
Access token (business): claims carry user info + roles + permission bitmap (covered later)
Refresh token (swap only): payload is just a rid
Two Redis keys, each with a job (I mixed these up myself at first):
Why split?
auth:refresh:{rid}— on refresh you only have a refresh token; you need to look up login_id + device from the rid. Keep the value as simple as possible —"1:web"is enough.auth:device:{login_id}:{device}— drives "online devices" list, and makes device cleanup atomic (read the current rid, delete the refresh key along with it).
Both are strings, not hashes. Kicking one device touches exactly two keys — no clever data structure needed.
4.2 Flow
Key designs:
- Refresh token is one-shot — used once and burned, intercepted reuse fails
- Write new key before deleting old — if the server crashes mid-rotation, the old refresh key lingers under its TTL; users don't hit "I just refreshed and got 401"
- Refresh re-fetches user info — solves v3's stale-role problem
- Refresh only rejects
banned, notrefresh:{ts}— banned blocks refresh; arefresh:{ts}deny marker exists specifically so that refreshing consumes it, so refresh must pass through - Business APIs are pure JWT — same as v3, zero Redis
4.3 Looks perfect? New problems.
After running for a while:
Problem 1: Refresh hits the DB every time
Refresh fetches user, role, permissions — multiple JOINs. With active users refreshing every 5 minutes, DB load creeps up again.
Problem 2: Tokens still long
200 permission codes in "perms": [...] is still a few KB.
Problem 3: The hijack window is bigger than I thought
If an attacker grabs the refresh token and uses it before the victim does, they get a fresh access + refresh pair. The victim's next refresh fails because their refresh token has been overwritten → logged out.
The victim only finds out at next refresh time, when their refresh fails. The exposure window is up to access + refresh lifetime.
5. v5: Bitmap + per-request deny check
For v4's three problems, I patched each.
5.1 Permission bitmap: 200-line array → 8 bytes
Permission code strings can't be compressed. But I noticed:
A project's permission set is finite and rarely changes. At startup I have the full list.
So assign each permission a bit:
JWT payload becomes:
"pb" is the base64-encoded bitmap. Permission info shrinks from KB to tens of bytes.
Permission check (abstracted from crates/summer-auth/src/session/manager.rs::validate_token):
Note: there's no "user_bitmap & required_bitmap" pure bit operation here. Reason: wildcard semantics (system:*:list) can't be expressed by bitmaps alone — you have to decode back to strings for matching. The real value of the bitmap is token-size compression, not matching speed (string matching itself runs in a few hundred nanoseconds; the bottleneck isn't there).
The sys.menu table got a bit_position column, loaded into memory at startup. Menu rarely changes in production, so reload-on-restart (or via broadcast) is fine.
Compatibility: string permission codes are still supported for cases like MCP tool calls and external webhooks where bitmap doesn't fit.
5.2 per-request deny check: a config knob for real-time
Core idea: a deny key is not a "killer," it's a state broadcaster.
It doesn't tell the request "die." It tells the request "your client-side state is stale, go refresh." That's why deny checks tolerate eventual consistency — 99% of paths miss, only 1% actually hit and trigger the client to renegotiate.
Every key in this section —
banned/refresh:{ts}/login_id/device— is the same idea in a different shape.
The cost of stateless JWT is you can't change an issued token. I added a switch:
When on, the middleware adds:
This handles two needs:
5.2.1 Blocklist: banned
Admin marks a user as banned → write auth:deny:{login_id} = "banned" → all their requests are rejected immediately.
Doesn't depend on token expiry. The very next request takes effect.
5.2.2 Force-refresh: refresh:{ts}
This is the gem. When I change a user's role, I don't have to kick them off. I just write:
This says: any token with iat earlier than 1777902800 must refresh.
Then:
- User's current access token was signed 5 minutes ago (iat = 1777902500) → match → return
RefreshRequired - Frontend sees the error → calls
/auth/refreshautomatically - Refresh fetches latest role → new access token with iat = 1777902800+ → no longer matches deny → works
Zero user-facing disruption. Worst case is a 1-second blip while the frontend handles the refresh.
One deny entry, and every token issued before that moment automatically picks up the new role on the next request. This is the bit of the design I'm proudest of.
5.2.3 deny TTLs are not the same
I earlier claimed "deny always uses access_timeout" — that's wrong. banned must be long-lived, otherwise a user banned a year ago would un-ban themselves when TTL expires. Not great.
5.2.4 Refresh does NOT delete the deny key
This one I learned the hard way. My first implementation deleted the deny key after a successful refresh, thinking "user has the new token, deny is useless now."
Multi-device reality broke it instantly:
The fix: don't delete the deny key on refresh; let it expire via TTL. Within that window, every device's old token is forced to refresh at least once.
Cost: auth:deny:{login_id} survives an extra access_timeout (~2h). Tens of bytes per user. Fine.
5.3 The "gentle" semantics of logout / kick
I iterated on this a few times; it deserves its own section.
The naive "kick off" pattern is: blocklist the target device's token, every request from that device returns 401. But in a multi-device world, that's too blunt:
The user is logged in on Web, Android, and iOS. They click "logout" on Web. Did they mean to log out of Android and iOS too? Obviously not.
The current implementation:
Step 2 is the subtle part: logging out one device writes a login_id-scoped deny. It might look like it "spills over" to other devices, but since the value is refresh:{ts} (not banned):
So one key serves three semantics: logout / force_refresh / ban. The comment in the code says it best:
I only fully appreciated the deny key after writing this part — it isn't a "blocker," it's a "rotation trigger."
5.4 The toggle cost
With per_request_deny_check = true, every request does one Redis GET. Aren't we back to v1?
Not really. The differences:
More importantly: it's a toggle. Off → v4 zero-Redis mode; On → real-time.
Engineering trade-off: Hand operators a knob, not a silver bullet.
5.5 The deny key in one table
For comparison, here are the three deny states side by side:
Writing this out, I realized: the whole deny mechanism collapses three concerns — online state, refresh rotation, ban — into a single Redis string. That kind of "one thing, three uses" minimalism is what Redis is good at.
6. Five-version comparison
6.5 Why this works — state distribution, not layering
I'm not drawing a pyramid here, because these four things aren't an upstream/downstream stack. They're orthogonal responsibilities. A single business request does not traverse four layers — it walks a flat path, but at each step on that path sits a piece of state with its own frequency and consistency profile:
Average per business request: 1 signature check + ε Redis lookups. ε depends on the deny toggle: off → 0; on → 1 lightweight GET (with extreme miss rates).
The essence of this design is: scatter state across positions by mutation frequency, instead of cramming it all into one synchronous interceptor.
This is also why I've opposed "stuff user_info into JWT" since v3 — that ties low-frequency static info to high-frequency signature checks. Changing user_info means waiting for every token to expire. Putting two different frequencies into the same slot is what made v3 fundamentally wrong.
7. What I learned
If I had to compress five iterations into a few lines:
-
JWT is not a session-wrapper. JWT's value is readable claims + no server lookup. If your JWT payload is just a UUID and you still hit Redis, just use a UUID.
-
Stateless and real-time are at odds. Want stateless → accept staleness. Want real-time → accept lookups. Don't try to have both — give the operator a switch (
per_request_deny_check) or compromise via dual tokens. -
Dual tokens aren't about "security," they're about evolvability. Access tokens are short enough to tolerate small leaks; refresh tokens are one-shot and carry the rotation signal. "Security" is a vague word — what dual tokens actually do is shrink the user-perceived loss window.
-
refresh:{ts}is the design I'm proudest of. A timestamped string that precisely invalidates "tokens issued before this moment" without kicking the user. It's basically CAS thinking applied to tokens — "I know when your token was signed; if it's not after my deny moment, you should refresh." -
Bitmap compression isn't a flex; it's forced by token bloat. Watching a 4 KB Authorization header trigger 502s on the reverse proxy taught me where "inline data" physically ends. Every inlined claim deserves the question: "Can this stay under 1 KB?"
-
Config toggles are an engineering courtesy. I used to want "the optimal solution." But every team's SLA, compliance, and performance bias differ.
per_request_deny_checkdefaults off; you turn it on. Just likeconcurrent_logindefaults on; you turn it off. Give every adopter a vote.
7.5 This is not a universal solution
I owe you a counter-current paragraph here — this stack is genuinely complex: dual tokens + bitmap + deny + device + refresh ts + a per-request switch + tiered TTLs. Stacked together, it does feel like a "combo punch."
I built it because summerrs-admin simultaneously carries three concrete needs: multi-tenant isolation, AI Gateway rate-limiting / billing, and admin-side force-kick. Each one wants its own state machine. If your scenario doesn't have these, don't copy this stack. Concretely:
My design was pushed by the business, not architected upfront. Each layer was added after the previous one hit a wall. This path may not fit you — but if you've hit the same wall, this post might save you a few months.
8. Unsolved / planned
To be honest, v5 isn't the end either.
- The 1-2 hour window after refresh-token theft — refresh-time checks block malicious refreshes, but one successful malicious refresh continues the session. Device fingerprinting / IP-binding helps, at the cost of false positives.
- Bulk permission changes — changing roles for 10,000 users — do you write 10,000 deny entries? A "global deny timestamp" version exists, but that triggers everyone's refresh at once — DB pressure?
- Multi-tenant isolation — current deny key is
auth:deny:{login_id}; tenant context isn't part of it. Deep-isolation tenants might want per-tenant deny. - Passkey / WebAuthn — schema is there (
sys.passkey_credential), flow isn't.
9. Closing
If you're building auth, don't just copy v5. Figure out your needs first:
Technical decisions aren't ranked by sophistication; they're ranked by constraints.
Hope this helps anyone going through similar wandering. Feel free to open an issue and tell me where I'm still wrong — maybe v6 is around the corner.
Further reading
- Tutorial: Auth & Authorization
- Architecture: Architecture Overview
- Source:
crates/summer-auth/src/lib.rs - Key files:
crates/summer-auth/src/middleware.rs— middleware layercrates/summer-auth/src/strategy.rs— JWT strategycrates/summer-auth/src/storage/— deny / refresh statecrates/summer-auth/src/bitmap.rs— permission bitmapcrates/summer-admin-macros/src/auth_macro.rs—#[has_perm]macro
