review 结果
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
# Fonrey 平台管理后台技术方案
|
||||
|
||||
**版本**: v1.0
|
||||
**版本**: v1.1
|
||||
**项目**: Fonrey 房产经纪管理系统
|
||||
**模块**: 平台管理后台(`apps/admin_console` + `apps/release`)
|
||||
**关联 PRD**: [`PRD/平台管理后台/平台管理后台PRD.md`](../PRD/平台管理后台/平台管理后台PRD.md)(v1.0)
|
||||
@@ -19,6 +19,7 @@
|
||||
| 日期 | 变更人 | 变更内容 |
|
||||
|---|---|---|
|
||||
| 2026-05-02 | Sisyphus | 初版:合并原『客户端发布管理技术方案.md』与原『系统管理技术文档.md』,统一三大维度(技术选型 / 页面路由表 / API 设计),新增 `ADR-20260502-002` |
|
||||
| 2026-05-02 | Atlas | v1.1:新增 §7.0 平台后台独立子域与会话隔离(S-2);新增 §6.1.1 创建租户 Saga 与补偿事务(PT-B-1)|
|
||||
|
||||
---
|
||||
|
||||
@@ -710,6 +711,186 @@ HX-Trigger: {"fonrey:toast":{"type":"info","message":"导出任务已提交"}}
|
||||
| 任务 | 触发场景 | 队列 | 重试 | 失败处理 |
|
||||
|---|---|---|---|---|
|
||||
| `provision_tenant` | 创建租户后异步执行 schema 创建 + 迁移 + 默认数据 | `admin_ops` | 不重试 | 标记 `tenants.status='failed'`,事务回滚,邮件告警 |
|
||||
|
||||
### 6.1.1 创建租户 Saga 与补偿事务(PT-B-1 回应)
|
||||
|
||||
> **背景**:审核报告 PT-B-1 指出,`provision_tenant` 任务跨越"DB 行写入 → schema 创建 → 迁移 → 发送欢迎邮件"多个步骤,任意步骤失败若无补偿事务,会导致 `tenants` 表存在悬空行、schema 孤儿或账号不一致。本节定义完整 Saga 流程及每步补偿动作。
|
||||
|
||||
#### Saga 步骤与补偿矩阵
|
||||
|
||||
| 步骤 # | 动作 | 成功后状态 | 补偿动作(失败时回滚) |
|
||||
|---|---|---|---|
|
||||
| **S1** | 写入 `public.tenants`(`status='provisioning'`)+ 写审计行 | DB 行存在 | 将 `status` 改为 `'failed'`;**不删行**(保留审计溯源) |
|
||||
| **S2** | `CREATE SCHEMA {schema_name}` | schema 已创建 | `DROP SCHEMA {schema_name} CASCADE`(若存在) |
|
||||
| **S3** | `django-tenants migrate --schema={schema_name}` | 所有 migration 应用完成 | `DROP SCHEMA {schema_name} CASCADE`(schema 已损坏,丢弃重建) |
|
||||
| **S4** | 写入租户 schema 默认数据(角色、系统配置等) | 默认数据就绪 | 同 S3 补偿(整个 schema 丢弃) |
|
||||
| **S5** | 在 `{schema_name}.users` 创建初始 Tenant Admin 账号 | 账号可用 | 同 S3 补偿(账号随 schema 丢弃) |
|
||||
| **S6** | 更新 `public.tenants.status = 'active'` | 租户对外可用 | 将 `status` 改为 `'failed'`;发送平台告警(已运行 S2–S5 资源已清理) |
|
||||
| **S7** | 异步发送欢迎邮件(`send_welcome_email`) | 邮件入队 | 仅记录失败日志 + Sentry 告警;**不回滚整个 Saga**(邮件失败不影响租户可用性) |
|
||||
|
||||
> **原则**:
|
||||
> - S1–S6 为"原子序列",任一步失败必须逆序执行已完成步骤的补偿。
|
||||
> - S7 为"幂等尾步骤",独立重试,不触发 Saga 回滚。
|
||||
> - 补偿动作本身不可再失败——若补偿失败(如 DROP SCHEMA 超时),写入 `platform_audit_logs`(`action_type='PROVISION_COMPENSATION_FAILED'`)并触发 PagerDuty 告警,由运维人工干预。
|
||||
|
||||
#### `provision_tenant` Celery 任务实现
|
||||
|
||||
```python
|
||||
# apps/admin_console/tasks/provision.py
|
||||
from celery import shared_task
|
||||
from django.db import transaction, connection
|
||||
from django_tenants.utils import schema_context, get_tenant_model
|
||||
|
||||
import logging
|
||||
logger = logging.getLogger("provision")
|
||||
|
||||
@shared_task(bind=True, acks_late=True, autoretry_for=(), max_retries=0)
|
||||
def provision_tenant(self, tenant_id: str):
|
||||
"""
|
||||
创建租户 Saga。
|
||||
不重试(max_retries=0)——失败后由运维根据审计日志判断是否重新触发。
|
||||
"""
|
||||
from apps.admin_console.models import Tenant
|
||||
from apps.admin_console.services import audit_service
|
||||
|
||||
tenant = Tenant.objects.get(id=tenant_id)
|
||||
completed_steps = []
|
||||
|
||||
try:
|
||||
# S1: tenants 行已在 View 层写入(status='provisioning'),记录已完成
|
||||
completed_steps.append("S1_row_written")
|
||||
|
||||
# S2: CREATE SCHEMA
|
||||
_create_schema(tenant)
|
||||
completed_steps.append("S2_schema_created")
|
||||
|
||||
# S3: migrate
|
||||
_run_migrations(tenant)
|
||||
completed_steps.append("S3_migrated")
|
||||
|
||||
# S4: 默认数据
|
||||
_seed_default_data(tenant)
|
||||
completed_steps.append("S4_seeded")
|
||||
|
||||
# S5: 初始 Tenant Admin
|
||||
_create_initial_admin(tenant)
|
||||
completed_steps.append("S5_admin_created")
|
||||
|
||||
# S6: 激活
|
||||
with transaction.atomic():
|
||||
tenant.status = "active"
|
||||
tenant.save(update_fields=["status", "updated_at"])
|
||||
audit_service.write_audit(
|
||||
action_type="CREATE_TENANT",
|
||||
target_type="Tenant",
|
||||
target_id=str(tenant.id),
|
||||
result="success",
|
||||
)
|
||||
completed_steps.append("S6_activated")
|
||||
|
||||
# S7: 欢迎邮件(幂等尾步骤,独立重试,不纳入 Saga 回滚)
|
||||
from apps.admin_console.tasks.notifications import send_welcome_email
|
||||
send_welcome_email.apply_async(
|
||||
kwargs={"tenant_id": tenant_id},
|
||||
countdown=5,
|
||||
retry=True,
|
||||
max_retries=5,
|
||||
)
|
||||
|
||||
except Exception as exc:
|
||||
logger.exception("provision_tenant failed at steps=%s tenant=%s", completed_steps, tenant_id)
|
||||
_compensate(tenant, completed_steps, exc)
|
||||
raise # 保留 Celery 任务失败状态,触发 Sentry
|
||||
|
||||
|
||||
def _create_schema(tenant):
|
||||
from django_tenants.utils import get_public_schema_name
|
||||
from django.db import connection
|
||||
with connection.cursor() as cur:
|
||||
schema = tenant.schema_name
|
||||
# 幂等:若 schema 已存在(上次 Saga 补偿不完整),先 DROP 再 CREATE
|
||||
cur.execute(f"DROP SCHEMA IF EXISTS {schema} CASCADE")
|
||||
cur.execute(f"CREATE SCHEMA {schema}")
|
||||
|
||||
|
||||
def _run_migrations(tenant):
|
||||
from django.core.management import call_command
|
||||
call_command("migrate_schemas", schema_name=tenant.schema_name, interactive=False, verbosity=0)
|
||||
|
||||
|
||||
def _seed_default_data(tenant):
|
||||
with schema_context(tenant.schema_name):
|
||||
from apps.admin_console.seeds import seed_tenant_defaults
|
||||
seed_tenant_defaults(tenant)
|
||||
|
||||
|
||||
def _create_initial_admin(tenant):
|
||||
with schema_context(tenant.schema_name):
|
||||
from apps.account.services import account_service
|
||||
account_service.create_initial_admin(
|
||||
tenant=tenant,
|
||||
phone=tenant.contact_phone,
|
||||
)
|
||||
|
||||
|
||||
def _compensate(tenant, completed_steps: list, exc: Exception):
|
||||
"""
|
||||
逆序执行已完成步骤的补偿动作。
|
||||
"""
|
||||
from apps.admin_console.services import audit_service
|
||||
from django.db import connection
|
||||
|
||||
# S2–S5:若 schema 已创建,丢弃整个 schema
|
||||
if any(s in completed_steps for s in ("S2_schema_created", "S3_migrated", "S4_seeded", "S5_admin_created")):
|
||||
try:
|
||||
with connection.cursor() as cur:
|
||||
cur.execute(f"DROP SCHEMA IF EXISTS {tenant.schema_name} CASCADE")
|
||||
logger.info("compensation: dropped schema %s", tenant.schema_name)
|
||||
except Exception as comp_exc:
|
||||
logger.error("compensation FAILED (DROP SCHEMA): %s", comp_exc)
|
||||
audit_service.write_audit(
|
||||
action_type="PROVISION_COMPENSATION_FAILED",
|
||||
target_type="Tenant",
|
||||
target_id=str(tenant.id),
|
||||
result="failed",
|
||||
error_message=str(comp_exc),
|
||||
)
|
||||
# 发 PagerDuty 告警
|
||||
from apps.admin_console.alerts import trigger_pagerduty
|
||||
trigger_pagerduty(
|
||||
title=f"provision_tenant compensation failed: {tenant.schema_name}",
|
||||
body=str(comp_exc),
|
||||
)
|
||||
|
||||
# S1:将 tenants 行标记为 failed(不删行,保留审计溯源)
|
||||
try:
|
||||
tenant.status = "failed"
|
||||
tenant.save(update_fields=["status", "updated_at"])
|
||||
audit_service.write_audit(
|
||||
action_type="CREATE_TENANT",
|
||||
target_type="Tenant",
|
||||
target_id=str(tenant.id),
|
||||
result="failed",
|
||||
error_message=str(exc),
|
||||
)
|
||||
except Exception as comp_exc:
|
||||
logger.error("compensation FAILED (mark tenant failed): %s", comp_exc)
|
||||
```
|
||||
|
||||
#### 幂等性保证
|
||||
|
||||
- `_create_schema` 前先 `DROP SCHEMA IF EXISTS ... CASCADE`,确保重新触发 Saga 时不因 schema 残留而报错。
|
||||
- `provision_tenant` 任务 ID 绑定 `tenant_id`;同一 `tenant_id` 若任务已在 `PROGRESS` / `SUCCESS` 状态,View 层拒绝重复入队。
|
||||
- `create_initial_admin` 内部以 `phone` 为唯一键做 `get_or_create`,幂等安全。
|
||||
|
||||
#### 可观测性
|
||||
|
||||
| 观测点 | 实现 |
|
||||
|---|---|
|
||||
| Saga 步骤进度 | `task.update_state(state='PROGRESS', meta={'step': step_name})` |
|
||||
| 最终状态 | `platform_audit_logs`(`action_type='CREATE_TENANT'`, `result='success'/'failed'`) |
|
||||
| 补偿失败告警 | `action_type='PROVISION_COMPENSATION_FAILED'` + PagerDuty |
|
||||
| 任务耗时监控 | Celery Flower + Prometheus `celery_task_runtime_seconds{name="provision_tenant"}` |
|
||||
| `auto_resume_suspended` | Beat 每 10 min 扫描 `suspended_until <= NOW()` | `admin_ops` | 3 次 / 60s | Sentry 告警 |
|
||||
| `purge_pending_delete` | Beat 每天 03:00 扫描冷静期到期 | `admin_ops` | 不重试 | 标记 `failed_to_purge` |
|
||||
| `hard_delete_tenant` | 视图触发 | `admin_ops` | 不重试 | 部分删除标记 + 告警 |
|
||||
@@ -836,6 +1017,159 @@ def is_enabled(tenant, flag_key: str, *, user=None) -> bool:
|
||||
|
||||
## 7. 安全与合规
|
||||
|
||||
### 7.0 平台后台独立子域与会话隔离(S-2 回应)
|
||||
|
||||
> **背景**:审核报告 S-2 指出,平台管理员(PlatformAdmin)会话与租户用户会话若共用 Cookie 域,存在越权同会话风险。本节明确隔离边界与实施机制。
|
||||
|
||||
#### 7.0.1 域名分离
|
||||
|
||||
| 角色 | 域名 | 说明 |
|
||||
|---|---|---|
|
||||
| 租户业务用户 | `*.fonrey.com`(各租户子域) | django-tenants 按 Host 路由至租户 schema |
|
||||
| 平台管理后台 | `admin.fonrey.com` | 独立 server block,物理分离 Cookie 域 |
|
||||
| 客户端 API | `app.fonrey.com` | 客户端运行时 API,独立 server block |
|
||||
|
||||
#### 7.0.2 Cookie 隔离配置
|
||||
|
||||
```python
|
||||
# settings/admin.py
|
||||
SESSION_COOKIE_DOMAIN = "admin.fonrey.com" # 严格限定,不允许 .fonrey.com 通配
|
||||
SESSION_COOKIE_SECURE = True # HTTPS only
|
||||
SESSION_COOKIE_HTTPONLY = True # 禁止 JS 访问
|
||||
SESSION_COOKIE_SAMESITE = "Strict" # 阻止跨站携带
|
||||
SESSION_COOKIE_NAME = "adminSessionId" # 与租户域 sessionid 命名隔离
|
||||
CSRF_COOKIE_DOMAIN = "admin.fonrey.com"
|
||||
```
|
||||
|
||||
> 租户业务侧 `SESSION_COOKIE_NAME = "sessionid"`;两侧 Cookie 名和 Domain 双重隔离,即使浏览器同时打开两个域名,也不会互相携带。
|
||||
|
||||
#### 7.0.3 `AdminSessionMiddleware` 会话隔离中间件
|
||||
|
||||
每次请求到达 `admin.fonrey.com` 时,中间件执行以下校验序列:
|
||||
|
||||
```python
|
||||
# apps/admin_console/middleware.py
|
||||
|
||||
class AdminSessionMiddleware:
|
||||
"""
|
||||
会话隔离守门中间件。
|
||||
必须放在 MIDDLEWARE 列表中 SessionMiddleware 之后、
|
||||
所有 View 处理之前。
|
||||
"""
|
||||
|
||||
EXEMPT_PATHS = {
|
||||
"/admin/login/",
|
||||
"/admin/mfa/setup/",
|
||||
"/admin/mfa/verify/",
|
||||
"/health/",
|
||||
}
|
||||
|
||||
def __init__(self, get_response):
|
||||
self.get_response = get_response
|
||||
|
||||
def __call__(self, request):
|
||||
if request.path not in self.EXEMPT_PATHS:
|
||||
self._enforce_isolation(request)
|
||||
return self.get_response(request)
|
||||
|
||||
def _enforce_isolation(self, request):
|
||||
"""
|
||||
三层校验:
|
||||
1. schema 必须是 public(租户 schema 不得进入)
|
||||
2. session 中必须存在 platform_admin_id
|
||||
3. admin_sessions 记录必须存在且未过期
|
||||
失败时 fail-closed → 302 跳登录页,同时清空 session。
|
||||
"""
|
||||
from django_tenants.utils import get_public_schema_name
|
||||
from django.db import connection
|
||||
|
||||
# 1. schema 隔离:只允许 public schema 进入后台
|
||||
if connection.schema_name != get_public_schema_name():
|
||||
self._reject(request, "non-public schema access denied")
|
||||
return
|
||||
|
||||
# 2. session 中必须有 platform_admin_id
|
||||
admin_id = request.session.get("platform_admin_id")
|
||||
if not admin_id:
|
||||
self._reject(request, "no platform_admin_id in session")
|
||||
return
|
||||
|
||||
# 3. admin_sessions 记录有效性(滚动续期)
|
||||
from apps.admin_console.models import AdminSession
|
||||
from django.utils import timezone
|
||||
|
||||
session = AdminSession.objects.filter(
|
||||
admin_id=admin_id,
|
||||
session_key=request.session.session_key,
|
||||
is_active=True,
|
||||
expires_at__gt=timezone.now(),
|
||||
).first()
|
||||
|
||||
if not session:
|
||||
self._reject(request, "admin session expired or revoked")
|
||||
return
|
||||
|
||||
# 4. 滚动续期:每次合法请求把 expires_at 向后延 30 min
|
||||
session.expires_at = timezone.now() + timedelta(minutes=30)
|
||||
session.save(update_fields=["expires_at"])
|
||||
|
||||
# 5. 把 admin 对象注入 request,供 View 直接使用
|
||||
request.platform_admin = session.admin
|
||||
|
||||
@staticmethod
|
||||
def _reject(request, reason: str):
|
||||
import logging
|
||||
from django.http import HttpResponseRedirect
|
||||
logging.getLogger("security").warning(
|
||||
"AdminSessionMiddleware rejected: %s | path=%s | ip=%s",
|
||||
reason, request.path, request.META.get("REMOTE_ADDR"),
|
||||
)
|
||||
request.session.flush() # 清空 session,防止残留
|
||||
# 注:raise 方式在 MIDDLEWARE 中无效,直接修改 request._reject 标记
|
||||
request._admin_session_rejected = True
|
||||
|
||||
def process_view(self, request, view_func, view_args, view_kwargs):
|
||||
if getattr(request, "_admin_session_rejected", False):
|
||||
from django.shortcuts import redirect
|
||||
return redirect("/admin/login/")
|
||||
return None
|
||||
```
|
||||
|
||||
#### 7.0.4 Nginx 物理防线
|
||||
|
||||
```nginx
|
||||
# /etc/nginx/conf.d/admin.fonrey.com.conf
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
server_name admin.fonrey.com;
|
||||
|
||||
# IP 白名单(与应用层 AdminIPWhitelistMiddleware 双重防线)
|
||||
include /etc/nginx/conf.d/admin_ip_whitelist.conf;
|
||||
deny all;
|
||||
|
||||
# 禁止租户子域访问 /admin/ 路径(防跨域探测)
|
||||
if ($host ~* "^(?!admin\.).*\.fonrey\.com$") {
|
||||
return 404;
|
||||
}
|
||||
|
||||
location / {
|
||||
proxy_pass http://gunicorn_cluster;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 7.0.5 安全回归测试要点
|
||||
|
||||
| 场景 | 期望结果 |
|
||||
|---|---|
|
||||
| 使用租户用户 Cookie 访问 `admin.fonrey.com` | `AdminSessionMiddleware` 拒绝,302 → 登录页 |
|
||||
| 使用平台管理员 Cookie 访问租户域 | Cookie Domain 不匹配,浏览器不携带,鉴权失败 |
|
||||
| `*.fonrey.com` 任意 Host 访问 `/admin/...` | Nginx 404(`if $host` 规则) |
|
||||
| Session 超过 30 min 无活动后访问 | `expires_at` 超时,中间件拒绝,302 → 登录页 |
|
||||
| `connection.schema_name != 'public'` 下访问 | 中间件拒绝,302 → 登录页 |
|
||||
|
||||
### 7.1 认证与会话
|
||||
|
||||
| 项 | 要求 |
|
||||
|
||||
Reference in New Issue
Block a user