码到成功

聊一聊python3.14

2025-10-12

一张图先看懂三件事

Python 3.14 特性地图

如果用一句话概括：

t-string 适合做安全渲染、SQL 参数化、日志结构化、轻量 DSL。
注解延迟适合类型密集项目，尤其是模型互相引用、插件系统、框架扫描。
多解释器适合 CPU 密集型任务、隔离执行、插件沙箱雏形。

它们有一个共同点：把以前“直接揉成结果”的东西拆成结构。字符串不急着拼，注解不急着算，任务也不急着共享同一份状态。

t-string：不是 f-string 的替代品，而是模板处理接口

t-string 工作台

f-string 会立刻得到 str：

name = "Zoy"
msg = f"hello {name}"
print(type(msg))  # <class 'str'>

t-string 则返回 string.templatelib.Template：

name = "Zoy"
tpl = t"hello {name}"

print(type(tpl))
print(tpl.strings)
print(tpl.values)
print(tpl.interpolations)

可以把它理解成：Python 帮你把“静态文本”和“动态插值”先拆开，至于最后怎么合成，由你自己决定。

Template 里有什么

Template 主要有三个常用属性：

strings：静态字符串片段。
interpolations：插值对象，包含值、表达式文本、转换标记、格式规格。
values：所有插值的值。

示例：

price = 19.8
tpl = t"price={price:.2f}"

print(tpl.strings)
print(tpl.values)

for item in tpl:
    print(repr(item))

它和 f-string 最大的区别在于：f-string 只给你结果，t-string 给你“可处理的中间结构”。这就很适合做安全输出。

用 t-string 做 HTML 转义

下面这个例子把静态文本原样保留，插值部分统一转义：

from html import escape
from string.templatelib import Interpolation, Template


def html_safe(template: Template) -> str:
    parts: list[str] = []

    for part in template:
        if isinstance(part, Interpolation):
            parts.append(escape(str(part.value), quote=True))
        else:
            parts.append(part)

    return "".join(parts)


username = '<img src=x onerror=alert(1)>'
page = html_safe(t"<p>Hello, {username}</p>")

print(page)

输出会把用户输入当作动态数据处理，而不是直接塞进 HTML。这里的优势是：调用方仍然能写出自然的字符串模板，渲染方可以集中处理安全策略。

用 t-string 做 SQL 参数化

不要把动态值直接拼进 SQL。t-string 可以帮我们把 SQL 文本和参数值拆开：

from string.templatelib import Interpolation, Template


def sql_params(template: Template) -> tuple[str, list[object]]:
    sql_parts: list[str] = []
    params: list[object] = []

    for part in template:
        if isinstance(part, Interpolation):
            sql_parts.append("?")
            params.append(part.value)
        else:
            sql_parts.append(part)

    return "".join(sql_parts), params


user_id = 1001
status = "active"

sql, params = sql_params(
    t"select * from users where id={user_id} and status={status}"
)

print(sql)
print(params)

结果类似：

select * from users where id=? and status=?
[1001, 'active']

这个模式很舒服：业务代码保持可读，底层处理函数负责把动态值交给数据库驱动。

t-string 适合什么场景

HTML、Markdown、邮件模板：静态片段和用户输入分开处理。
SQL、Shell 命令构造：统一参数化，少写危险拼接。
结构化日志：插值值可以单独进入日志字段。
轻量 DSL：表达式文本和值都能拿到，适合做规则引擎。

不建议把 t-string 当成“更酷的 f-string”。如果只是拼一句日志，f-string 仍然直接。t-string 的价值在于你要拦截、检查、转义、参数化。

注解延迟求值：前向引用终于轻松了

注解延迟求值

以前写类型注解时，经常会遇到前向引用问题。比如一个函数参数类型在后面才定义，老写法可能需要引号：

def handle(user: "User") -> None:
    print(user.name)


class User:
    def __init__(self, name: str) -> None:
        self.name = name

在 Python 3.14 的默认语义下，注解不会在定义处马上求值，而是在真正访问时再处理。这样很多前向引用可以自然书写：

def handle(user: User) -> None:
    print(user.name)


class User:
    def __init__(self, name: str) -> None:
        self.name = name

这对框架和大型项目尤其友好：模型之间互相引用时，不必为了导入顺序写一堆字符串注解。

annotationlib：按你需要的格式读取注解

Python 3.14 提供了 annotationlib，可以用不同格式读取注解。

from annotationlib import Format, get_annotations


def load_plugin(plugin: Plugin) -> Result:
    ...


print(get_annotations(load_plugin, format=Format.STRING))
print(get_annotations(load_plugin, format=Format.FORWARDREF))


class Plugin:
    pass


class Result:
    pass


print(get_annotations(load_plugin, format=Format.VALUE))

三个格式的使用倾向：

Format.STRING：适合文档生成器、代码扫描器，只想展示注解文本。
Format.FORWARDREF：适合框架扫描，允许暂时未解析的引用存在。
Format.VALUE：适合最终运行阶段，需要拿到真实对象。

注意：读取注解本身可能触发代码执行，STRING 和 FORWARDREF 也不是沙箱。它们适合减少“必须拿到真实对象”的需求，但不要对不可信模块随意做注解 introspection。

框架代码可以更温柔

假设我们写一个路由扫描器，只想拿到函数参数类型用于生成接口文档：

from annotationlib import Format, get_annotations
from collections.abc import Callable


def describe_endpoint(func: Callable) -> dict[str, str]:
    annotations = get_annotations(func, format=Format.STRING)
    return {
        name: str(annotation)
        for name, annotation in annotations.items()
    }


def create_order(payload: CreateOrderRequest) -> OrderResponse:
    ...


print(describe_endpoint(create_order))

这个扫描器不需要真的导入 CreateOrderRequest 或 OrderResponse，也不需要提前解决所有依赖。对框架来说，这种“先读结构，后做解析”的能力很实用。

和 `from future import annotations` 的关系

如果模块里仍然写了：

from __future__ import annotations

注解会继续以字符串化语义运行。迁移时不要急着全删，可以按模块逐步调整。比较稳的顺序是：

先确认框架、ORM、序列化库如何读取注解。
再把注解读取逻辑从直接访问 __annotations__ 调整为 annotationlib 或 typing.get_type_hints。
最后再决定是否移除 future import。

多解释器：同进程里的隔离并行

多解释器池

多解释器可以理解成：同一个进程里跑多个相互隔离的 Python 解释器。它们不是普通线程，因为每个解释器有自己的运行状态；它们也不是子进程，因为仍然在同一个进程里。

这带来一个很香的点：多个解释器可以在不同 CPU 核心上并行执行 Python 代码。代价是：对象不能随便共享，通信要显式做。

直接创建解释器

标准库提供了 concurrent.interpreters：

from concurrent import interpreters


interp = interpreters.create()
interp.exec("print('hello from another interpreter')")

result = interp.call(len, [1, 2, 3])
print(result)

interp.close()

这里的 interp.exec() 会在另一个解释器中执行源码，interp.call() 会把可调用对象和参数传过去再取回结果。

用 InterpreterPoolExecutor 跑 CPU 任务

如果你更喜欢 concurrent.futures 的风格，可以用 InterpreterPoolExecutor：

from concurrent.futures import InterpreterPoolExecutor


def count_prime(limit: int) -> int:
    total = 0
    for n in range(2, limit):
        for i in range(2, int(n ** 0.5) + 1):
            if n % i == 0:
                break
        else:
            total += 1
    return total


limits = [30_000, 32_000, 34_000, 36_000]

with InterpreterPoolExecutor(max_workers=4) as pool:
    results = list(pool.map(count_prime, limits))

print(results)

这个接口看起来像线程池，但每个 worker 在自己的解释器里跑任务。对于纯 Python 的 CPU 计算，它比普通线程更有想象空间。

通信要显式，不要幻想共享大对象

多解释器的隔离感很强。一个解释器里导入的模块、修改的全局变量、打开的状态，不会自动同步给另一个解释器。大多数可变对象也不能直接共享。

这不是缺点，而是设计选择：它逼你把并发关系写清楚。

from concurrent import interpreters


queue = interpreters.create_queue()
replies = interpreters.create_queue()
worker = interpreters.create()

worker.prepare_main(queue=queue, replies=replies)
queue.put("job-001")

worker.exec("""
item = queue.get()
replies.put(f"done: {item}")
""")

print(replies.get())

worker.close()

实践中可以把它当成“进程内 actor”：每个解释器负责自己的状态，外部只通过消息传递交换数据。

迁移和使用建议

Python 3.14 迁移清单

t-string 的建议

安全输出优先考虑 t-string，尤其是 HTML、SQL、Shell、日志这类场景。
不要直接把 Template 转成字符串后再处理，那会丢掉结构优势。
封装统一渲染函数，让业务层只写模板，安全策略集中落地。
对外暴露 API 时注明参数类型是 Template，避免调用方传普通字符串。

注解延迟的建议

框架代码不要盲目读取 __annotations__，优先使用标准库提供的读取工具。
文档生成器优先用 Format.STRING，框架扫描优先考虑 Format.FORWARDREF。
只有确实需要真实对象时再用 Format.VALUE 或 typing.get_type_hints。
对不可信模块读取注解要谨慎，不要把任何格式当成安全沙箱。

多解释器的建议

CPU 密集任务可以尝试 InterpreterPoolExecutor。
状态隔离要提前设计，别把它当线程池的无脑替代。
跨解释器传输尽量使用简单、可序列化的数据。
第三方 C 扩展是否支持多解释器要单独验证。

一个小组合：t-string + 注解扫描 + 解释器池

Python 3.14 实践路线

想象一个报表系统：

用 t-string 定义 SQL 模板，自动转成参数化查询。
用注解延迟扫描报表函数签名，生成参数文档。
用多解释器并行计算多个重型报表。

代码大概长这样：

from annotationlib import Format, get_annotations
from concurrent.futures import InterpreterPoolExecutor
from string.templatelib import Interpolation, Template


def to_sql(template: Template) -> tuple[str, list[object]]:
    sql_parts: list[str] = []
    params: list[object] = []

    for part in template:
        if isinstance(part, Interpolation):
            sql_parts.append("?")
            params.append(part.value)
        else:
            sql_parts.append(part)

    return "".join(sql_parts), params


def report_schema(func) -> dict[str, str]:
    return {
        key: str(value)
        for key, value in get_annotations(func, format=Format.STRING).items()
    }


def build_sales_report(region: str, minimum: int) -> dict[str, object]:
    sql, params = to_sql(
        t"select * from sales where region={region} and amount>={minimum}"
    )
    return {"sql": sql, "params": params}


print(report_schema(build_sales_report))

jobs = [("east", 100), ("west", 200), ("south", 150)]

with InterpreterPoolExecutor(max_workers=3) as pool:
    futures = [
        pool.submit(build_sales_report, region, minimum)
        for region, minimum in jobs
    ]
    print([future.result() for future in futures])

这个例子不追求复杂，但能体现 Python 3.14 的新味道：结构化模板、可控注解读取、隔离并行任务可以组合在一起。