A new AI compliance service sits between AI models and end users to flag and replace any messages that might present a compliance problem.
AI 点评 · 融资千万美元,在AI输出端拦截违规内容,开辟模型安全新赛道。
共 28 条相关资讯 · 来自历史归档
A new AI compliance service sits between AI models and end users to flag and replace any messages that might present a compliance problem.
AI 点评 · 融资千万美元,在AI输出端拦截违规内容,开辟模型安全新赛道。

IT之家 6 月 2 日消息,据路透社昨天报道,欧盟计划对高度敏感的政府云项目招标提出新标准,将微软、亚马逊和谷歌等美国大型科技公司排除在外。 IT之家援引路透社,本提案属于欧盟委员会《云与人工智能发展法案》的一部分,预计将在周三正式公布。 旨在进一步减少美国企业依赖 , 并推动欧洲本土产业发展 。 同时,欧盟不断在银行、能源和医疗等敏感领域实施“数字主权”…
AI 点评 · 欧盟加速“数字主权”战略,限制美企参与云招标,将重塑全球科技竞争格局。

In this post, we use a lakehouse data agent to demonstrate how you can use Policy for deterministic access control and Lambda interceptors for dynamic validation. We then show how…
AI 点评 · 亚马逊Bedrock新功能实现AI代理安全管控,结合策略与动态验证,为行业提供可落地的防护方案。
Aligning Large Language Models (LLMs) with human values often degrades their general capabilities, termed the alignment tax. Existing methods mitigate this by balancing dual objectives, which heavily…
AI 点评 · 用局部策略蒸馏降低对齐成本,在保持安全性的同时避免通用能力下降,为高效安全对齐提供了新思路。
Our approach to AI policy and political advocacy, transparency, support for thoughtful regulation and AI safety, and that no outside political group speaks on the company’s behalf.
AI 点评 · OpenAI公开AI政策立场,强调透明度与安全监管,体现科技巨头在AI治理中的责任与影响力。
36氪获悉,华金证券发布研报称,复盘历史,影响6月A股市场走势的核心因素是政策和外部事件、基本面和流动性。今年6月A股可能延续震荡偏强趋势,受世界杯等因素影响有限。行业配置上,6月科技主线可能不变,建议继续逢低配置:一是政策和产业趋势向上的电子(半导体、AI硬件)、通信(AI硬件)、电新(AI电力、锂电)、军工(商业航天)、传媒(AI应用、游戏)、计算机(A…
AI 点评 · 科技主线延续性与政策事件共振,震荡市中结构性机会值得关注。
Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Ma…
AI 点评 · 教皇通谕点破技术非中立本质,为人类应对AI时代提供伦理锚点。
Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Ma…
AI 点评 · 教皇通谕点破技术非中性,为AI伦理提供了超越功利主义的道德框架,值得科技与政策界深思。
Building strong reward models (RMs) for language model alignment is bottlenecked by the cost and difficulty of acquiring diverse and reliable preference data from human annotation or judge models. It…
AI 点评 · 用模型自身生成数据改进奖励模型,突破人工标注瓶颈,是RLHF的高效自监督路径。
Group-advantage-based reinforcement learning methods, such as GRPO and DAPO, have demonstrated strong performance across diverse domains, including mathematical reasoning and text-to-image generation.…
On-policy distillation (OPD) trains a student on prefixes sampled from its own policy while matching a stronger teacher. This addresses the prefix mismatch of offline distillation, but early student r…
Video world models (WMs) have shown promise for policy evaluation and improvement by imagining realistic future observations conditioned on ego-robot actions. While WMs can model distributions over fu…
AI 点评 · 用压力测试场景驱动视频世界模型,提升机器人策略评估的鲁棒性和改进效果。

Here’s why Anthropic and OpenAI are on board with Illinois safety testing.
AI 点评 · 伊利诺伊州AI新法获OpenAI支持,特朗普监管影响力减弱,行业安全标准或迎转折。

Here’s why Anthropic and OpenAI are on board with Illinois safety testing.
AI 点评 · 伊利诺伊州新法削弱联邦对AI监管主导权,获Anthropic和OpenAI支持,凸显行业对安全测试的
AI 点评 · Zig 明确拒绝AI并独立发展,揭示编程语言社区对技术自主性的新思考。
Explore OpenAI’s Frontier Governance Framework and how our AI safety, security, and risk practices align with emerging EU and California regulations.
AI 点评 · 首次披露AI安全与欧美监管对齐的具体实践,为行业合规提供参考。
Explore OpenAI’s Frontier Governance Framework and how our AI safety, security, and risk practices align with emerging EU and California regulations.
AI 点评 · 前沿治理框架首次将AI安全实践与欧盟、加州法规对标,为行业合规提供范本。
We study two-level autoresearch for cooperation: an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSD…
AI 点评 · 自动探索合作策略的AI管道设计,为多智能体序贯社会困境提供创新解法。
While GUI agents have advanced rapidly, they often lack the robustness to recover from their own errors, hindering real-world deployment. To bridge this gap at both the evaluation and data levels, we…
AI 点评 · 为GUI智能体提供自我纠错能力评估基准与轨迹合成方法,填补了实际部署中的关键空白。
Reinforcement learning (RL) can be used to improve the policy (denoiser) of diffusion large language models (dLLMs), while being hindered by the intractability of the policy likelihood. A dominant and…
Speculative decoding accelerates large language model inference by pairing a target model with a lightweight draft model whose proposed tokens are verified in parallel. A common way to build draft mod…
AI 点评 · 通过在线策略蒸馏提升推测解码效率,为加速大模型推理提供了更优训练方案。
When a large language model under reinforcement learning commits a wrong reasoning step early in a trajectory, standard algorithms force it to keep generating until the maximum horizon, spending compu…
AI 点评 · 用早停机制提升强化学习训练大模型效率,大幅减少无效计算资源浪费。
AI 点评 · ripgrep集成AI搜索,或成开发者效率利器。
On-policy distillation (OPD) trains a student on its own rollouts with token-level teacher supervision. Recent selective OPD methods exploit the non-uniformity of OPD signals by prioritizing high-entr…
Customized image editing aims to equip pre-trained diffusion models with specific visual effects using limited paired data, typically via Low-Rank Adaptation (LoRA). As the number of desired effects g…
AI 点评 · 多教师在线蒸馏技术,让一个LoRA模型整合50种图像特效,显著降低部署成本。
Reinforcement learning with verifiable rewards (RLVR) has become a core technique for post-training of Large Language Models (LLMs). While policy optimization is driven by all sampled tokens under a g…
AI 点评 · 时间调度让强化学习从空间维度扩展到时间维度,为提升推理效率开辟新思路。
Multi-agent LLM workflows route inference through specialized roles to lift end-task accuracy, but jointly training those roles with reinforcement learning is unstable in ways that are poorly understo…
AI 点评 · 多智能体强化学习提升大模型协作效率的关键在于理解分工规模与策略共享的权衡。
Runtime security monitoring and control for AI agents. Catches malicious tool use, prompt injection, and policy drift in real time, before the agent acts.