chaos-api

Author	SHA1	Message	Date
CaIon	fddf54ccc5	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	2026-05-22 19:08:38 +08:00
Seefs	0936e25046	perf: avoid eager formatting in debug log calls (#4929 )	2026-05-19 12:11:24 +08:00
Seefs	9ecad90652	refactor: optimize billing flow for OpenAI-to-Anthropic convert	2026-03-23 14:22:12 +08:00
Seefs	2cf3c1836c	fix: preserve explicit zero values in native relay requests	2026-03-01 15:47:03 +08:00
Seefs	0f09dbda2b	Merge branch 'upstream-main' into feature/improve-param-override # Conflicts: # relay/channel/api_request_test.go # relay/common/override_test.go # web/src/components/table/channels/modals/EditChannelModal.jsx	2026-02-25 13:39:54 +08:00
Calcium-Ion	49eb6d3c1e	feat: add missing OpenAI/Claude/Gemini request fields (#2971 ) * feat: add missing OpenAI/Claude/Gemini request fields and responses stream options * fix: skip field filtering when request passthrough is enabled * fix: include subscription in personal sidebar module controls * feat: gate Claude inference_geo passthrough behind channel setting and add field docs	2026-02-22 23:31:18 +08:00
Seefs	8cfc2b4398	fix: claude affinity cache counter (#2980 ) * fix: claude affinity cache counter * fix: claude affinity cache counter * fix: stabilize cache usage stats format and simplify modal rendering	2026-02-22 23:30:02 +08:00
Seefs	7633863c96	feat: unify param/header overrides with retry-aware conditions and flexible header operations	2026-02-22 00:45:49 +08:00
Seefs	aebc8ae254	feat: add retry-aware param override with return_error and prune_objects	2026-02-22 00:10:49 +08:00
Seefs	1dfffcf1ea	fix: skip field filtering when request passthrough is enabled	2026-02-19 15:09:13 +08:00
CaIon	29d48e262e	feat: refactor request body handling to use BodyStorage for improved efficiency	2026-02-12 01:51:27 +08:00
Seefs	a972722367	fix: 使用openai兼容接口调用部分渠道在最终端点为claude原生端点下还是走了openai扣减input_token的逻辑	2026-02-07 14:21:19 +08:00
CaIon	116004fd44	refactor: 抽象统一计费会话 BillingSession 将散落在多个文件中的预扣费/结算/退款逻辑抽象为统一的 BillingSession 生命周期管理： - 新增 BillingSettler 接口 (relay/common/billing.go) 避免循环引用 - 新增 FundingSource 接口 + WalletFunding / SubscriptionFunding 实现 (service/funding_source.go) - 新增 BillingSession 封装预扣/结算/退款原子操作 (service/billing_session.go) - 新增 SettleBilling 统一结算辅助函数，替换各 handler 中的 quotaDelta 模式 - 重写 PreConsumeBilling 为 BillingSession 工厂入口 - controller/relay.go 退款守卫改用 BillingSession.Refund() 修复的 Bug： - 令牌额度泄漏：PreConsumeTokenQuota 成功但 DecreaseUserQuota 失败时未回滚 - 订阅退款遗漏：FinalPreConsumedQuota=0 但 SubscriptionPreConsumed>0 时跳过退款 - 订阅多扣费：subConsume 强制为 1 但 FinalPreConsumedQuota 不同步 - 退款路径不统一：钱包/订阅退款逻辑现统一由 FundingSource.Refund 分派	2026-02-06 23:14:25 +08:00
Seefs	f244a9e661	fix: channel affinity (#2799 ) * fix: channel affinity log styles * fix: Issue with incorrect data storage when switching key sources * feat: support not retrying after a single rule configuration fails * fix: render channel affinity tooltip as multiline content * feat: channel affinity cache hit * fix: prevent ChannelAffinityUsageCacheModal infinite loading and hide data before fetch * chore: format backend with gofmt and frontend with prettier/eslint autofix	2026-02-02 14:37:31 +08:00
Seefs	d9321b7da3	feat: channel affinity (#2669 ) * feat: channel affinity * feat: channel affinity -> model setting * fix: channel affinity * feat: channel affinity op * feat: channel_type setting * feat: clean * feat: cache supports both memory and Redis. * feat: Optimise ui/ux * feat: Optimise ui/ux * feat: Optimise codex usage ui/ux * feat: Optimise ui/ux * feat: Optimise ui/ux * feat: Optimise ui/ux * feat: If the affinitized channel fails and a retry succeeds on another channel, update the affinity to the successful channel	2026-01-26 19:57:41 +08:00
Seefs	68e1e635e9	feat: logs show reject reason	2026-01-25 14:52:18 +08:00
Seefs	38791fa46d	feat: log shows request conversion	2026-01-20 23:43:29 +08:00
Seefs	f9c7daedcf	fix: for chat-based calls to the Claude model, tagging is required. Using Claude's rendering logs, the two approaches handle input rendering differently.	2026-01-15 15:28:02 +08:00
Seefs	2a15e3b152	feat: codex channel (#2652 ) * feat: codex channel * feat: codex channel * feat: codex oauth flow * feat: codex refresh cred * feat: codex usage * fix: codex err message detail * fix: codex setting ui * feat: codex refresh cred task * fix: import err * fix: codex store must be false * fix: chat -> responses tool call * fix: chat -> responses tool call	2026-01-14 22:29:43 +08:00
Seefs	71460cba15	feat: /v1/chat/completion -> /v1/response (#2629 ) * feat: /v1/chat/completion -> /v1/response	2026-01-11 21:38:07 +08:00
CaIon	87a75b0565	feat(ratio): add functions to check for audio ratios and clean up unused code	2025-12-31 21:29:10 +08:00
CaIon	b5a0c822d2	feat(adaptor): 新适配百炼多种图片生成模型 - wan2.6系列生图与编辑，适配多图生成计费 - wan2.5系列生图与编辑 - z-image-turbo生图，适配prompt_extend计费	2025-12-29 23:00:17 +08:00
长安	6e3bc06fa6	fix: 修复 Anthropic 渠道缓存计费错误 ## 问题描述当使用 Anthropic 渠道通过 `/v1/chat/completions` 端点调用且启用缓存功能时，计费逻辑错误地减去了缓存 tokens，导致严重的收入损失（94.5%）。 ## 根本原因不同 API 的 `prompt_tokens` 定义不同： - Anthropic API: `input_tokens` 字段已经是纯输入 tokens（不包含缓存） - OpenAI API: `prompt_tokens` 字段包含所有 tokens（包含缓存） - OpenRouter API: `prompt_tokens` 字段包含所有 tokens（包含缓存）当前 `postConsumeQuota` 函数对所有渠道都减去缓存 tokens，这对 Anthropic 渠道是错误的，因为其 `input_tokens` 已经不包含缓存。 ## 修复方案在 `relay/compatible_handler.go` 的 `postConsumeQuota` 函数中，添加渠道类型判断： ```go if relayInfo.ChannelType != constant.ChannelTypeAnthropic { baseTokens = baseTokens.Sub(dCacheTokens) } ``` 只对非 Anthropic 渠道减去缓存 tokens。 ## 影响分析 ### ✅ 不受影响的场景 1. 无缓存调用（所有渠道） - cache_tokens = 0 - 减去 0 = 不减去 - 结果：完全一致 2. OpenAI/OpenRouter 渠道 + 缓存 - 继续减去缓存（因为 ChannelType != Anthropic） - 结果：完全一致 3. Anthropic 渠道 + /v1/messages 端点 - 使用 PostClaudeConsumeQuota（不修改） - 结果：完全不受影响 ### ✅ 修复的场景 4. Anthropic 渠道 + /v1/chat/completions + 缓存 - 修复前：错误地减去缓存，导致 94.5% 收入损失 - 修复后：不减去缓存，计费正确 ## 验证数据以实际记录 143509 为例： \| 项目 \| 修复前 \| 修复后 \| 差异 \| \|------\|--------\|--------\|------\| \| Quota \| 10,489 \| 191,330 \| +180,841 \| \| 费用 \| ¥0.020978 \| ¥0.382660 \| +¥0.361682 \| \| 收入恢复 \| - \| - \| +1724.1% \| ## 测试建议 1. 测试 Anthropic 渠道 + 缓存场景 2. 测试 OpenAI 渠道 + 缓存场景（确保不受影响） 3. 测试无缓存场景（确保不受影响） ## 相关 Issue 修复 Anthropic 渠道使用 prompt caching 时的计费错误。	2025-12-20 14:17:12 +08:00
CaIon	be2a863b9b	feat(audio): enhance audio request handling with token type detection and streaming support	2025-12-13 17:24:23 +08:00
CaIon	1fededceb3	feat: refactor token estimation logic - Introduced new OpenAI text models in `common/model.go`. - Added `IsOpenAITextModel` function to check for OpenAI text models. - Refactored token estimation methods across various channels to use estimated prompt tokens instead of direct prompt token counts. - Updated related functions and structures to accommodate the new token estimation approach, enhancing overall token management.	2025-12-02 21:34:39 +08:00
Seefs	dbdcb14e7a	feat: embedding param override && internal params	2025-11-22 18:27:17 +08:00
Seefs	5010f2d004	format: package name -> github.com/QuantumNous/new-api (#2017 )	2025-10-11 15:30:09 +08:00
Seefs	d075fbee23	fix: missing field & field control	2025-10-02 00:14:35 +08:00
CaIon	552d795742	Merge branch 'alpha'	2025-09-19 14:20:15 +08:00
creamlike1024	f6984272bf	fix: openai responses api 未统计图像生成调用计费	2025-09-16 12:47:59 +08:00
Xyfacai	c176e713f7	fix: 非openai 渠道使用 SystemPrompt 设置会panic	2025-09-15 19:38:31 +08:00
CaIon	1a8d781721	Revert "feat: gemini-2.5-flash-image-preview 文本和图片输出计费" This reverts commit `a45513a7a6`.	2025-09-13 12:53:28 +08:00
Xyfacai	3f9adc9992	fix: openai 格式请求 claude 没计费 create cache token	2025-09-10 15:30:23 +08:00
creamlike1024	a45513a7a6	feat: gemini-2.5-flash-image-preview 文本和图片输出计费	2025-08-27 21:30:52 +08:00
CaIon	1b8bcfb000	fix: update error types for upstream errors and JSON marshal failure	2025-08-26 16:26:56 +08:00
CaIon	13301d8544	fix: enhance error handling for invalid request types in relay handlers	2025-08-23 13:34:56 +08:00
CaIon	060ce89286	refactor: rename relay-text.go to compatible_handler.go for clarity	2025-08-23 13:13:57 +08:00

37 Commits