> ## Documentation Index
> Fetch the complete documentation index at: https://veniceai-mintlify-6ce01df5.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# 소개

> Venice API 참조 문서

Venice API는 검열되지 않은 모델과 비공개 추론으로 AI 애플리케이션을 구축하기 위한 HTTP 기반 REST 및 스트리밍 인터페이스를 제공합니다. 텍스트 생성, 이미지 생성, embeddings 등을 모두 제한적인 콘텐츠 정책 없이 만들 수 있습니다. 통합 예제와 SDK는 [문서](/overview/getting-started)에서 사용할 수 있습니다. API 참조는 [OpenAPI YAML spec](https://api.venice.ai/doc/api/swagger.yaml)으로도 제공됩니다.

## 인증

Venice API는 인증을 위해 API 키를 사용합니다. [API 설정](https://venice.ai/settings/api)에서 API 키를 생성하고 관리하세요.

모든 API 요청에는 HTTP Bearer 인증이 필요합니다:

```
Authorization: Bearer VENICE_API_KEY
```

<Note>
  API 키는 비밀입니다. 공유하거나 클라이언트 측 코드에 노출하지 마세요.
</Note>

## OpenAI 호환성

Venice의 API는 OpenAI API 사양을 구현하여 기존 OpenAI 클라이언트 및 도구와의 호환성을 보장합니다. 이를 통해 익숙한 OpenAI 인터페이스를 사용하여 Venice와 통합하면서 Venice의 고유한 기능과 검열되지 않은 모델에 액세스할 수 있습니다.

### 설정

Venice의 base URL(`https://api.venice.ai/api/v1`)을 사용하도록 클라이언트를 구성하고 첫 번째 요청을 보내세요:

<CodeGroup>
  ```bash curl theme={"system"}
  curl https://api.venice.ai/api/v1/chat/completions \
    -H "Authorization: Bearer $VENICE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "venice-uncensored",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
  ```

  ```javascript JavaScript theme={"system"}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.VENICE_API_KEY,
    baseURL: "https://api.venice.ai/api/v1",
  });

  const response = await client.chat.completions.create({
    model: "venice-uncensored",
    messages: [{ role: "user", content: "Hello!" }]
  });

  console.log(response.choices[0].message.content);
  ```

  ```python Python theme={"system"}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ.get("VENICE_API_KEY"),
      base_url="https://api.venice.ai/api/v1"
  )

  response = client.chat.completions.create(
      model="venice-uncensored",
      messages=[{"role": "user", "content": "Hello!"}]
  )

  print(response.choices[0].message.content)
  ```
</CodeGroup>

## Venice 전용 기능

### 시스템 프롬프트

Venice는 검열되지 않은 자연스러운 모델 응답을 보장하도록 설계된 기본 시스템 프롬프트를 제공합니다. 시스템 프롬프트 처리에는 두 가지 옵션이 있습니다:

1. **기본 동작**: 시스템 프롬프트가 Venice의 기본값에 추가됨
2. **사용자 지정 동작**: Venice의 시스템 프롬프트를 완전히 비활성화

#### Venice 시스템 프롬프트 비활성화

Venice의 기본 시스템 프롬프트를 제거하려면 `venice_parameters` 옵션을 사용하세요:

<CodeGroup>
  ```bash curl theme={"system"}
  curl https://api.venice.ai/api/v1/chat/completions \
    -H "Authorization: Bearer $VENICE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "venice-uncensored",
      "messages": [
        {"role": "system", "content": "Your custom system prompt"},
        {"role": "user", "content": "Why is the sky blue?"}
      ],
      "venice_parameters": {
        "include_venice_system_prompt": false
      }
    }'
  ```

  ```javascript JavaScript theme={"system"}
  const completion = await client.chat.completions.create({
    model: "venice-uncensored",
    messages: [
      {
        role: "system",
        content: "Your custom system prompt",
      },
      {
        role: "user",
        content: "Why is the sky blue?",
      },
    ],
    venice_parameters: {
      include_venice_system_prompt: false,
    },
  });
  ```

  ```python Python theme={"system"}
  response = client.chat.completions.create(
      model="venice-uncensored",
      messages=[
          {"role": "system", "content": "Your custom system prompt"},
          {"role": "user", "content": "Why is the sky blue?"}
      ],
      extra_body={
          "venice_parameters": {
              "include_venice_system_prompt": False
          }
      }
  )
  ```
</CodeGroup>

### Venice 매개변수

`venice_parameters` 객체를 사용하면 표준 OpenAI API에서 사용할 수 없는 Venice 전용 기능에 액세스할 수 있습니다:

| 매개변수                                 | 유형      | 설명                                                                                                                                                                                                                        | 기본값     |
| ------------------------------------ | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `character_slug`                     | string  | 공개 Venice 캐릭터의 캐릭터 슬러그 (게시된 캐릭터 페이지에서 "Public ID"로 검색 가능)                                                                                                                                                                 | -       |
| `strip_thinking_response`            | boolean | 응답에서 `<think></think>` 블록 제거 (레거시 `<think>` 태그 형식을 사용하는 모델). [추론 모델](/guides/features/reasoning-models) 참조.                                                                                                               | `false` |
| `disable_thinking`                   | boolean | 지원되는 추론 모델에서 사고를 비활성화하고 응답에서 `<think></think>` 블록 제거                                                                                                                                                                      | `false` |
| `enable_web_search`                  | string  | 이 요청에 대해 웹 검색 활성화 (`off`, `on`, `auto` - auto는 모델 재량에 따라 활성화)<br />추가 사용량 기반 가격 적용, [가격](/overview/pricing#web-search-and-scraping) 참조.                                                                                   | `off`   |
| `enable_web_scraping`                | boolean | 사용자 메시지에서 감지된 최대 5개의 URL에 대해 웹 스크래핑 활성화. 스크래핑된 콘텐츠는 응답을 보강하고 웹 검색을 우회합니다. 성공적으로 스크래핑된 URL만 청구됩니다.<br />추가 사용량 기반 가격 적용, [가격](/overview/pricing#web-search-and-scraping) 참조.                                               | `false` |
| `enable_x_search`                    | boolean | 지원되는 Grok 모델(예: `grok-4-20-beta`)에 대해 xAI의 네이티브 검색(web + X/Twitter) 활성화. xAI의 검색 인프라를 사용하여 더 높은 품질의 검색 결과를 제공합니다. 활성화되면 Venice의 표준 웹 검색이 우회됩니다.<br />추가 사용량 기반 가격 적용, [가격](/overview/pricing#web-search-and-scraping) 참조. | `false` |
| `enable_web_citations`               | boolean | 웹 검색이 활성화된 경우, LLM이 `[REF]0[/REF]` 형식을 사용하여 출처를 인용하도록 요청                                                                                                                                                                  | `false` |
| `include_search_results_in_stream`   | boolean | 실험적: 첫 번째 방출 청크로 스트림에 검색 결과 포함                                                                                                                                                                                            | `false` |
| `return_search_results_as_documents` | boolean | LangChain 통합을 위해 `venice_web_search_documents`라는 OpenAI 호환 도구 호출에 검색 결과 노출                                                                                                                                                | `false` |
| `include_venice_system_prompt`       | boolean | 지정된 시스템 프롬프트와 함께 Venice의 기본 시스템 프롬프트를 포함할지 여부                                                                                                                                                                             | `true`  |

<Note>
  이러한 매개변수는 모델 이름에 추가된 모델 접미사로도 지정할 수 있습니다 (예: `zai-org-glm-5:enable_web_search=auto`). 자세한 내용은 [모델 기능 접미사](/api-reference/endpoint/chat/model_feature_suffix)를 참조하세요.
</Note>

### 프롬프트 캐싱

Venice는 반복되는 콘텐츠에 대한 지연 시간과 비용을 줄이기 위해 선택된 모델에서 프롬프트 캐싱을 지원합니다. 지원되는 모델의 경우 Venice는 시스템 프롬프트를 자동으로 캐시합니다 — 코드 변경이 필요하지 않습니다. 메시지 콘텐츠의 `cache_control` 속성을 사용하여 캐싱할 콘텐츠를 수동으로 표시할 수도 있습니다.

| 매개변수               | 유형     | 설명                                                                                         |
| ------------------ | ------ | ------------------------------------------------------------------------------------------ |
| `prompt_cache_key` | string | 캐시 히트율을 개선하기 위한 선택적 라우팅 힌트. 제공되면 Venice는 요청을 동일한 백엔드 인프라로 라우팅하여 다중 턴 대화에서 캐시 히트 가능성을 높입니다. |

캐싱 작동 방식, 청구 및 모범 사례에 대한 자세한 내용은 [프롬프트 캐싱](/guides/features/prompt-caching)을 참조하세요.

## 응답 헤더 참조

모든 Venice API 응답에는 요청, 속도 제한, 모델 정보 및 계정 잔액에 대한 메타데이터를 제공하는 HTTP 헤더가 포함됩니다. API 응답에서 반환된 오류 코드 외에도 이러한 헤더를 검사하여 특정 API 요청의 고유 ID를 얻고, 속도 제한을 모니터링하고, 계정 잔액을 추적할 수 있습니다.

Venice는 필요한 경우 지원팀과 보다 효율적인 문제 해결을 위해 프로덕션 배포에서 요청 ID(`CF-RAY` 헤더)를 기록할 것을 권장합니다.

아래 표는 발생할 수 있는 모든 헤더에 대한 포괄적인 참조를 제공합니다:

| Header                                      | Type   | Purpose                                                         | When Returned                                   |
| ------------------------------------------- | ------ | --------------------------------------------------------------- | ----------------------------------------------- |
| **Standard HTTP Headers**                   |        |                                                                 |                                                 |
| `Content-Type`                              | string | 응답 본문의 MIME 유형 (`application/json`, `text/csv`, `image/png`, 등) | Always                                          |
| `Content-Encoding`                          | string | 응답 본문 압축에 사용된 인코딩 (`gzip`, `br`)                                | When client sends `Accept-Encoding` header      |
| `Content-Disposition`                       | string | 콘텐츠 표시 방법 (예: `attachment; filename=export.csv`)                | When downloading files or exports               |
| `Date`                                      | string | RFC 7231 형식의 응답 생성 타임스탬프                                        | Always                                          |
| **Request Identification**                  |        |                                                                 |                                                 |
| `CF-RAY`                                    | string | 이 API 요청의 고유 식별자, 문제 해결 및 지원 요청에 사용                             | Always                                          |
| `x-venice-version`                          | string | Venice API 서비스의 현재 버전/리비전 (예: `20250828.222653`)                | Always                                          |
| `x-venice-timestamp`                        | string | 요청이 처리된 서버 타임스탬프 (ISO 8601 형식)                                  | When timestamp tracking is enabled              |
| `x-venice-host-name`                        | string | 요청을 처리한 서버의 호스트명                                                | Error responses and debugging scenarios         |
| **Model Information**                       |        |                                                                 |                                                 |
| `x-venice-model-id`                         | string | 요청에 사용된 AI 모델의 고유 식별자 (예: `venice-01-lite`)                     | Inference endpoints using AI models             |
| `x-venice-model-name`                       | string | 사용된 AI 모델의 친숙한/표시 이름 (예: `Venice Lite`)                         | Inference endpoints using AI models             |
| `x-venice-model-router`                     | string | 모델 추론을 처리한 라우터/백엔드 서비스                                          | Inference endpoints when routing info available |
| `x-venice-model-deprecation-warning`        | string | 지원 중단 예정 모델에 대한 경고 메시지                                          | When using a deprecated model                   |
| `x-venice-model-deprecation-date`           | string | 모델이 지원 중단되는 날짜 (ISO 8601 날짜)                                    | When using a deprecated model                   |
| **Rate Limiting Information**               |        |                                                                 |                                                 |
| `x-ratelimit-limit-requests`                | number | 현재 시간 윈도우에서 허용된 최대 요청 수                                         | All authenticated requests                      |
| `x-ratelimit-remaining-requests`            | number | 현재 시간 윈도우에서 남은 요청 수                                             | All authenticated requests                      |
| `x-ratelimit-reset-requests`                | number | 요청 속도 제한이 재설정되는 Unix 타임스탬프                                      | All authenticated requests                      |
| `x-ratelimit-limit-tokens`                  | number | 시간 윈도우에서 허용된 최대 토큰 수 (프롬프트 + 완료)                                | All authenticated requests                      |
| `x-ratelimit-remaining-tokens`              | number | 현재 시간 윈도우에서 남은 토큰 수                                             | All authenticated requests                      |
| `x-ratelimit-reset-tokens`                  | number | 토큰 속도 제한이 재설정될 때까지의 초                                           | All authenticated requests                      |
| `x-ratelimit-type`                          | string | 적용된 속도 제한 유형 (`user`, `api_key`, `global`)                      | When rate limiting is enforced                  |
| **Pagination Headers**                      |        |                                                                 |                                                 |
| `x-pagination-limit`                        | number | 페이지당 항목 수                                                       | Paginated endpoints                             |
| `x-pagination-page`                         | number | 현재 페이지 번호 (1 기반)                                                | Paginated endpoints                             |
| `x-pagination-total`                        | number | 모든 페이지의 총 항목 수                                                  | Paginated endpoints                             |
| `x-pagination-total-pages`                  | number | 총 페이지 수                                                         | Paginated endpoints                             |
| **Account Balance Information**             |        |                                                                 |                                                 |
| `x-venice-balance-diem`                     | string | 요청 처리 전 DIEM 토큰 잔액                                              | All authenticated requests                      |
| `x-venice-balance-usd`                      | string | 요청 처리 전 USD 크레딧 잔액                                              | All authenticated requests                      |
| **Content Safety Headers**                  |        |                                                                 |                                                 |
| `x-venice-is-blurred`                       | string | 콘텐츠 정책으로 인해 생성된 이미지가 흐려졌는지 여부 (`true`/`false`)                  | Image generation with Safe Venice enabled       |
| `x-venice-is-content-violation`             | string | 콘텐츠가 Venice의 콘텐츠 정책을 위반하는지 여부 (`true`/`false`)                  | Content generation endpoints                    |
| `x-venice-is-adult-model-content-violation` | string | 콘텐츠가 성인 모델 콘텐츠 정책을 위반하는지 여부 (`true`/`false`)                    | Image generation endpoints                      |
| `x-venice-contains-minor`                   | string | 이미지에 미성년자가 포함되어 있는지 여부 (`true`/`false`)                         | Image analysis endpoints with age detection     |
| **Client Information**                      |        |                                                                 |                                                 |
| `x-venice-middleface-version`               | string | Venice middleface 클라이언트 버전                                      | Requests from Venice middleface clients         |
| `x-venice-mobile-version`                   | string | Venice 모바일 앱 클라이언트 버전                                           | Requests from mobile applications               |
| `x-venice-request-timestamp-ms`             | number | 클라이언트 제공 요청 타임스탬프 (밀리초)                                         | When client provides timestamp in request       |
| `x-venice-control-instance`                 | string | 디버깅용 제어 인스턴스 식별자                                                | Image generation endpoints for debugging        |
| **Authentication Headers**                  |        |                                                                 |                                                 |
| `x-auth-refreshed`                          | string | 요청 중에 인증 토큰이 새로 고침되었는지 여부 (`true`/`false`)                      | When authentication tokens are auto-refreshed   |
| `x-retry-count`                             | number | 요청의 재시도 횟수                                                      | When request retries occur                      |

### 중요 참고 사항

* **헤더 이름 대소문자**: HTTP 헤더는 대소문자를 구분하지 않지만 Venice는 일관성을 위해 소문자와 하이픈을 사용합니다
* **문자열 값**: 헤더의 부울 값은 문자열 (`"true"` 또는 `"false"`)로 반환됩니다
* **숫자 값**: 정밀도 손실을 방지하기 위해 큰 숫자와 잔액 값은 문자열로 반환될 수 있습니다
* **선택적 헤더**: 모든 헤더가 모든 응답에 반환되는 것은 아닙니다; 존재 여부는 엔드포인트와 요청 컨텍스트에 따라 다릅니다
* **압축**: 지원되는 경우 압축된 응답을 받으려면 요청에 `Accept-Encoding: gzip, br`을 사용하세요

### 예제: 응답 헤더 접근

```javascript theme={"system"}
// After making an API request, access headers from the response object
const requestId = response.headers.get('CF-RAY');
const remainingRequests = response.headers.get('x-ratelimit-remaining-requests');
const remainingTokens = response.headers.get('x-ratelimit-remaining-tokens');
const usdBalance = response.headers.get('x-venice-balance-usd');

// Check for model deprecation warnings
const deprecationWarning = response.headers.get('x-venice-model-deprecation-warning');
if (deprecationWarning) {
  console.warn(`Model Deprecation: ${deprecationWarning}`);
}
```

## 모범 사례

1. **속도 제한**: `x-ratelimit-remaining-requests` 및 `x-ratelimit-remaining-tokens` 헤더를 모니터링하고 지수 백오프 구현
2. **잔액 모니터링**: 서비스 중단을 방지하기 위해 `x-venice-balance-usd` 및 `x-venice-balance-diem` 헤더 추적
3. **시스템 프롬프트**: Venice의 시스템 프롬프트를 사용하거나 사용하지 않고 테스트하여 사용 사례에 가장 적합한 것 찾기
4. **API 키**: API 키를 안전하게 보관하고 정기적으로 회전
5. **요청 로깅**: 지원과의 문제 해결을 위해 `CF-RAY` 헤더 값 기록
6. **모델 지원 중단**: 모델 사용 시 `x-venice-model-deprecation-warning` 헤더 확인

## OpenAI API와의 차이점

Venice는 OpenAI API 사양과 높은 호환성을 유지하지만 몇 가지 주요 차이점이 있습니다:

1. **venice\_parameters**: 확장 기능을 위한 `enable_web_search`, `character_slug`, `strip_thinking_response`와 같은 추가 구성
2. **시스템 프롬프트**: Venice는 검열되지 않은 응답에 최적화된 기본값에 시스템 프롬프트를 추가합니다 (`include_venice_system_prompt: false`로 비활성화)
3. **모델 생태계**: Venice는 검열되지 않은 모델과 추론 모델을 포함한 자체 [모델 라인업](/overview/models)을 제공합니다 - OpenAI 매핑이 아닌 Venice 모델 ID를 사용하세요
4. **응답 헤더**: 잔액 추적(`x-venice-balance-usd`, `x-venice-balance-diem`), 모델 지원 중단 경고 및 콘텐츠 안전 플래그를 위한 고유 헤더
5. **콘텐츠 정책**: 전용 검열되지 않은 모델과 선택적 콘텐츠 필터링이 있는 보다 허용적인 정책

## API 안정성

Venice는 v1 엔드포인트와 매개변수에 대한 이전 버전과의 호환성을 유지합니다. 모델 수명 주기 정책, 지원 중단 공지 및 마이그레이션 가이드는 [지원 중단](/overview/deprecations)을 참조하세요.

## OpenAPI 사양 & 원시 데이터

Venice API 문서와 데이터에 대한 프로그래밍 방식 접근(RAG(Retrieval-Augmented Generation)와 함께 사용 포함)을 위해 다음 리소스를 사용할 수 있습니다:

* [OpenAPI Spec (YAML)](https://api.venice.ai/doc/api/swagger.yaml) — YAML 형식의 전체 API 사양
* [API Docs Source](https://github.com/veniceai/api-docs/archive/refs/heads/main.zip) — 다운로드 가능한 아카이브로 제공되는 모든 문서 페이지(`.mdx` 형식)

***

<sub>이 문서에 나열되지 않은 요청 필드는 전달될 수 있지만 유효성 검사되거나 작동이 보장되지 않습니다.</sub>