Skip to content

Latest commit

 

History

History
206 lines (149 loc) · 7.23 KB

File metadata and controls

206 lines (149 loc) · 7.23 KB

Use cases of SDK

For more detailed instructions, including API documentation and usage examples, please refer to the Use case.

Download model

from pycsghub.snapshot_download import snapshot_download
token = "your_access_token"

endpoint = "https://hub.opencsg.com"
repo_id = 'OpenCSG/csg-wukong-1B'
cache_dir = '/Users/hhwang/temp/'
result = snapshot_download(repo_id, cache_dir=cache_dir, endpoint=endpoint, token=token)

Download model with allow patterns '.json' and ignore '_config.json' pattern of files

from pycsghub.snapshot_download import snapshot_download
token = "your_access_token"

endpoint = "https://hub.opencsg.com"
repo_id = 'OpenCSG/csg-wukong-1B'
cache_dir = '/Users/hhwang/temp/'
allow_patterns = ["*.json"]
ignore_patterns = ["*_config.json"]
result = snapshot_download(repo_id, cache_dir=cache_dir, endpoint=endpoint, token=token, allow_patterns=allow_patterns, ignore_patterns=ignore_patterns)

Download dataset

from pycsghub.snapshot_download import snapshot_download
token="xxxx"
endpoint = "https://hub.opencsg.com"
repo_id = 'AIWizards/tmmluplus'
repo_type="dataset"
cache_dir = '/Users/xiangzhen/Downloads/'
result = snapshot_download(repo_id, repo_type=repo_type, cache_dir=cache_dir, endpoint=endpoint, token=token)

Download single file

Use http_get function to download single file

from pycsghub.file_download import http_get
token = "your_access_token"

url = "https://hub.opencsg.com/api/v1/models/OpenCSG/csg-wukong-1B/resolve/tokenizer.model"
local_dir = '/home/test/'
file_name = 'test.txt'
headers = None
cookies = None
http_get(url=url, token=token, local_dir=local_dir, file_name=file_name, headers=headers, cookies=cookies)

use file_download function to download single file from a repository

from pycsghub.file_download import file_download
token = "your_access_token"

endpoint = "https://hub.opencsg.com"
repo_id = 'OpenCSG/csg-wukong-1B'
cache_dir = '/home/test/'
result = file_download(repo_id, file_name='README.md', cache_dir=cache_dir, endpoint=endpoint, token=token)

Upload file

from pycsghub.file_upload import http_upload_file

token = "your_access_token"

endpoint = "https://hub.opencsg.com"
repo_type = "model"
repo_id = 'wanghh2000/myprivate1'
result = http_upload_file(repo_id, endpoint=endpoint, token=token, repo_type='model', file_path='test1.txt')

Upload multi-files

from pycsghub.file_upload import http_upload_file

token = "your_access_token"

endpoint = "https://hub.opencsg.com"
repo_type = "model"
repo_id = 'wanghh2000/myprivate1'

repo_files = ["1.txt", "2.txt"]
for item in repo_files:
    http_upload_file(repo_id=repo_id, repo_type=repo_type, file_path=item, endpoint=endpoint, token=token)

Upload the local path to repo

Before starting, please make sure you have Git-LFS installed (see here for installation instructions).

from pycsghub.repository import Repository

token = "your access token"

r = Repository(
    repo_id="wanghh2003/ds15",
    upload_path="/Users/hhwang/temp/bbb/jsonl",
    user_name="wanghh2003",
    token=token,
    repo_type="dataset",
)

r.upload()

Upload the local path to the specified path in the repo

Before starting, please make sure you have Git-LFS installed (see here for installation instructions).

from pycsghub.repository import Repository

token = "your access token"

r = Repository(
    repo_id="wanghh2000/model01",
    upload_path="/Users/hhwang/temp/jsonl",
    path_in_repo="test/abc",
    user_name="wanghh2000",
    token=token,
    repo_type="model",
    branch_name="v1",
)

r.upload()

Model loading compatible with huggingface

The transformers library supports directly inputting the repo_id from Hugging Face to download and load related models, as shown below:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained('model/repoid')

In this code, the Hugging Face Transformers library first downloads the model to a local cache folder, then reads the configuration, and loads the model by dynamically selecting the relevant class for instantiation.

To ensure compatibility with Hugging Face, version 0.2 of the CSGHub SDK now includes the most commonly features: downloading and loading models. Models can be downloaded and loaded as follows:

# import os 
# os.environ['CSGHUB_TOKEN'] = 'your_access_token'
from pycsghub.repo_reader import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained('model/repoid')

This code:

  1. Use the snapshot_download from the CSGHub SDK library to download the related files.

  2. By generating batch classes dynamically and using class name reflection mechanism, a large number of classes with the same names as those automatically loaded by transformers are created in batches.

  3. Assign it with the from_pretrained method, so the model read out will be an hf-transformers model.

Sandbox (async HTTP client)

The SDK includes pycsghub.sandbox_client for CSGHub sandbox lifecycle and runtime APIs (async). Default base_url matches the public Hub (https://hub.opencsg.com, see DEFAULT_CSGHUB_DOMAIN). Set CsgHubSandboxConfig if you use a self-hosted Hub or a separate AI Gateway (aigateway_url; empty string means runtime calls use the same host as base_url).

Authentication uses the same token resolution as the rest of the SDK: optional token= on CsgHubSandbox, else CSGHUB_TOKEN / token file via get_token_to_send. HTTP failures raise SandboxHttpError or SandboxTransportError; stream_execute_command yields ERROR: ... lines on failure (it does not raise).

import asyncio
from pycsghub.sandbox_client import CsgHubSandbox, SandboxCreateRequest

async def main() -> None:
    client = CsgHubSandbox(token="your_access_token")
    spec = SandboxCreateRequest(
        image="your-runner-image:tag",
        resource_id=77,
        sandbox_name="my-sandbox",
    )
    resp = await client.create_sandbox(spec)
    print(resp.spec.sandbox_name, resp.state.status)

asyncio.run(main())

Sandbox (CLI)

After installing the package, use the csghub-cli sandbox command group. Subcommands include create, get, start, stop, delete (same semantics as stop), exec, upload, and health. Shared options: -e / --endpoint (Hub base_url, default https://hub.opencsg.com), --aigateway-url for runtime routes when the gateway differs from the Hub, and -k / --token (optional; otherwise uses CSGHUB_TOKEN / token file like the rest of the SDK).

Examples:

csghub-cli sandbox create -i your-runner-image:tag -n my-sandbox -k YOUR_TOKEN
csghub-cli sandbox get my-sandbox -k YOUR_TOKEN
csghub-cli sandbox exec my-sandbox "echo hello" -k YOUR_TOKEN
csghub-cli sandbox upload my-sandbox ./local-file.txt -k YOUR_TOKEN
csghub-cli sandbox health my-sandbox -k YOUR_TOKEN

For a full SandboxCreateRequest body, pass --spec path/to/spec.json instead of --image / --name (--spec takes precedence and ignores --image / --name). Lifecycle commands print JSON; exec streams lines to stdout (exit code 1 if any line starts with ERROR:); upload prints the JSON response message; health prints ok on success.