Skip to content

How do I save bnb-4bit model with merged_16bit? #326

@azulika

Description

@azulika

This is essentially the same issue as unslothai/unsloth#2749 .

Pull request #254 claims to have fixed this, but the implementation does not seem to solve the core issue. While the pull request adds support for 16-bit base models and mxfp4 native models, it does not address the problem that the original poster in #2749 and others were facing: saving a bnb-4bit model like Qwen3-8B-bnb-4int with merged_16bit.

Currently, it is impossible to use save_pretrained_merged with merged_16bit for bnb-4bit models like unsloth/gemma-3-27b-it-bnb-4bit and unsloth/Mistral-Small-3.2-24B-Instruct-2506-unsloth-bnb-4bit due to the following code in saving_utils.py:

        if base_model_is_quantized and (quant_type == "nf4" or quant_type == "fp4") and save_method == "merged_16bit":
            warnings.warn("Base model should be a 16bits or mxfp4 base model for a 16bit model merge. Use `save_method=forced_merged_4bit` instead")
            return None

Since bnb-4bit models use nf4 quantization, this condition always evaluates to true, causing the function to return None and fail the merge operation.

Does this mean there is currently no way to save bnb-4bit models as 16-bit merged models, or is this feature simply not implemented yet?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions