GreenBitAI/Qwen3-VL-32B-Instruct-layer-mix-bpw-3.5-mlx

This quantized low-bit model GreenBitAI/Qwen3-VL-32B-Instruct-layer-mix-bpw-3.5-mlx was converted to MLX format from GreenBitAI/Qwen3-VL-32B-Instruct-layer-mix-bpw-3.5 using gbx-lm version 0.4.2. Refer to the original model card for more details on the model.

Use with mlx

pip install gbx-lm
from gbx_lm import load, generate

model, tokenizer = load("GreenBitAI/Qwen3-VL-32B-Instruct-layer-mix-bpw-3.5-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
58
Safetensors
Model size
6B params
Tensor type
BF16
I16
U32
MLX
Hardware compatibility
Log In to view the estimation

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for GreenBitAI/Qwen3-VL-32B-Instruct-layer-mix-bpw-3.5-mlx

Finetuned
(1)
this model