IBM granite-4.0-h-small (32b) served from highly compressed semi-lossless T3 GGUF (8.76 GB)

Granite 4 is an open-source LLM supporting a 1M context window. This demo uses only 2K context and max 1K output tokens. View Documentation

0 1
0 2
0 1
0 100
1 2000