IBM Granite 4 served from highly compressed semi-lossless T3 GGUF

Granite 4 is an open-source LLM supporting a 1M context window. This demo uses only 2K context and max 1K output tokens. View Documentation

0 1
0 2
0 1
0 100
1 2000