IBM Granite 4 served from local GGUF server

Granite 4 is an open-source LLM supporting a 1M context window. This demo uses only 2K context and max 1K output tokens. View Documentation

0 1
0 2
0 1
0 100
1 2000