Common Compilation Failures: Large Input Size

Compiling a model with too large of an input image size can result in failures. When an image size is larger, every layer itself will be larger and take up more resources. Eventually, the compiler will not be able to allocate sufficient resources for a layer and the compilation will fail. The larger the model scale (n, s, m, l, x), the more likely you will run into this issue at larger input sizes. We normally see this occur with l and x scaled models. Similarly, if your model contains attention layers, you will also run into this issue more often because the attention layer scales quadratically with sequence length (image size).

Instead of using a large input size, you can try two different approaches: tiling or using the P2-head variant model when training.