b8532

📅 Mar 26, 2026📦 llama-cppView on GitHub →

✨ 2 features🔧 7 symbols

Summary

This release introduces F32 kernel type support for `CONV_TRANSPOSE_2D` on both CUDA and CPU. It also includes substantial refactoring of the 2D transpose implementation for better parameter handling and flexibility across data types.

Migration Steps

Review code related to `conv2d_transpose_params` as it was removed and replaced by direct kernel launch dispatching in the CUDA implementation.
If relying on specific kernel type instantiation for 2D transpose tests, note that test structures were updated to reorder constructor arguments and iterate over kernel types dynamically.

✨ New Features

Added support for F32 kernel type for `CONV_TRANSPOSE_2D` on CUDA and CPU.
Enhanced CPU conv2d transpose implementation by introducing a templated kernel type for improved flexibility with F16 and F32 data types.

Affected Symbols

conv2d_transpose_params conv2d_transpose_kernel ggml_cuda_conv_2d_transpose_p0 test_conv_transpose_2d ggml_compute_forward_conv_transpose_2d ggml-cuda/conv2d-transpose.cu ggml-cpu/ggml-cpu.c