Change8

b8532

📦 llama-cppView on GitHub →
2 features🔧 7 symbols

Summary

This release introduces F32 kernel type support for `CONV_TRANSPOSE_2D` on both CUDA and CPU. It also includes substantial refactoring of the 2D transpose implementation for better parameter handling and flexibility across data types.

Migration Steps

  1. Review code related to `conv2d_transpose_params` as it was removed and replaced by direct kernel launch dispatching in the CUDA implementation.
  2. If relying on specific kernel type instantiation for 2D transpose tests, note that test structures were updated to reorder constructor arguments and iterate over kernel types dynamically.

✨ New Features

  • Added support for F32 kernel type for `CONV_TRANSPOSE_2D` on CUDA and CPU.
  • Enhanced CPU conv2d transpose implementation by introducing a templated kernel type for improved flexibility with F16 and F32 data types.

Affected Symbols