fudan-zvg/SOFT

Repository files navigation

image

SOFT: Softmax-free Transformer with Linear Complexity,
Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, Li Zhang
NeurIPS 2021

Softmax-free Linear Transformers,
Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang
IJCV 2024

  1. We propose a normalized softmax-free self-attention with stronger generalizability.
  2. SOFT is now avaliable on more vision tasks (object detection and semantic segmentation).
  • [2024/02/12] Our journal extension Softmax-free Linear Transformer is accepted by IJCV.
  • [2022/07/05] SOFT is now available for downstream tasks! An efficient normalization is applied to SOFT. Please refer to SOFT-Norm
  • timm==0.3.2

  • torch>=1.7.0 and torchvision that matches the PyTorch installation

  • cuda>=10.2

Compilation may be fail on cuda < 10.2.
We have compiled it successfully on cuda 10.2 and cuda 11.2.

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg
git clone https://.com/fudan-zvg/SOFT.git
python -m pip install -e SOFT
ModelResolutionParamsFLOPsTop-1 %ConfigPretrained Model
SOFT-Tiny22413M1.9G79.3SOFT_Tiny.yaml, SOFT_Tiny_cuda.yamlSOFT_Tiny, SOFT_Tiny_cuda
SOFT-Small22424M3.3G82.2SOFT_Small.yaml, SOFT_Small_cuda.yaml
SOFT-Medium22445M7.2G82.9SOFT_Meidum.yaml, SOFT_Meidum_cuda.yaml
SOFT-Large22464M11.0G83.1SOFT_Large.yaml, SOFT_Large_cuda.yaml
SOFT-Huge22487M16.3G83.3SOFT_Huge.yaml, SOFT_Huge_cuda.yaml
SOFT-Tiny-Norm22413M1.9G79.4SOFT_Tiny_norm.yamlSOFT_Tiny_norm
SOFT-Small-Norm22424M3.3G82.4SOFT_Small_norm.yamlSOFT_Small_norm
SOFT-Medium-Norm22445M7.2G83.1SOFT_Meidum_norm.yamlSOFT_Medium_norm
SOFT-Large-Norm22464M11.0G83.3SOFT_Large_norm.yamlSOFT_Large_norm
SOFT-Huge-Norm22487M16.3G83.4SOFT_Huge_norm.yaml
BackboneMethodlr schdbox mAPmask mAPParams
SOFT-Tiny-NormRetinaNet1x40.0-23M
SOFT-Tiny-NormMask R-CNN1x41.238.233M
SOFT-Small-NormRetinaNet1x42.8-34M
SOFT-Small-NormMask R-CNN1x43.840.144M
SOFT-Medium-NormRetinaNet1x44.3-55M
SOFT-Medium-NormMask R-CNN1x46.642.065M
SOFT-Large-NormRetinaNet1x45.3-74M
SOFT-Large-NormMask R-CNN1x47.042.284M
BackboneMethodCrop sizelr schdmIoUParams
SOFT-Small-NormUperNet512x5121x46.254M
SOFT-Medium-NormUperNet512x5121x48.076M

We have two implementations of Gaussian Kernel: PyTorch version and the exact form of Gaussian function implemented by cuda. The config file containing cuda is the cuda implementation. Both implementations yield same performance. Please install SOFT before running the cuda version.

./dist_train.sh ${GPU_NUM} --data ${DATA_PATH} --config ${CONFIG_FILE}
# For example, train SOFT-Tiny on Imagenet training dataset with 8 GPUs
./dist_train.sh 8 --data ${DATA_PATH} --config config/SOFT_Tiny.yaml
./dist_train.sh ${GPU_NUM} --data ${DATA_PATH} --config ${CONFIG_FILE} --eval_checkpoint ${CHECKPOINT_FILE} --eval

# For example, test SOFT-Tiny on Imagenet validation dataset with 8 GPUs

./dist_train.sh 8 --data ${DATA_PATH} --config config/SOFT_Tiny.yaml --eval_checkpoint ${CHECKPOINT_FILE} --eval
@inproceedings{SOFT,
    title={SOFT: Softmax-free Transformer with Linear Complexity}, 
    author={Lu, Jiachen and Yao, Jinghan and Zhang, Junge and Zhu, Xiatian and Xu, Hang and Gao, Weiguo and Xu, Chunjing and Xiang, Tao and Zhang, Li},
    booktitle={NeurIPS},
    year={2021}
}
@article{Softmax,
    title={Softmax-free Linear Transformers}, 
    author={Lu, Jiachen and Zhang, Li and Zhang, Junge and Zhu, Xiatian and Feng, Jianfeng and Xiang, Tao},
    journal={International Journal of Coumputer Vision},
    year={2024}
}

MIT

Thanks to previous open-sourced repo:
Detectron2
T2T-ViT
PVT
Nystromformer
pytorch-image-models

About

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published