Impvove Inference Efficiency with Batch Inference

As an algorithm engineer, it is inevitable that you will encounter the problem of bringing models online in your daily work. For some less demanding scenarios, you can handle this by utilizing a web framework: for each user request, call the model to infer and return the result. However, this straightforward implementation often fails to maximize the use of the GPU, and is slightly overwhelming for scenarios with high performance requirements....

June 20, 2021 · 7 min · Yuanhao

Tricks of Semantic Segmentation

{: .align-center style=“width:80%”} HuBMAP - Hacking the Kidney {: .align-caption style=“text-align:center;font-size:smaller”} Last month, I spent some time doing the “HuBMAP - Hacking the Kidney” competition on Kaggle. The goal of this competition is the implementation of a successful and robust glomeruli FTU detector. It is a classical binary semantic segmentation problem. This is my second semantic segmentation competition, and our team ended up in 43rd place and won a silver medal....

May 30, 2021 · 4 min · Yuanhao

Training large model with your GPU

In the last post, I shared my story of the Kaggle Jigsaw Multilingual Toxic Comment Classification competition. At that time, I only had a 1080Ti with 11G VRAM, and this made it impossible for me to train the SOTA Roberta-XLM large model which requires larger VRAM than what I had. In this post, I want to share some tips about how to reduce the VRAM usage so that you can train larger deep neural networks with your GPU....

April 15, 2021 · 4 min · Yuanhao