Deep Learning

Reflections as an ACL ARR Reviewer

The scores for ACL 2025’s May ARR cycle have been released, and I imagine the academic community is busy again—authors scrambling to write rebuttals, area chairs frantically searching for emergency reviewers. Yet as a reviewer, I feel exceptionally relaxed because the papers I reviewed scored so low that I doubt the authors will even bother with rebuttals, which means I won’t need to respond either. This round, I was assigned papers in RAG and Hallucination directions....

Impact of Deepseek's Public Profit Rate Disclosure

On March 1st, 2025, the sixth day of DeepSeek’s open-source week, they released something quite special - an introduction to the V3/R1 model’s inference system. Unlike the code repositories from previous days, this time it was a blog post. I haven’t read the article in detail, and I might not fully understand it anyway, but the Twitter post alone has already caused quite a stir. DeepSeek resembles the OpenAI of two years ago, catching everyone’s attention....

Impvove Inference Efficiency with Batch Inference

As an algorithm engineer, it is inevitable that you will encounter the problem of bringing models online in your daily work. For some less demanding scenarios, you can handle this by utilizing a web framework: for each user request, call the model to infer and return the result. However, this straightforward implementation often fails to maximize the use of the GPU, and is slightly overwhelming for scenarios with high performance requirements....

Tricks of Semantic Segmentation

{: .align-center style=“width:80%”} HuBMAP - Hacking the Kidney {: .align-caption style=“text-align:center;font-size:smaller”} Last month, I spent some time doing the “HuBMAP - Hacking the Kidney” competition on Kaggle. The goal of this competition is the implementation of a successful and robust glomeruli FTU detector. It is a classical binary semantic segmentation problem. This is my second semantic segmentation competition, and our team ended up in 43rd place and won a silver medal....

Training large model with your GPU

In the last post, I shared my story of the Kaggle Jigsaw Multilingual Toxic Comment Classification competition. At that time, I only had a 1080Ti with 11G VRAM, and this made it impossible for me to train the SOTA Roberta-XLM large model which requires larger VRAM than what I had. In this post, I want to share some tips about how to reduce the VRAM usage so that you can train larger deep neural networks with your GPU....