In the real-world scenario, data often have a long-tailed distribution and training deep neural networks on such an imbalanced dataset has become a great challenge. The main problem caused by a long-tailed data distribution is that common classes will dominate the training results and achieve a very low accuracy on the rare classes. Recent work focuses on improving the network representation ability to overcome the long-tailed problem, while it always ignores adapting the network classifier to a long-tailed case, which will cause the “incompatibility” problem of network representation and network classifier. In this paper, we use knowledge distillation to solve the long-tailed data distribution problem and fully optimize the network representation and classifier simultaneously. We propose multiexperts knowledge distillation with class-balanced sampling to jointly learn high-quality network representation and classifier. Also, a channel activation-based knowledge distillation method is also proposed to improve the performance further. State-of-the-art performance on several large-scale long-tailed classification datasets shows the superior generalization of our method.