New mixture of distillation strategy for knowledge transfer

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

New mixture of distillation strategy for knowledge transfer

Mamunoor, Rashid

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6573

Date: 2023-03-27

Abstract:

Despite the fact that best performing supervised learning models are often ensemble of many base classifiers or a very large and complex classifier, it may suffer by lack of resource related problem on smart-phones or Internet of Things (IoT) related devices. Model compression or distillation is the solution to turn a large and complex model or an ensemble of models into a smaller and faster model, usually without significant loss in performance which is more suitable for deployment in resource constrained devices. However, existing offline distillation methods rely on a strong pre-trained teacher model to solve complex problems leading to a lengthy and complex multi-phase training procedure. Its online counterparts on the otherhand address this limitation by introducing simultaneous training of student and teacher models where peer learning provide extra teaching knowledge. Though online distillation sometimes outperforms than the teacher based offline distillation, this teacher-student simultaneous learning strategy some time pulls to “the blind leading the blind” paradigm. To avoid these problems, we present a new single stage training procedure named Mixture of Distillation (MoD) which introduces a different kind of independent-dependent group learning for both student and teacher models and utilizes the complementary strengths of both offline and online distillation loss function. The Main objective of such a hybrid approach is to improve accuracy and to reduce the training time. Extensive evaluations on SVHN, MNIST, NumtaDB, CIFAR-10 and CIFAR-100 datasets substantiates that our proposed “Mixture of Distillation” improves the generalization performance more significantly than existing distillation methods.

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 1.061Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

New mixture of distillation strategy for knowledge transfer

New mixture of distillation strategy for knowledge transfer

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account