نوع مقاله : علمی-پژوهشی
نویسندگان
1 استادیار، دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران
2 دانشجوی دکتری، دانشکده مهندسی انفورماتیک، دانشگاه پورتو، پورتو، پرتغال
3 فارغ التحصیل کارشناسی دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Automatic keyphrase generation plays an important role in many text analysis and natural language processing tasks. Many existing methods are bound to select keyphrases from the terms and phrases that are present in the target text. This handicap could be overcome using sequence-to-sequence methods. However, many such methods need huge datasets for training which pose a challenge for low-resource languages such as Persian. Transfer learning where a pre-trained model is adapted to a new task specified with a smaller dataset is very useful in such circumstances. In this paper, we present a sequence-to-sequence method utilizing a transformer model for Persian keyphrase generation. Accordingly, a corpus of 70K Persian scientific abstracts and their corresponding keyphrases have been gathered. A pretrianed MT5 mdel is fine-tuned on this corpus for the task of Persian keyword generation. The resulted model is compared to several other keyphrase generation methods. The results indicate that the proposed method can outperform existing methods at least by 2.71 percent.
کلیدواژهها [English]