We propose a new approach for text simplification by fine-tuning powerful pretrained lan- guage models with Reinforcement Learning. Without using complex-simple sentence pairs, we update a pretrained language model fine-tuned for paraphrasing to simply complex sentences. For this aim, we propose an unsupervised simplification reward function and update the pretrained language model to optimize the reward. By exploring the space of possible simplifications and getting feedback from the reward function, the pretrained language model fine-tuned for paraphrasing learns to simplify sentences instead of just paraphrasing them. We have also used a supervised reward for fine-tuning the paraphrasing model. Even with using the supervised reward, we still need much less aligned simplification data to obtain a high-quality text simplification model compared to other supervised simplification models. Report : https://drive.google.com/file/d/15iBgWhBedi2ERmazz3CRSVIBAdim6YpW/view?usp=sharing
imohammad12/trl
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|