0% found this document useful (0 votes)

8 views

AILC Abstract2

This paper presents work on automatic text simplification for Italian. It creates a new corpus for Italian text simplification by merging existing resources. It also fine-tunes a transformer model for sentence simplification, achieving state-of-the-art results for Italian. Additionally, it attempts to create an adaptive model that can simplify text according to specific target populations based on parameterized grammatical features. The baseline simplification model achieves a SARI score of 51.51 on test data from the new corpus, improving the state of the art. The adaptive model achieves the highest reported SARI score of 60.12 for a controllable Italian text simplification system.

Uploaded by

pratik kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

AILC Abstract2

Uploaded by

pratik kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/371156476

Controllable Sentence Simpliﬁcation with a Uniﬁed Text-to-Text Transfer

Transformer

Conference Paper · May 2023

CITATIONS READ

0 1

3 authors, including:

Martina Galletti

8 PUBLICATIONS 6 CITATIONS

SEE PROFILE

All content following this page was uploaded by Martina Galletti on 30 May 2023.

The user has requested enhancement of the downloaded file.

Automatic Text Simplification for Italian poor readers and comprehenders : a new
corpus, a model and an adaptive component

Francesca Padovani [1][2], Martina Galletti* [1][3][5] & Daniele Nardi [3][4][5]
[1]
Sony Computer Science Laboratories-Paris (Sony CSL - Paris), France
[2]
University of Trento, Italy
[3]
Sapienza University of Rome, Italy
[4]
CINI-AIIS, Italy
[5]
Centro di Studi e Ricerche Enrico Fermi, Italy

*martina.galletti@sony.com

Automatic Text Simplification (ATS) is the process of modifying a text to reduce its overall linguistic complexity.
To automate this simplification process, a number of non-trivial operations must be carried out, including the
assessment of the complexity of the source text, the identification of the fundamental words and parts of the text
itself, and the appropriate modification of these elements in the subsequent simplification stages, at the level of
vocabulary, syntax or discourse. The simplification problem has been investigated by several studies proposing
different methodologies to tackle the task on the English language, but other languages, such as Italian, are less
explored. This is due not only to the limited amount of data available but also the poor quality of the accessible
data itself. For the Italian language there are only two small manually curated datasets1 and only one large corpus2,
PaCCSS-IT, created with a data-driven approach. Most ATS systems produce the same output for every target
group, whereas different categories of people, such as those with cognitive and linguistic disabilities, may benefit
from a text simplified according to their vulnerabilities. The output of this abstract is three-fold. We first built a
new enriched corpus of parallel complex/simple sentences for Italian, robust in terms of quality and large in terms
of quantity by merging PaCCSS-IT with the existing manually curated resources3, a small dataset harvested from
the Italian Wikipedia in a semi-automatic way4 and by translating sentences from an English dataset. Secondly,
we fine-tuned a transformer-based encoder-decoder model inspired by the state-of-the-art available for English5.
Finally, we attempted to parameterise grammatical text features to control simplifications with the goal of making
them adaptive for a specific target population. After evaluation, the baseline sentence simplification model
obtained a good result, achieving a SARI value of 51.51 on the test set of the corpus we built and designed. This
result improves the state of the art (+1.51) on Italian language. We have also made an attempt to create the adaptive
model that reached a SARI value of 60.12. This score is the highest obtained for a controllable simplification
system of Italian text.

1
Brunato, D., Dell’Orletta, F., Venturi, G., & Montemagni, S. (2015, June). Design and annotation of the first
Italian corpus for text simplification. In Proceedings of The 9th Linguistic Annotation Workshop (pp. 31-41).
2
Brunato, D., Cimino, A., Dell’Orletta, F., & Venturi, G. (2016, November). Paccss-it: A parallel corpus of
complex-simple sentences for automatic text simplification. In Proceedings of the 2016 Conference on Empirical
Methods in Natural Language Processing (pp. 351-361).
3
Brunato, D., Dell’Orletta, F., Venturi, G., & Montemagni, S. (2015, June). Design and annotation of the first
Italian corpus for text simplification. In Proceedings of The 9th Linguistic Annotation Workshop (pp. 31-41).
4
Tonelli, S., Aprosio, A. P., & Saltori, F. (2016). SIMPITIKI: a Simplification corpus for Italian. In CLiC-
it/EVALITA (pp. 4333-4338).
5
Sheang, K. C., & Saggion, H. (2021, August). Controllable Sentence Simplification with a Unified Text-to-
Text Transfer Transformer. In Proceedings of the 14th International Conference on Natural Language
Generation (pp. 341-352).

View publication stats

EXP PSYCH Course Syllabus
No ratings yet
EXP PSYCH Course Syllabus
7 pages
Common European Framework of Reference for Languages: Learning, Teaching, assessment: Companion volume
From Everand
Common European Framework of Reference for Languages: Learning, Teaching, assessment: Companion volume
Collective
No ratings yet
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
No ratings yet
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
9 pages
Pets Lesson Plan Clasa 0
100% (1)
Pets Lesson Plan Clasa 0
3 pages
1-s2.0-S1877042813041906-main
No ratings yet
1-s2.0-S1877042813041906-main
9 pages
Syntactic Simplification For MT BULAG
No ratings yet
Syntactic Simplification For MT BULAG
20 pages
Controllable Sentence Simplification: Louis Martin Eric Villemonte de La Clergerie Beno It Sagot Antoine Bordes
No ratings yet
Controllable Sentence Simplification: Louis Martin Eric Villemonte de La Clergerie Beno It Sagot Antoine Bordes
10 pages
Proceedings CLICit 2014
No ratings yet
Proceedings CLICit 2014
404 pages
20 Paper
No ratings yet
20 Paper
12 pages
Data-Driven Sentence Simplification: Survey and Benchmark
No ratings yet
Data-Driven Sentence Simplification: Survey and Benchmark
53 pages
SIMPITIKI: A Simplification Corpus For Italian Created From Wikipedia Edits
No ratings yet
SIMPITIKI: A Simplification Corpus For Italian Created From Wikipedia Edits
7 pages
Information 13 00228
No ratings yet
Information 13 00228
12 pages
Artificial intelligence
No ratings yet
Artificial intelligence
20 pages
MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases
No ratings yet
MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases
16 pages
Report Sample
No ratings yet
Report Sample
61 pages
SeamlessM4T-Massively_Multilingual_Multimodal_Mach
No ratings yet
SeamlessM4T-Massively_Multilingual_Multimodal_Mach
102 pages
Is It Possible to Modify Text to a Target Readability Level an Initial.2024.Lrec-main.815
No ratings yet
Is It Possible to Modify Text to a Target Readability Level an Initial.2024.Lrec-main.815
15 pages
SeamlessM4T - Massively Multilingual & Multimodal Machine Research Paper
No ratings yet
SeamlessM4T - Massively Multilingual & Multimodal Machine Research Paper
111 pages
Neurocomputing: Mario Malcangi, David Frontini
No ratings yet
Neurocomputing: Mario Malcangi, David Frontini
10 pages
17489-Article Text-20983-1-2-20210518
No ratings yet
17489-Article Text-20983-1-2-20210518
10 pages
15092024
No ratings yet
15092024
13 pages
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
No ratings yet
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
10 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
4 pages
Paper 5728
No ratings yet
Paper 5728
3 pages
Hal 131 PDF
No ratings yet
Hal 131 PDF
237 pages
4 Natural Language Processing-Text Normalization
No ratings yet
4 Natural Language Processing-Text Normalization
10 pages
Seamless:: Multilingual Expressive and Streaming Speech Translation
No ratings yet
Seamless:: Multilingual Expressive and Streaming Speech Translation
145 pages
15092024
No ratings yet
15092024
14 pages
8.5 Multilingual Speech Processing
No ratings yet
8.5 Multilingual Speech Processing
24 pages
Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers
No ratings yet
Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers
7 pages
The Fiesta Data Model: A novel approach to the representation of heterogeneous multimodal interaction data
From Everand
The Fiesta Data Model: A novel approach to the representation of heterogeneous multimodal interaction data
Peter Menke
No ratings yet
(Ebook) Automatic Language Identification in Texts by Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, Krister Lindén ISBN 9783031458217, 3031458214 - Download the ebook today to explore every detail
100% (1)
(Ebook) Automatic Language Identification in Texts by Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, Krister Lindén ISBN 9783031458217, 3031458214 - Download the ebook today to explore every detail
77 pages
An Interactive Intelligent Web-Based Text-To-Speech System For The Visually Impaired
No ratings yet
An Interactive Intelligent Web-Based Text-To-Speech System For The Visually Impaired
24 pages
Real Time Chat Application Using Socket - Io
No ratings yet
Real Time Chat Application Using Socket - Io
48 pages
3583780.3615514 (1)
No ratings yet
3583780.3615514 (1)
11 pages
Statistical Semantics: Fundamentals and Applications
From Everand
Statistical Semantics: Fundamentals and Applications
Fouad Sabry
No ratings yet
Advances in Speech To Speech Translation Technologies
No ratings yet
Advances in Speech To Speech Translation Technologies
35 pages
Full download New Language Technologies and Linguistic Research A Two Way Road 1st Edition Sandra Maria Aluisio pdf docx
100% (2)
Full download New Language Technologies and Linguistic Research A Two Way Road 1st Edition Sandra Maria Aluisio pdf docx
81 pages
RD Minimalism A
No ratings yet
RD Minimalism A
34 pages
English Paper
No ratings yet
English Paper
13 pages
Deep Learning For Text Style Transfer: A Survey: Di Jin Zhijing Jin
No ratings yet
Deep Learning For Text Style Transfer: A Survey: Di Jin Zhijing Jin
47 pages
Gatto_light
No ratings yet
Gatto_light
99 pages
Instant Access to (Ebook) Technological Innovation Put to the Service of Language Learning, Translation and Interpreting by Óscar Ferreiro Vázquez, Ana Teresa Varajão Moutinho Pereira, Sílvia Lima Gonçalves Araújo ISBN 9783631889459, 3631889453 ebook Full Chapters
100% (7)
Instant Access to (Ebook) Technological Innovation Put to the Service of Language Learning, Translation and Interpreting by Óscar Ferreiro Vázquez, Ana Teresa Varajão Moutinho Pereira, Sílvia Lima Gonçalves Araújo ISBN 9783631889459, 3631889453 ebook Full Chapters
81 pages
Machine Translation: Problems and Issues: John Hutchins
No ratings yet
Machine Translation: Problems and Issues: John Hutchins
18 pages
Speech To Speech Translation With Response Suggestion (English-Hindi)
No ratings yet
Speech To Speech Translation With Response Suggestion (English-Hindi)
6 pages
journal-fix-lai-r-nha
No ratings yet
journal-fix-lai-r-nha
15 pages
Machine Learning in Translation Corpora Processing
No ratings yet
Machine Learning in Translation Corpora Processing
281 pages
The Main Principles of Text-to-Speech Synthesis System: January 2010
No ratings yet
The Main Principles of Text-to-Speech Synthesis System: January 2010
8 pages
Controllable Sentence Simplification With A Unified Text-to-Text Transfer Transformer
No ratings yet
Controllable Sentence Simplification With A Unified Text-to-Text Transfer Transformer
12 pages
KNOSYS-S-24-20154
No ratings yet
KNOSYS-S-24-20154
36 pages
fin_irjmets1704109137
No ratings yet
fin_irjmets1704109137
3 pages
Session 14 - Computaional Linguistics
No ratings yet
Session 14 - Computaional Linguistics
23 pages
Analysis On Text Summarization
No ratings yet
Analysis On Text Summarization
10 pages
Sentences and Documents in Native Language
No ratings yet
Sentences and Documents in Native Language
393 pages
Text Summarization and Conversion of Speech To Text
No ratings yet
Text Summarization and Conversion of Speech To Text
5 pages
The Teacher-Student Chatroom Corpus: Carter and Mccarthy 1997
No ratings yet
The Teacher-Student Chatroom Corpus: Carter and Mccarthy 1997
11 pages
simplification_ucca
No ratings yet
simplification_ucca
12 pages
17020
No ratings yet
17020
88 pages
Python For Beginners
From Everand
Python For Beginners
TUDOR MARCIANTI
5/5 (1)
Download full Recent Advances in Natural Language Processing V Selected Papers from RANLP 2007 1st Edition Nicolas Nicolov (Ed.) ebook all chapters
100% (7)
Download full Recent Advances in Natural Language Processing V Selected Papers from RANLP 2007 1st Edition Nicolas Nicolov (Ed.) ebook all chapters
50 pages
NILC-Metrix Assessing The Complexity of Written An
No ratings yet
NILC-Metrix Assessing The Complexity of Written An
26 pages
Vanderbilt Scholarship Essay
No ratings yet
Vanderbilt Scholarship Essay
2 pages
Abstract Reasoning Test
75% (4)
Abstract Reasoning Test
20 pages
Freedom, Reason and Impartiality
100% (2)
Freedom, Reason and Impartiality
9 pages
Speech Act
No ratings yet
Speech Act
6 pages
CH4-E3-E4 Management-Conflict Management PDF
No ratings yet
CH4-E3-E4 Management-Conflict Management PDF
29 pages
5rubric in Rating An Essay
No ratings yet
5rubric in Rating An Essay
3 pages
UNDERSTANDING AND GETTING RID OF THE VOICES SCHIZOPHRENICS HEAR (Dr. J. Jerry Marzinsky)
50% (2)
UNDERSTANDING AND GETTING RID OF THE VOICES SCHIZOPHRENICS HEAR (Dr. J. Jerry Marzinsky)
174 pages
Abstract Abstraction
No ratings yet
Abstract Abstraction
10 pages
Von Beren Izabella Resume
No ratings yet
Von Beren Izabella Resume
2 pages
NLC Math 7-8 Intervention LP&NT v.1
No ratings yet
NLC Math 7-8 Intervention LP&NT v.1
120 pages
PLC Assessment - Revised
No ratings yet
PLC Assessment - Revised
3 pages
Unit-4
No ratings yet
Unit-4
20 pages
Nuestra Culpa
No ratings yet
Nuestra Culpa
16 pages
English 10 - Quarter 2 - Week 1
100% (1)
English 10 - Quarter 2 - Week 1
12 pages
Aphasia: By: Hasan Arafat
No ratings yet
Aphasia: By: Hasan Arafat
16 pages
English 7 - 4TH Q-DLL
No ratings yet
English 7 - 4TH Q-DLL
15 pages
Actfl Certificate
No ratings yet
Actfl Certificate
1 page
Therapeutic Intervention For Children Through.7
No ratings yet
Therapeutic Intervention For Children Through.7
8 pages
BIG FIVE e TDAH
No ratings yet
BIG FIVE e TDAH
7 pages
Board Answer Key 2024 AI 417 Tutorialaicsip
No ratings yet
Board Answer Key 2024 AI 417 Tutorialaicsip
8 pages
Amparo, Catolico, Fabros and Peyra (2020)
No ratings yet
Amparo, Catolico, Fabros and Peyra (2020)
39 pages
Panasonic Aptitude Test
No ratings yet
Panasonic Aptitude Test
2 pages
Motivation Letter
No ratings yet
Motivation Letter
2 pages
Criteria For Science Skills Olympics
100% (3)
Criteria For Science Skills Olympics
4 pages
Overview of Recruitment and Selection Process in HRM: March 2022
No ratings yet
Overview of Recruitment and Selection Process in HRM: March 2022
7 pages
The Fundamentals of Machine Learning
No ratings yet
The Fundamentals of Machine Learning
19 pages
Iep Assignment
100% (1)
Iep Assignment
6 pages
302 Module 1
No ratings yet
302 Module 1
15 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

AILC Abstract2

Uploaded by

AILC Abstract2

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Controllable Sentence Simpliﬁcation with a Uniﬁed Text-to-Text Transfer

Conference Paper · May 2023

The user has requested enhancement of the downloaded file.

View publication stats

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.