Towards Automatic Subtitling: Assessing the Quality of Old and New Resources
Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial a...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Accademia University Press
2020
|
Materias: | |
Acceso en línea: | https://doaj.org/article/133e90f0aee140eeb5a36011b52f3917 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:133e90f0aee140eeb5a36011b52f3917 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:133e90f0aee140eeb5a36011b52f39172021-12-02T09:52:20ZTowards Automatic Subtitling: Assessing the Quality of Old and New Resources2499-455310.4000/ijcol.649https://doaj.org/article/133e90f0aee140eeb5a36011b52f39172020-06-01T00:00:00Zhttp://journals.openedition.org/ijcol/649https://doaj.org/toc/2499-4553Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation.Alina KarakantaMatteo NegriMarco TurchiAccademia University PressarticleSocial SciencesHComputational linguistics. Natural language processingP98-98.5ENIJCoL, Vol 6, Iss 1, Pp 63-76 (2020) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Social Sciences H Computational linguistics. Natural language processing P98-98.5 |
spellingShingle |
Social Sciences H Computational linguistics. Natural language processing P98-98.5 Alina Karakanta Matteo Negri Marco Turchi Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
description |
Growing needs in localising multimedia content for global audiences have resulted in Neural Machine Translation (NMT) gradually becoming an established practice in the field of subtitling in order to reduce costs and turn-around times. Contrary to text translation, subtitling is subject to spatial and temporal constraints, which greatly increase the post-processing effort required to restore the NMT output to a proper subtitle format. In our previous work (Karakanta, Negri, and Turchi 2019), we identified several missing elements in the corpora available for training NMT systems specifically tailored for subtitling. In this work, we compare the previously studied corpora with MuST-Cinema, a corpus enabling end-to-end speech to subtitles translation, in terms of the conformity to the constraints of: 1) length and reading speed; and 2) proper line breaks. We show that MuST-Cinema conforms to these constraints and discuss the recent progress the corpus has facilitated in end-to-end speech to subtitles translation. |
format |
article |
author |
Alina Karakanta Matteo Negri Marco Turchi |
author_facet |
Alina Karakanta Matteo Negri Marco Turchi |
author_sort |
Alina Karakanta |
title |
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
title_short |
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
title_full |
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
title_fullStr |
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
title_full_unstemmed |
Towards Automatic Subtitling: Assessing the Quality of Old and New Resources |
title_sort |
towards automatic subtitling: assessing the quality of old and new resources |
publisher |
Accademia University Press |
publishDate |
2020 |
url |
https://doaj.org/article/133e90f0aee140eeb5a36011b52f3917 |
work_keys_str_mv |
AT alinakarakanta towardsautomaticsubtitlingassessingthequalityofoldandnewresources AT matteonegri towardsautomaticsubtitlingassessingthequalityofoldandnewresources AT marcoturchi towardsautomaticsubtitlingassessingthequalityofoldandnewresources |
_version_ |
1718397929100673024 |