Polynomial multiplication on embedded vector architectures

High-degree, low-precision polynomial arithmetic is a fundamental computational primitive underlying structured lattice based cryptography. Its algorithmic properties and suitability for implementation on different compute platforms is an active area of research, and this article contributes to thi...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Hanno Becker, Jose Maria Bermudo Mera, Angshuman Karmakar, Joseph Yiu, Ingrid Verbauwhede
Formato:	article
Lenguaje:	EN
Publicado:	Ruhr-Universität Bochum 2021
Materias:	Post-Quantum Cryptography Polynomial multiplication IoT Cortex-M55 Cortex-M4 M-profile Vector Extension (MVE) Computer engineering. Computer hardware TK7885-7895 Information technology T58.5-58.64
Acceso en línea:	https://doaj.org/article/14960de09daa41c9ac5a32eb740c9297
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:14960de09daa41c9ac5a32eb740c9297
record_format	dspace
spelling	oai:doaj.org-article:14960de09daa41c9ac5a32eb740c92972021-11-19T14:36:07ZPolynomial multiplication on embedded vector architectures10.46586/tches.v2022.i1.482-5052569-2925https://doaj.org/article/14960de09daa41c9ac5a32eb740c92972021-11-01T00:00:00Zhttps://tches.iacr.org/index.php/TCHES/article/view/9305https://doaj.org/toc/2569-2925 High-degree, low-precision polynomial arithmetic is a fundamental computational primitive underlying structured lattice based cryptography. Its algorithmic properties and suitability for implementation on different compute platforms is an active area of research, and this article contributes to this line of work: Firstly, we present memory-efficiency and performance improvements for the Toom-Cook/Karatsuba polynomial multiplication strategy. Secondly, we provide implementations of those improvements on Arm® Cortex®-M4 CPU, as well as the newer Cortex-M55 processor, the first M-profile core implementing the M-profile Vector Extension (MVE), also known as Arm® Helium™ technology. We also implement the Number Theoretic Transform (NTT) on the Cortex-M55 processor. We show that despite being singleissue, in-order and offering only 8 vector registers compared to 32 on A-profile SIMD architectures like Arm® Neon™ technology and the Scalable Vector Extension (SVE), by careful register management and instruction scheduling, we can obtain a 3× to 5× performance improvement over already highly optimized implementations on Cortex-M4, while maintaining a low area and energy profile necessary for use in embedded market. Finally, as a real-world application we integrate our multiplication techniques to post-quantum key-encapsulation mechanism Saber Hanno BeckerJose Maria Bermudo MeraAngshuman KarmakarJoseph YiuIngrid VerbauwhedeRuhr-Universität BochumarticlePost-Quantum CryptographyPolynomial multiplicationIoTCortex-M55Cortex-M4M-profile Vector Extension (MVE)Computer engineering. Computer hardwareTK7885-7895Information technologyT58.5-58.64ENTransactions on Cryptographic Hardware and Embedded Systems, Vol 2022, Iss 1 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Post-Quantum Cryptography Polynomial multiplication IoT Cortex-M55 Cortex-M4 M-profile Vector Extension (MVE) Computer engineering. Computer hardware TK7885-7895 Information technology T58.5-58.64
spellingShingle	Post-Quantum Cryptography Polynomial multiplication IoT Cortex-M55 Cortex-M4 M-profile Vector Extension (MVE) Computer engineering. Computer hardware TK7885-7895 Information technology T58.5-58.64 Hanno Becker Jose Maria Bermudo Mera Angshuman Karmakar Joseph Yiu Ingrid Verbauwhede Polynomial multiplication on embedded vector architectures
description	High-degree, low-precision polynomial arithmetic is a fundamental computational primitive underlying structured lattice based cryptography. Its algorithmic properties and suitability for implementation on different compute platforms is an active area of research, and this article contributes to this line of work: Firstly, we present memory-efficiency and performance improvements for the Toom-Cook/Karatsuba polynomial multiplication strategy. Secondly, we provide implementations of those improvements on Arm® Cortex®-M4 CPU, as well as the newer Cortex-M55 processor, the first M-profile core implementing the M-profile Vector Extension (MVE), also known as Arm® Helium™ technology. We also implement the Number Theoretic Transform (NTT) on the Cortex-M55 processor. We show that despite being singleissue, in-order and offering only 8 vector registers compared to 32 on A-profile SIMD architectures like Arm® Neon™ technology and the Scalable Vector Extension (SVE), by careful register management and instruction scheduling, we can obtain a 3× to 5× performance improvement over already highly optimized implementations on Cortex-M4, while maintaining a low area and energy profile necessary for use in embedded market. Finally, as a real-world application we integrate our multiplication techniques to post-quantum key-encapsulation mechanism Saber
format	article
author	Hanno Becker Jose Maria Bermudo Mera Angshuman Karmakar Joseph Yiu Ingrid Verbauwhede
author_facet	Hanno Becker Jose Maria Bermudo Mera Angshuman Karmakar Joseph Yiu Ingrid Verbauwhede
author_sort	Hanno Becker
title	Polynomial multiplication on embedded vector architectures
title_short	Polynomial multiplication on embedded vector architectures
title_full	Polynomial multiplication on embedded vector architectures
title_fullStr	Polynomial multiplication on embedded vector architectures
title_full_unstemmed	Polynomial multiplication on embedded vector architectures
title_sort	polynomial multiplication on embedded vector architectures
publisher	Ruhr-Universität Bochum
publishDate	2021
url	https://doaj.org/article/14960de09daa41c9ac5a32eb740c9297
work_keys_str_mv	AT hannobecker polynomialmultiplicationonembeddedvectorarchitectures AT josemariabermudomera polynomialmultiplicationonembeddedvectorarchitectures AT angshumankarmakar polynomialmultiplicationonembeddedvectorarchitectures AT josephyiu polynomialmultiplicationonembeddedvectorarchitectures AT ingridverbauwhede polynomialmultiplicationonembeddedvectorarchitectures
_version_	1718420055002185728

Polynomial multiplication on embedded vector architectures

Ejemplares similares