Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures.
Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem....
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/b88b59bd771a421591ff0e913e85798d |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:b88b59bd771a421591ff0e913e85798d |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:b88b59bd771a421591ff0e913e85798d2021-12-02T20:08:17ZOff-chip prefetching based on Hidden Markov Model for non-volatile memory architectures.1932-620310.1371/journal.pone.0257047https://doaj.org/article/b88b59bd771a421591ff0e913e85798d2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0257047https://doaj.org/toc/1932-6203Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem. In this work we present a novel off-chip prefetch technique based on a Hidden Markov Model that specifically deals with the latency problem caused by complexity of off-chip memory access patterns. Firstly, we present a thorough analysis of off-chip memory access patterns to identify its complexity in multicore processors. Based on this study, we propose a prefetching module located in the llc which uses two small tables, and where the computational complexity of which is linear with the number of computing threads. Our Markov-based technique is able to keep track and make clustering of several simultaneous groups of memory accesses coming from multiple simultaneous threads in a multicore processor. It can quickly identify complex address groups and trigger prefetch with very high accuracy. Our simulations show an improvement of up to 76% in the hit ratio of an off-chip dram cache for multicore architecture over the conventional prefetch technique (g/dc). Also, the overhead of prefetch requests (failed prefetches) is reduced by 48% in single core simulations and by 83% in multicore simulations.Adrián LamelaÓscar G OssorioGuillermo VinuesaBenjamín SahelicesPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 9, p e0257047 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Adrián Lamela Óscar G Ossorio Guillermo Vinuesa Benjamín Sahelices Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
description |
Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem. In this work we present a novel off-chip prefetch technique based on a Hidden Markov Model that specifically deals with the latency problem caused by complexity of off-chip memory access patterns. Firstly, we present a thorough analysis of off-chip memory access patterns to identify its complexity in multicore processors. Based on this study, we propose a prefetching module located in the llc which uses two small tables, and where the computational complexity of which is linear with the number of computing threads. Our Markov-based technique is able to keep track and make clustering of several simultaneous groups of memory accesses coming from multiple simultaneous threads in a multicore processor. It can quickly identify complex address groups and trigger prefetch with very high accuracy. Our simulations show an improvement of up to 76% in the hit ratio of an off-chip dram cache for multicore architecture over the conventional prefetch technique (g/dc). Also, the overhead of prefetch requests (failed prefetches) is reduced by 48% in single core simulations and by 83% in multicore simulations. |
format |
article |
author |
Adrián Lamela Óscar G Ossorio Guillermo Vinuesa Benjamín Sahelices |
author_facet |
Adrián Lamela Óscar G Ossorio Guillermo Vinuesa Benjamín Sahelices |
author_sort |
Adrián Lamela |
title |
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
title_short |
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
title_full |
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
title_fullStr |
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
title_full_unstemmed |
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures. |
title_sort |
off-chip prefetching based on hidden markov model for non-volatile memory architectures. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/b88b59bd771a421591ff0e913e85798d |
work_keys_str_mv |
AT adrianlamela offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT oscargossorio offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT guillermovinuesa offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures AT benjaminsahelices offchipprefetchingbasedonhiddenmarkovmodelfornonvolatilememoryarchitectures |
_version_ |
1718375203473457152 |