| Yunfeng Huang, Fang-Jing Wu, Christian Hakert, Georg Brüggen, Kuan-Hsun Chen, Jian-Jia Chen, Patrick Böcker, Petr Chernikov, Luis Cruz, Zeyi Duan, Ahmed Gheith, Anand Gopalan Yantao Gong, Karthik Prakash, Ammar Tauqir and Yue Wang. Demo Abstract: Perception vs. Reality - Never Believe in What You See. In 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN) Virtual Conference, 2020 [BibTeX][PDF][Abstract]@inproceedings { ipsndemo2020,
author = {Huang, Yunfeng and Wu, Fang-Jing and Hakert, Christian and Br\"uggen, Georg and Chen, Kuan-Hsun and Chen, Jian-Jia and B\"ocker, Patrick and Chernikov, Petr and Cruz, Luis and Duan, Zeyi and Gheith, Ahmed and Yantao Gong, Anand Gopalan and Prakash, Karthik and Tauqir, Ammar and Wang, Yue},
title = {Demo Abstract: Perception vs. Reality - Never Believe in What You See},
booktitle = {19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)},
year = {2020 },
address = {Virtual Conference},
keywords = {kuan, georg},
file = {https://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2020-ipsn.pdf},
confidential = {n},
abstract = {The increasing availability of heterogeneous ambient sensing systems challenges the according information processing systems to analyse and compare a variety of different systems in a single scenario. For instance, localization of objects can be performed by image processing systems as well as by radio based localization. If such systems are utilized to localize the same objects, synergy of the outputs is important to enable comparable and meaningful analysis.This demo showcases the practical deployment and challenges ofsuch an example system.},
} The increasing availability of heterogeneous ambient sensing systems challenges the according information processing systems to analyse and compare a variety of different systems in a single scenario. For instance, localization of objects can be performed by image processing systems as well as by radio based localization. If such systems are utilized to localize the same objects, synergy of the outputs is important to enable comparable and meaningful analysis.This demo showcases the practical deployment and challenges ofsuch an example system.
|
| Christian Hakert, Kuan-Hsun Chen, Mikail Yayla, Georg von der Brüggen, Sebastian Bloemeke and Jian-Jia Chen. Software-Based Memory Analysis Environments for In-Memory Wear-Leveling. In 25th Asia and South Pacific Design Automation Conference ASP-DAC 2020, Invited Paper Beijing, China, 2020 [BibTeX][PDF][Abstract]@inproceedings { nvmsimulator,
author = {Hakert, Christian and Chen, Kuan-Hsun and Yayla, Mikail and Br\"uggen, Georg von der and Bloemeke, Sebastian and Chen, Jian-Jia},
title = {Software-Based Memory Analysis Environments for In-Memory Wear-Leveling},
booktitle = {25th Asia and South Pacific Design Automation Conference ASP-DAC 2020, Invited Paper},
year = {2020},
address = {Beijing, China},
keywords = {kuan, nvm-oma, georg},
file = {https://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2020-aspdac-nvm.pdf},
confidential = {n},
abstract = {Emerging non-volatile memory (NVM) architectures are considered as a replacement for DRAM and storage in the near future, since NVMs provide low power consumption, fast access speed, and low unit cost. Due to the lower write-endurance of NVMs, several in-memory wear-leveling techniques have been studied over the last years. Since most approaches propose or rely on specialized hardware, the techniques are often evaluated based on assumptions and in-house simulations rather than on real systems. To address this issue, we develop a setup consisting of a gem5 instance and an NVMain2.0 instance, which simulates an entire system (CPU, peripherals, etc.) together with an NVM plugged into the system. Taking a recorded memory access pattern from a low-level simulation into consideration to design and optimize wear-leveling techniques as operating system services allows a cross-layer design of wear- leveling techniques. With the insights gathered by analyzing the recorded memory access patterns, we develop a software-only wear-leveling solution, which does not require special hardware at all. This algorithm is evaluated afterwards by the full system simulation.},
} Emerging non-volatile memory (NVM) architectures are considered as a replacement for DRAM and storage in the near future, since NVMs provide low power consumption, fast access speed, and low unit cost. Due to the lower write-endurance of NVMs, several in-memory wear-leveling techniques have been studied over the last years. Since most approaches propose or rely on specialized hardware, the techniques are often evaluated based on assumptions and in-house simulations rather than on real systems. To address this issue, we develop a setup consisting of a gem5 instance and an NVMain2.0 instance, which simulates an entire system (CPU, peripherals, etc.) together with an NVM plugged into the system. Taking a recorded memory access pattern from a low-level simulation into consideration to design and optimize wear-leveling techniques as operating system services allows a cross-layer design of wear- leveling techniques. With the insights gathered by analyzing the recorded memory access patterns, we develop a software-only wear-leveling solution, which does not require special hardware at all. This algorithm is evaluated afterwards by the full system simulation.
|
| Christian Hakert, Kuan-Hsun Chen, Simon Kuenzer, Sharan Santhanam, Shuo-Han Chen, Yuan-Hao Chang, Felipe Huici and Jian-Jia Chen. Split’n Trace NVM: Leveraging Library OSes for Semantic Memory Tracing. In 9th Non-Volatile Memory Systems and Applications Symposium (NVMSA) Virtual Conference, 2020 [BibTeX][PDF][Abstract]@inproceedings { hakert2020nvmsa,
author = {Hakert, Christian and Chen, Kuan-Hsun and Kuenzer, Simon and Santhanam, Sharan and Chen, Shuo-Han and Chang, Yuan-Hao and Huici, Felipe and Chen, Jian-Jia},
title = {Split’n Trace NVM: Leveraging Library OSes for Semantic Memory Tracing},
booktitle = {9th Non-Volatile Memory Systems and Applications Symposium (NVMSA)},
year = {2020},
address = {Virtual Conference},
keywords = {kuan, nvm-oma, },
file = {https://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2020-nvmsa-hakert.pdf},
confidential = {n},
abstract = {With the rise of non-volatile memory (NVM) as a replacement for traditional main memories (e.g. DRAM), memory access analysis is becoming an increasingly important topic. NVMs suffer from technical shortcomings as such as reduced cell endurance which call for precise memory access analysis in order to design maintenance strategies that can extend the memory’s lifetime. While existing memory access analyzers trace memory accesses at various levels, from the application level with code instrumentation, down to the hardware level where software is executed on special analysis hardware, they usually interpret main memory as a consecutive area, without investigating the application semantics of different memory regions.
In contrast, this paper presents a memory access simulator, which splits the main memory into semantic regions and enriches the simulation result with semantics from the analyzed application. We leverage a library-based operating system called Unikraft by ascribing memory regions of the simulation to the relevant OS libraries. This novel approach allows us to derive a detailed analysis of which libraries (and thus functionalities) are responsible for which memory access patterns. Through offline profiling with our simulator, we provide a fine-granularity analysis of memory access patterns that provide insights for the design of efficient NVM maintenance strategies.},
} With the rise of non-volatile memory (NVM) as a replacement for traditional main memories (e.g. DRAM), memory access analysis is becoming an increasingly important topic. NVMs suffer from technical shortcomings as such as reduced cell endurance which call for precise memory access analysis in order to design maintenance strategies that can extend the memory’s lifetime. While existing memory access analyzers trace memory accesses at various levels, from the application level with code instrumentation, down to the hardware level where software is executed on special analysis hardware, they usually interpret main memory as a consecutive area, without investigating the application semantics of different memory regions.
In contrast, this paper presents a memory access simulator, which splits the main memory into semantic regions and enriches the simulation result with semantics from the analyzed application. We leverage a library-based operating system called Unikraft by ascribing memory regions of the simulation to the relevant OS libraries. This novel approach allows us to derive a detailed analysis of which libraries (and thus functionalities) are responsible for which memory access patterns. Through offline profiling with our simulator, we provide a fine-granularity analysis of memory access patterns that provide insights for the design of efficient NVM maintenance strategies.
|
| Christian Hakert, Kuan-Hsun Chen, Paul R. Genssler, Georg Brüggen, Lars Bauer, Hussam Amrouch, Jian-Jia Chen and Jörg Henkel. SoftWear: Software-Only In-Memory Wear-Leveling for Non-Volatile Main Memory. CoRR abs/2004.03244 2020 [BibTeX][Link][Abstract]@article { hakert2020softwear,
author = {Hakert, Christian and Chen, Kuan-Hsun and Genssler, Paul R. and Br\"uggen, Georg and Bauer, Lars and Amrouch, Hussam and Chen, Jian-Jia and Henkel, J\"org},
title = {SoftWear: Software-Only In-Memory Wear-Leveling for Non-Volatile Main Memory},
journal = {CoRR},
year = {2020},
volume = {abs/2004.03244},
url = {https://arxiv.org/pdf/2004.03244.pdf},
keywords = {kuan, nvm-oma, georg},
confidential = {n},
abstract = {Several emerging technologies for byte-addressable non-volatile memory (NVM) have been considered to replace DRAM as the main memory in computer systems during the last years. The disadvantage of a lower write endurance, compared to DRAM, of NVM technologies like Phase-Change Memory (PCM) or Ferroelectric RAM (FeRAM) has been addressed in the literature. As a solution, in-memory wear-leveling techniques have been proposed, which aim to balance the wear-level over all memory cells to achieve an increased memory lifetime. Generally, to apply such advanced aging-aware wear-leveling techniques proposed in the literature, additional special hardware is introduced into the memory system to provide the necessary information about the cell age and thus enable aging-aware wear-leveling decisions.
This paper proposes software-only aging-aware wear-leveling based on common CPU features and does not rely on any additional hardware support from the memory subsystem. Specifically, we exploit the memory management unit (MMU), performance counters, and interrupts to approximate the memory write counts as an aging indicator. Although the software-only approach may lead to slightly worse wear-leveling, it is applicable on commonly available hardware. We achieve page-level coarse-grained wear-leveling by approximating the current cell age through statistical sampling and performing physical memory remapping through the MMU. This method results in non-uniform memory usage patterns within a memory page. Hence, we further propose a fine-grained wear-leveling in the stack region of C / C++ compiled software.
By applying both wear-leveling techniques, we achieve up to 78.43% of the ideal memory lifetime, which is a lifetime improvement of more than a factor of 900 compared to the lifetime without any wear-leveling. },
} Several emerging technologies for byte-addressable non-volatile memory (NVM) have been considered to replace DRAM as the main memory in computer systems during the last years. The disadvantage of a lower write endurance, compared to DRAM, of NVM technologies like Phase-Change Memory (PCM) or Ferroelectric RAM (FeRAM) has been addressed in the literature. As a solution, in-memory wear-leveling techniques have been proposed, which aim to balance the wear-level over all memory cells to achieve an increased memory lifetime. Generally, to apply such advanced aging-aware wear-leveling techniques proposed in the literature, additional special hardware is introduced into the memory system to provide the necessary information about the cell age and thus enable aging-aware wear-leveling decisions.
This paper proposes software-only aging-aware wear-leveling based on common CPU features and does not rely on any additional hardware support from the memory subsystem. Specifically, we exploit the memory management unit (MMU), performance counters, and interrupts to approximate the memory write counts as an aging indicator. Although the software-only approach may lead to slightly worse wear-leveling, it is applicable on commonly available hardware. We achieve page-level coarse-grained wear-leveling by approximating the current cell age through statistical sampling and performing physical memory remapping through the MMU. This method results in non-uniform memory usage patterns within a memory page. Hence, we further propose a fine-grained wear-leveling in the stack region of C / C++ compiled software.
By applying both wear-leveling techniques, we achieve up to 78.43% of the ideal memory lifetime, which is a lifetime improvement of more than a factor of 900 compared to the lifetime without any wear-leveling.
|
| Christian Hakert, Kuan-Hsun Chen and Jian-Jia Chen. Can Wear-Aware Memory Allocation be Intelligent?. In 2020 ACM/IEEE Workshop on Machine Learning for CAD (MLCAD ’20), November 16–20, 2020, Virtual Event, Ice- land 2020 [BibTeX][PDF][Abstract]@inproceedings { mlcad2020intelliheap,
author = {Hakert, Christian and Chen, Kuan-Hsun and Chen, Jian-Jia},
title = {Can Wear-Aware Memory Allocation be Intelligent?},
booktitle = {2020 ACM/IEEE Workshop on Machine Learning for CAD (MLCAD ’20), November 16–20, 2020, Virtual Event, Ice- land},
year = {2020},
file = {https://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2020-mlcad-hakert.pdf},
confidential = {n},
abstract = {Many non-volatile memories (NVM) suffer from a severe reduced
cell endurance and therefore require wear-leveling. Heap memory,
as one segment, which potentially is mapped to a NVM, faces a
strong application dependent characteristic regarding the amount
of memory accesses and allocations. A simple deterministic strategy
for wear leveling of the heap may suffer when the available action
space becomes too large. Therefore, we investigate the employment
of a reinforcement learning agent as a substitute for such a strategy
in this paper. The agent’s objective is to learn a strategy, which is
optimal with respect to the total memory wear out. We conclude
this work with an evaluation, where we compare the deterministic
strategy with the proposed agent. We report that our proposed
agent outperforms the simple deterministic strategy in several cases.
However, we also report further optimization potential in the agent
design and deployment.
},
} Many non-volatile memories (NVM) suffer from a severe reduced
cell endurance and therefore require wear-leveling. Heap memory,
as one segment, which potentially is mapped to a NVM, faces a
strong application dependent characteristic regarding the amount
of memory accesses and allocations. A simple deterministic strategy
for wear leveling of the heap may suffer when the available action
space becomes too large. Therefore, we investigate the employment
of a reinforcement learning agent as a substitute for such a strategy
in this paper. The agent’s objective is to learn a strategy, which is
optimal with respect to the total memory wear out. We conclude
this work with an evaluation, where we compare the deterministic
strategy with the proposed agent. We report that our proposed
agent outperforms the simple deterministic strategy in several cases.
However, we also report further optimization potential in the agent
design and deployment.
|
| Sebastian Buschjäger, Jian-Jia Chen, Kuan-Hsun Chen, Mario Günzel, Christian Hakert, Katharina Morik, Rodion Novkin, Lukas Pfahler and Mikail Yayla. Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks. CoRR abs/2002.00909 2020 [BibTeX][Link][Abstract]@article { buschjger2020explainable,
author = {Buschj\"ager, Sebastian and Chen, Jian-Jia and Chen, Kuan-Hsun and G\"unzel, Mario and Hakert, Christian and Morik, Katharina and Novkin, Rodion and Pfahler, Lukas and Yayla, Mikail},
title = {Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks},
journal = {CoRR},
year = {2020},
volume = {abs/2002.00909},
url = {https://arxiv.org/pdf/2002.00909.pdf},
keywords = {kuan, nvm-oma, mario},
confidential = {n},
abstract = {Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage, especially for low-power machine learning models on the edge. It is reported, however, that the bit error rate of RRAMs can be up to 3.3% in the ultra low-power setting, which might be crucial for many use cases. Binary neural networks (BNNs), a resource efficient variant of neural networks (NNs), can tolerate a certain percentage of errors without a loss in accuracy and demand lower resources in computation and storage. The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training, as proposed by Hirtzlin et al., but their method has a significant drawback, especially for fully connected neural networks (FCNN): The FCNNs overfit to the error rate used in training, which leads to low accuracy under lower error rates. In addition, the underlying principles of BET are not investigated. In this work, we improve the training for BET of BNNs and aim to explain this property. We propose straight-through gradient approximation to improve the weight-sign-flip training, by which BNNs adapt less to the bit error rates. To explain the achieved robustness, we define a metric that aims to measure BET without fault injection. We evaluate the metric and find that it correlates with accuracy over error rate for all FCNNs tested. Finally, we explore the influence of a novel regularizer that optimizes with respect to this metric, with the aim of providing a configurable trade-off in accuracy and BET.},
} Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage, especially for low-power machine learning models on the edge. It is reported, however, that the bit error rate of RRAMs can be up to 3.3% in the ultra low-power setting, which might be crucial for many use cases. Binary neural networks (BNNs), a resource efficient variant of neural networks (NNs), can tolerate a certain percentage of errors without a loss in accuracy and demand lower resources in computation and storage. The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training, as proposed by Hirtzlin et al., but their method has a significant drawback, especially for fully connected neural networks (FCNN): The FCNNs overfit to the error rate used in training, which leads to low accuracy under lower error rates. In addition, the underlying principles of BET are not investigated. In this work, we improve the training for BET of BNNs and aim to explain this property. We propose straight-through gradient approximation to improve the weight-sign-flip training, by which BNNs adapt less to the bit error rates. To explain the achieved robustness, we define a metric that aims to measure BET without fault injection. We evaluate the metric and find that it correlates with accuracy over error rate for all FCNNs tested. Finally, we explore the influence of a novel regularizer that optimizes with respect to this metric, with the aim of providing a configurable trade-off in accuracy and BET.
|