| Jörg Henkel, Lars Bauer, Joachim Becker, Oliver Bringmann, Uwe Brinkschulte, Samarjit Chakraborty, Michael Engel, Rolf Ernst, Hermann Härtig, Lars Hedrich, Andreas Herkersdorf, Rüdiger Kapitza, Daniel Lohmann, Peter Marwedel, Marco Platzner, Wolfgang Rosenstiel, Ulf Schlichtmann, Olaf Spinczyk, Mehdi Tahoori, Jürgen Teich, Norbert Wehn and Hans-Joachim Wunderlich. Design and Architectures for Dependable Embedded Systems. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Taipei, Taiwan, October 2011 [BibTeX][PDF][Abstract]@inproceedings { SPP1500:11,
author = {Henkel, J{\"o}rg and Bauer, Lars and Becker, Joachim and Bringmann, Oliver and Brinkschulte, Uwe and Chakraborty, Samarjit and Engel, Michael and Ernst, Rolf and H{\"a}rtig, Hermann and Hedrich, Lars and Herkersdorf, Andreas and Kapitza, R{\"u}diger and Lohmann, Daniel and Marwedel, Peter and Platzner, Marco and Rosenstiel, Wolfgang and Schlichtmann, Ulf and Spinczyk, Olaf and Tahoori, Mehdi and Teich, J{\"u}rgen and Wehn, Norbert and Wunderlich, Hans-Joachim},
title = {Design and Architectures for Dependable Embedded Systems},
booktitle = {Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)},
year = {2011},
address = {Taipei, Taiwan},
month = {oct},
keywords = {ders},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-esweek-marwedel.pdf},
confidential = {n},
abstract = {The paper presents an overview of a major research project on dependable embedded systems that has started in Fall 2010 and is running for a projected duration of six years. Aim is a `dependability co-design' that spans various levels of abstraction in the design process of embedded systems starting from gate level through operating system, applications software to system architecture. In addition, we present a new classication on faults, errors, and failures.},
} The paper presents an overview of a major research project on dependable embedded systems that has started in Fall 2010 and is running for a projected duration of six years. Aim is a `dependability co-design' that spans various levels of abstraction in the design process of embedded systems starting from gate level through operating system, applications software to system architecture. In addition, we present a new classication on faults, errors, and failures.
|
| Constantin Timm, Frank Weichert, David Fiedler, Christian Prasse, Heinrich Müller, Michael Hompel and Peter Marwedel. Decentralized Control of a Material Flow System enabled by an Embedded Computer Vision System. In Proceedings of the IEEE ICC 2011 Workshop on Embedding the Real World into the Future Internet June 2011 [BibTeX][PDF][Abstract]@inproceedings { Timm:2011a,
author = {Timm, Constantin and Weichert, Frank and Fiedler, David and Prasse, Christian and M{\"u}ller, Heinrich and Hompel, Michael and Marwedel, Peter},
title = {Decentralized Control of a Material Flow System enabled by an Embedded Computer Vision System},
booktitle = {Proceedings of the IEEE ICC 2011 Workshop on Embedding the Real World into the Future Internet},
year = {2011},
month = {jun},
file = {http://dx.doi.org/10.1109/iccw.2011.5963564},
confidential = {n},
abstract = {In this study, a novel sensor/actuator network approach for scalable automated facility logistics systems is presented. The approach comprises (1) a new sensor combination (cameras and few RFID scanners) for distributed detection, localization and identification of parcels and bins and (2) a novel middleware approach based on a service oriented architecture tailored towards the utilization in sensor/actuator networks. The latter enables a more flexible deploying of automated facility logistics system, while the former presents a novel departure for the detection and tracking of bins and parcels in automated facility logistics systems: light barriers and bar code readers are substituted by low-cost cameras, local conveyor mounted embedded evaluation units and few RFID readers. By combining vision-based systems and RFID systems, this approach can compensate for the drawbacks of each respective system. By utilizing a state-of-the-art middleware for connecting all computer systems of an automated facility logistics system the costs for deployment and reconfiguring the system can be decreased.
The paper describes image processing methods specific to the given problem to both track and read visual markers attached to parcels or bins, processing the data on an embedded system and communication/middleware aspects between different computer systems of an automated facility logistics system such as a database holding the loading and routing information of the conveyed objects as a service for the different visual sensor units. In addition, information from the RFID system is used to narrow the decision space for detection and identification. From an economic point of view this approach enables high density of identification while lowering hardware costs compared to state of the art applications and, due to decentralized control, minimizing the effort for (re-)configuration. These innovations will make automated material flow systems more cost-efficient.},
} In this study, a novel sensor/actuator network approach for scalable automated facility logistics systems is presented. The approach comprises (1) a new sensor combination (cameras and few RFID scanners) for distributed detection, localization and identification of parcels and bins and (2) a novel middleware approach based on a service oriented architecture tailored towards the utilization in sensor/actuator networks. The latter enables a more flexible deploying of automated facility logistics system, while the former presents a novel departure for the detection and tracking of bins and parcels in automated facility logistics systems: light barriers and bar code readers are substituted by low-cost cameras, local conveyor mounted embedded evaluation units and few RFID readers. By combining vision-based systems and RFID systems, this approach can compensate for the drawbacks of each respective system. By utilizing a state-of-the-art middleware for connecting all computer systems of an automated facility logistics system the costs for deployment and reconfiguring the system can be decreased.
The paper describes image processing methods specific to the given problem to both track and read visual markers attached to parcels or bins, processing the data on an embedded system and communication/middleware aspects between different computer systems of an automated facility logistics system such as a database holding the loading and routing information of the conveyed objects as a service for the different visual sensor units. In addition, information from the RFID system is used to narrow the decision space for detection and identification. From an economic point of view this approach enables high density of identification while lowering hardware costs compared to state of the art applications and, due to decentralized control, minimizing the effort for (re-)configuration. These innovations will make automated material flow systems more cost-efficient.
|
| Michael Engel, Florian Schmoll, Andreas Heinig and Peter Marwedel. Temporal Properties of Error Handling for Multimedia Applications. In Proceedings of the 14th ITG Conference on Electronic Media Technology Dortmund / Germany, February 2011 [BibTeX][PDF][Abstract]@inproceedings { engel:11:itg,
author = {Engel, Michael and Schmoll, Florian and Heinig, Andreas and Marwedel, Peter},
title = {Temporal Properties of Error Handling for Multimedia Applications},
booktitle = {Proceedings of the 14th ITG Conference on Electronic Media Technology},
year = {2011},
address = {Dortmund / Germany},
month = {feb},
keywords = {ders},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-itg-engel.pdf},
confidential = {n},
abstract = {In embedded consumer electronics devices, cost pressure is one of the driving design objectives. Devices that handle multimedia information, like DVD players or digital video cameras require high computing performance and real- time capabilities while adhering to the cost restrictions. The cost pressure often results in system designs that barely exceed the minimum requirements for such a system.
Thus, hardware-based fault tolerance methods frequently are ignored due to their cost overhead. However, the amount of transient faults showing up in semiconductor-based systems is expected to increase sharply in the near future. Thus, low- overhead methods to correct related errors in such systems are required. Considering restrictions in processing speed, the real-time properties of a system with added error handling are of special interest. In this paper, we present our approach to flexible error handling and discuss the challenges as well as the inherent timing dependencies to deploy it in a typical soft real- time multimedia system, a H.264 video decoder.},
} In embedded consumer electronics devices, cost pressure is one of the driving design objectives. Devices that handle multimedia information, like DVD players or digital video cameras require high computing performance and real- time capabilities while adhering to the cost restrictions. The cost pressure often results in system designs that barely exceed the minimum requirements for such a system.
Thus, hardware-based fault tolerance methods frequently are ignored due to their cost overhead. However, the amount of transient faults showing up in semiconductor-based systems is expected to increase sharply in the near future. Thus, low- overhead methods to correct related errors in such systems are required. Considering restrictions in processing speed, the real-time properties of a system with added error handling are of special interest. In this paper, we present our approach to flexible error handling and discuss the challenges as well as the inherent timing dependencies to deploy it in a typical soft real- time multimedia system, a H.264 video decoder.
|
| Michael Engel, Florian Schmoll, Andreas Heinig and Peter Marwedel. Unreliable yet Useful -- Reliability Annotations for Data in Cyber-Physical Systems. In Proceedings of the 2011 Workshop on Software Language Engineering for Cyber-physical Systems (WS4C) Berlin / Germany, October 2011 [BibTeX][PDF][Abstract]@inproceedings { engel:11:ws4c,
author = {Engel, Michael and Schmoll, Florian and Heinig, Andreas and Marwedel, Peter},
title = {Unreliable yet Useful -- Reliability Annotations for Data in Cyber-Physical Systems},
booktitle = {Proceedings of the 2011 Workshop on Software Language Engineering for Cyber-physical Systems (WS4C)},
year = {2011},
address = {Berlin / Germany},
month = {oct},
keywords = {ders},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-ws4c-engel.pdf},
confidential = {n},
abstract = {Today, cyber-physical systems face yet another challenge in addition to the traditional constraints in energy, computing power, or memory. Shrinking semiconductor structure sizes and supply voltages imply that the number of errors that manifest themselves in a system will rise significantly. Most CP systems have to survive errors, but many systems do not have sufficient resources to correct all errors that show up. Thus, it is important to spend the available resources on handling errors with the most critical effect. We propose an ``unreliability'' annotation for data types in C programs that indicates if an error showing up in a specific variable or data structure will possibly cause a severe problem like a program crash or might only show rather negligible effects, e.g., a discolored pixel in video decoding. This classification of data is supported by static analysis methods that verify if the value contained in a variable marked as unreliable does not end up as part of a critical operation, e.g., an array index or loop termination condition. This classification enables several approaches to flexible error handling. For example, a CP system designer might choose to selectively safeguard variables marked as non-unreliable or to employ memories with different reliabilty properties to store the respective values.},
} Today, cyber-physical systems face yet another challenge in addition to the traditional constraints in energy, computing power, or memory. Shrinking semiconductor structure sizes and supply voltages imply that the number of errors that manifest themselves in a system will rise significantly. Most CP systems have to survive errors, but many systems do not have sufficient resources to correct all errors that show up. Thus, it is important to spend the available resources on handling errors with the most critical effect. We propose an "unreliability" annotation for data types in C programs that indicates if an error showing up in a specific variable or data structure will possibly cause a severe problem like a program crash or might only show rather negligible effects, e.g., a discolored pixel in video decoding. This classification of data is supported by static analysis methods that verify if the value contained in a variable marked as unreliable does not end up as part of a critical operation, e.g., an array index or loop termination condition. This classification enables several approaches to flexible error handling. For example, a CP system designer might choose to selectively safeguard variables marked as non-unreliable or to employ memories with different reliabilty properties to store the respective values.
|
| Constantin Timm, Frank Weichert, Peter Marwedel and Heinrich Müller. Multi-Objective Local Instruction Scheduling for GPGPU Applications. In Proceedings of the International Conference on Parallel and Distributed Computing Systems 2011 (PDCS) Dallas, USA, December 2011 [BibTeX][PDF][Abstract]@inproceedings { timm:2011:pdcs,
author = {Timm, Constantin and Weichert, Frank and Marwedel, Peter and M{\"u}ller, Heinrich},
title = {Multi-Objective Local Instruction Scheduling for GPGPU Applications},
booktitle = {Proceedings of the International Conference on Parallel and Distributed Computing Systems 2011 (PDCS) },
year = {2011},
address = {Dallas, USA},
month = {December},
publisher = {IASTED/ACTA Press},
file = {http://www.actapress.com/PaperInfo.aspx?paperId=453074},
confidential = {n},
abstract = {In this paper, a new optimization approach (MOLIS: Multi-Objective Local Instruction Scheduling) is presented which maximizes the performance and minimizes the energy consumption of GPGPU applications. The design process of writing efficient GPGPU applications is time-consuming. This disadvantage mainly arises from the fact that the optimization of an application is accomplished in an expensive trial-and-error manner without efficient compiler support. Especially, efficient register utilization and load balancing of the concurrently working instruction and memory pipelines were not considered in the compile process up to now. Another drawback of the state-of-the-art GPGPU application design process is that energy consumption is not taken into account, which is important in the face of green computing. In order to optimize performance and energy consumption simultaneously, a multi-objective genetic algorithm was utilized. The optimization of GPGPU applications in MOLIS employs local instruction scheduling methods. The optimization potential of MOLIS was evaluated by profiling the runtime and the energy consumption on a real platform. The optimization approach was tested with several real-world benchmarks stemming from the Nvidia CUDA examples, the VSIPL-GPGPU-Library and the Rodinia benchmark suite. By applying MOLIS to the real-world benchmarks, up to 9% energy and 12% runtime can be saved.},
} In this paper, a new optimization approach (MOLIS: Multi-Objective Local Instruction Scheduling) is presented which maximizes the performance and minimizes the energy consumption of GPGPU applications. The design process of writing efficient GPGPU applications is time-consuming. This disadvantage mainly arises from the fact that the optimization of an application is accomplished in an expensive trial-and-error manner without efficient compiler support. Especially, efficient register utilization and load balancing of the concurrently working instruction and memory pipelines were not considered in the compile process up to now. Another drawback of the state-of-the-art GPGPU application design process is that energy consumption is not taken into account, which is important in the face of green computing. In order to optimize performance and energy consumption simultaneously, a multi-objective genetic algorithm was utilized. The optimization of GPGPU applications in MOLIS employs local instruction scheduling methods. The optimization potential of MOLIS was evaluated by profiling the runtime and the energy consumption on a real platform. The optimization approach was tested with several real-world benchmarks stemming from the Nvidia CUDA examples, the VSIPL-GPGPU-Library and the Rodinia benchmark suite. By applying MOLIS to the real-world benchmarks, up to 9% energy and 12% runtime can be saved.
|
| Timon Kelter, Heiko Falk, Peter Marwedel, Sudipta Chattopadhyay and Abhik Roychoudhury. Bus-Aware Multicore WCET Analysis through TDMA Offset Bounds. In Proceedings of the 23rd Euromicro Conference on Real-Time Systems (ECRTS), pages 3-12 Porto / Portugal, July 2011 [BibTeX][PDF][Abstract]@inproceedings { kelter:11:ecrts,
author = {Kelter, Timon and Falk, Heiko and Marwedel, Peter and Chattopadhyay, Sudipta and Roychoudhury, Abhik},
title = {Bus-Aware Multicore WCET Analysis through TDMA Offset Bounds},
booktitle = {Proceedings of the 23rd Euromicro Conference on Real-Time Systems (ECRTS)},
year = {2011},
pages = {3-12},
address = {Porto / Portugal},
month = {jul},
keywords = {wcet},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-ecrts_1.pdf},
confidential = {n},
abstract = {In the domain of real-time systems, the analysis of the timing behavior of programs is crucial for guaranteeing the schedulability and thus the safeness of a system. Static analyses of the \textit{WCET} (Worst-Case Execution Time) have proven to be a key element for timing analysis, as they provide safe upper bounds on a program's execution time. For single-core systems, industrial-strength WCET analyzers are already available, but up to now, only first proposals have been made to analyze the WCET in multicore systems, where the different cores may interfere during the access to shared resources. An important example for this are shared buses which connect the cores to a shared main memory. The time to gain access to the shared bus may vary significantly, depending on the used bus arbitration protocol and the access timings. In this paper, we propose a new technique for analyzing the duration of accesses to shared buses. We implemented a prototype tool which uses the new analysis and tested it on a set of realworld benchmarks. Results demonstrate that our analysis achieves the same precision as the best existing approach while drastically outperforming it in matters of analysis time.},
} In the domain of real-time systems, the analysis of the timing behavior of programs is crucial for guaranteeing the schedulability and thus the safeness of a system. Static analyses of the WCET (Worst-Case Execution Time) have proven to be a key element for timing analysis, as they provide safe upper bounds on a program's execution time. For single-core systems, industrial-strength WCET analyzers are already available, but up to now, only first proposals have been made to analyze the WCET in multicore systems, where the different cores may interfere during the access to shared resources. An important example for this are shared buses which connect the cores to a shared main memory. The time to gain access to the shared bus may vary significantly, depending on the used bus arbitration protocol and the access timings. In this paper, we propose a new technique for analyzing the duration of accesses to shared buses. We implemented a prototype tool which uses the new analysis and tested it on a set of realworld benchmarks. Results demonstrate that our analysis achieves the same precision as the best existing approach while drastically outperforming it in matters of analysis time.
|
| Sascha Plazar, Jan C. Kleinsorge, Heiko Falk and Peter Marwedel. WCET-driven Branch Prediction aware Code Positioning. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), pages 165-174 Taipei, Taiwan, October 2011 [BibTeX][PDF][Abstract]@inproceedings { plazar:11:cases,
author = {Plazar, Sascha and Kleinsorge, Jan C. and Falk, Heiko and Marwedel, Peter},
title = {WCET-driven Branch Prediction aware Code Positioning},
booktitle = {Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES)},
year = {2011},
pages = {165-174},
address = {Taipei, Taiwan},
month = {oct},
keywords = {wcet},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-cases_2.pdf},
confidential = {n},
abstract = {In the past decades, embedded system designers moved from simple, predictable system designs towards complex systems equipped with caches, branch prediction units and speculative execution. This step was necessary in order to fulfill increasing requirements on computational power. Static analysis techniques considering such speculative units had to be developed to allow the estimation of an upper bound of the execution time of a program. This bound is called worst-case execution time (WCET). Its knowledge is crucial to verify whether hard real-time systems satisfy their timing constraints, and the WCET is a key parameter for the design of embedded systems.
In this paper, we propose a WCET-driven branch prediction aware optimization which reorders basic blocks of a function in order to reduce the amount of jump instructions and mispredicted branches. We employed a genetic algorithm which rearranges basic blocks in order to decrease the WCET of a program. This enables a first estimation of the possible optimization potential at the cost of high optimization runtimes. To avoid time consuming repetitive WCET analyses, we developed a new algorithm employing integer-linear programming (ILP). The ILP models the worst-case execution path (WCEP) of a program and takes branch prediction effects into account. This algorithm enables short optimization runtimes at slightly decreased optimization results. In a case study, the genetic algorithm is able to reduce the benchmarks’ WCET by up to 24.7% whereas our ILP-based approach is able to decrease the WCET by up to 20.0%.
},
} In the past decades, embedded system designers moved from simple, predictable system designs towards complex systems equipped with caches, branch prediction units and speculative execution. This step was necessary in order to fulfill increasing requirements on computational power. Static analysis techniques considering such speculative units had to be developed to allow the estimation of an upper bound of the execution time of a program. This bound is called worst-case execution time (WCET). Its knowledge is crucial to verify whether hard real-time systems satisfy their timing constraints, and the WCET is a key parameter for the design of embedded systems.
In this paper, we propose a WCET-driven branch prediction aware optimization which reorders basic blocks of a function in order to reduce the amount of jump instructions and mispredicted branches. We employed a genetic algorithm which rearranges basic blocks in order to decrease the WCET of a program. This enables a first estimation of the possible optimization potential at the cost of high optimization runtimes. To avoid time consuming repetitive WCET analyses, we developed a new algorithm employing integer-linear programming (ILP). The ILP models the worst-case execution path (WCEP) of a program and takes branch prediction effects into account. This algorithm enables short optimization runtimes at slightly decreased optimization results. In a case study, the genetic algorithm is able to reduce the benchmarks’ WCET by up to 24.7% whereas our ILP-based approach is able to decrease the WCET by up to 20.0%.
|
| Daniel Cordes, Andreas Heinig, Peter Marwedel and Arindam Mallik. Automatic Extraction of Pipeline Parallelism for Embedded Software Using Linear Programming. In Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS), 2011, pages 699 -706 Tainan, Taiwan, December 2011 [BibTeX][PDF][Abstract]@inproceedings { cordes:2011:icpads,
author = {Cordes, Daniel and Heinig, Andreas and Marwedel, Peter and Mallik, Arindam},
title = {Automatic Extraction of Pipeline Parallelism for Embedded Software Using Linear Programming},
booktitle = {Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS), 2011},
year = {2011},
pages = {699 -706},
address = {Tainan, Taiwan},
month = {dec},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-icpads-cordes.pdf},
confidential = {n},
abstract = {The complexity and performance requirements of embedded software are continuously increasing, making Multiprocessor System-on-Chip (MPSoC) architectures more and more important in the domain of embedded and cyber-physical systems. Using multiple cores in a single system reduces problems concerning energy consumption, heat dissipation, and increases performance. Nevertheless, these benefits do not come for free. Porting existing, mostly sequential, applications to MPSoCs requires extracting efficient parallelism to utilize all available cores. Many embedded applications, like network services and multimedia tasks for voice-, image- and video processing, are operating on data streams and thus have a streaming-based structure. Despite the abundance of parallelism in streaming applications, it is a non-trivial task to split and efficiently map sequential applications to MPSoCs. Therefore, we present an algorithm which automatically extracts pipeline parallelism from sequential ANSI-C applications. The presented tool employs an integer linear programming (ILP) based approach enriched with an adequate cost model to automatically control the granularity of the parallelization. By applying our tool to real-life applications, it can be shown that our approach is able to speed up applications by a factor of up to 3.9x on a four-core MPSoC architecture, compared to a sequential execution.},
} The complexity and performance requirements of embedded software are continuously increasing, making Multiprocessor System-on-Chip (MPSoC) architectures more and more important in the domain of embedded and cyber-physical systems. Using multiple cores in a single system reduces problems concerning energy consumption, heat dissipation, and increases performance. Nevertheless, these benefits do not come for free. Porting existing, mostly sequential, applications to MPSoCs requires extracting efficient parallelism to utilize all available cores. Many embedded applications, like network services and multimedia tasks for voice-, image- and video processing, are operating on data streams and thus have a streaming-based structure. Despite the abundance of parallelism in streaming applications, it is a non-trivial task to split and efficiently map sequential applications to MPSoCs. Therefore, we present an algorithm which automatically extracts pipeline parallelism from sequential ANSI-C applications. The presented tool employs an integer linear programming (ILP) based approach enriched with an adequate cost model to automatically control the granularity of the parallelization. By applying our tool to real-life applications, it can be shown that our approach is able to speed up applications by a factor of up to 3.9x on a four-core MPSoC architecture, compared to a sequential execution.
|
| Peter Marwedel and Michael Engel. Embedded System Design 2.0: Rationale Behind a Textbook Revision. In Proceedings of Workshop on Embedded Systems Education (WESE) Taipei, Taiwan, October 2011 [BibTeX][PDF][Abstract]@inproceedings { marwedel:2011:wese,
author = {Marwedel, Peter and Engel, Michael},
title = {Embedded System Design 2.0: Rationale Behind a Textbook Revision},
booktitle = {Proceedings of Workshop on Embedded Systems Education (WESE)},
year = {2011},
address = {Taipei, Taiwan},
month = {October},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-wese-marwedel.pdf},
confidential = {n},
abstract = {Seven years after its first release, it became necessary to publish a new edition of the author’s text book on embedded system
design. This paper explains the key changes that were incorporated into the second edition. These changes reflect seven years of teaching of the subject, with two courses every year. The rationale behind these changes can also be found in the paper. In this way, the paper also reflects changes in the area over time, while the area becomes more mature. The paper helps understanding why a particular topic is included in this curriculum for embedded system design and why a certain structure of the course is suggested.},
} Seven years after its first release, it became necessary to publish a new edition of the author’s text book on embedded system
design. This paper explains the key changes that were incorporated into the second edition. These changes reflect seven years of teaching of the subject, with two courses every year. The rationale behind these changes can also be found in the paper. In this way, the paper also reflects changes in the area over time, while the area becomes more mature. The paper helps understanding why a particular topic is included in this curriculum for embedded system design and why a certain structure of the course is suggested.
|
| Jan C. Kleinsorge, Heiko Falk and Peter Marwedel. A Synergetic Approach To Accurate Analysis Of Cache-Related Preemption Delay. In Proceedings of the International Conference on Embedded Software (EMSOFT), pages 329-338 Taipei, Taiwan, October 2011 [BibTeX][PDF][Abstract]@inproceedings { kleinsorge:11:emsoft,
author = {Kleinsorge, Jan C. and Falk, Heiko and Marwedel, Peter},
title = {A Synergetic Approach To Accurate Analysis Of Cache-Related Preemption Delay},
booktitle = {Proceedings of the International Conference on Embedded Software (EMSOFT)},
year = {2011},
pages = {329-338},
address = {Taipei, Taiwan},
month = {oct},
keywords = {wcet},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-emsoft.pdf},
confidential = {n},
abstract = {The worst-case execution time (WCET) of a task denotes the largest possible execution time for all possible inputs and thus, hardware states. For non-preemptive multitask scheduling, techniques for the static estimation of safe upper bounds have been subject to industrial practice for years. For preemptive scheduling however, the isolated analysis of tasks becomes imprecise as interferences among tasks cannot be considered with sufficient precision. For such scenarios, the cache-related preemption delay (CRPD) denotes a key metric as it reflects the eects of preemptions on the execution behavior of a single task. Until recently, proposals for CRPD analyses were often limited to direct mapped caches or comparably imprecise for k-way set-associative caches.
In this paper, we propose how the current best techniques for CRPD analysis, which have only been proposed separately and for dierent aspects of the analysis can be brought together to construct an efficient CRPD analysis with unique properties. Moreover, along the construction, we propose several different enhancements to the methods employed. We also exploit that in a complete approach, analysis steps are synergetic and can be combined into a single analysis pass solving all formerly separate steps at once. In addition, we argue that it is often sufficient to carry out the combined analysis on basic block bounds, which further lowers the overall complexity. The result is a proposal for a fast CRPD analysis of very high accuracy.
},
} The worst-case execution time (WCET) of a task denotes the largest possible execution time for all possible inputs and thus, hardware states. For non-preemptive multitask scheduling, techniques for the static estimation of safe upper bounds have been subject to industrial practice for years. For preemptive scheduling however, the isolated analysis of tasks becomes imprecise as interferences among tasks cannot be considered with sufficient precision. For such scenarios, the cache-related preemption delay (CRPD) denotes a key metric as it reflects the eects of preemptions on the execution behavior of a single task. Until recently, proposals for CRPD analyses were often limited to direct mapped caches or comparably imprecise for k-way set-associative caches.
In this paper, we propose how the current best techniques for CRPD analysis, which have only been proposed separately and for dierent aspects of the analysis can be brought together to construct an efficient CRPD analysis with unique properties. Moreover, along the construction, we propose several different enhancements to the methods employed. We also exploit that in a complete approach, analysis steps are synergetic and can be combined into a single analysis pass solving all formerly separate steps at once. In addition, we argue that it is often sufficient to carry out the combined analysis on basic block bounds, which further lowers the overall complexity. The result is a proposal for a fast CRPD analysis of very high accuracy.
|
| Peter Marwedel, Jürgen Teich, Georgia Kouveli, Iuliana Bacivarov, Lothar Thiele, Soonhoi Ha, Chanhee Lee, Qiang Xu and Lin Huang. Mapping of Applications to MPSoCs. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Taipei, Taiwan, October 2011 [BibTeX][PDF][Abstract]@inproceedings { marwedel:2011:codes-isss2,
author = {Marwedel, Peter and Teich, J\"urgen and Kouveli, Georgia and Bacivarov, Iuliana and Thiele, Lothar and Ha, Soonhoi and Lee, Chanhee and Xu, Qiang and Huang, Lin},
title = {Mapping of Applications to MPSoCs},
booktitle = {Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)},
year = {2011},
address = {Taipei, Taiwan},
month = {October},
file = {http://ls12-www.cs.tu-dortmund.de/daes/media/documents/publications/downloads/2011-codes-isss-marwedel.pdf},
confidential = {n},
abstract = {The advent of embedded many-core architectures results in the need to come up with techniques for mapping embedded applications onto such architectures. This paper presents a representative set of such techniques. The techniques focus on optimizing performance, temperature distribution, reliability and fault tolerance for various models.},
} The advent of embedded many-core architectures results in the need to come up with techniques for mapping embedded applications onto such architectures. This paper presents a representative set of such techniques. The techniques focus on optimizing performance, temperature distribution, reliability and fault tolerance for various models.
|
| Constantin Timm, Pascal Libuschewski, Dominic Siedhoff, Frank Weichert, Heinrich Müller and Peter Marwedel. Improving Nanoobject Detection in Optical Biosensor Data. In Proceedings of the 5th International Symposium on Bio- and Medical Informatics and Cybernetics, BMIC 2011 July 2011 [BibTeX][PDF][Abstract]@inproceedings { Timm:2011b,
author = {Timm, Constantin and Libuschewski, Pascal and Siedhoff, Dominic and Weichert, Frank and M{\"u}ller, Heinrich and Marwedel, Peter},
title = {Improving Nanoobject Detection in Optical Biosensor Data},
booktitle = {Proceedings of the 5th International Symposium on Bio- and Medical Informatics and Cybernetics, BMIC 2011},
year = {2011},
month = {July},
file = {http://www.iiis.org/CDs2011/CD2011SCI/BMIC_2011/PapersPdf/BA536CW.pdf},
confidential = {n},
abstract = {The importance of real-time capable mobile biosensors increases in face of rising numbers of global virus epidemics. Such biosensors can be used for on-site diagnosis, e.g. at airports, to prevent further spread of virus-transmitted diseases, by answering the question whether or not a sample contains a certain virus. In-depth laboratory analysis might furthermore demand for measurements of the concentration of virus particles in a sample. The novel PAMONO sensor technique allows for accomplishing both tasks. One of its basic prerequisites is an efficient analysis of the biosensor image data by means of digital image processing and classification. In this study, we present a high performance approach to this analysis: The diagnosis whether a virus occurs in the sample can be carried out in real-time with high accuracy. An estimate of the concentration can be obtained in real-time as well, if that concentration is not too high.
The contribution of this work is an optimization of our processing pipeline used for PAMONO sensor data analysis. The following objectives are optimized: detection-quality, speed and consumption of resources (e.g. energy, memory). Thus our approach respects the constraints imposed by medical applicability, as well as the constraints on resource consumption arising in embedded systems. The parameters to be optimized are descriptive (virus appearance parameters) and hardware-related (design space exploration).
},
} The importance of real-time capable mobile biosensors increases in face of rising numbers of global virus epidemics. Such biosensors can be used for on-site diagnosis, e.g. at airports, to prevent further spread of virus-transmitted diseases, by answering the question whether or not a sample contains a certain virus. In-depth laboratory analysis might furthermore demand for measurements of the concentration of virus particles in a sample. The novel PAMONO sensor technique allows for accomplishing both tasks. One of its basic prerequisites is an efficient analysis of the biosensor image data by means of digital image processing and classification. In this study, we present a high performance approach to this analysis: The diagnosis whether a virus occurs in the sample can be carried out in real-time with high accuracy. An estimate of the concentration can be obtained in real-time as well, if that concentration is not too high.
The contribution of this work is an optimization of our processing pipeline used for PAMONO sensor data analysis. The following objectives are optimized: detection-quality, speed and consumption of resources (e.g. energy, memory). Thus our approach respects the constraints imposed by medical applicability, as well as the constraints on resource consumption arising in embedded systems. The parameters to be optimized are descriptive (virus appearance parameters) and hardware-related (design space exploration).
|