Sie sind hier:

Home Forschung Research Seminar



Research Seminar

General Information

We regularly organize a seminar about the research topics that we are working on. Everyone is welcome to join.

The schedule for the next meetings and summaries of the old meetings are given below.

The seminar takes place in OH16/E18 on Monday at 14:15 bi-weekly. (starting from 02.09.19)


The default sequence is as follows:

Christian Hakert, Qiao Yu, Mikail Yayla, Marco Dürr, Lea Schönberger, Niklas Ueter, Junjie Shi, Georg von der Brüggen, Kuan-Hsun Chen, Jian-Jia Chen


When there is an exceptional event, e.g., rehearsal talk, the person who does it will be automatically added to the end of the default sequence by FIFO policy.

*If there is more than one exceptional event for the same person in the same round, the above rule is only triggered once.


Date Presenter Topic Reference Abstract
16.12.19 Junjie Shi
02.12.19 Jian-Jia Chen Writing Workshop
18.11.19 Niklas Ueter
04.11.19 Lea Schönberger
21.10.19 Qiao Yu Accelerating k-means algorithm based on efficient filtering methods Master Thesis K-means is a well-known clustering algorithm in data mining and machine learning. It is widely applicable in various domains such as computer vision, market segmentation, social network analysis, etc. However, k-means wastes a large amount of time on the unnecessary distance calculations. Thus accelerating k-means has become a worthy and important topic. Accelerated k-means algorithms can achieve the same result as k-means, but only faster. In this paper, we present a novel accelerated exact k-means algorithm named Fission-Fusion k-means that is significantly faster than the state-of-the-art accelerated k-means algorithms. The additional memory consumption of our algorithm is also much less than other accelerated k-means algorithms. Fission-Fusion k-means accelerates k-means by grouping number of points automatically during the iterations. It can balance these expenses well between distance calculations and the filtering time cost. We conduct extensive experiments on the real world datasets. In the experiments, real world datasets verify that Fission-Fusion k-means can considerably outperform the state-of-the-art accelerated k-means algorithms especially when the datasets are low-dimensional and the number of clusters is quite large. In addition, for more separated and naturally-clustered datasets, our algorithm is relatively faster than other accelerated k-means algorithms. 
07.10.19 Marco Dürr End-to-End Timing Analysis of Sporadic Cause-Effect Chains in Distributed Systems Rehearsal Talk
A cause-effect chain is used to define the logical order of data dependent tasks, which is independent from
the execution order of the jobs of the (periodic/sporadic) tasks. Analyzing the worst-case End-to-End timing behavior, associated to a cause-effect chain, is an important problem in embedded control systems. For example, the detailed timing properties of modern automotive systems are specified in the AUTOSAR Timing Extensions.
In this paper, we present a formal End-to-End timing analysis for distributed systems. We consider the two
most important End-to-End timing semantics, i.e., the button-to-action delay (termed as the maximum reaction time) and the worst-case data freshness (termed as the maximum data age). Our contribution is significant due to the consideration of the sporadic behavior of job activations, whilst the results in the literature have been mostly limited to periodic activations. The proof strategy shows the (previously unexplored) connection between the reaction time (data age, respectively) and immediate forward (backward, respectively) job chains. Our analytical results dominate the state of the art for sporadic task activations in distributed systems and the evaluations show a clear improvement for synthesized task systems as well as for a real world automotive benchmark setting.
23.09.19 Projektgruppe
F1/10 - Autonomous Racing PG 620: F1/10 - Autonomous Racing Live-demo in CILAB
09.09.19 Jiang Bian/Mikail Yayla Parameter dropping during inference for efficient and scalable neural networks

Marco Dürr

End-to-End Timing Behavior Analysis of Immediate Forward and Backward Job-Chains based on AUTOSAR Timing Extensions 

Junjie Shi

How to Apply Dependency Graph Approach in Synchronizing Periodic Real-Time Tasks in Multiprocessor

Mostafa Jafari Nodoushan


Niklas Ueter

RTSS rehearsal talk

Lea Schönberger

Do Nothing, but Carefully: Fault Tolerance with Timing Guarantees for Multiprocessor Systems devoid of Online Adaption PRDC rehearsal talk Many practical real-time systems must be able to sustain several reliability threats induced by their physical environments that cause short-term abnormal system behavior, such as transient faults. To cope with this change of system behavior, online adaptions, which may introduce a high computation overhead, are performed in many cases to ensure the timeliness of the more important tasks while no guarantees are provided for the less important tasks. In this work, we propose a system model which does not require any online adaption, but, according to the concept of dynamic real-time guarantees, provides full timing guarantees as well as limited timing guarantees, depending on the system behavior. For the normal system behavior, timeliness is guaranteed for all tasks; otherwise, timeliness is guaranteed only for the more important tasks while bounded tardiness is ensured for the less important tasks. Aiming to provide such dynamic timing guarantees, we propose a suitable system model and discuss, how this can be established by means of partitioned as well as semi-partitioned strategies. Moreover, we propose an approach for handling abnormal behavior with a longer duration, such as intermittent faults or overheating of processors, by performing task migration in order to compensate the affected system component and to increase the system’s reliability. We show by comprehensive experiments that good acceptance ratios can be achieved under partitioned scheduling, which can be further improved under semi-partitioned strategies. In addition, we demonstrate that the proposed migration techniques lead to a reasonable trade-off between the decrease in schedulability and the gain in robustness of the system. The presented approaches can also be applied to mixed-criticality systems with two criticality levels.

Jian-Jia Chen


Marco Dürr

Anas Toma

Keynotes in retreat
03.09.18 Niklas Ueter

Lea Schönberger

Kuan-Hsun Chen

Analysis of Deadline Miss Rates for Uniprocessor Fixed-Priority Scheduling (Kuan) RTCSA rehearsals Timeliness is an important feature for many embedded systems. Although soft real-time embedded systems
can tolerate and allow certain deadline misses, it is still
important to quantify them to justify whether the considered systems are acceptable. In this paper, we provide a way to safely over-approximate the expected deadline miss rate for a specific sporadic real-time task under fixed-priority preemptive scheduling in uniprocessor systems. Our approach is compatible with the existing results in the literature that calculate the probability of deadline misses either based on the
convolution-based approaches or analytically. We  demonstrate our approach by considering randomly generated task sets with an execution behavior that simulates jobs that are subjected to soft errors incurred by hardware transient faults under a given fault rate. To empirically gather the deadline miss rates, we implemented an event-based simulator with a fault-injection module and release the scripts. With extensive simulations under different fault rates, we evaluate the efficiency and the pessimism of our approach. The evaluation results show that our approach is effective to derive an upper bound of the expected deadline miss rate and efficient with respect to the required computation time. 
30.07.18 Junjie Shi Finding a better mapping for TensorFlow  Due to the high demand of running TensorFlow in the heterogeneous platforms in machine learning field, mapping problem becomes more and more important. We are trying to find an efficient way to tune a better mapping for TensorFlow in different heterogeneous platforms.
The state of the art is based on Genetic Algorithm and Gradient Boosting Regressors. However, the results depend on the initial setting of GA (e.g., starting points, probability of performing crossover, probability of mutation, and so on). We believe that our MBO method can outperform the existed methods.  
28.06.18  Georg von der Brüggen ECRTS rehearsal 
28.06.18 Anas Toma OSPERT rehearsal
04.06.18 Prof. Fang-Jing Wu Crowd Estimation: Approximating Crowd Sizes with Multi-modal Data Invited Talk   Crowd mobility has been paid attention for the Internet-of-things (IoT) applications. We address the crowd estimation problem and build an IoT service to share the crowd estimation results across different systems. The crowd estimation problem is to approximate the crowd size in a targeted area using the observed information (e.g., Wi-Fi data). We exploit Wi-Fi probe request packets ("Wi-Fi probes" for short) broadcasted by mobile devices to solve this problem. However, using only Wi-Fi probes to estimate the crowd size may result in inaccurate results due to various environmental uncertainties which may lead to crowd overestimation or underestimation. Moreover, the ground-truth is unavailable because the coverage of Wi-Fi signals is time-varying and invisible. Our system introduces auxiliary sensors, stereoscopic cameras, to collect the near ground-truth at a specified calibration choke point. The key idea of the proposed crowd estimation algorithm is to calibrate the Wi-Fi-only crowd estimation based on the correlations between the two types of data modalities. To verify the proposed system, we have launched an indoor pilot study in the Wellington Railway Station and an outdoor pilot study in the Christchurch Re:START Mall in New Zealand. The large-scale pilot studies show that stereoscopic cameras can reach the minimum accuracy of 85% and high precision detection for providing the near ground-truth. The proposed calibration algorithms reduce estimation errors by 43.68% on average compared to the Wi-Fi-only approach.
14.05.18 Kuan and JJ  Infrastructure Meeting
23.04.18 Prof. Jian-Jia Chen  Internal Workshop
05.03.18 Kuan-Hsun Chen OPT of Decision Trees
05.02.18 Prof. Jian-Jia Chen
27.11.17 Ching-Chi Lin
20.11.17 Lea Schönberger Improving Hardware-Based Message Acceptance Filtering for Controller Area Network (CAN)  In the field of automotive engineering, the Controller Area Network (CAN) is frequently used for the purpose of connecting multiple Electronic Control Units (ECUs). Unfortunately, even though this technique is prevalent in the respective domain, it entails additional computation overhead, since CAN is a broadcast bus. In fact, any message transmitted is possibly received by each ECU and thus must be evaluated in terms of its relevance for the respective receiving node. Such filtering for desired messages can be performed either by means of hardware or of software mechanisms, whereat the latter is preferably avoided due to the ECUs' resource limitations. Hardware-based approaches, in contrast, are much more cost-efficient and can reduce the number of irrelevant messages arriving at a receiving node drastically or even completely, provided that the configuration has been set properly. 

We provide novel methods for finding a minimal (feasible) filter configuration in the first instance as well as for determining a minimal (feasible) configuration under limited hardware resources, i.e., an insufficient amount of filters, such that the number of unintentionally accepted messages is minimized.  
06.11.17 Junjie Shi The next phase of sub-project A3: Methods for Efficient Resource Utilization in Machine Learning Algorithm  During the optimization process in RAMBO, there are lots of points need to be evaluated, different points have different profits and execution times. Also, if there are two pints close with each other, the total profits won't be the sum of each of their profits (which can be shown easily as 1+1 < 2). My topic is about how to evaluate these candidates to maximize total profit under the constraints of limited time budget and number of cpus.  (kind of Quadratic Knapsack Problem) 
23.10.17 Wei Liu

An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness

GPU architectures are increasingly important in the multi-core era due to their high number of parallel processors. Programming thousands of massively parallel threads is a big challenge for software engineers, but understanding the performance bottlenecks of those parallel programs on GPU architectures to improve application performance is even more difficult. Current approaches rely on programmers to tune their applications by exploiting the design space exhaustively without fully understanding the performance characteristics of their applications. To provide insights into the performance bottlenecks of parallel applications on GPU architectures, a simple analytical model is proposed to estimate the execution time of massively parallel programs. The key component of this model is estimating the number of parallel memory requests (we call this the memory warp parallelism) by considering the number of running threads and memory bandwidth. Based on the degree of memory warp parallelism, the model estimates the cost of memory requests, thereby estimating the overall execution time of a program. 
09.10.17 Anas Toma  Auxiliary Resources in Mobile Cloud Computing  A middleware was already presented in a former seminar to save energy in embedded systems by using nearby resources. In this talk, the results of experimental evaluations of the proposed middleware will be presented and discussed. They also include power consumption evaluation of the ODROID-XU4 for different operation modes.



Georg von der Brüggen

1) Parametric Utilization Bounds for Implicit-Deadline Periodic Tasks in Automotive Systems


2) Release Enforcement in Resource-Oriented
Partitioned Scheduling for Multiprocessor Systems

Rehearsal: RTNS Conference Presentation

1) Fixed-priority scheduling has been widely used in safety-critical applications. This paper explores the parametric utilization bounds for implicit-deadline periodic tasks in automotive uniprocessor systems, where the period of a task is either 1, 2, 5, 10, 20, 50, 100, 200, or 1000 milliseconds. We prove a parametric utilization bound of 90%+z for such automotive task systems under rate-monotonic preemptive scheduling (RM-P), where z is a parameter defined by the input task set with 0 ≤ z ≤ 10%. Moreover, we explain how to perform an exact schedulability test for an automotive task set under RM-P by validating only three conditions. Furthermore, we extend our analyses to rate-monotonic non-preemptive scheduling (RM-NP). We show that very reasonable utilization values can still be achieved under RM-NP if the execution time of all tasks is below 1 millisecond. The analyses presented here are compatible with angle synchronous tasks by applying the related arrival curves. It is shown in the evaluations that scheduling those angle-synchronous tasks according to their minimum inter-arrival time instead of assigning them to the highest priority can drastically increase the acceptance ratio in some settings.


2) When partitioned scheduling is used in real-time multiprocessor systems, access to shared resources can jeopardize the schedulability if the task partition is not done carefully. To tackle this problem we change our view angle from focusing on the computing tasks to focusing on the shared resources by applying resource-oriented partitioned scheduling. We use a release enforcement technique to shape the interference from the higher-priority jobs to be sporadic, analyze the schedulability, and provide strategies for partitioning both the critical and the non-critical sections of tasks onto processors individually. Our approaches are shown to be effective, both in the evaluations and from a theoretical point of view by providing a speedup factor of 6, improving previously known results.

To tackle the unavoidable self-suspension behavior due to I/O-intensive interactions, multi-core processors, computation offloading systems with coprocessors, etc., the dynamic and the segmented self-suspension sporadic task models have been widely used in the literature. We propose new self-suspension models that are hybrids of the dynamic and the segmented models. Those hybrid models are capable of exploiting knowledge about execution paths, potentially reducing modelling pessimism. In addition, we provide the corresponding schedulability analysis under fixed-relative-deadline (FRD) scheduling and explain how the state-of-the-art FRD scheduling strategy can be applied. Empirically, these hybrid approaches are shown to be effective with regards to the number of schedulable task sets.
19.07.17 Wei Liu OpenCL Offloading framework on Virus Detection 
21.06.17 Prof. Jian-Jia Chen and Junjie Shi

Implementation and Evaluation of Multiprocessor Resource Synchronization Protocol (MrsP) on LITMUSRT

Rehearsal for ECRTS and OSPERT
07.06.17 Kuan-Hsun Chen Probabilistic Schedulability Tests for Uniprocessor Fixed-Priority Scheduling under Soft Errors Rehearsal Talk for SIES'17 Due to rising integrations, low voltage operations,
and environmental influences such as electromagnetic interference and radiation, transient faults may cause soft errors and corrupt the execution state. Such soft errors can be recovered by applying fault-tolerant techniques. Therefore, the execution time of a job of a sporadic/periodic task may differ, depending upon the occurrence of soft errors and the applied error detection and recovery mechanisms. We model a periodic/sporadic real-time task under such a scenario by using two different worst-case execution times (WCETs), in which one is with the occurrence of soft errors and another is not. Based on a probabilistic soft-error model, the WCETs are hence with different probabilities. In this paper, we present efficient probabilistic schedulability tests that can be applied to verify the schedulability based on probabilistic arguments under fixed-priority  scheduling on a uniprocessor system. We demonstrate how the Chernoff bounds can be used to calculate the task workloads based on their probabilistic WCETs. In addition, we further consider how to calculate the probability of -consecutive deadline misses of a task. The pessimism and the efficiency of our approaches are evaluated against the tighter and approximated convolution-based approaches, by running extensive evaluations under different soft-error rates. The evaluation results show that our approaches are effective to derive the probability of deadline misses and efficient with respect to the needed calculation time.
31.05.17 Santiago Pagani Ultra-low power and dependability for IoT devices
16.05.17 Kuan-Hsun Chen How to deploy experiments on our servers?
09.05.17 Ching-Chi Lin
Research Proposal : Energy-efficient Containers-to-Server and Tasks-to-Core mapping in Cloud Computing System
16.02.17 Kevin/Georg Framework for Empirical Evaluation on Schedulability Tests for Real-Time Scheduling Algorithms
02.02.17 Anas Toma Power-Aware Performance Adaptation of Concurrent Applications in Heterogeneous Many-Core Systems http://dl.acm.org/citation.cfm?id=2934612
19.01.17 Ingo Korb The Cake Cutting Problem Fair distribution of a finite set of resources among multiple agents can be a complex, but important problem -- consider for example the problem of splitting a disputed territory among multiple neighbouring countries. Mathematicians tend to formulate it in a more pleasant way by thinking of a cake as the shared resource to be distributed. This talk will present a few interesting algorithms and results from this problem space. 
08.12.16 Wei Liu A Simplified Acceleration Framework for Data Offloading and Workload Scheduling Today’s trend to use accelerators like GPGPUs in heterogeneous computer systems has entailed several low-level APIs for accelerator programming. However, programming with these APIs is often tedious and therefore unproductive. Seeking faster application performance without significant programming effort is necessary for scientific programmers. In this work, we present a parallel acceleration framework with a set of simplified API functions. In our framework, based on these  API functions, data offloading algorithm and heterogeneous scheduling algorithm are effectively explored. We compare the performance of our framework with CUDA for some real-world applications and evaluate the performance. From experiment results, our framework is more efficient than low-level APIs with high programming efficiency. 
 17.11.16 (10:00) Georg von der Brüggen and Sheng-Wei Cheng Georg: Systems with Dynamic Real-Time Guarantees in Uncertain and Faulty Execution Environments Rehearsal: RTSS Conference Presentation In many practical real-time systems, the physical environment and the system platform can impose uncertain execution behaviour to the system. For example, if transient faults are detected, the execution time of a task instance can be increased due to recovery operations. Such fault recovery routines make the system very vulnerable with respect to meeting hard real-time deadlines. In theory and in practical systems, this problem is often handled by aborting not so important tasks to guarantee the response time of the more important tasks. However, for most systems such faults occur rarely and the results of not so important tasks might still be useful, even if they are a bit late. This implicates to not abort these not so important tasks but keep them running even if faults occur, provided that the more important tasks still meet their hard real time properties. In this paper, we present  Systems with Dynamic Real-Time Guarantees to model this behaviour and determine if the system can provide full timing guarantees or limited timing guarantees  without any online adaptation after a fault occurred. We present a schedulability test, provide an algorithm for optimal priority assignment, determine the maximum interval length until the system will again provide full timing guarantees and explain how we can monitor the system state online. The approaches presented in this paper can also be applied to mixed criticality systems with dual criticality levels.
10.11.16 (10:00) Kevin Wen-Hung Huang


Resource-Oriented Partitioned Scheduling in Multiprocessor Systems: How to Partition and How to Share?


 Rehearsal: RTSS Conference Presentation
When concurrent real-time tasks have to access shared resources, to prevent race conditions, the synchronization and resource access must ensure mutual exclusion, e.g., by using semaphores. That is, no two concurrent accesses to one shared resource are in their critical sections at the same time. For uniprocessor systems, the priority ceiling protocol (PCP) has been widely accepted and supported in real-time operating systems. However, it is still arguable whether there exists a preferable approach for resource sharing in multiprocessor systems. In this paper, we show that the proposed resource-oriented partitioned scheduling using PCP combined with a reasonable allocation algorithm can achieve a non-trivial speedup factor guarantee. Speci cally, we prove that our task mapping and resource allocation algorithm has a speedup factor 11.
27.10.16 Jian-Jia Chen
13.10.16 Georg von der Brüggen Uniprocessor Scheduling Strategies for Self-Suspending
Task Systems 
 Rehearsal: RTNS Conference Presentation

We study uniprocessor scheduling for hard real-time self-suspending task systems where each task may contain a single self-suspension interval. We focus on improving state-of-the-art fixed-relative-deadline (FRD) scheduling approaches, where an FRD scheduler assigns a separate relative deadline to each computation segment of a task. Then, FRD schedules different computation segments by using the earliest-deadline first (EDF) scheduling policy, based on the assigned deadlines for the computation segments. Our proposed algorithm, Shortest Execution Interval First Deadline Assignment (SEIFDA), greedily assigns the relative deadlines of the computation segments, starting with  the task with the smallest execution interval length, i.e., the period minus the self-suspension  time. We show that any reasonable deadline assignment under this strategy has a speedup factor of $3$. Moreover, we present how to approximate the schedulability test and a generalized mixed  integer linear programming (MILP) that can be formulated based on the tolerable loss in the schedulability test defined by the users.  We show by both analysis and experiments that through designing smarter relative deadline assignment policies, the resulting FRD scheduling algorithms yield significantly better performance than existing schedulers for such task systems.

22.09.16 Kuan-Hsun Chen

Overrun Handling for Mixed-Criticality Support in RTEMS

Rehearsal: WMC Workshop
Real-time operating systems are not only used in embedded real-time systems but also useful for the simulation and validation of those systems. During the evaluation of our paper about Systems with Dynamic Real-Time Guarantees that appears in RTSS 2016 we discovered certain unexpected system behavior in the open-source real-time operating system RTEMS. In the current implementation of RTEMS (version 4.11), overruns of an implicit-deadline task, i.e., deadline misses, result in unexpected system behavior as they may lead to a shift of the release pattern of the task. This also has the consequence that some task instances are not released as they should be. In this paper we explain the reason why such problems occur in RTEMS and our solutions.
08.09.16 (12:30) Anas Toma

Auxiliary Middleware Resources for Embedded Systems

  In this talk, a middleware for client-server applications will be presented. The middleware can be used to save energy on the client device and also to reduce the workload of the server. The main idea is to provide auxiliary resources through the middleware by exploiting the nearby devices. 
28.07.16  Ingo Korb


  Something about trees, probably of the decision kind
14.07.16  Wei Liu

Deep learning for Vision: The Caffe Framework

  Caffe provides multimedia scientist and practitioners with a clean and modifiable framework for state-of-the art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and Matlab bindings for training and deploying general purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU. Caffe allows experimentation and seamless switching among platforms for ease of development from prototyping machines to cloud environment. 
30.06.16 Jian-Jia Chen

Open research data

16.06.16 Georg von der Brüggen

Presentation course 

02.06.16 Kuan-Hsun Chen

Compensate or Ignore? Meeting Control Robustness Requirements through Adaptive Soft-Error Handling

Rehearsal Talk for LCTES'16 To avoid catastrophic events like unrecoverable system failures on mobile and embedded systems caused by soft-errors, software-based error detection and compensation techniques have been proposed. Methods like error-correction codes or redundant execution can offer high flexibility and allow for application-specific fault-tolerance selection without the needs of special hardware supports. However, such software-based approaches may lead to system overload due to the execution time overhead. An adaptive deployment of such techniques to meet both application requirements and system constraints is desired. From our case study, we observe that a control task can tolerate limited errors with acceptable performance loss. Such tolerance can be modeled as a (m, k) constraint which requires at least m correct runs out of any k consecutive runs to be correct. In this paper, we discuss how a given (m, k) constraint can be satisfied by adopting patterns of task instances with individual error detection and compensation capabilities. We introduce static strategies and provide a formal feasibility analysis for validation. Furthermore, we develop an adaptive scheme that extends our initial approach with online awareness that increases efficiency while preserving analysis results. The effectiveness of our method is shown in a real-world case study as well as for synthesized task sets.
19.05.16  Sheng-Wei Cheng

Many-Core Real-Time Task Scheduling with Scratchpad Memory 


 This work is motivated by the demand for scheduling tasks upon the increasingly popular island-based many-core architectures. On such an architecture, homogeneous cores are grouped into islands, each of which is equipped with a scratchpad memory module (referred to as local memory). We first show the NP-hardness and the inapproximability of the scheduling problem. Despite the inapproximability, positive results can still be found when different cases of the problem are investigated. A (3 − 1/F)-approximation algorithm is proposed for the minimization of the maximum system utilization, where F is the number of cores in the platform. When the technique of resource augmentation is considered, this paper further develops a (γ + 1)-memory (2γ−1)/(γ−1)-approximation algorithm, where γ represents the trade-off between CPU utilization and local memory space. On the other hand, a special case is also considered when the ratio of the worst-case execution time of a task without and with using the local memory is bounded by a constant. The capabilities of the proposed algorithms are then evaluated with benchmarks from MRTC, UTDSP, NetBench and DSPstone, where the maximum system utilization can be significantly reduced even when the local memory size is only 5% of the total footprint of all of the tasks.

19.05.16 Kevin Huang  

Utilization Bounds on Allocating Rate-Monotonic Scheduled
Multi-Mode Tasks on Multiprocessor Systems

Rehersal Talk for DAC16


Formal models used for representing recurrent real-time processes have traditionally been characterized by a collection of jobs that are released periodically. However, such a modeling may result in resource under-utilization in systems whose behaviors are not entirely periodic. For instance, tasks in cyber-physical system (CPS) may change their service levels, e.g., periods and/or execution times, to adapt to the changes of environments. In this work, we study a model that is a generalization of the periodic task model, called multi-mode task model: a task has several modes specified with different execution times and periods to switch during runtime, independent of other tasks.
Moreover, we study the problem of allocating a set of multi-mode tasks on a homogeneous multiprocessor system.
We present a scheduling algorithm using any reasonable allocation decreasing (RAD) algorithm for task allocations for scheduling multi-mode tasks on multiprocessor systems.
We prove that this algorithm achieves 38% utilization for implicit-deadline rate-monotonic (RM) scheduled multi-mode tasks on multiprocessor systems.

10.03.16  Kuan-Hsun Chen

GetSURE-II Progress Report

Rehearsal Talk for SPP1500 
  1. Task Mapping for Redundant Multithreading in Multi-Cores with Reliability and Performance Heterogeneity
  2. Systems with Dynamic Real-Time Guarantees in Uncertain and Faulty Execution Environments
10.03.16 Kevin Huang Self-Suspension Real-Time Tasks under Fixed-Relative-Deadline Fixed-Priority Scheduling  Rehearsal Talk for DATE'16 Self-suspension is becoming a prominent characteristic in real-time systems such as: (i) I/O-intensive systems (ii) multi-core processors, and (iii) computation offloading systems with coprocessors, like Graphics Processing Units (GPUs). In this work, we study self-suspension systems under fixed-priority (FP) fixed-relative-deadline (FRD) algorithm by using release enforcement to control self-suspension tasks' behavior. Specifically, we use equal-deadline assignment (EDA) to assign the release phases of computations and suspensions. We provide analysis for deriving the speedup factor of the FP FRD scheduler using suspension-laxity-monotonic (SLM) priority assignment.This is the first positive result to provide bounded speedup factor guarantees for general multi-segment self-suspending task systems.
18.02.16 (14:00) Anas Toma Brain-Computer Interface - Potential Research Areas   Brain-Computer Interface (BCI) is a communication or a control technique based on reading the neural electrical activities of the human brain. The commands are detected in spatiotemporal electroencephalograms (EEG) recorded by electrodes distributed over the scalp. This technology is mainly used to help the handicapped people with severe motor disabilities. In this talk, I will present a portable neuroheadset, a resource-constrained device, that reads raw EEG data from the brain and offloads it to a remote processing unit. Furthermore, I will introduce different computation offloading techniques used especially in wearable devices. Finally, potential research areas\cooperation related to the presented techniques will be discussed (e.g. performance and energy optimization, real-time scheduling, reliability, parallel processing, pattern recognition, etc.).
21.01.16  Ingo Korb  Care and Feeding of Benchmarks   Runtime benchmarking appears to be a simple topic - just run the program and measure how long it took. There are however some pitfalls that can influence the run time of your program and thus increase the variance of your results. This talk will demonstrate a few of them and give hints for avoiding them.
07.01.16 Wei Liu  Data Offloading for Remote GPU Acceleration in Distributed Systems    

As many computation-intensive applications increase on the mobile embedded systems, GPUs (Graphics Processing Units) can be used to accelerate these computation-intensive applications even in distributed systems. Traditionally, applications use Remote Procedure Call (RPC) to access the GPUs in the network. However, its simplicity has also limited its efficiency in existing implementations. Specifically, the API requires the application to execute many system calls like select, accept, read, and write. Each of these functions crosses the protection boundary between user space and the operating system, which is expensive. When several applications access GPUs in a remote server, concurrency on the server has become bottlenecks and the response time for applications on embedded systems will also largely be increased.

To solve this problem, we propose a computation offloading framework for remote GPU acceleration in distributed systems. In our framework, a set of API is provided for remote GPU acceleration. An offloading decision algorithm can efficiently utilize GPU resources in the network. Moreover, data communication in our framework is based on user space network protocols. Compared with traditional Linux-based network protocols, our implementation can largely increase the concurrency of GPU utilization when a large number of applications offload data to the GPU server. Average response time on GPU servers can also be largely saved.

10.12.15  Maolin Yang 

The partitioned fixed-priority scheduling of multiprocessor real-time systems with shared resources.

  Multiprocessor scheduling has been studied since decades, and one well-known and well-understood scheduling policy is the partitioned fixed-priority scheduling. However, when shared resources protected by suspension-based locks are modeled explicitly in the system, the situation is much less understood: which synchronization strategy should
be used and how to partition real-time tasks among multiple cores? In this work, we present a dedicated-core synchronization framework, in which dedicated cores are reserved only for shared resources such that all requests to the shared resources are carried out on such cores. This synchronization strategy theoretically outperforms the other two well-know
synchronization strategies in terms of speedup factor. Meanwhile, interplay between task assignment and the response-time analysis is avoided, which enables an efficient task assignment and simplifies the corresponding analysis. 
26.11.15 Prof. Jian-Jia Chen / Kuan-Hsun Chen


Experiments and Benchmarks! How do they matter?
(Embedding 3-d figures in Portable Document Format (PDF) files)


12.11.15 George von der Brueggen


Dynamic Hard and Soft Real-Time Service Level Guarantees in Uncertain and Faulty Execution Environments 


In many practical real time systems the physical environment and the system platform will impose some kind of uncertain behaviour to the system. If faults are detected, the execution time of a task instance can be enlarged due to recovery operations. This fault recovery routines makes the system very vulnerable with respect to meeting hard real time deadlines. In theory and practical systems this problem is often handled by abortion ''not so important'' tasks to guarantee the response time of the more important tasks. However, for most systems those faults occur rarely and the results of ''not so important'' tasks might still be useful, even if they are a bit late. This leads to the idea to not abort these ''not so important'' tasks but keep them running if faults occur as long as the hard real time properties of the important tasks are still guaranteed. We present a new task model and related schedulability tests and an optimal priority assignment to handle this case.

29.10.15 Kevin Huang

Response Time Bounds for Sporadic Arbitrary-Deadline Tasks under Global Fixed-Priority Scheduling on Multiprocessors

  In this paper, we study the problem of scheduling arbitrary-deadline real-time sporadic task sets on a multiprocessor system under global fixed-priority scheduling.  Two contributions are made   in this paper.  First, it has been shown that the existing response time analysis in arbitrary-deadline systems is flawed: the response time may be larger than the derived bound. This paper  provides a revised analysis resolving the problems with the original approach, and then propose a corresponding schedulability test. Secondly, we derive a linear-time upper bound on the response time of arbitrary-deadline tasks in multiprocessor systems. To the best of our knowledge, this is the first work presenting a linear-time response time upper bound for arbitrary-deadline sporadic tasks in multiprocessor systems. Empirically, this linear-time response time bound is shown to be highly effective in terms of the number of task sets that are deemed schedulable. 
06.10.15 Wei Liu

Object Detection with Discriminatively Trained Part Based Models


We describe an object detection system based on mixtures of multiscale deformable part models. Mixtures of deformable part models are trained using a discriminative method that only requires bounding boxes for the objects in an image. The approach leads to efficient object detectors that achieve state of the art results on the PASCAL and INRIA person datasets.

22.09.15 Kuan-Hsun Chen

Reliability-Aware Task Mapping on Many-Cores with Performance Heterogeneity


Due to architectural design, process variations and aging, cores may exhibit heterogeneous performance.
In many-core systems. A commonly adopted soft error mitigation technique is Redundant Multithreading (RMT) that achieves error detection and recovery through redundant thread execution on different cores for an application.
However, task mapping and determining the task execution mode (i.e. a task executes in a reliable mode with RMT or unreliable mode without RMT) need to be considered for achieving resource-efficient reliability.
This paper explores how to efficiently assign the tasks onto different cores with heterogeneous performance properties and determine the execution modes of tasks in order to achieve high reliability and satisfy the tolerance of timeliness.
Our results illustrate that compared to state-of-the-art, the proposed approaches achieve up to 80% reliability improvement (on average 20%) under different scenarios of chip frequencies variation maps.

15.09.15 Emiily Wu

Online Energy Scheduling


Online energy scheduling is an online job scheduling problem. Each job has an arrival time, deadline, and execution time. A feasible solution is to finish every job before its deadline. A processor has three different states: busy, idle and sleep. The objective is to minimize the total energy consumption. In this talk I will introduce some previous work and share my preliminary idea for improvement.

25.08.15 Maolin Yang

Response-time analysis for multiprocessor real-time systems with shared resources


When real-time applications synchronize access to shared resources with binary semaphores (ie, ``mutexes'' or suspension-based locks), a real-time locking protocol is required to bound the undesired priority inversions that increase the response-time of tasks. This talk will discuss the challenges in semaphore protocol analysis, the most recent analytical method for semaphore protocols under fixed-priority scheduling (both G-FP and P-FP), and seek to discuss the potential extensions to further improve schedulability.

11.08.15  Kevin Huang

Techniques for Schedulability Analysis in Mode
Change Systems under Fixed-Priority Scheduling

 Accepted at RTCSA15

Rehearsal Talk for RTCSA

28.07.15 Prof. Jian-Jia Chen

What should be done if the paper is WRONG.




Wei Liu 

Parameter selection for Real-time tasks in Camera-based Object Detection System




Georg von der Brueggen 

Schedulability and Optimization Analysis for
Non-Preemptive Static Priority Scheduling Based on Task Utilization and
Blocking Factors

 Accepted ECRTS'15 paper

Rehearsal Talk 


Kuan-Hsun Chen

Semi-automatic R2Pi Navigation


Fachprojekt discussion


Kevin Huang

Timing Analysis of Real-Time Self-Suspending Tasks under Fixed-Priority Scheduling

 Accepted DAC'15 paper

Rehearsal Talk 


Georg von der Brueggen 

Monte-Carlo Method


A short introduction into the Monte-Carlo Method. The general idea of the Monte-Carlo Method is to use random sampling to get (hopefully good) solutions for mathematical, physical or computer science problems that are too difficult to solve analytically.


Kevin Huang

Timing Analysis of Real-Time Self-Suspending Tasks under Fixed-Priority Scheduling


Self-suspension is becoming an increasingly prominent characteristic in real-time systems such as:

(i) I/O-intensive systems, where applications interact intensively with I/O devices, (ii) multi-core processors, where tasks running on different cores have to synchronize and communicate with each other, and (iii) computation offloading systems with coprocessors, like Graphics Processing Units (GPUs).


Wei Liu

System-Level Performance Optimization through Data Offloading and Parallel GPU Executions


GPUs are becoming extremely important to improve system performance for many embedded systems. Running massively parallel workloads on GPUs is challenging for overall system performance especially when massive workloads are executed concurrently. In this paper, we develop a mechanism to optimize the system-level performance. Two scheduling algorithms can be used to scheduled parallel workloads with data offloading and parallel executions. Experiments show the performance of our algorithms and the feasibility of our mechanism across different platforms.


Kuan-Hsun Chen

Dependable Task Mapping on Many Cores


To mitigate soft errors, redundant copies of an application can execute on different cores in redundant multithreading (RMT) to achieve much higher reliability. On the other hand, due to the architectural design, process variations, and aging, individual cores in manycore systems may have heterogeneous performance. Therefore, we discuss about how to allocate the tasks and cores with heterogeneous performance for dependable execution.


Prof. Jian-Jia Chen

Prof. Peter Marwedel

How to write a paper 

Jian-Jia: (staff only)

Peter: #1(staff only), #2(staff only)



George von der Brueggen

dSpace Turorial: Control Desk - Next Generation (Basic)

dSpace GmbH

Introduction into the basic concepts of the dSpace Control Desk. It can be used to simulate Hardware in the loop, either by using hardware or software simulators. We covered the properties of the Control Desk, creation and configuration of use cases, the definition of variables, input and output devices and data recording.