PhD Thesis Defenses 2018

PhD thesis defenses are a public affair and open to anyone who is interested. Attending them is a great way to get to know the work going on by your peers in the various research groups. On this page you will find a list of upcoming and past defense talks.

Please go here for electronic access to most of the doctoral dissertations from Saarbrücken Computer Science going back to about 1990.


Quantifying & Characterizing Information Diets of Social Media Users
(Advisor: Prof. Krishna Gummadi)
Thursday, 20.12.2018, 16:00 h, in building E1 5, room 0.29


An increasing number of people are relying on online social media platforms like Twitter and Facebook to consume news and information about the world around them. This change has led to a paradigm shift in the way news and information is exchanged in our society – from traditional mass media to online social media.
With the changing environment, it’s essential to study the information consumption of social media users and to audit how automated algorithms (like search and recommendation systems) are modifying the information that social media users consume. In this thesis, we fulfill this high-level goal with a two-fold approach. First, we propose the concept of information diets as the composition of information produced or consumed. Next, we quantify the diversity and bias in the information diets that social media users consume via the three main consumption channels on social media platforms: (a) word of mouth channels that users curate for themselves by creating social links, (b) recommendations that platform providers give to the users, and (c) search systems that users use to find interesting information on these platforms. We measure the information diets of social media users along three different dimensions of topics, geographic sources, and political perspectives.
Our work is aimed at making social media users aware of the potential biases in their consumed diets, and at encouraging the development of novel mechanisms for mitigating the effects of these biases.

Applicable and Sound Polyhedral Optimization of Low-Level Programs
(Advisor: Prof. Sebastian Hack)
Wednesday, 19.12.2018, 10:15 h, in building E1 7, room 0.01


Computers become increasingly complex. Current and future systems feature configurable hardware, multiple cores with different capabilities, as well as accelerators. In addition, the memory subsystem becomes diversified too. The cache hierarchy grows deeper, is augmented with scratchpads, low-latency memory, and high-bandwidth memory. The programmer alone cannot utilize this enormous potential. Compilers have to provide insight into the program behavior, or even arrange computations and data themselves. Either way, they need a more holistic view of the program. Local transformations, which treat the iteration order, computation unit, and data layout as fixed, will not be able to fully utilize a diverse system.
The polyhedral model, a high-level program representation and transformation framework, has shown great success tackling various problems in the context of diverse systems. While it is widely acknowledged for its analytical powers and transformation capabilities, it is also widely assumed to be too restrictive and fragile for real-world programs.
In this thesis, we improve the applicability and profitability of polyhedral-model-based techniques. Our efforts guarantee a sound polyhedral representation and extend the applicability to a wider range of programs. In addition, we introduce new applications to utilize the information available in the polyhedral program representation, including standalone optimizations and techniques to derive high-level properties.

Alexander WEINERT
Optimality and Resilience in Parity Games
(Advisor: Dr. Martin Zimmermann)
Friday, 14.12.2018, 14:15 h, in building E1 7, room 0.01


Modeling reactive systems as infinite games has yielded a multitude of results in the fields of program verification and program synthesis. The canonical parity condition used for such games, however, neither suffices to express non-functional requirements on the modeled system, nor to capture malfunctions of the deployed system.
In this work we address these issues by investigating quantitative games in which the above characteristics can be expressed. Parity games with costs are a variant of parity games in which traversing an edge incurs some nonnegative cost. The cost of a play is the limit of the cost incurred between answering odd colors by larger even ones. We extend that model by allowing for integer costs, obtaining parity games with weights, and determine both the computational complexity of solving the resulting games as well as the memory required by the protagonist to win such games.
Our results show that the increase in expressiveness comes at a steep cost in terms of both measures. Subsequently, we address the optimality question, i.e., the question asking whether the protagonist is able to ensure a given bound on the cost, and show that answering this question is harder than just solving the underlying game, even for special cases of parity games with weights.
Moreover, we further extend this model to allow for modeling disturbances that may occur during the operation of the resulting strategy and show how to compute a strategy that is resilient against as many such disturbances as possible. Finally, we show that the protagonist is able to trade memory, quality, and resilience of strategies for one another and we show how to compute the possible tradeoffs between these three values.

Paarijaat ADITYA
Towards Privacy-Compliant Mobile Computing
(Advisor: Prof. Peter Druschel)
Monday, 03.12.2018, 16:30 h, in building E1 5, room 0.29


Sophisticated mobile computing, sensing and recording devices like smartphones, smartwatches, and wearable cameras are carried by their users virtually around the clock, blurring the distinction between the online and offline worlds. While these devices enable transformative new applications and services, they also introduce entirely new threats to users‘ privacy, because they can capture a complete record of the user’s location, online and offline activities, and social encounters, including an audiovisual record. Such a record of users‘ personal information is highly sensitive and is subject to numerous privacy risks. In this thesis, we have investigated and built systems to mitigate two such privacy risks:
1) privacy risks due to ubiquitous digital capture, where bystanders may inadvertently be captured in photos and videos recorded by other nearby users,
2) privacy risks to users‘ personal information introduced by a popular class of apps called ‘mobile social apps’. In this thesis, we present two systems, called I-Pic and EnCore, built to mitigate these two privacy risks.
Both systems aim to put the users back in control of what personal information is being collected and shared, while still enabling innovative new applications. We built working prototypes of both systems and evaluated them through actual user deployments. Overall we demonstrate that it is possible to achieve privacy-compliant digital capture and it is possible to build privacy-compliant mobile social apps, while preserving their intended functionality and ease-of-use. Furthermore, we also explore how the two solutions can be merged into a powerful combination, one which could enable novel workflows for specifying privacy preferences in image capture that do not currently exist.


Algorithmic Results for Clustering and Refined Physarum Analysis
(Advisor: Prof. Kurt Mehlhorn)
Tuesday, 27.11.2018, 16:15 h, in building E1 5, room 0.29


In the first part of this thesis, we study the Binary $ell_0$-Rank-$k$ problem which given a binary matrix $A$ and a positive integer $k$, seeks to find a rank-$k$ binary matrix $B$ minimizing the number of non-zero entries of $A-B$. A central open question is whether this problem admits a polynomial time approximation scheme. We give an affirmative answer to this question by designing the first randomized almost-linear time approximation scheme for constant $k$ over the reals, $mathbb{F}_2$, and the Boolean semiring. In addition, we give novel algorithms for important variants of $ell_0$-low rank approximation. The second part of this dissertation, studies a popular and successful heuristic, known as Approximate Spectral Clustering (ASC), for partitioning the nodes of a graph $G$ into clusters with small conductance. We give a comprehensive analysis, showing that ASC runs efficiently and yields a good approximation of an optimal $k$-way node partition of $G$. In the final part of this thesis, we present two results on slime mold computations: i) the continuous undirected Physarum dynamics converges for undirected linear programs with a non-negative cost vector; and ii) for the discrete directed Physarum dynamics, we give a refined analysis that yields strengthened and close to optimal convergence rate bounds, and shows that the model can be initialized with any strongly dominating point.

A Verified Compiler for a Linear Imperative/Functional Intermediate Language
(Advisors: Prof. Sebastian Hack & Prof. Gert Smolka)
Friday, 16.11.2018, 12:00 h, in building E1 7, room 0.01


This thesis describes the design of the verified compiler LVC. LVC’s main novelty is the way its first-order, term-based intermediate language IL realizes the advantages of static single assignment (SSA) for verified compilation. IL is a term-based language not based on a control-flow graph (CFG) but defined in terms of an inductively defined syntax with lexically scoped mutually recursive function definitions. IL replaces the usual dominance-based SSA definition found in unverified and verified compilers with the novel notion of coherence. The main research question this thesis studies is whether IL with coherence offers a faithful implementation of SSA, and how the design influences the correctness invariants and the proofs in the verified compiler LVC. To study this question, we verify dead code elimination, several SSA-based value optimizations including sparse conditional constant propagation and SSA-based register allocation approach including spilling. In these case studies, IL with coherence provides the usual advantages of SSA and improves modularity of proofs. Furthermore, we propose a novel SSA construction algorithm based on coherence, and leverage the term structure of IL to obtain an inductive proof method for simulation proofs. LVC is implemented and verified with over 50,000 lines of code using the proof assistant Coq. To underline practicability of our approach, we integrate LVC with CompCert to obtain an executable compiler that generates PowerPC assembly code.

Supporting User’s Influence in Gamification Settings and Game Live-Streams
(Advisor: Prof. Antonio Krüger)
Tuesday, 06.11.2018, 15:00 h, in building D3 2 (DFKI), room -2.17 (Reuse)


Playing games has long been important to mankind. One reason for this is the associated autonomy, as players can decide on many aspects on their own and can shape the experience. Game-related sub-fields have appeared in Human-Computer Interaction where this autonomy is questionable: in this thesis, we consider gamification and game live-streams and here, we support the users‘ influence at runtime. We hypothesize that this should affect the perception of autonomy and should lead to positive effects overall. Our contribution is three-fold: first, we investigate crowd-based, self-sustaining systems in which the user’s influence directly impacts the outcome of the system’s service. We show that users are willing to expend effort in such systems even without additional motivation, but that gamification is still beneficial here. Second, we introduce „bottom-up“ gamification, i.e., the idea of self-tailored gamification. Here, users have full control over the gamification used in a system, i.e., they can set it up as they see fit at the system’s runtime. Through user studies, we show that this has positive behavioral effects and thus adds to the ongoing efforts to move away from „one-size-fits-all“ solutions. Third, we investigate how to make gaming live-streams more interactive, and how viewers perceive this. We also consider shared game control settings in live-streams, in which viewers have full control, and we contribute options to support viewers‘ self-administration here.


Mining Sandboxes
(Advisor: Prof. Andreas Zeller)
Monday, 22.10.2018, 11:00 h, in building E9 1, room 0.01


Modern software is ubiquitous, yet insecure. It has the potential to expose billions of humans to serious harm, up to and including losing fortunes and taking lives. Existing approaches for securing programs are either exceedingly hard and costly to apply, significantly decrease usability, or just don’t work well enough against a determined attacker.
In this thesis we propose a new solution that significantly increases application security yet it is cheap, easy to deploy, and has minimal usability impact. We combine in a novel way the best of what existing techniques of test generation, dynamic program analysis and runtime enforcement have to offer: We introduce the concept of sandbox mining. First, in a phase called mining, we use automatic test generation to discover application behavior. Second, we apply a sandbox to limit any behavior during normal usage to the one discovered during mining. Users of an application running in a mined sandbox are thus protected from the application suddenly changing its behavior, as compared to the one observed during automatic test generation. As a consequence, backdoors, advanced persistent threats and other kinds of attacks based on the passage of time become exceedingly hard to conduct covertly. They are either discovered in the secure mining phase, where they can do no damage, or are blocked altogether. Mining is cheap because we leverage fully automated test generation to provide baseline behavior. Usability is not degraded: the sandbox runtime enforcement impact is negligible; the mined behavior is comprehensive and presented in a human readable format, thus any unexpected behavior changes are rare and easy to reason about. Our BOXMATE prototype for Android applications shows the approach is technically feasible, has an easy setup process, and is widely applicable to existing apps. Experiments conducted with BOXMATE show less than one hour is required to mine Android applications sandboxes, requiring few to no confirmations for frequently used functionality.


Correctness of Multi-Core Processors with Operating System Support
(Advisor: Prof. Wolfgang Paul)
Thursday, 27.09.2018, 10:00 h, in building E1 5, room 0.29


In the course of adding hardware support for operating systems and hypervisors (like Microsoft Hyper-V), we verify a realistic multi-core processor with integrated mechanisms for two levels of virtual address translation. The second level of address translation (SLAT) is required to allow guests of a hypervisor (typically operating systems) to execute their programs in translated mode. In hardware, the scheme of SLAT is implemented in a dedicated processor component, called the two-level memory management unit (MMU). Both the specification and the implementation of the two-level MMU are presented in full detail. The main contribution is the complete paper and pencil correctness proof for the pipelined multi-core implementation of the MIPS-86 ISA with SLAT. MIPS-86 combines the instruction set of MIPS with the memory model of x86. First, we consider the sequential implementation, which serves to demonstrate integration of the two-level MMU into the MIPS processor. In the proof of our main result — correctness of the pipelined implementation — we refer to the sequential case to show correctness of the MMU operation. This allows us to shift the focus towards the problems of pipelining a machine with speculative execution and interrupts, which are necessary to consider in combination with address translation.

Xucong ZHANG
Gaze Estimation and Interaction in Real-World Environments
(Advisor: Dr. Andreas Bulling)
Wednesday, 26.09.2018, 11:00 h, in building E1 4, room 0.19


Human eye gaze has been widely used in human-computer interaction, as it is a promising modality for natural, fast, pervasive, and non-verbal interaction between humans and computers. Appearance-based gaze estimation methods directly regress from eye images to gaze targets without eye feature detection, and therefore have great potential to work with ordinary devices. However, these methods require a large amount of domain-specific training data to cover the significant variability in eye appearance. In this thesis, we focus on developing appearance-based gaze estimation methods and corresponding attentive user interfaces with a single webcam for challenging real-world environments. We collected a large-scale gaze estimation dataset from real-world setting, proposed a full-face gaze estimation method, and studied data normalization. We applied our gaze estimation methods to real-world interactive application including eye contact detection and gaze estimation with multiple personal devices.

Yongtao SHUAI
Dynamic Adaptive Video Streaming with Minimal Buffer Sizes

(Advisor: Prof. Thorsten Herfet)
Wednesday, 19.09.2018, 16:00 h, in building E1 1, room 407


Recently, adaptive streaming has been widely adopted in video streaming services to improve the quality-of-experience (QoE) of video delivery over the Internet. However, state-of-the-art bitrate adaptation achieves satisfactory performance only with extensive buffering of several tens of seconds. This leads to high playback latency in video delivery, which is undesirable especially in the context of live content and interactive features with a low upper bound on the latency. Therefore, this thesis aims at pushing the application of adaptive streaming to its limit with respect to the buffering size, which is the dominant factor of the streaming latency.
In the current work, we first address the minimum buffering size required in adaptive streaming, which provides us with guidelines to determine a reasonable low latency for streaming systems. Then, we tackle the fundamental challenge of achieving such a low latency streaming by developing a novel adaptation algorithm that stabilizes buffer dynamics despite a small buffering size. We also present advanced improvements by designing a novel adaptation architecture with low-delay feedback for the bitrate selection and optimizing the underlying transport layer to offer efficient real-time streaming. Experimental evaluations demonstrate that our approach achieves superior QoE in adaptive video streaming, especially in the particularly challenging case of low latency streaming.


Understanding and Assessing Security on Android via Static Code Analysis
(Advisor: Prof. Michael Backes)
Tuesday, 28.08.2018, 14:00 h, in building E9 1, room 0.05


Smart devices have become a rich source of sensitive information including personal data (contacts and account data) and context information like GPS data that is continuously aggregated by onboard sensors. As a consequence, mobile platforms have become a prime target for malicious and over-curious applications. The growing complexity and the quickly rising number of mobile apps have further reinforced the demand for comprehensive application security vetting. This dissertation presents a line of work that advances security testing on Android via static code analysis. In the first part of this dissertation, we build an analysis framework that statically models the complex runtime behavior of apps and Android’s application framework (on which apps are built upon) to extract privacy and security-relevant data-flows. We provide the first classification of Android’s protected resources within the framework and generate precise API-to-permission mappings that excel over prior work. We then propose a third-party library detector for apps that is resilient against common code obfuscations to measure the outdatedness of libraries in apps and to attribute vulnerabilities to the correct software component. Based on these results, we identify root causes of app developers not updating their dependencies and propose actionable items to remedy the current status quo. Finally, we measure to which extent libraries can be updated automatically without modifying the application code.

Seong Joon OH
Image Manipulation against Learned Models: Privacy and Security Implications
(Advisor: Prof. Bernt Schiele)
Monday, 06.08.2018, 16:00 h, in building E1 4, room 0.24


Machine learning is transforming the world. Its application areas span privacy sensitive and security critical tasks such as human identification and self-driving cars. These applications raise privacy and security related questions that are not fully understood or answered yet: Can automatic person recognisers identify people in photos even when their faces are blurred? How easy is it to find an adversarial input for a self-driving car that makes it drive off the road?
This thesis contributes one of the first steps towards a better understanding of such concerns in the presence of data manipulation. From the point of view of user’s privacy, we show the inefficacy of common obfuscation methods like face blurring, and propose more advanced techniques based on head inpainting and adversarial examples. We discuss the duality of model security and user privacy problems and describe the implications of research in one area for the other. Finally, we study the knowledge aspect of the data manipulation problem: the more one knows about the target model, the more effective manipulations one can craft. We propose a game theoretic framework to systematically represent the partial knowledge on the target model and derive privacy and security guarantees. We also demonstrate that one can reveal architectural details and training hyperparameters of a model only by querying it, leading to even more effective data manipulations against it.


Quantifying and Mitigating Privacy Risks in Biomedical Data
(Advisor: Prof. Michael Backes)
Wednesday, 25.07.2018, 14:00 h, in building E9 1 (CISPA), room 001


The decreasing costs of molecular profiling have fueled the biomedical research community with a plethora of new types of biomedical data, allowing for a breakthrough towards a more precise and personalized medicine. However, the release of these intrinsically highly sensitive, interdependent data poses a new severe privacy threat.
In this thesis, we provide means to quantify and protect the privacy of individuals’ biomedical data. Besides the genome, we specifically focus on two of the most important epigenetic elements influencing human health: microRNAs and DNA methylation. We quantify the privacy for multiple realistic attack scenarios. Our results underline that the privacy risks inherent to biomedical data have to be taken seriously. Moreover, we present and evaluate solutions to preserve the privacy of individuals. Our mitigation techniques stretch from the differentially private release of epigenetic data, considering its utility, up to cryptographic constructions to securely, and privately evaluate a random forest on a patient’s data.

On Flows, Paths, Roots, and Zeros
(Advisor: Prof. Kurt Mehlhorn)
Monday, 23.07.2018, 14:00 h, in building E1 7, room 001


This thesis has two parts; in the first of which we give new results for various network flow problems. (1) We present a novel dual ascent algorithm for min-cost flow and show that an implementation of it is very efficient on certain instance classes. (2) We approach the problem of numerical stability of interior point network flow algorithms by giving a path following method that works with integer arithmetic solely and is thus guaranteed to be free of any nu-merical instabilities. (3) We present a gradient descent approach for the undirected transship-ment problem and its special case, the single source shortest path problem (SSSP). For distrib-uted computation models this yields the first SSSP-algorithm with near-optimal number of communication rounds.
The second part deals with fundamental topics from algebraic computation. (1) We give an algorithm for computing the complex roots of a complex polynomial. While achieving a com-parable bit complexity as previous best results, our algorithm is simple and promising to be of practical impact. It uses a test for counting the roots of a polynomial in a region that is based on Pellet’s theorem. (2) We extend this test to polynomial systems, i.e., we develop an algorithm that can certify the existence of a k-fold zero of a zero-dimensional polynomial system within a given region. For bivariate systems, we show experimentally that this approach yields signifi-cant improvements when used as inclusion predicate in an elimination method.

Incentives in Dynamic Markets
(Advisor: Prof. Martin Hoefer, now Uni Frankfurt)
Monday, 23.07.2018, 11:00 h, in building E1 7, room 001


In this thesis, we consider a variety of combinatorial optimization problems within a common theme of uncertainty and selfish behavior.
In our first scenario, the input is collected from selfish players. Here, we study extensions of the so-called smoothness framework for mechanisms, a very useful technique for bounding the inefficiency of equilibria, to the cases of varying mechanism availability and participation of risk-averse players. In both of these cases, our main results are general theorems for the class of (λ,μ)-smooth mechanisms. We show that these mechanisms guarantee at most a (small) constant factor performance loss in the extended settings.
In our second scenario, we do not have access to the exact numerical input. Within this context, we explore combinatorial extensions of the well-known secretary problem under the assumption that the incoming elements only reveal their ordinal position within the set of previously arrived elements. We first observe that many existing algorithms for special matroid structures maintain their competitive ratio in the ordinal model. In contrast, we provide a lower bound for algorithms that are oblivious to the matroid structure. Finally, we design new algorithms that obtain constant competitive ratios for a variety of combinatorial problems.


Cryptographic Techniques for Privacy and Access Control in Cloud-Based Applications
(Advisor: Prof. Matteo Maffei, now Uni Wien)
Friday, 29.06.2018, 15:30 h, in building E9 1 (CISPA), room 001


Digitization is one of the key challenges for today’s industries and society. It affects more and more business areas and also user data and, in particular, sensitive information. Due to its sensitivity, it is important to treat personal information as secure and private as possible yet enabling cloud-based software to use that information when requested by the user.
In this thesis, we focus on the privacy-preserving outsourcing and sharing of data, the querying of outsourced protected data, and the usage of personal information as an access control mechanism for rating platforms, which should be protected from coercion attacks. In those three categories, we present cryptographic techniques and protocols that push the state of the art. In particular, we first present multi-client oblivious RAM (ORAM), which augments standard ORAM with selective data sharing through access control, confidentiality, and integrity. Second, we investigate on recent work in frequency-hiding order-preserving encryption and show that the state of the art misses rigorous treatment, allowing for simple attacks against the security of the existing scheme. As a remedy, we show how to fix the security definition and that the existing scheme, slightly adapted, fulfills it. Finally, we design and develop a coercion-resistant rating platform. Coercion-resistance has been dealt with mainly in the context of electronic voting yet also affects other areas of digital life such as rating platforms.

Understanding Regulatory Mechanisms Underlying Stem Cells Helps to Identify Cancer Biomarks
(Advisor: Prof. Volkhard Helms)
Thursday, 28.06.2018, 16:30 h, in building E2 1, room 001


In this thesis, we present novel algorithms to unravel regulatory mechanisms underlying stem cell differentiation and cancerogenesis. Inspired by the tightly interwoven topology of the regulators controlling the pluripotency network in mouse embryonic stem cells, we formulated the problem where a set of master regulatory genes in a regulatory network is identified by solving either of two combinatorial optimization problems namely a minimum dominating set or a minimum connected dominating set in weakly and strongly connected components. Then we applied the developed methods to regulatory cancer networks to identify disease-associated genes and anti-cancer drug targets in breast cancer and hepatocellular carcinoma. As not all the nodes in the solutions are critical, we developed a prioritization method to rank a set of candidate genes which are related to a certain disease based on systematic analysis of the genes that are differentially expressed in tumor and normal conditions. Moreover, we demonstrated that the topological features in regulatory networks surrounding differentially expressed genes are highly consistent in terms of using the output of several analysis tools for identifying differentially expressed genes. We compared two randomization strategies for TF-miRNA co-regulatory networks to infer significant network motifs underlying cellular identity. We showed that the edge-type conserving method surpasses the non-conserving method in terms of biological relevance and centrality overlap. We presented several web servers and software packages that were made publicly available at no cost. The Cytoscape plugin of minimum connected dominating set identifies a set of key regulatory genes in a user-provided regulatory network based on a heuristic approach. The ILP formulations of minimum dominating set and minimum connected dominating set return the optimal solutions for the aforementioned problems. The web servers TFmiR and TFmiR2 construct disease-, tissue-, process-specific networks for the sets of deregulated genes and miRNAs provided by a user. They highlight topological hotspots and offer detection of three- and four-node FFL motifs as a separate web service for mouse and human.

Model Reconstruction for Moment-based Stochastic Chemical Kinetics
(Advisor: Prof. Verena Wolf)
Monday, 25.06.2018, 12:00 h, in building E1 7, room 0.01


Based on the theory of stochastic chemical kinetics, the inherent randomness and stochasticity of biochemical reaction networks can be accurately described by discrete-state continuous-time Markov chains, where each chemical reaction corresponds to a state transition of the process. However, the analysis of such processes is computationally expensive and sophisticated numerical methods are required. The main complication comes due to the largeness problem of the state space, so that analysis techniques based on an exploration of the state space are often not feasible and the integration of the moments of the underlying probability distribution has become a very popular alternative. In this thesis we propose an analysis framework in which we integrate a number of moments of the process instead of the state probabilities. This results in a more time-efficient simulation of the time evolution of the process. In order to regain the state probabilities from the moment representation, we combine the moment-based simulation (MM) with a maximum entropy approach: the maximum entropy principle is applied to derive a distribution that fits best to a given sequence of moments. We further extend this approach by incorporating the conditional moments (MCM) which allows not only to reconstruct the distribution of the species present in high amount in the system, but also to approximate the probabilities of species with low molecular counts. For the given distribution reconstruction framework, we investigate the numerical accuracy and stability using case studies from systems biology, compare two different moment approximation methods (MM and MCM), examine if it can be used for the reaction rates estimation problem and describe the possible future applications.

Practical Dynamic Information Flow Control
(Advisor: Prof. Christian Hammer, now Uni Potsdam)
Friday, 22.06.2018, 16:30 h, in building E2 1, room 007


Computer systems and applications handle a plethora of private user information. These are often assisted by a set of bugs with both malicious and non-malicious intent leading to information leaks. Information flow control has been studied extensively as an approach to mitigate such leaks but has focussed mainly on static analyses. However, some of the applications are developed using dynamic languages like JavaScript that make the static analyses techniques ineffective. Although there has been a growing interest to develop dynamic analysis techniques, it still isn’t at the helm of information flow security in such settings; the prime reason being that these techniques either over-approximate or are too restrictive in most cases. This thesis focuses on improving the usability of dynamic information flow analyses. It begins by presenting a sound improvement and enhancement of the permissive-upgrade strategy. It, then, presents a sound and precise dynamic information flow analysis for handling complex features like exceptions in higher-order languages. To address the issue of permissiveness, this thesis develops a sound approach to bound leaks dynamically while allowing information release in accordance to a developer-specified budget. It then applies these techniques to a Web browser and explores a policy specification mechanism for Web applications.

Shilpa GARG
Computational Haplotyping: theory and practice
(Advisor: Prof. Tobias Marschall)
Thursday, 21.06.2018, 16:30 h, in building E2 1, room 007


Genomics has paved a new way to comprehend life and its evolution, and also to investigate causes of diseases and their treatment. One of the important problems in genomic analyses is haplotype assembly. Constructing complete and accurate haplotypes plays an essential role in understanding population genetics and how species evolve.In this thesis, we focus on computational approaches to haplotype assembly from third generation sequencing technologies. This involves huge amounts of sequencing data, and such data contain errors due to the single molecule sequencing protocols employed. Taking advantage of combinatorial formulations helps  to correct for these errors to solve the haplotyping problem. Various computational techniques such as dynamic programming, parameterized algorithms, and graph algorithms are used to solve this problem.
This thesis presents several contributions concerning the area of haplotyping. First, a novel algorithm based on dynamic programming is proposed to provide approximation guarantees for phasing a single individual. Second, an integrative approach is introduced to combining multiple sequencing datasets to generating complete and accurate haplotypes. The effectiveness of this integrative approach is demonstrated on a real human genome. Third, we provide a novel efficient approach to phasing pedigrees and demonstrate its advantages in comparison to phasing a single individual.Fourth, we present a generalized graph-based framework for performing haplotype-aware de novo assembly. Specifically, this generalized framework consists of a hybrid pipeline for generating accurate and complete haplotypes from data stemming from multiple sequencing technologies, one that provides accurate reads and other that provides long reads.

A tale of two packing problems: Improved algorithms and tighter bounds for online bin packing and the geometric knapsack problem
(Advisor: Prof. Rob van Stee, now Uni Siegen)
Tuesday, 12.06.2018, 16:15 h, in building E1 4, room 024


In this thesis, we deal with two packing problems: the online bin packing and the geometric knapsack problem. In online bin packing, the aim is to pack a given number of items of different size into a minimal number of containers. The items need to be packed one by one without knowing future items. For online bin packing in one dimension, we present a new family of algorithms that constitutes the first improvement over the previously best algorithm in almost 15 years. While the algorithmic ideas are intuitive, an elaborate analysis is required to prove its competitive ratio. We also give a lower bound for the competitive ratio of this family of algorithms. For online bin packing in higher dimensions, we discuss lower bounds for the competitive ratio and show that the ideas from the one-dimensional case cannot be easily transferred to obtain better two-dimensional algorithms. In the geometric knapsack problem, one aims to pack a maximum weight subset of given rectangles into one square container. For this problem, we consider offline approximation algorithms. For geometric knapsack with square items, we improve the running time of the best known PTAS and obtain an EPTAS. This shows that large running times caused by some standard techniques for geometric packing problems are not always necessary and can be improved. Finally, we show how to use resource augmentation to compute optimal solutions in EPTAS-time, thereby improving upon the known PTAS for this case.

Georg NEIS
Compositional Compiler Correctness Via Parametric Simulations
(Advisor: Prof. Derek Dreyer)
Thursday, 07.06.2018, 14:00 h, in building E1 5, room 0.29


Compiler verification is essential for the construction of fully verified software, but most prior work (such as CompCert) has focused on verifying whole-program compilers. To support separate compilation and to enable linking of results from different verified compilers, it is important to develop a compositional notion of compiler correctness that is modular (preserved under linking), transitive (supports multi-pass compilation), and flexible (applicable to compilers that use different intermediate languages or employ non-standard program transformations). In this thesis, we formalize such a notion of correctness based on parametric simulations, thus developing a novel approach to compositional compiler verification.


Andreas TEUCKE
An Approximation and Refinement Approach to First-Order Automated Reasoning
(Advisor: Prof. Christoph Weidenbach)
Wednesday, 16.05.2018, 14:00 h, in building E1 5, room 0.02


With the goal of lifting model-based guidance from the propositional setting to first-order logic, I have developed an approximation theorem proving approach based on counterexample-guided abstraction refinement. A given clause set is transformed into a simplified form where satisfiability is decidable.This approximation extends the signature and preserves unsatisfiability:if the simplified clause set is satisfiable, so is the original clause set. A resolution refutation generated by a decision procedure on the simplified clause set can then either be lifted to a refutation in the original clause set, or it guides a refinement excluding the previously found unliftable refutation. This way the approach is refutationally complete. I have implemented my approach based on the SPASS theorem prover. On certain satisfiable problems, the implementation shows the ability to beat established provers such as SPASS, iProver, and Vampire.

Georg TAMM
Architectures for Ubiquitous 3D on Heterogeneous Computing Platforms
(Advisor: Prof. Philipp Slusallek)
Wednesday, 09.05.2018, 14:00 h, in building D3 4,  VisCenter DFKI


Today, a wide scope for 3D graphics applications exists, including domains such as scientific visualization, 3D-enabled web pages, and entertainment. At the same time, the devices and platforms that run and display the applications are more heterogeneous than ever. Display environments range from mobile devices to desktop systems and ultimately to distributed displays. While the capability of the client devices may vary considerably, the visualization experiences running on them should be consistent. The field of application should dictate how and on what devices users access the application, not the technical requirements to realize the 3D output.
The goal of this thesis is to examine the diverse challenges involved in providing consistent and scalable visualization experiences to heterogeneous computing platforms and display setups. We developed a comprehensive set of rendering architectures in the major domains of scientific and medical visualization, web-based 3D applications, and movie virtual production. To provide the required service quality, performance, and scalability for different client devices and displays, our architectures focus on the efficient utilization and combination of the available client, server, and network resources. We present innovative solutions that incorporate methods for hybrid and distributed rendering as well as means to stream rendering results. We establish the browser as a promising platform for accessible and portable visualization services. We collaborated with experts from the medical field and the movie industry to evaluate the usability of our technology in real-world scenarios.

Integrated Timing Verification for Distributed Embedded Real-Time Systems
(Advisor: Prof. Reinhard Wilhelm)
Tuesday, 08.05.2018, 14:00 h, in building E1 1,  room 407


The verification of timing properties with commercial tools based on static analysis has been available for safety-critical automotive systems for more than a decade. Although they offer many other advantagesin addition to safety guarantees, these tools are rarely applied in the automotive industry. This thesis analyses different possible reasons for this lack of application and suggests solutions where possible. Amongst the results of this thesis are technical findings, like the further increase of automation (e.g. for loop-bound detection) or the automatic reduction of WCET overestimation for highly variant systems. With regard to the development process, technical as well as non-technical solutions are presented that target the actual users of the timing verification, which range from software developers to system architects. The central result of this thesis is the integration of existing and newly developed timing verification methods and tools into a holistic development process approach called Timing Oriented Application Development (TOAD).


Wenbin LI
From Perception over Anticipation to Manipulation
(Advisor: Dr. Mario Fritz)
Wednesday, 25.04.2018, 16:00 h, in building E1 4,  room 019


From autonomous driving cars to surgical robots, robotic system has enjoyed significant growth over the past decade. With the rapid development in robotics alongside the evolution in the related fields, such as computer vision and machine learning, integrating perception, anticipation and manipulation is key to the success of future robotic system. In this talk, we summarize our efforts of exploring different ways of such integration to extend the capabilities of a robotic system to take on more challenging real world tasks. On anticipation and perception, we address the recognition of ongoing activity from videos. In particular we focus on long-duration and complex activities and hence propose a new challenging dataset to facilitate the work. On manipulation with perception, we propose an efficient framework for programming a robot to use human tools and evaluate it on a Baxter research robot. Finally, combining perception, anticipation and manipulation, we focus on a block stacking task. We first explore how to guide robot to place a single block into the scene without collapsing the existing structure and later introduce the target stacking task where the agent stacks blocks to reproduce a tower shown in an image. We validate our model on both synthetic and real-world settings.

Towards the understanding of transcriptional and translational regulatory complexity
(Advisor: Prof. Dr. Volkhard Helms)
Wednesday, 11.04.2018, 10:00 h, in building E2 1,  room 001


Considering the same genome within every cell, the observed phenotypic diversity can only arise from highly regulated mechanisms beyond the encoded DNA sequence. We investigated several mechanisms of protein biosynthesis and analyzed DNA methylation patterns, alternative translation sites, and genomic mutations. As chromatin states are determined by epigenetic modifications and nucleosome occupancy,we conducted a structural superimposition approach between DNA methyltransferase 1 (DNMT1) and the nucleosome, which suggests that DNA methylation is dependent on accessibility of DNMT1 to nucleosome–bound DNA. Considering translation, alternative non–AUG translation initiation was observed. We developed reliable prediction models to detect these alternative start sites in a given mRNA sequence. Our tool PreTIS provides initiation confidences for all frame–independent non–cognate and AUG starts. Despite these innate factors, specific sequence variations can additionally affect a phenotype. We conduced a genome–wide analysis with millions of mutations and found an accumulation of SNPs next to transcription starts that could relate to a gene–specific regulatory signal. We also report similar conservation of canonical and alternative translation sites, highlighting the relevance of alternative mechanisms. Finally, our tool MutaNET automates variation analysis by scoring the impact of individual mutations on cell function while also integrating a gene regulatory network.


Biomedical Knowledge Base Construction form Text and its Application in Knowledge-based Systems
(Advisor: Prof. Dr. Klaus Berberich, now HTW Saar)
Monday, 12.03.2018, 10:00 h, in building E1 4,  room 024


Today in this Big Data era, overwhelming amounts of textual information across different sources with a high degree of redundancy has made it hard for a consumer to retrospect on past events. A plausible solution is to link semantically similar information contained across the different sources to enforce a structure thereby providing multiple access paths to relevant information. Keeping this larger goal in view, this work uses Wikipedia and online news articles as two prominent yet disparate information sources to address the following three problems:
• We address a linking problem to  connect Wikipedia excerpts to news articles by casting it into an IR task. Our novel approach integrates time, geolocations, and entities with text to identify relevant documents thatcan be linked to a given excerpt.
• We address an unsupervised extractive multi-document summarization task to generate a fixed-length event digest that facilitates efficient consumption of information contained within a large set of documents. Our novel approach proposes an ILP for global inference across text, time, geolocations, and entities associated with the event.
• To estimate temporal focus of short event descriptions, we present a semi-supervised approach that leverages redundancy within a longitudinal news collection to estimate accurate probabilistic time models.
Extensive experimental evaluations demonstrate the effectiveness and viability of our proposed approaches towards achieving the larger goal.

Patrick ERNST
Biomedical Knowledge Base Construction from Text and its Application in Knowledge-based Systems
(Advisor: Prof. Dr. Gerhard Weikum)
Tuesday, 06.03.2018, 10:00 h, in building E1 4,  room 024


While general-purpose Knowledge Bases (KBs) have gone a long way in compiling comprehensive knowledge about people, events, places, etc., domain-specific KBs, such as on health, are equally important, but are less explored. Consequently, a comprehensive and expressive health KB that spans all aspects of biomedical knowledge is still missing. The main goal of this thesis is to develop principled methods for building such a KB and enabling knowledge-centric applications. We address several challenges and make the following contributions:
– To construct a health KB, we devise a largely automated and scalable pattern-based knowledge extraction method covering a spectrum of different text genres and distilling a wide variety of facts from different biomedical areas.
– To consider higher-arity relations, crucial for proper knowledge representation in advanced domain such as health, we generalize the fact-pattern duality paradigm of previous methods. A key novelty is the integration of facts with missing arguments by extending our framework to partial patterns and facts by reasoning overthe composability of partial facts.
– To demonstrate the benefits of a health KB, we devise systems for entity-aware search and analytics andfor entity-relationship-oriented exploration.
Extensive experiments and use-case studies demonstrate the viability of the proposed approaches.


Roland LEIßA
Language Support for Programming High-Performance Code
(Advisor: Prof. Dr. Sebastian Hack)
Thursday, 22.02.2018, 14:00 h, in building E1 7,  room 001


Nowadays, the computing landscape is becoming increasingly heterogeneous and this trend is currently showingno signs of turning around. In particular, hardware becomes more and more specialized and exhibits different formsof parallelism. For performance-critical codes it is indispensable to address hardware-specific peculiarities. Because of the halting problem, however, it is unrealistic to assume that a program implemented in a general-purpose programming language can be fully automatically compiled to such specialized hardware while still delivering peak performance. One form of parallelism is single instruction, multiple data (SIMD).
In this talk, we first introduce Sierra: an extension for C++ that facilitates portable and effective SIMD programming. Second, we present AnyDSL. This framework allows to embed a so-called domain-specific language (DSL) into a host language. On the one hand, a DSL offers the application developer a convenient interface; on the other hand, a DSL can perform domain-specific optimizations and effectively map DSL constructs to various architectures. In order to implement a DSL, one usually has to write or modify a compiler. With AnyDSL though, the DSL constructs are directly implemented in the host language while a partial evaluator removes any abstractions that are required in the implementation of the DSL.

Gaze and Peripheral Vision Analysis for Human-Environment Interaction: Applications in Automotive and Mixed-Reality Scenarios
(Advisor: Prof. Dr. Wolfgang Wahlster)
Wednesday, 14.02.2018, 16:00 h, in building D3 2, Reuse seminar room -2.17


This thesis studies eye-based user interfaces which integrate information about the user’s perceptual focus-of-attention into multimodal systems to enrich the interaction with the surrounding environment. We examine two new modalities:gaze input and output in the peripheral field of view. All modalities are considered in the whole spectrum of the mixed-reality continuum. We show the added value of these new forms of multimodal interaction in two important application domains: Automotive User Interfaces and Human-Robot Collaboration. We present experiments that analyze gaze under various conditions and help to design a 3D model for peripheral vision. Furthermore, this work presents several new algorithms for eye-based interaction, like deictic reference in mobile scenarios, for non-intrusive user identification, or exploiting the peripheral field view for advanced multimodal presentations. These algorithms have been integrated into a number of software tools for eye-based interaction, which are used to implement 15 use cases for intelligent environment applications. These use cases cover a wide spectrum of applications, from spatial interactions with a rapidly changing environment from within a moving vehicle, to mixed-reality interaction between teams of human and robots.

Justifying The Strong Memory Semantics of Concurrent High-Level Programming Languages for System Programming
(Advisor: Prof. Dr. Wolfgang Paul)
Wednesday, 07.02.2018, 15:00 h, in E1 7, room 001


We justify the strong memory semantics of high-level concurrent languages, such as C11 or Java, when compiled to x86-like machines. Languages such as C11 guarantee that programs that obey a specific software discipline behave as if they were executed on a simple coherent shared memory multi-processor. Real x86 machines, on the other hand, do not provide a coherent shared memory, and compilers add slow synchronizing instructions to provide the illusion of coherency. We show that one software discipline that closely matches software disciplines of languages such as C11 and Java – in a nutshell, that racing memory accesses are annotated as such by the programmer – and one particular way to add relatively few synchronization instructions – in a nutshell, between an annotated store and an annotated load in the same thread – suffice to create this illusion.
The software discipline has to be obeyed by the program in the semantics of the high-level language, therefore allowing the verification effort to take place completely in the strong memory semantics of the high-level language. We treat a machine model with operating system support, and accordingly our theorems can be used to verify interruptible multi-core operating systems, including those where users can run unverified programs that do not obey the software discipline.

Variational Image Fusion
(Advisor: Prof. Dr. Joachim Weickert)
Monday, 05.02.2018, 10:15 h, in E1 1, room 407


The main goal of this work is the fusion of multiple images to a single composite that offers more information than the individual input images. We approach those fusion tasks within a variational framework.
First, we present iterative schemes that are well-suited for such variational problems and related tasks. They lead to efficient algorithms that are simple to implement and well-parallelisable. Next, we design a general fusion technique that aims for an image with optimal local contrast. This is the key for a versatile method that performs well in many application areas such as multispectral imaging, decolourisation, and exposure fusion. To handle motion within an exposure set, we present the following two-step approach: First, we introduce the complete rank transform to design an optic flow approach that is robust against severe illumination changes.
Second, we eliminate remaining misalignments by means of brightness transfer functions that relate the brightness values between frames. Additional knowledge about the exposure set enables us to propose the first fully coupled method that jointly computes an aligned high dynamic range image and dense displacement fields.
Finally, we present a technique that infers depth information from differently focused images. In this context, we additionally introduce a novel second order regulariser that adapts to the image structure in an anisotropic way.


Numerical Analysis of Stochastic Biochemical Reaction Networks
(Advisor: Prof. Dr. Verena Wolf)
Wednesday, 31.01.2018, 14:00 h, in E1 7, room 001


Numerical solution of the chemical master equation for stochastic reaction networks typically suffers from the state space explosion problem due to the curse of dimensionality and from stiffness due to multiple time scales.The dimension of the state space equals the number of molecular species involved in the reaction network and the size of the system of differential equations equals the number of states in the corresponding continuous-time Markov chain, which is usually enormously huge and often even infinite. Thus, efficient numerical solution approaches must be able to handle huge, possibly infinite and stiff systems of differential equations efficiently.
In this thesis, we present efficient techniques for the numerical analysis of the biochemical reaction networks. We present an approximate numerical integration approach that combines a dynamical state space truncation procedure with efficient numerical integration schemes for systems of ordinary differential equations including adaptive step size selection based on local error estimates. We combine our dynamical state space truncation with the method of conditional moments, and present the implementation details and numerical results. We also incorporate ideas from importance sampling simulations into a non-simulative numerical method that approximates transient rare event probabilities based on a dynamical truncation of the state space.
Finally, we present a maximum likelihood method for the estimation of the model parameters given noisy time series measurements of molecular counts. All approaches presented in this thesis are implemented as part of the tool STAR, which allows to model and simulate the biochemical reaction networks. The efficiency and accuracy is demonstrated by numerical examples.

Relational Cost Analysis
(Advisor: Dr. Deepak Garg)
Monday, 22.01.2018, 16:00 h, in E1 5, room 029


Programming languages research has made great progress towards statically estimating the execution cost of a program. However, when one is interested in how the execution costs of two programs compare to each other (i.e., relational cost analysis), the use of unary techniques does not work well in many cases. In order to support a relational cost analysis, we must ultimately support reasoning about not only the executions of a single program, but also the executions of two programs, taking into account their similarities. This dissertation makes several contributions to the understanding and development of such a relational cost analysis. It shows how:
• Refinement types and effect systems can express functional and relational quantitative properties of pairs of programs, including the difference in execution costs.
• Relational cost analysis can be adapted to reason about dynamic stability, a measure of the update times of incremental programs as their inputs change.
• A sound and complete bidirectional type system can be developed (and implemented) for relational cost analysis.