PhD Thesis Defenses 2019
PhD thesis defenses are a public affair and open to anyone who is interested. Attending them is a great way to get to know the work going on by your peers in the various research groups. On this page you will find a list of upcoming and past defense talks.
Please go here for electronic access to most of the doctoral dissertations from Saarbrücken Computer Science going back to about 1990.
Perception-driven Rendering: Techniques for the Efficient Visualization of 3D Scenes including View- and Gaze-contingent Approaches
(Advisor: Prof. Philipp Slusallek)
Thursday, 19.12.2019, 15:00 h, in building D3 4 (DFKI), VisCenter room
Modern image synthesis approaches enable to create digital images of astonishing complexity and beauty, yet processing resources remain a limiting factor. At the same time, advances in display technologies drive the development of novel display devices, thereupon, further tightening the requirements of rendering approaches. Rendering efficiency is a central challenge involving a trade-off between visual fidelity and interactivity. However, the human visual system has some limitations and the highest possible visual quality is not always necessary. The knowledge of those limitations can be used to develop better and more efficient rendering systems, a field known as perception-driven rendering. The central question in this thesis is how to exploit the limitations or use the potentials of perception to enhance the quality of different rendering techniques whilst maintaining their performance and vice versa. To this end, this thesis presents some state-of-the-art research and models that exploit the limitations of perception in order to increase visual quality but also to reduce workload alike. This research results in several practical rendering approaches that tackle some of the fundamental challenges of computer graphics. Here, different sampling, filtering, and reconstruction techniques aid the visual quality of the synthesized images. An in-depth evaluation of the presented systems including benchmarks, comparative examination with image metrics as well as user studies and experiments demonstrated that the methods introduced are visually superior or on the same qualitative level as ground truth, whilst having a significantly reduced run time.
Learning-based Face Reconstruction and Editing
(Advisor: Prof. Christian Theobalt)
Thursday, 12.12.2019, 16:15 h, in building E1 4, room 0.19
Photo-realistic face editing – an important basis for a wide range of applications in movie and game productions, and applications for mobile devices – is based on computationally expensive algorithms that often require many tedious time-consuming manual steps. This thesis advances state-of-the-art face performance capture and editing pipelines by proposing machine learning-based algorithms for high-quality inverse face rendering in real time and highly realistic neural face rendering, and a video-based refocusing method for faces and general videos. In particular, the proposed contributions address fundamental open challenges towards real-time and highly realistic face editing. The first contribution addresses face reconstruction and introduces a deep convolutional inverse rendering framework that jointly estimates all facial rendering parameters from a single image in real time. The proposed method is based on a novel boosting process that iteratively updates the synthetic training data to better reflect the distribution of real-world images. Second, the thesis introduces a method for face video editing at previously unseen quality. It is based on a generative neural network with a novel space-time architecture, which enables photo-realistic re-animation of portrait videos using an input video. It is the first method to transfer the full 3D head position, head rotation, face expression, eye gaze and eye blinking from a source actor to a portrait video of a target actor. Third, the thesis contributes a new refocusing approach for faces and general videos in postprocessing. The proposed algorithm is based on a new depth-from-defocus algorithm that computes space-time-coherent depth maps, deblurred all-in-focus video and the focus distance for each frame. The high-quality results shown with various applications and challenging scenarios demonstrate the contributions presented in the thesis, and also show potential for machine learning-driven algorithms to solve various open problems in computer graphics.
Search and Analytics Using Semantic Annotations
(Advisor: Dr. Klaus Berberich)
Thursday, 12.12.2019, 14:00 h, in building E1 4, room 0.24
Search systems help users locate relevant information in the form of text documents for keyword queries. Using text alone, it is often difficult to satisfy the user’s information need. To discern the user’s intent behind queries, we turn to semantic annotations (e.g., named entities and temporal expressions) that natural language processing tools can now deliver with great accuracy.
This thesis develops methods and an infrastructure that leverage semantic annotations to efficiently and effectively search large document collections.
This thesis makes contributions in three areas: indexing, querying, and mining of semantically annotated document collections. First, we describe an indexing infrastructure for semantically annotated document collections. The indexing infrastructure can support knowledge-centric tasks such as information extraction, relationship extraction, question answering, fact spotting and semantic search at scale across millions of documents. Second, we propose methods for exploring large document collections by suggesting semantic aspects for queries. These semantic aspects are generated by considering annotations in the form of temporal expressions, geographic locations, and other named entities. The generated aspects help guide the user to relevant documents without the need to read their contents. Third and finally, we present methods that can generate events, structured tables, and insightful visualizations from semantically annotated document collections.
Decision Procedures for Linear Arithmetic
(Advisors: Prof. Christoph Weidenbach and Dr. Thomas Sturm, now LORIA, Nancy)
Tuesday, 10.12.2019, 14:00 h, in building E1 5, room 002
In this thesis, we present new decision procedures for linear arithmetic in the context of SMT solvers and theorem provers:
1) CutSAT++, a calculus for linear integer arithmetic that combines techniques from SAT solving and quantifier elimination in order to be sound, terminating, and complete.
2) The largest cube test and the unit cube test, two sound (although incomplete) tests that find integer and mixed solutions in polynomial time. The tests are especially efficient on absolutely unbounded constraint systems, which are difficult to handle for many other decision procedures.
3) Techniques for the investigation of equalities implied by a constraint system. Moreover, we present several applications for these techniques.
4) The Double-Bounded reduction and the Mixed-Echelon-Hermite transformation, two transformations that reduce any constraint system in polynomial time to an equisatisfiable constraint system that is bounded. The transformations are beneficial because they turn branch-and-bound into a complete and efficient decision procedure for unbounded constraint systems.
We have implemented the above decision procedures (except for CutSAT++) as part of our linear arithmetic theory solver SPASS-IQ and as part of our CDCL(LA) solver SPASS-SATT. We also present various benchmark evaluations that confirm the practical efficiency of our new decision procedures.
Felipe Fernandes ALBRECHT
Analyzing Epigenomic Data in a Large-Scale Context
(Advisor: Prof. Thomas Lengauer)
Monday, 09.12.2019, 16:15 h, in building E1 4, room 0.23
While large amounts of epigenomic data are publicly available, their retrieval in a form suitable for downstream analysis is a bottleneck in current research. In a typical analysis, users are required to download huge files that span the entire genome, even if they are only interested in a small subset or an aggregation thereof. The DeepBlue Epigenomic Data Server mitigates this issue by providing a robust server that affords a powerful API for searching, filtering, transforming, aggregating, enriching, and downloading data from several epigenomic consortia. This work also presents companion tools that utilize the DeepBlue API to enable users not proficient in programming to analyze epigenomic data: (i) an R/Bioconductor package that integrates DeepBlue into the R analysis workflow; (ii) a web portal that enables users to search, select, filter and download the epigenomic data available in the DeepBlue Server; (iii) DIVE, a web data analysis tool that allows researchers to perform large-epigenomic data analysis in a programming-free environment. Furthermore, these tools are integrated, being capable of sharing results among themselves, creating a powerful large-scale epigenomic data analysis environment.
Model Counting for Reactive Systems
(Advisor: Prof. Bernd Finkbeiner)
Thursday, 05.12.2019, 15:00 h, in building E1 7, room 0.01
Model counting is the problem of computing the number of solutions for a logical formula. In the last few years, it has been primarily studied for propositional logic, and has been shown to be useful in many applications. In planning, for example, propositional model counting has been used to compute the robustness of a plan in an incomplete domain. In information-flow control, model counting has been applied to measure the amount of information leaked by a security-critical system.
In this thesis, we introduce the model counting problem for linear-time properties, and show its applications in formal verification. In the same way propositional model counting generalizes the satisfiability problem for propositional logic, counting models for linear-time properties generalizes the emptiness problem for languages over infinite words to one that asks for the number of words in a language. The model counting problem, thus, provides a foundation for quantitative extensions of model checking, where not only the existence of computations that violate the specification is determined, but also the number of such violations.
We solve the model counting problem for the prominent class of omega-regular properties. We present algorithms for solving the problem for different classes of properties, and show the advantages of our algorithms in comparison to indirect approaches based on encodings into propositional logic.
We further show how model counting can be used for solving a variety of quantitative problems in formal verification, including quantitative information-flow in security-critical systems, and the synthesis of approximate implementations for reactive systems.
Improved Methods and Analysis for Semantic Image Segmentation
(Advisor: Prof. Mario Fritz)
Tuesday, 03.12.2019, 15:00 h, in building E1 4, room 0.24
Modern deep learning has enabled amazing developments of computer vision in recent years. As a fundamental task, semantic segmentation aims to predict class labels for each pixel of images, which empowers machines perception of the visual world. In spite of recent successes of fully convolutional networks, several challenges remain to be addressed. In this work, we focus on this topic, under different kinds of input formats and various types of scenes. Specifically, our study contains two aspects:
(1) Data-driven neural modules for improved performance.
(2) Leverage of datasets w.r.t. training systems with higher performances and better data privacy guarantees.
Digital Fabrication of Custom Interactive Objects with Rich Materials
(Advisor: Prof. Jürgen Steimle)
Tuesday, 03.12.2019, 14:00 h, in building E1 7, room 0.01
This thesis investigates digital fabrication as a key technology for prototyping physical user interfaces with custom interactivity. It contributes four novel design and fabrication approaches for interactive objects that leverage rich materials of everyday objects. The contributions enable easy, accessible, and versatile design and fabrication of interactive objects with custom stretchability, input and output on complex geometries and diverse materials, tactile output on 3D-object geometries, and capabilities of changing their shape and material properties.
Together, the contributions of this thesis advance the fields of digital fabrication, rapid prototyping, and ubiquitous computing towards the bigger goal of exploring interactive objects with rich materials as a new generation of physical interfaces.
Plane and Sample: Using Planar Subgraphs for Efficient Algorithms
(Advisor: Prof. Kurt Mehlhorn)
Monday, 02.12.2019, 11:00 h, in building E1 4, room 0.24
In my thesis, we showcased how planar subgraphs with special structural properties can be used to find efficient algorithms for two NP-hard problems in combinatorial optimization.
In this talk I will describe how we developed algorithms for the computation of Tutte paths and show how these special subgraphs can be used to efficiently compute long cycles and other relaxations of Hamiltonicity if we restrict the input to planar graphs.
In addition I will give an introduction to the Maximum Planar Subgraph Problem MPS and show how dense planar subgraphs can be used to develop new approximation algorithms for this NP-hard problem. All new algorithms and arguments I present are based on a novel approach that focuses on maximizing the number of triangular faces in the computed subgraph.
Credibility Analysis of Textual Claims with Explainable Evidence
(Advisor: Prof. Gerhard Weikum)
Tuesday, 26.11.2019, 16:00 h, in building E1 5, room 0.29
Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not.
To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources.
We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.
Cryptography for bitcoin and friends
(Advisor: Prof. Aniket Kate, now Purdue)
Thursday, 21.11.2019, 16:00 h, in building E1 7, room 0.01
Numerous cryptographic extensions to Bitcoin have been proposed since Satoshi Nakamoto introduced the revolutionary design in 2008. However, only few proposals have been adopted in Bitcoin and other prevalent cryptocurrencies, whose resistance to fundamental changes has proven to grow with their success. In this dissertation, we introduce four cryptographic techniques that advance the functionality and privacy provided by Bitcoin and similar cryptocurrencies without requiring fundamental changes in their design: First, we realize smart contracts that disincentivize parties in distributed systems from making contradicting statements by penalizing such behavior by the loss of funds in a cryptocurrency. Second, we propose CoinShuffle++, a coin mixing protocol which improves the anonymity of cryptocurrency users by combining their transactions and thereby making it harder for observers to trace those transactions. The core of CoinShuffle++ is DiceMix, a novel and efficient protocol for broadcasting messages anonymously without the help of any trusted third-party anonymity proxies and in the presence of malicious participants. Third, we combine coin mixing with the existing idea to hide payment values in homomorphic commitments toobtain the ValueShuffle protocol, which enables us to overcome major obstacles to the practical deployment of coin mixing protocols. Fourth, we show how to prepare the aforementioned homomorphic commitments for a safe transition to post-quantum cryptography.
Assessing the Security of Hardware-Assisted Isolation Techniques
(Advisor: Prof. Christian Rossow)
Thursday, 21.11.2019, 11:00 h, in building E9 1, room 0.01
Modern computer systems execute a multitude of different processes on the same hardware. To guarantee that these processes do not interfere with one another, either accidentally or maliciously, modern processors implement various isolation techniques, such as process, privilege, and memory isolation. Consequently, these isolation techniques constitute a fundamental requirement for a secure computer system.
This dissertation investigates the security guarantees of various isolation techniques that are used by modern computer systems. To this end, we present the isolation challenges from three different angles. First, we target fine-grained memory isolation that is used by code-reuse mitigation techniques to thwart attackers with arbitrary memory read vulnerabilities. We demonstrate that dynamically generated code can be used to undermine such memory isolation techniques and, ultimately, propose a way to improve the mitigation. Second, we study side effects of legitimate hardware features, such as speculative execution, and show how they can be leveraged to leak sensitive data across different security contexts, thereby breaking process and memory isolation techniques. Finally, we demonstrate that in the presence of a faulty hardware implementation, all isolation guarantees can be broken. We do this by using a novel microarchitectural issue—discovered by us—to leak arbitrary memory contents across all security contexts.
Symbolic Reactive Synthesis
(Advisor: Prof. Bernd Finkbeiner)
Friday, 15.11.2019, 16:00 h, in building E1 7, room 0.01
In this thesis, we develop symbolic algorithms for the synthesis of reactive systems. Synthesis, that is the task of deriving correct-by-construction implementations from formal specifications, has the potential to eliminate the need for the manual—and error-prone—programming task. The synthesis problem can be formulated as an infinite two-player game, where the system player has the objective to satisfy the specification against all possible actions of the environment player. The standard synthesis algorithms represent the underlying synthesis game explicitly and, thus, they scale poorly with respect to the size of the specification.
We provide an algorithmic framework to solve the synthesis problem symbolically. In contrast to the standard approaches, we use a succinct representation of the synthesis game which leads to improved scalability in terms of the symbolically represented parameters. Our algorithm reduces the synthesis game to the satisfiability problem of quantified Boolean formulas (QBF) and dependency quantified Boolean formulas (DQBF). In the encodings, we use propositional quantification to succinctly represent different parts of the implementation, such as the state space and the transition function.
We develop highly optimized satisfiability algorithms for QBF and DQBF. Based on a counterexample-guided abstraction refinement (CEGAR) loop, our algorithms avoid an exponential blow-up by using the structure of the underlying symbolic encodings. Further, we extend the solving algorithms to extract certificates in the form of Boolean functions, from which we construct implementations for the synthesis problem. Our empirical evaluation shows that our symbolic approach significantly outperforms previous explicit synthesis algorithms with respect to scalability and solution quality.
Unsupervised multiple kernel learning approaches for integrating molecular cancer patient data
(Advisor: Prof. Nico Pfeifer, now Tübingen)
Monday, 11.11.2019, 14:00 h, in building E1 7, room 0.01
Cancer is the second leading cause of death worldwide. A characteristic of this disease is its complexity leading to a wide variety of genetic and molecular aberrations in the tumors. This heterogeneity necessitates personalized therapies for the patients. However, currently defined cancer subtypes used in clinical practice for treatment decision-making are based on relatively few selected markers and thus provide only a coarse classification of tumors. The increased availability in multi-omics data measured for cancer patients now offers the possibility of defining more informed cancer subtypes. Such a more fine-grained characterization of cancer subtypes harbors the potential of substantially expanding treatment options in personalized cancer therapy.
In this work, we identify comprehensive cancer subtypes using multidimensional data. For this purpose, we apply and extend unsupervised multiple kernel learning methods. Three challenges of unsupervised multiple kernel learning are addressed: robustness, applicability, and interpretability. First, we show that regularization of the multiple kernel graph embedding framework, which enables the implementation of dimensionality reduction techniques, can increase the stability of the resulting patient subgroups. This improvement is especially beneficial for data sets with a small number of samples. Second, we adapt the objective function of kernel principal component analysis to enable the application of multiple kernel learning in combination with this widely used dimensionality reduction technique. Third, we improve the interpretability of kernel learning procedures by performing feature clustering prior to integrating the data via multiple kernel learning. On the basis of these clusters, we derive a score indicating the impact of a feature cluster on a patient cluster, thereby facilitating further analysis of the cluster-specific biological properties. All three procedures are successfully tested on real-world cancer data.
On Approximate Polynomial Identity Testing and Real Root Finding
(Advisor: Prof. Markus Bläser)
Monday, 11.11.2019, 10:00 h, in building E1 7, room 0.01
In this thesis we study the following three themes, which share a connection through the (arithmetic) circuit complexity of polynomials.
1. Rank of symbolic matrices: There exist easy randomized polynomial time algorithms for computing the commutative rank of symbolic matrices, finding deterministic polynomial time algorithms remains elusive. We first demonstrate a deterministic polynomial time approximation scheme (PTAS) for computing the commutative rank. Prior to this work, deterministic polynomial time algorithms were known only for computing a 1/2-approximation of the commutative rank. We give two distinct proofs that our algorithm is a PTAS. We also give a min-max characterization of commutative and non-commutative ranks of symbolic matrices.
2. Computation of real roots of real sparse polynomials: It is known that solving a system of polynomial equations reduces to solving a uni-variate polynomial equation. We describe a polynomial time algorithm for sparse polynomials, which computes approximations of all the real roots (even though it may also compute approximations of some complex roots). Moreover, we also show that the roots of integer trinomials are well-separated.
3. Complexity of symmetric polynomials: It is known that symmetric Boolean functions are easy to compute. In contrast, we show that the assumption of inequality of VP and VNP implies that there exist hard symmetric polynomials. To prove this result, we use an algebraic analogue of the classical Newton’s iteration.
Mobile Eye Tracking for Everyone
(Advisor: Prof. Andreas Bulling, now Stuttgart)
Friday, 08.11.2019, 12:00 h, in building E1 4, room 0.24
Mobile eye tracking has significant potential to overcome fundamental limitations of remote systems in term of mobility and practical usefulness. However, mobile eye tracking currently faces two fundamental challenges that prevent it from being practically usable and that, consequently, have to be addressed before mobile eye tracking can truly be used by everyone: Mobile eye tracking needs to be advanced and made fully functional in unconstrained environments, and it needs to be made socially acceptable. Nevertheless, solving technical challenges and social obstacles alone is not sufficient to make mobile eye tracking attractive for the masses. The key to success is the development of convincingly useful, innovative, and essential applications. This thesis provides solutions for all of these challenges and paves the way for the development of socially acceptable, privacy-aware, but highly functional mobile eye tracking devices and novel applications, so that mobile eye tracking can develop its full potential to become an everyday technology for everyone.
Optimization Landscape of Deep Neural Networks
(Advisor: Prof. Matthias Hein, now Tübingen)
Friday, 18.10.2019, 13:00 h, in building E1 7, room 0.01
Deep learning has drawn increasing attention in artificial intelligence due to its superior empirical performance in various fields of applications, including computer vision, natural language processing and speech recognition. This raises important and fundamental questions on why these methods work so well in practice.
The goal of this thesis is to improve our theoretical understanding on the optimization landscape of deep neural nets. We show that under some over-parameterization conditions, in particular as the width of one of the hidden layers exceeds the number of training samples, every stationary point of the loss with some full-rank structure is a global minimum with zero training error. Moreover, every sub-level set of the loss is connected and unbounded. This suggests that by increasing over-parameterization, the loss function of deep neural nets can become much more favorable to local search algorithms than any other „wildly“ non-convex function. In an attempt to bring our results closer to practice, we come up with a new class of network architectures which provably have a nice optimization surface while empirically achieving good generalization performance on standard benchmark dataset. Lastly, we complement our results by identifying an intuitive function class which neural networks of arbitrary depth might not be able to learn if their hidden layers are not wide enough.
Non-Reformist Reform for Haskell Modularity
(Advisor: Prof. Derek Dreyer)
Tuesday, 15.10.2019, 15:00 h, in building E1 5, room 0.29
In this thesis, I present Backpack, a new language for building separately-typecheckable packages on top of a weak module system like Haskell’s. The design of Backpack is the first to bring the rich world of type systems to the practical world of packages via mixin modules. It’s inspired by the MixML module calculus of Rossberg and Dreyer but by choosing practicality over expressivity Backpack both simplifies that semantics and supports a flexible notion of applicative instantiation. Moreover, this design is motivated less by foundational concerns and more by the practical concern of integration into Haskell. The result is a new approach to writing modular software at the scale of packages.
Understanding Quantities in Web Tables and Text
(Advisor: Prof. Gerhard Weikum)
Tuesday, 08.10.2019, 15:00 h, in building E1 4, room 024
There is a wealth of schema-free tables on the web. The text accompanying these tables explains and qualifies the numerical quantities given in the tables. Despite this ubiquity of tabular data, there is little research that harnesses this wealth of data by semantically understanding the information that is conveyed rather ambiguously in these tables. This information can be disambiguated only by the help of the accompanying text.
In the process of understanding quantity mentions in tables and text, we are faced with the following challenges; First, there is no comprehensive knowledge base for anchoring quantity mentions. Second, tables are created ad-hoc without a standard schema and with ambiguous header names; also table cells usually contain abbreviations. Third, quantities can be written in multiple forms and units of measures–for example “48 km/h” is equivalent to “30 mph”. Fourth, the text usually refers to the quantities in tables using aggregation, approximation, and different scales.
In this thesis, we target these challenges through the following contributions:
We present the Quantity Knowledge Base (QKB), a knowledge base for representing Quantity mentions. We construct the QKB by importing information from Freebase, Wikipedia, and other online sources.
We propose Equity: a system for automatically canonicalizing header names and cell values onto concepts, classes, entities, and uniquely represented quantities registered in a knowledge base.
We devise a probabilistic graphical model that captures coherence dependencies between cells in tables and candidate items in the space of concepts, entities, and quantities. Then, we cast the inference problem into an efficient algorithm based on random walks over weighted graphs. baselines.
We introduce the quantity alignment problem: computing bidirectional links between textual mentions of quantities and the corresponding table cells. We propose BriQ: a system for computing such alignments. BriQ copes with the specific challenges of approximate quantities, aggregated quantities, and calculated quantities.
We design ExQuisiTe: a web application that identifies mentions of quantities in text and tables, aligns quantity mentions in the text with related quantity mentions in tables, and generates salient suggestions for extractive text summarization systems.
Paths and Walks, Forests and Planes – Arcadian Algorithms and Complexity
(Advisor: Prof. Holger Dell, now ITU Copenhagen)
Thursday, 26.09.2019, 10:00 h, in building E1 7, room 001
We present new results on parameterized algorithms for hard graph problems. We generalize and improve established methods (i.a. Color-Coding and representative families) by an algebraic formulation of the problems in the language of exterior algebra.Our method gives (1) a faster algorithm to estimate the number of simple paths of given length in directed graphs, (2) faster deterministic algorithms for finding such paths if the input graph contains only few of them, and (3) faster deterministic algorithms to find spanning trees with few leaves. We also consider the algebraic foundations of our new method in exterior and commutative algebra.
Additionally, we investigate the fine-grained complexity of determining the number of forests with a given number of edges in a given undirected graph.
We (1) complete the complexity classification of the Tutte plane under the exponential time hypothesis, and (2) prove that counting forests with a given number of edges is at least as hard as counting cliques of a given size.
Accountable Infrastructure and Its Impact on Internet Security and Privacy
(Advisor: Prof. Michael Backes)
Tuesday, 17.09.2019, 14:00 h, in building E9 1, room 001
The Internet infrastructure relies on the correct functioning of the basic underlying protocols, which were designed for functionality. Security and privacy have been added post hoc, mostly by applying cryptographic means to different layers of communication. In the absence of accountability, as a fundamental property, the Internet infrastructure does not have a built-in ability to associate an action with the responsible entity, neither to detect or prevent misbehavior.
In this thesis, we study accountability from a few different perspectives. First, we study the need of having accountability in anonymous communication networks as a mechanism that provides repudiation for the proxy nodes by tracing back selected outbound traffic in a provable manner. Second, we design a framework that provides a foundation to support the enforcement of the right to be forgotten law in a scalable and automated manner. The framework provides a technical mean for the users to prove their eligibility for content removal from the search results. Third, we analyze the Internet infrastructure determining potential security risks and threats imposed by dependencies among the entities on the Internet. Finally, we evaluate the feasibility of using hop count filtering as a mechanism for mitigating Distributed Reflective Denial-of-Service attacks, and conceptually show that it cannot work to prevent these attacks.
On some covering, partition, and connectivity problems in graphs
(Advisor: Dr. Andreas Karrenbauer)
Monday, 09.09.2019, 14:00 h, in building E1 4, room 0.24
We look at some graph problems related to covering, partition, and connectivity. First, we study the problems of covering and partitioning edges with bicliques, especially from the viewpoint of parameterized complexity. For the partition problem, we develop much more efficient algorithms than the ones previously known. In contrast, for the cover problem, our lower bounds show that the known algorithms are probably optimal. Next, we move on to graph coloring, which is probably the most extensively studied partition problem in graphs. Hadwiger’s conjecture is a long-standing open problem related to vertex coloring. We prove the conjecture for a special class of graphs, namely squares of 2-trees, and show that square graphs are important in connection with Hadwiger’s conjecture. Then, we study a coloring problem that has been emerging recently, called rainbow coloring. This problem lies in the intersection of coloring and connectivity. We study different variants of rainbow coloring and present bounds and complexity results on them. Finally, we move on to another parameter related to connectivity called spanning tree congestion (STC). We give tight bounds for STC in general graphs and random graphs. While proving the results on STC, we also make some contributions to the related area of connected partitioning.
Applications, challenges and new perspectives on the analysis of transcriptional regulation using epigenomic and transcriptomic data
(Advisor: Dr. Marcel Schulz, now Frankfurt)
Monday, 26.08.2019, 10:00 h, in building E1 4, room 0.24
The integrative analysis of epigenomics and transcriptomics data is an active research field in Bioinformatics. New methods are required to interpret and process large omics data sets, as generated within consortia such as the International Human Epigenomics Consortium. In this thesis, we present several approaches illustrating how combined epigenomics and transcriptomics datasets, e.g. for differential or time series analysis, can be used to derive new biological insights on transcriptional regulation. In my work, I focused on regulatory proteins called transcription factors (TFs), which are essential for orchestrating cellular processes.
I present novel approaches, which combine epigenomics data, such as DNaseI-seq, predicted TF binding scores and gene-expression measurements in interpretable machine learning models. In joint work with our collaborators within and outside IHEC, we have shown that our methods lead to biological meaningful results, which could be validated with wet-lab experiments.
Aside from providing the community with new tools to perform integrative analysis of epigenomics and transcriptomics data, we have studied the characteristics of chromatin accessibility data and its relation to gene-expression in detail to better understand the implications of both computational processing and of different experimental methods on data interpretation.
Decidable Fragments of First-Order Logic and of First-Order Linear Arithmetic with Uninterpreted Predicates
(Advisor: Prof. Christoph Weidenbach)
Wednesday, 31.07.2019, 14:00 h, in building E1 4, room 0.24
First-order logic (FOL) is one of the most prominent formalisms in computer science. Unfortunately, FOL satisfiability is not solvable algorithmically in full generality. The classical decision problem is the quest for a delineation between the decidable and the undecidable parts of FOL. This talk aims to shed new light on the boundary. To this end, recently discovered structure on the decidable side of the first-order landscape shall be discussed.
The primary focus will be on the syntactic concept of separateness of variables and its applicability to the classical decision problem and beyond. Two disjoint sets of first-order variables are separated in a given formula if each atom in that formula contains variables from at most one of the two sets. This simple notion facilitates the definition of decidable extensions of many well-known decidable FOL fragments. Although the extensions exhibit the same expressive power as the respective originals, certain logical properties can be expressed much more succinctly. In at least two cases the succinctness gap cannot be bounded using any elementary function.
The secondary focus will be on linear first-order arithmetic over the rationals with uninterpreted predicates. Novel decidable fragments of this logical language shall be discussed briefly.
Counting Problems on Quantum Graphs
(Advisor: Dr. Holger Dell)
Monday, 15.07.2019, 12:00 h, in building E1 7, room 001
Quantum graphs, as defined by Lovász in the late 60s, are formal linear combinations of simple graphs with finite support. They allow for the complexity analysis of the problem of computing finite linear combinations of homomorphism counts, the latter of which constitute the foundation of the structural hardness theory for parameterized counting problems: The framework of parameterized counting complexity was introduced by Flum and Grohe, and McCartin in 2002 and forms a hybrid between the classical field of computational counting as founded by Valiant in the late 70s and the paradigm of parameterized complexity theory due to Downey and Fellows which originated in the early 90s.
The problem of computing homomorphism numbers of quantum graphs subsumes general motif counting problems and the complexity theoretic implications have only turned out recently in a breakthrough regarding the parameterized subgraph counting problem by Curticapean, Dell and Marx in 2017.
We study the problems of counting partially injective and edge-injective homomorphisms, counting induced subgraphs, as well as counting answers to existential first-order queries. We establish novel combinatorial, algebraic and even topological properties of quantum graphs that allow us to provide exhaustive parameterized and exact complexity classifications, including necessary, sufficient and mostly explicit tractability criteria, for all of the previous problems.
Engineering Formal Systems in Constructive Type Theory
(Advisor: Prof. Gert Smolka)
Friday, 12.07.2019, 14:15 h, in building E1 7, room 001
This thesis presents a practical methodology for formalizing the meta-theory of formal systems with binders and coinductive relations in constructive type theory. While constructive type theory offers support for reasoning about formal systems built out of inductive definitions, support for syntax with binders and coinductive relations is lacking. We provide this support. We implement syntax with binders using well-scoped de Bruijn terms and parallel substitutions. We solve substitution lemmas automatically using the rewriting theory of the σ-calculus. We present the Autosubst library to automate our approach in the proof assistant Coq. Our approach to coinductive relations is based on an inductive tower construction, which is a type-theoretic form of transfinite induction. The tower construction allows us to reduce coinduction to induction. This leads to a symmetric treatment of induction and coinduction and allows us to give a novel construction of the companion of a monotone function on a complete lattice. We demonstrate our methods with a series of case studies. In particular, we present a proof of type preservation for CCω, a proof of weak and strong normalization for System F, a proof that systems of weakly guarded equations have unique solutions in CCS, and a compiler verification for a compiler from a non-deterministic language into a deterministic language. All technical results in the thesis are formalized in Coq.
Formal Verification of the Equivalence of System F and the Pure Type System L2
(Advisor: Prof. Gert Smolka)
Thursday, 11.07.2019, 16:15 h, in building E1 7, room 001
We develop a formal proof of the equivalence of two different variants of System F. The first is close to the original presentation where expressions are separated into distinct syntactic classes of types and terms. The second, L2, is a particular pure type system (PTS) where the notions of types and terms, and the associated expressions are unified in a single syntactic class. The employed notion of equivalence is a bidirectional reduction of the respective typing relations. A machine-verified proof of this result turns out to be surprisingly intricate, since the two variants noticeably differ in their expression languages, their type systems and the binding of local variables.
Most of this work is executed in the Coq theorem prover and encompasses a general development of the PTS metatheory, an equivalence result for a stratified and a PTS variant of the simply typed lambda-calculus as well as the subsequent extension to the full equivalence result for System F. We utilise nameless de Bruijn syntax with parallel substitutions for the representation of variable binding and develop an extended notion of context morphism lemmas as a structured proof method for this setting.
We also provide two developments of the equivalence result in the proof systems Abella and Beluga, where we rely on higher-order abstract syntax (HOAS). This allows us to compare the three proof systems, as well as HOAS and de Bruijn for the purpose of developing formal metatheory.
Measuring User Experience for Virtual Reality
(Advisor: Prof. Antonio Krüger)
Wednesday, 10.07.2019, 13:00 h, in building D3 2, Reuse seminar room (-2.17)
In recent years, Virtual Reality (VR) and 3D User Interfaces (3DUI) have seen a drastic increase in popularity, especially in terms of consumer-ready hardware and software. These technologies have the potential to create new experiences that combine the advantages of reality and virtuality. While the technology for input as well as output devices is market ready, only a few solutions for everyday VR – online shopping, games, or movies – exist, and empirical knowledge about performance and user preferences is lacking. All this makes the development and design of human-centered user interfaces for VR a great challenge.
This thesis investigates the evaluation and design of interactive VR experiences. We introduce the Virtual Reality User Experience (VRUX) model based on VR-specific external factors and evaluation metrics such as task performance and user preference. Based on our novel UX evaluation approach, we contribute by exploring the following directions: shopping in virtual environments, as well as text entry and menu control in the context of everyday VR. Along with this, we summarize our findings by design spaces and guidelines for choosing optimal interfaces and controls in VR.
Matrix Factorization over Dioids and its Applications in Data Mining
(Advisor: Prof. Pauli Miettinen, now University of Eastern Finland)
Wednesday, 10.07.2019, 10:30 h, in building E1 5, room 0.29
While classic matrix factorization methods, such as NMF and SVD, are known to be highly effective at finding latent patterns in data, they are limited by the underlying algebraic structure. In particular, it is often difficult to distinguish heavily overlapping patterns because they interfere with each other. To deal with this problem, we study matrix factorization over algebraic structures known as dioids, that are characterized by the idempotency of addition (a + a = a). Idempotency ensures that at every data point only a single pattern contributes, which makes it easier to distinguish them. In this thesis, we consider different types of dioids, that range from continuous (subtropical and tropical algebras) to discrete (Boolean algebra).
The Boolean algebra is, perhaps, the most well known dioid, and there exist Boolean matrix factorization algorithms that produce low reconstruction errors. In this thesis, however, a different objective function is used — the description length of the data, which enables us to obtain more compact and highly interpretable results.
The tropical and subtropical algebras are much less known in the data mining field. While they find applications in areas such as job scheduling and discrete event systems, they are virtually unknown in the context of data analysis. We will use them to obtain idempotent nonnegative factorizations that are similar to NMF, but are better at separating most prominent features of the data.
Intents and Preferences Prediction Based on Implicit Human Cues
(Advisor: Dr. Mario Fritz)
Tuesday, 02.07.2019, 13:00 h, in building E1 4, room 0.22
Understanding implicit human cues is a key factor to build advanced human-machine interfaces. The modern interfaces should be able to understand and predict human intents and assist users from implicit data rather waiting for explicit input. Eye movement, electroencephalogram (EEG), electrocardiogram (ECG), body shape, are several sources of implicit human cues. In this thesis, we focused on eye data and body shape.
The eye is known as a window to the mind and could reflect allot about the ongoing process in mind. Thus, in the first part of the thesis, we focus on advancing techniques for search target prediction from human gaze data. We propose a series of techniques for search target inference algorithm from human fixation data recorded during the visual search that broaden the scope of search target prediction to categorical classes, such as object categories and attributes.
Moreover, body shape could reveal a lot of information about the people choice of outfit, food, sport, or living style. Hence, in the second part of this thesis, we studied how body shape could influences people outfits’ preferences. We propose a novel and robust multi-photo approach to estimate the body shapes of each user and build a conditional model of clothing categories given body-shape. However, an accurate depiction of the naked body is considered highly private and therefore, might not be consented by most people. First, we studied the perception of such technology via a user study. Then, in the last part of this thesis, we ask if the automatic extraction of such information can be effectively evaded.
Interpretable Machine Learning Methods for Prediction and Analysis of Genome Regulation in 3D
(Advisor: Prof. Nico Pfeifer, now Tübingen)
Monday, 24.06.2019, 16:00 h, in building E1 4, room 0.24
With the development of chromosome conformation capture-based techniques, we now know that chromatin is packed in three-dimensional (3D) space inside the cell nucleus. Changes in the 3D chromatin architecture have already been implicated in diseases such as cancer. Thus, a better understanding of this 3D conformation is of interest to help enhance our comprehension of the complex, multi-pronged regulatory mechanisms of the genome. In my colloquium, I will present interpretable machine learning-based approaches for prediction and analysis of long-range chromatin interactions and how they can help gain novel, fine-grained insights into genome regulatory mechanisms. Specifically, I will first present our string kernels-based support vector classification method for prediction of long-range chromatin interactions. This method was one of the first bioinformatic approaches to suggest a potential general role of short tandem repeat sequences in long-range genome regulation and its 3D organization. I will then present CoMIK, a method we developed for handling variable-length DNA sequences in classification scenarios. CoMIK can not only identify the features important for classification but also locate them within the individual sequences, thus enabling the possibility of more fine-grained insights.
Artificial Intelligence for Efficient Image-based View Synthesis
(Advisor: Prof. Hans-Peter Seidel and Prof. Tobias Ritschel, now University College London)
Monday, 24.06.2019, 9:00 h, in building E1 4, room 0.19
Synthesizing novel views from image data is a widely investigated topic in both computer graphics and computer vision, and has many applications like stereo or multi-view rendering for virtual reality, light field reconstruction, and image post-processing. While image-based approaches have the advantage of reduced computational load compared to classical model-based rendering, efficiency is still a major concern. We demonstrate how concepts and tools from artificial intelligence can be used to increase the efficiency of image-based view synthesis algorithms. In particular it is shown how machine learning can help to generate point patterns useful for a variety of computer graphics tasks, how path planning can guide image warping, how sparsity-enforcing optimization can lead to significant speedups in interactive distribution effect rendering, and how probabilistic inference can be used to perform real-time 2D-to-3D conversion.
Azim Dehghani Amirabad
From genes to transcripts: Integrative modeling and analysis of regularity networks
(Advisor: Dr. Marcel Schulz, now Frankfurt)
Tuesday, 11.06.2019, 10:00 h, in building E1 7, room 002
Although all the cells in an organism posses the same genome, the regulatory mechanisms lead to highly specic cell types. Elucidating these regulatory mechanisms is a great challenge in systems biology research. Nonetheless, it is known that a large fraction of our genome is comprised of regulatory elements, the precise mechanisms by which different combinations of regulatory elements are involved in controlling gene expression and cell identity are poorly understood. This thesis describes algorithms and approaches for modeling and analysis of different modes of gene regulation. We present POSTIT a novel algorithm for modeling and inferring transcript isoform regulation from transcriptomics and epigenomics data. POSTIT uses multi-task learning with structured-sparsity inducing regularizer to share the regulatory information between isoforms of a gene, which is shown to lead to accurate isoform expression prediction and inference of regulators.
We developed a novel statistical method, called SPONGE, for large-scale inference of ceRNA networks. In this framework, we designed an efficient empirical p-value computation approach, by sampling from derived null models, which addresses important confounding factors such as sample size, number of involved regulators and strength of correlation.
Finally, we present an integrative analysis of miRNA and protein-based post-transcriptional regulation. We postulate a competitive regulation of the RNA-binding protein IMP2 with miRNAs binding the same RNAs using expression and RNA binding data. This function of IMP2 is relevant in the contribution to disease in the context of adult cellular metabolism.
Model-based Human Performance Capture in Outdoor Scenes
(Advisor: Prof. Christian Theobalt)
Tuesday, 21.05.2019, 10:00 h, in building E1 4, room 019
Technologies for motion and performance capture of real actors have enabled the creation of realistic-looking virtual humans through detail and deformation transfer at the cost of extensive manual work and sophisticated in-studio marker-based systems. This thesis pushes the boundaries of performance capture by proposing automatic algorithms for robust 3D skeleton and detailed surface tracking in less constrained multi-view outdoor scenarios.
Contributions include new multi-layered human body representations designed for effective model-based time-consistent reconstruction in complex dynamic environments with varying illumination, from a set of vision cameras. We design dense surface refinement approaches to enable smooth silhouette-free model-to-image alignment, as well as coarse-to-fine tracking techniques to enable joint estimation of skeleton motion and fine-scale surface deformations in complicated scenarios. High-quality results attained on challenging application scenarios confirm the contributions and show great potential for the automatic creation of personalized 3D virtual humans.
Monitoring with Parameters
(Advisor: Prof. Bernd Finkbeiner)
Friday, 17.05.2019, 14:15 h, in building E1 7, room 0.01
Runtime monitoring of embedded systems is a method to safeguard their reliable operation by detecting runtime failures within the system and recognizing unexpected environment behavior during system development and operation. Specification languages for runtime monitoring aim to provide the succinct, understandable, and formal specification of system and component properties.
Starting from temporal logics, this thesis extends the expressivity of specification languages for runtime monitoring in three key aspects while maintaining the predictability of resource usage. First, we provide monitoring algorithms for linear-time temporal logic with parameters (PLTL), where the parameters bound the number of steps until an eventuality is satisfied. Second, we introduce Lola 2.0, which adds data parameterization to the stream specification language Lola. Third, we integrate real-time specifications in RTLola and add real-time sliding windows, which aggregate data over real-time intervals. For the combination of these extensions, we present a design-time specification analysis which provides resource guarantees.
We report on a case study on the application of the language in an autonomous UAS. Component and system properties were specified together with domain experts in the developed stream specification language and evaluated in a real-time hardware-in-the-loop testbed with a complex environment simulation.
Computational Approaches for Improving Treatment and Prevention of Viral Infections
(Advisor: Prof. Nico Pfeifer, now Tübingen)
Tuesday, 30.04.2019, 14:00 h, in building E2 1, room 0.01
The treatment of infections with HIV or HCV is challenging. Thus, novel drugs and new computational approaches that support the selection of therapies are required. This work presents methods that support therapy selection as well as methods that advance novel antiviral treatments.
geno2pheno[ngs-freq] identifies drug resistance from HIV-1 or HCV samples that were subjected to next-generation sequencing by interpreting their sequences either via support vector machines or a rules-based approach. geno2pheno[coreceptor-hiv2] determines the coreceptor that is used for viral cell entry by analyzing a segment of the HIV-2 surface protein with a support vector machine. openPrimeR is capable of finding optimal combinations of primers for multiplex polymerase chain reaction by solving a set cover problem and accessing a new logistic regression model for determining amplification events arising from polymerase chain reaction. geno2pheno[ngs-freq] and geno2pheno[coreceptor-hiv2] enable the personalization of antiviral treatments and support clinical decision making. The application of openPrimeR on human immunoglobulin sequences has resulted in novel primer sets that improve the isolation of broadly neutralizing antibodies against HIV-1. The methods that were developed in this work thus constitute important contributions towards improving the prevention and treatment of viral infectious diseases.
Stochastic Modeling of DNA Demethylation Dynamics in ESCs
(Advisor: Prof. Verena Wolf)
Tuesday, 30.04.2019, 11:00 h, in building E1 5, room 0.29
DNA methylation and demethylation are opposing processes that when in balance create stable patterns of epigenetic memory. The control of DNA methylation pattern formation in replication dependent and independent demethylation processes has been suggested to be influenced by Tet mediated oxidation of a methylated cytosine, 5mC, to a hydroxylated cytosine, 5hmC. Based only on in vitro experiments, several alternative mechanisms have been proposed on how 5hmC influences replication dependent maintenance of DNA methylation and replication independent processes of active demethylation. In this thesis we design an extended and easily generalizable hidden Markov model that uses as input hairpin (oxidative-)bisulfite sequencing data to precisely determine the over time dynamics of 5mC and 5hmC, as well as to infer the activities of the involved enzymes at a single CpG resolution. Developing the appropriate statistical and computational tools, we apply the model to discrete high-depth sequenced genomic loci, and on a whole genome scale with a much smaller sequencing depth. Performing the analysis of the model’s output on mESCs data, we show that the presence of Tet enzymes and 5hmC has a very strong impact on replication dependent demethylation by establishing a passive demethylation mechanism, implicitly impairing methylation maintenance, but also down-regulating the de novo methylation activity.
Analysis of the Protein-Ligand and Protein-Peptide Interactions Using a Combined Sequence- and Structure-Based Approach
(Advisor: Prof. Olga Kalinina)
Wednesday, 17.04.2019, 15:00 h, in building E1 4, room 0.24
Proteins participate in most of the important processes in cells, and their ability to perform their function ultimately depends on their three-dimensional structure. They usually act in these processes through interactions with other molecules. Because of the importance of their role, proteins are also the common target for small molecule drugs that inhibit their activity, which may include targeting protein interactions. Understanding protein interactions and how they are affected by mutations is thus crucial for combating drug resistance and aiding drug design.
This dissertation combines bioinformatics studies of protein interactions at both primary sequence and structural level. We analyse protein-protein interactions through linear motifs, as well as protein-small molecule interactions, and study how mutations affect them. This is done in the context of two systems. In the first study of drug resistance mutations in the protease of the human immunodeficiency virus type 1, we successfully apply molecular dynamics simulations to estimate the effects of known resistance-associated mutations on the free binding energy, also revealing molecular mechanisms of resistance. In the second study, we analyse consensus profiles of linear motifs that mediate the recognition by the mitogen-activated protein kinases of their target proteins. We thus gain insights into the cellular processes these proteins are involved in.
Question Answering over Knowledge Bases with Continuous Learning
(Advisor: Prof. Gerhard Weikum)
Friday, 12.04.2019, 11:00 h, in building E1 4, room 0.24
Answering complex natural language questions with crisp answers is crucial towards satisfying the information needs of advanced users. With the rapid growth of knowledge bases (KBs) such as Yago and Freebase, this goal has become attainable by translating questions into formal queries like SPARQL queries. Such queries can then be evaluated over knowledge bases to retrieve crisp answers. To this end, two research issues arise:
(i) how to develop methods that are robust to lexical and syntactic variations in questions and can handle complex questions, and
(ii) how to design and curate datasets to advance research in question answering. In this dissertation, we make the following contributions in the area of question answering (QA). For issue (i), we present QUINT, an approach for answering natural language questions over knowledge bases using automatically learned templates. Templates are an important asset for QA over KBs, simplifying the semantic parsing of input questions and generating formal queries for interpretable answers. We also introduce NEQA, a framework for continuous learning for QA over KBs. NEQA starts with a small seed of training examples in the form of question-answer pairs, and improves its performance over time. NEQA combines both syntax, through template-based answering, and semantics, via a semantic similarity function. For issues (i) and (ii), we present TEQUILA, a framework for answering complex questions with explicit and implicit temporal conditions over KBs. TEQUILA is built on a rule-based framework that detects and decomposes temporal questions into simpler sub-questions that can be answered by standard KB-QA systems. TEQUILA reconciles the results of sub-questions into final answers. TEQUILA is accompanied with a dataset called TempQuestions, which consists of 1,271 temporal questions with gold-standard answers over Freebase. This collection is derived by judiciously selecting time-related questions from existing QA datasets. For issue (ii), we publish ComQA, a large-scale manually-curated dataset for QA. ComQA contains questions that represent real information needs and exhibit a wide range of difficulties such as the need for temporal reasoning, comparison, and compositionality. ComQA contains paraphrase clusters of semantically-equivalent questions that can be exploited by QA systems.
Generalizations of the Multicut Problem for Computer Vision
(Advisor: Prof. Bernt Schiele)
Monday, 08.04.2019, 11:30 h, in building E1 4, room 0.24
Graph decompositions have always been an essential part of computer vision as a large variety of tasks can be formulated in this framework. One way to optimize for a decomposition of a graph is called the multicut problem, also known as correlation clustering. Its particular feature is that the number of clusters is not required to be known a priori, but is rather deduced from the optimal solution. On the downside, it is NP-hard to solve.
In this thesis we present several generalizations of the multicut problem, that allow to model richer dependencies between nodes of a graph. We also propose efficient local search algorithms, that in practice find good solution in reasonable time, and show applications to image segmentation, multi-human pose estimation, multi-object tracking and many others.
On Static Execution-Time Analysis – Compositionality, Pipeline Abstraction, and Predictable Hardware
(Advisor: Prof. Reinhard Wilhelm)
Thursday, 04.04.2019, 16:00 h, in building E1 1, room 407
Proving timeliness is an integral part of the verification of safety-critical real-time systems. To this end, timing analysis computes upper bounds on the execution times of programs that execute on a given hardware platform.
At least since multi-core processors have emerged, timing analysis separates concerns by analysing different aspects of the system’s timing behaviour individually. In this work, we validate the underlying assumption that a timing bound can be soundly composed from individual contributions. We show that even simple processors exhibit counter-intuitive behaviour – a locally slow execution can lead to an even slower overall execution – that impedes the soundness of the composition. We present the compositional base bound analysis that accounts for any such amplifying effects within its timing contribution. This enables a sound although expensive compositional analysis even for complex contemporary processors.
In addition, we discuss future hardware designs that enable efficient compositional analyses. We introduce a hardware design, the strictly in-order pipeline, that behaves monotonically with respect to the progress of a program’s execution. Monotonicity enables us to formally reason about properties such as compositionality.
Joanna (Asia) BIEGA
Enhancing Privacy and Fairness in Search Systems
(Advisor: Prof. Gerhard Weikum)
Monday, 01.04.2019, 15:15 h, in building E1 4, room 024
Search systems have a substantial potential to harm individuals and the society. First, since they collect vast amounts of data about their users, they have the potential to violate user privacy. Second, in applications where rankings influence people’s economic livelihood outside of the platform, such as sharing economy or hiring support websites, search engines have an immense economic power over their users in that they control user exposure in ranked results.
This thesis develops new models and methods broadly covering different aspects of privacy and fairness in search systems for both searchers and search subjects:
* We propose a model for computing individually fair rankings where search subjects get exposure proportional to their relevance. The exposure is amortized over time using constrained optimization to overcome searcher attention biases while preserving ranking utility.
* We propose a model for computing sensitive search exposure where each subject gets to know the sensitive queries that lead to her profile in the top-k search results. The problem of finding exposing queries is technically modeled as reverse nearest neighbor search, followed by a weekly-supervised learning to rank model ordering the queries by privacy-sensitivity.
* We propose a model for quantifying privacy risks from textual data in online communities. The method builds on a topic model where each topic is annotated by a crowdsourced sensitivity score, and privacy risks are associated with a user’s relevance to sensitive topics. We propose relevance measures capturing different dimensions of user interest in a topic and show how they correlate with human risk perceptions.
* Last but not least, we propose a model for privacy-preserving personalized search where search queries of different users are split and merged into synthetic profiles. The model mediates the privacy-utility trade-off by keeping semantically coherent fragments of search histories within individual profiles, while trying to minimize the similarity of any of the synthetic profiles to the original user profiles.
Intelligent Tutoring in Virtual Reality for Highly Dynamic Pedestrian Safety Training
(Advisor: Prof. Jörg Siekmann)
Monday, 01.04.2019, 14:00 h, in building E1 7, room 0.01
This thesis presents the design, implementation, and evaluation of an Intelligent Tutoring System (ITS) with a Virtual Reality (VR) interface for child pedestrian safety training. This system enables children to train practical skills in a safe and realistic virtual environment without the time and space dependencies of traditional roadside training. This system also employs Domain and Student Modelling techniques to analyze user data during training automatically and to provide appropriate instructions and feedback.
Thus, the traditional requirement of constant monitoring from teaching personnel is greatly reduced. Compared to previous work, especially the second aspect is a principal novelty for this domain. To achieve this, a novel Domain and Student Modeling method was developed in addition to a modular and extensible virtual environment for the target domain. While the Domain and Student Modeling framework is designed to handle the highly dynamic nature of training in traffic and the ill-defined characteristics of pedestrian tasks, the modular virtual environment supports different interaction methods and a simple and efficient way to create and adapt exercises. The thesis is complemented by two user studies with elementary school children. These studies testify great overall user acceptance and the system’s potential for improving key pedestrian skills through autonomous learning. Last but not least, the thesis presents experiments with different forms of VR input and provides directions for future work.
Novel Approaches to Anonymity and Privacy in Decentralized, Open Settings
(Advisor: Prof. Michael Backes)
Friday, 29.03.2019, 11:30 h, in building E9 1, room 0.01
The Internet has undergone dramatic changes in the last two decades, evolving from a mere communication network to a global multimedia platform in which billions of users actively exchange information. While this transformation has brought tremendous benefits to society, it has also created new threats to online privacy that existing technology is failing to keep pace with.
In this dissertation, we present the results of two lines of research that developed two novel approaches to anonymity and privacy in decentralized, open settings. First, we examine the issue of attribute and identity disclosure in open settings and develop the novel notion of (k,d)-anonymity for open settings that we extensively study and validate experimentally. Furthermore, we investigate the relationship between anonymity and linkability using the notion of (k,d)-anonymity and show that, in contrast to the traditional closed setting, anonymity within one online community does necessarily imply unlinkability across different online communities in the decentralized, open setting.
Secondly, we consider the transitive diffusion of information that is shared in social networks and spread through pairwise interactions of user connected in this social network. We develop the novel approach of exposure minimization to control the diffusion of information within an open network, allowing the owner to minimize its exposure by suitably choosing who they share their information with. We implement our algorithms and investigate the practical limitations of user side exposure minimization in large social networks.
At their core, both of these approaches present a departure from the provable privacy guarantees that we can achieve in closed settings and a step towards sound assessments of privacy risks in decentralized, open settings.
What we leave behind: reproducibility in chromatin analysis within and across species
(Advisor: Prof. Thomas Lengauer)
Thursday, 14.03.2019, 11:30 h, in building E2 1, room 007
Epigenetics is the field of biology that investigates heritable factors regulating gene expression without being directly encoded in the genome of an organism. In my colloquium, I am going to present approaches dealing with three challenging problems in computational epigenomics: first, we designed a solution strategy for improving reproducibility of computational analysis as part of a large national epigenetics research consortium. Second, we developed a software tool for the comparative epigenome analysis of cell types within a species. For this project, we relied on a well-established and compact representation of epigenetic information known as chromatin states, which enabled us to realize fast and interpretable comparative chromatin analysis even for small sample numbers. Third, we extended comparative epigenomics beyond canonical model organisms to also include species for which little to no epigenetic data is available. In a proof of concept study, we implemented a computational pipeline for predicting biologically relevant information, e.g., cell-type specific gene expression profiles, in non-model organisms based only on reference epigenomes from well-studied species such as human or mouse.
Scalable positioning of commodity mobile devices using audio signals
(Advisor: Prof. Peter Druschel)
Thursday, 14.02.2019, 9:00 h, in building E1 5, room 0.29
This thesis explores the problem of computing a position map for co-located mobile devices. The positioning should happen in a scalable manner without requiring specialized hardware and without requiring specialized infrastructure (except basic Wi-Fi or cellular access). At events like meetings, talks, or conferences, a position map can aid spontaneous communication among users based on their relative position in two ways. First, it enables users to choose message recipients based on their relative position, which also enables the position-based distribution of documents. Second, it enables senders to attach their position to messages, which can facilitate interaction between speaker and audience in a lecture hall and enables the collection of feedback based on users‘ location.
In this thesis, we present Sonoloc, a mobile app and system that, by relying on acoustic signals, allows a set of commodity smart devices to determine their _relative_ positions. Sonoloc can position any number of devices within acoustic range with a constant number of acoustic signals emitted by a subset of devices. Our experimental evaluation with up to 115 devices in real rooms shows that — despite substantial background noise — the system can locate devices with an accuracy of tens of centimeters using no more than 15 acoustic signals.
Techniques to Protect Confidentiality and Integrity of Persistent and In-Memory Data
(Advisor: Prof. Peter Druschel)
Tuesday, 05.02.2019, 17:30 h, in building E1 5, room 0.29
Today computers store and analyze valuable and sensitive data. As a result we need to protect this data against confidentiality and integrity violations that can result in the illicit release, loss, or modification of a user’s and an organization’s sensitive data such as personal media content or client records. Existing techniques protecting confidentiality and integrity lack either efficiency or are vulnerable to malicious attacks. In this thesis we suggest techniques, Guardat and ERIM, to efficiently and robustly protect persistent and in-memory data.
To protect the confidentiality and integrity of persistent data, clients specify per-file policies to Guardat declaratively, concisely and separately from code. Guardat enforces policies by mediating I/O in the storage layer. In contrast to prior techniques, we protect against accidental or malicious circumvention of higher software layers. We present the design and prototype implementation, and demonstrate that Guardat efficiently enforces example policies in a web server.
To protect the confidentiality and integrity of in-memory data, ERIM isolates sensitive data using Intel Memory Protection Keys (MPK), a recent x86 extension to partition the address space. However, MPK does not protect against malicious attacks by itself. We prevent malicious attacks by combining MPK with call gates to trusted entry points and ahead-of-time binary inspection. In contrast to existing techniques, ERIM efficiently protects frequently-used session keys of web servers, an in-memory reference monitor’s private state, and managed runtimes from native libraries. These use cases result in high switch rates of the order of 10 5 –10 6 switches/s. Our experiments demonstrate less then 1% runtime overhead per 100,000 switches/s, thus outperforming existing techniques.
Muhammad Bilal ZAFAR
Discrimination in Algorithmic Decision Making: From Principles to Measures and Mechanisms
(Advisor: Prof. Krishna Gummadi)
Monday, 04.02.2019, 18:00 h, in building E1 5, room 0.29
The rise of algorithmic decision making in a variety of applications has also raised concerns about its potential for discrimination against certain social groups. However, incorporating nondiscrimination goals into the design of algorithmic decision making systems (or, classifiers) has proven to be quite challenging. These challenges arise mainly due to the computational complexities involved in the process, and the inadequacy of existing measures to computationally capture discrimination in various situations. The goal of this thesis is to tackle these problems. First, with the aim of incorporating existing measures of discrimination (namely, disparate treatment and disparate impact) into the design of well-known classifiers, we introduce a mechanism of decision boundary covariance, that can be included in the formulation of any convex boundary-based classifier in the form of convex constraints. Second, we propose alternative measures of discrimination. Our first proposed measure, disparate mistreatment, is useful in situations when unbiased ground truth training data is available. The other two measures, preferred treatment and preferred impact, are useful in situations when feature and class distributions of different social groups are significantly different, and can additionally help reduce the cost of nondiscrimination (as compared to the existing measures). We also design mechanisms to incorporate these new measures into the design of convex boundary-based classifiers.
Ubiquitous Head-Mounted Gaze Tracking
(Advisor: Prof. Antonio Krüger)
Thursday, 24.01.2019, 15:00 h, in building D3 2, „Reuse“ meeting room
Recently, gaze-based interfaces in stationary settings became a valid option, as remote eye trackers are easily available at a low price. However, making gaze-based interfaces ubiquitous remains an open challenge that cannot be resolved using remote approaches. The miniaturization in camera technology allows for gaze estimation with head-mounted devices, which are similar to a pair of glasses. A crucial step towards gaze estimation using such devices is calibration. Although the development is mainly technology driven, a hypothetical fully calibrated system in which all parameters are known and valid is affected by calibration drift. In addition, attributes like minimal intrusiveness and obstruction, easy and flexible setup and high accuracy are not present in the latest commercial devices. Hence their applicability to spontaneous interaction or long-lasting research experiments is questionable.
In this thesis we enable spontaneous, calibration-free and accurate mobile gaze estimation. We contribute by investigating the following three areas: Efficient Long-Term Usage of a head-mounted eye tracker; Location, Orientation & Target Independent head mounted eye tracking; and Mobile & Accurate Calibration-Free gaze estimation. Through the elaboration of the theoretical principles, we investigate novel concepts for each aspect; these are implemented and evaluated, converging into a final device, to show how to overcome current limitations.