PhD Thesis Defenses
PhD thesis defenses are a public affair and open to anyone who is interested. Attending them is a great way to get to know the work going on by your peers in the various research groups. On this page you will find a list of upcoming and past defense talks.
Please go here for electronic access to most of the doctoral dissertations from Saarbrücken Computer Science going back to about 1990.
Automatic Detection of Dementia and related Affective Disorders through Processing of Speech and Language
(Advisor: Prof. Antonio Krüger)
Friday, 24.03.23, 15:00 h, building D3 2, DFKI, ViS Room (SB-1.61)
In 2019, dementia is has become a trillion dollar disorder. Alzheimer’s disease (AD) is a type of dementia in which the main observable symptom is a decline in cognitive functions, notably memory, as well as language and problem-solving. Experts agree that early detection is crucial to effectively develop and apply interventions and treatments, underlining the need for effective and pervasive assessment and screening tools. The goal of this thesis is to explores how computational techniques can be used to process speech and language samples produced by patients suffering from dementia or related affective disorders, to the end of automatically detecting them in large populations using machine learning models. A strong focus is laid on the detection of early stage dementia (MCI), as most clinical trials today focus on intervention at this level. To this end, novel automatic and semi-automatic analysis schemes for a speech-based cognitive task, i.e., verbal fluency, are explored and evaluated to be an appropriate screening task. Due to a lack of available patient data in most languages, world-first multilingual approaches to detecting dementia are introduced in this thesis. Results are encouraging and clear benefits on a small French dataset become visible. Lastly, the task of detecting these people with dementia who also suffer from an affective disorder called apathy is explored. Since they are more likely to convert into later stage of dementia faster, it is crucial to identify them. These are the first experiments that consider this task using solely speech and language as inputs. Results are again encouraging, both using only speech or language data elicited using emotional questions. Overall, strong results encourage further research in establishing speech-based biomarkers for early detection and monitoring of these disorders to better patients’ lives.
Designing Tactile Experiences for Immersive Virtual Environments
(Advisor: Prof. Antonio Krüger)
Tuesday, 21.03.23, 14:00 h, building D3 2, Reuse meeting room
Designing for the sense of touch is essential in creating convincing and realistic experiences in Virtual Reality (VR). Currently, a variety of methods exist for simulating touch experiences. However, developing effective and convincing haptic feedback still remains challenging. In this work, we study how real-world touch experiences can inform haptic design processes for VR. Firstly, we investigate the reproduction of haptic features by capturing and fabricating surface microgeometry. We show that haptic reproduction is able to create a wide range of feel aesthetics. Furthermore, we build upon procedural design by generating and fabricating haptically-varying surface structures. We show that digital design processes are able to generate flexible and universal structures that directly influence tactile dimensions, such as roughness and hardness. Lastly, we investigate correspondences between different sensory modalities to enhance the design of tactile experiences. We show that vocal expressions can translate a designer’s intent into effective haptic feedback, while providing a rapid in-situ design process. This thesis advances the fields of VR, haptic design, and fabrication by contributing knowledge to the question of how effective tactile experiences can be designed.
Anna HAKE (née Feldmann)
Predicting and analyzing HIV-1 adaptation to broadly neutralizing antibodies and the host immune system using machine learning
(Advisors: Prof. Nico Pfeifer, now Uni Tübingen)
Monday, 20.03.23, 14:00 h, building E1 4, Rm 0.24
With neither a cure nor a vaccine at hand, infection with the human immunode-ficiency virus type 1 (HIV-1) is still a major global health threat. Viral control is usually gained using lifelong therapy with antiretroviral drugs and rarely by the immune system alone. Without drug exposure, interindividual differences in viral control are partly influenced by host genetic factors like the human leukocyte antigen (HLA) system, and viral genetic factors like the predominant coreceptor usage of the virus. Thanks to its extraordinarily high mutation and replication rate, HIV-1 is however able to rapidly adapt to the selection pressure imposed by the host immune system or antiretroviral drug exposure.
For a successful control of the virus, it is thus vital to have fast and reliable methods in place that assess the viral adaptation to drugs of interest prior to their (further) administration. For a better assessment of our ability to control the virus, it is also important to estimate the viral adaptation to the host immune system.
In this talk, I will present four studies all aiming to further our understanding of HIV-1 adaptation and our ability to reliably predict it. In particular, we present a SVM approach to predict HIV adaptation to broadly neutralizing antibodies (bNAbs), a promising new treatment option. In addition, we use statistical learn-ing to further characterize antibody-mediated therapy with the promising bNAb 3BNC177 by investigating its ability (i) to suppress the virus and (ii) to boost the immune system. Finally, I will introduce a novel way to predict HIV-1 adaptation to the host immune system using Bayesian generalized linear mixed models, which allowed us to investigate the relationship between HIV-1 coreceptor usage and its adaptation to the host HLA system.
Bharat Lal Bhatnagar
Modelling 3D Humans: Pose, Shape, Clothing and Interactions
(Advisors: Prof. Gerard Pons-Moll, now Uni Tübingen)
Thursday, 16.03.23, 18:00 h, building E1 4, Rm 0.24
Digital humans are increasingly becoming a part of our lives with applications like animation, gaming, virtual try-on, Metaverse and much more. In recent years there has been a great push to make our models of digital humans as real as possible. In this thesis we present methodologies to model two key characteristics of real humans, their „appearance“ and „actions“. To this end, we discuss what are the best representations for humans, clothing and their interactions with their surroundings? How can we extract human appearance cues like pose, shape and clothing from scans, point clouds and images? How can we capture and in-turn model human-object interaction? and more
On a Notion of Abduction and Relevance for First-Order Logic Clause Sets
(Advisors: Prof. Christoph Weidenbach and Dr. Sophie Tourret)
Thursday, 09.03.23, 14:00 h, building E1 4, Rm 0.24
I propose techniques to help explain entailment and non-entailment in first-order logic. For entailment, I classify clauses necessary for any possible deduction (syntactically relevant), usable for some deduction (syntactically semi-relevant), or unusable (syntactically irrelevant) along with a semantic characterization via conflict literals (contradictory simple facts). This offers a novel insight beyond the existing notion of minimal unsatisfiable set. The need to test if a clause is syntactically semi-relevant leads to a generalization of the completeness result of a well-known resolution strategy: resolution with the set-of-support (SOS) strategy is refutationally complete on a clause set N and SOS M if and only if there is a resolution refutation from N ∪ M using a clause in M. For non-entailment, abductive reasoning helps find extensions of a knowledge base to obtain an entailment of some missing consequence. I focus on EL TBox abduction that is lightweight but prevalent in practice. The solution space can be huge so, to help sort the chaff from the grain, I introduce connection-minimality, a criterion such that accepted hypotheses always immediately relate the observation to the given axioms. I show that such hypotheses are computable using prime implicate-based abduction in first-order logic. I evaluate this notion on ontologies from the medical domain using an implementation with SPASS as a prime implicate generation engine.
Towards Enabling Cross-layer Information Sharing to Improve Today’s Content Delivery Systems
(Advisor: Prof. Anja Feldmann)
Thursday, 02.03.23, 15:00 h, building E1 4, Rm 0.24
Content is omnipresent and without content the Internet would not be what it is today. End users consume content throughout the day, from checking the latest news on Twitter in the morning, to streaming music in the background (while working), to streaming movies or playing online games in the evening, and to using apps (e.g., sleep trackers) even while we sleep in the night. All of these different kinds of content have very specific and different requirements on a transport—on one end, online gaming often requires a low latency connection but needs little throughput, and, on the other, streaming a video requires high throughput, but it performs quite poorly under packet loss. Yet, all content is transferred opaquely over the same transport, adhering to a strict separation of network layers. Even a modern transport protocol such as Multi-Path TCP, which is capable of utilizing multiple paths, cannot take the (above) requirements or needs of that content into account for its path selection. In this work we challenge the layer separation and show that sharing information across the layers is beneficial for consuming web and video content. To this end, we created an event-based simulator for evaluating how applications can make informed decisions about which interfaces to use delivering different content based on a set of pre-defined policies that encode the (performance) requirements or needs of that content. Our policies achieve speedups of a factor of two in 20% of our cases, have benefits in more than 50%, and create no overhead in any of the cases. For video content we created a full streaming system that allows an even finer grained information sharing between the transport and the application. Our streaming system, called VOXEL, enables applications to select dynamically and on a frame granularity which video data to transfer based on the current network conditions. VOXEL drastically reduces video stalls in the 90th-percentile by up to 97% while not sacrificing the stream’s visual fidelity. We confirmed our performance improvements in a real-user study where 84% of the participants clearly preferred watching videos streamed with VOXEL over the state-of-the-art.
Hazard-Free Clock Synchronization
(Advisor: Dr. Christoph Lenzen)
Tuesday, 28.02.23, 13:00 h, building E1 4, Rm 0.24
The growing complexity of microprocessors makes it infeasible to distribute a single clock source over the whole processor with small clock skew. Hence, chips are split into multiple clock regions, which are each covered by a single clock source. This poses a problem for communication between these clock regions. Clock synchronization algorithms promise an advantage over state-of-the-art solutions, such as GALS systems. When clock regions are synchronous the communication latency improves significantly over handshake-based solutions. We focus on implementation of clock synchronization algorithms.
A major obstacle when implementing circuits on clock domain crossings are hazardous signals. Extending the Boolean logic by a third value ‚u‘ we can formally define hazards. In this thesis we describe a theory for design and analysis of hazard-free circuits. We develop strategies for hazard-free encoding and construction of hazard-free circuits from finite state machines. Furthermore, we discuss clock synchronization algorithms and a possible combination of them.
Said Jawad SAIDI
Characterizing the IoT Ecosystem at Scale
(Advisor: Prof. Anja Feldmann)
Friday, 24.02.23, 15:00 h, building E1 4, Rm 0.24
Internet of Things (IoT) devices are extremely popular with home, business, and industrial users. To provide their services, they typically rely on a backend server infrastructure on the Internet, which collectively form the IoT Ecosystem. This ecosystem is rapidly growing and offers users an increasing number of services. It also has been a source and target of significant security and privacy risks. One notable example is the recent large-scale coordinated global attacks, like Mirai, which disrupted large service providers. Thus, characterizing this ecosystem yields insights that help end-users, network operators, policymakers, and researchers better understand it, obtain a detailed view, and keep track of its evolution. In addition, they can use these insights to inform their decision-making process for mitigating this ecosystem’s security and privacy risks. In this dissertation, we characterize the IoT ecosystem at scale by (i) detecting the IoT devices in the wild, (ii) conducting a case study to measure how deployed IoT devices can affect users’ privacy, and (iii) detecting and measuring the IoT backend infrastructure.
To conduct our studies, we collaborated with a large European Internet Service Provider (ISP) and a major European Internet eXchange Point (IXP). They routinely collect large volumes of passive, sampled data, e.g., NetFlow and IPFIX, for their operational purposes. These data sources help providers obtain insights about their networks, and we used them to characterize the IoT ecosystem at scale.
We start with IoT devices and study how to track and trace their activity in the wild. We developed and evaluated a scalable methodology to accurately detect and monitor IoT devices with limited, sparsely sampled data in the ISP and IXP.
Next, we conduct a case study to measure how a myriad of deployed devices can affect the privacy of ISP subscribers. Unfortunately, we found that the privacy of a substantial fraction of IPv6 end-users is at risk. We noticed that a single device at home that encodes its MAC address into the IPv6 address could be utilized as a tracking identifier for the entire end-user prefix—even if other devices use IPv6 privacy extensions. Our results showed that IoT devices contribute the most to this privacy leakage.
Finally, we focus on the backend server infrastructure and propose a methodology to identify and locate IoT backend servers operated by cloud services and IoT vendors. We analyzed their IoT traffic patterns as observed in the ISP. Our analysis sheds light on their diverse operational and deployment strategies.
The need for issuing a priori unknown network-wide queries against large volumes of network flow capture data, which we used in our studies, motivated us to develop Flowyager. It is a system built on top of existing traffic capture utilities, and it relies on flow summarization techniques to reduce (i) the storage and transfer cost of flow captures and (ii) query response time. We deployed a prototype of Flowyager at both the IXP and ISP.
Learning from Imperfect Data: Incremental Learning and Few-shot Learning
(Advisor: Prof. Bernt Schiele)
Friday, 27.01.23, 16:30 h, building E1 4, Rm 0.24
In recent years, artificial intelligence (AI) has achieved great success in many fields. Although impressive advances have been made, AI algorithms still suffer from an important limitation: they rely on static and large-scale datasets. In contrast, human beings naturally possess the ability to learn novel knowledge from imperfect real-world data such as a small number of samples or a non-static continual data stream. Attaining such an ability is particularly appealing and will push the AI models one step further toward human-level Intelligence. In this talk, I will present my work on addressing these challenges in the context of class-incremental learning and few-shot learning. Specifically, I will first discuss how to get better exemplars for class-incremental learning based on optimization. I parameterize exemplars and optimize them in an end-to-end manner to obtain high-quality memory-efficient exemplars. I will present my work on how to apply incremental techniques to a more challenging and realistic scenario, object detection. I will provide algorithm design on a transformer-based incremental object detection framework. I will briefly mention my work on addressing other challenges and discuss future research directions.
Mechanised Metamathematics: An Investigation of First-Order Logic and Set Theory in Constructive Type Theory
(Advisor: Prof. Gert Smolka)
Friday, 27.01.23, 15:15 h, building E1 1, Rm 4.07
In this thesis, we investigate several key results in the canon of metamathematics, applying the contemporary perspective of formalisation in constructive type theory and mechanisation in the Coq proof assistant. Concretely, we consider the central completeness, undecidability, and incompleteness theorems of first-order logic as well as properties of the axiom of choice and the continuum hypothesis in axiomatic set theory. Due to their fundamental role in the foundations of mathematics and their technical intricacies, these results have a long tradition in the codification as standard literature and, in more recent investigations, increasingly serve as a benchmark for computer mechanisation.
With the present thesis, we continue this tradition by uniformly analysing the aforementioned cornerstones of metamathematics in the formal framework of constructive type theory. This programme offers novel insights into the constructive content of completeness, a synthetic approach to undecidability and incompleteness that largely eliminates the notorious tedium obscuring the essence of their proofs, as well as natural representations of set theory in the form of a second-order axiomatisation and of a fully type-theoretic account. The mechanisation concerning first-order logic is organised as a com-prehensive Coq library open to usage and contribution by external users.
Following the trail of cellular signatures: Computational methods for the analysis of molecular high-throughput profiles
(Advisor: Prof. Hans-Peter Lenhof)
Friday, 13.01.23, 11:00 h, building E2 1, Rm 007
Over the last three decades, high-throughput techniques, such as next- generation sequencing, microarrays, or mass spectrometry, have revolutionized biomedical research by enabling scientists to generate detailed molecular profiles of biological samples on a large scale. These profiles are usually complex, high-dimensional, and often prone to technical noise, which makes a manual inspection practically impossible. Hence, powerful computational methods are required that enable the analysis and exploration of these data sets and thereby help researchers to gain novel insights into the underlying biology.
In this thesis, we present a comprehensive collection of algorithms, tools, and databases for the integrative analysis of molecular high- throughput profiles. We developed these tools with two primary goals in mind. The detection of deregulated biological processes in complex diseases, like cancer, and the identification of driving factors within those processes.
Our first contribution in this context are several major extensions of the GeneTrail web service that make it one of the most comprehen- sive toolboxes for the analysis of deregulated biological processes and signaling pathways. GeneTrail offers a collection of powerful enrichment and network analysis algorithms that can be used to examine genomic, epigenomic, transcriptomic, miRNomic, and proteomic data sets. In addition to approaches for the analysis of individual -omics types, our framework also provides functionality for the integrative analysis of multi-omics data sets, the investigation of time-resolved expression profiles, and the exploration of single-cell experiments. Besides the analysis of deregulated biological processes, we also focus on the identification of driving factors within those processes, in particular, miRNAs and transcriptional regulators.
For miRNAs, we created the miRNA pathway dictionary database miRPathDB, which compiles links between miRNAs, target genes, and target pathways. Furthermore, it provides a variety of tools that help to study associations between them. For the analysis of transcriptional regulators, we developed REGGAE, a novel algorithm for the identification of key regulators that have a significant impact on deregulated genes, e.g., genes that show large expression differences in a comparison between disease and control samples. To analyze the influence of transcriptional regulators on deregulated biological processes we also created the RegulatorTrail web service. In addition to REGGAE, this tool suite compiles a range of powerful algorithms that can be used to identify key regulators in transcriptomic, proteomic, and epigenomic data sets.
Moreover, we evaluate the capabilities of our tool suite through several case studies that highlight the versatility and potential of our framework. In particular, we used our tools to conducted a detailed analysis of a Wilms’ tumor data set. Here, we could identify a circuitry of regulatory mechanisms, including new potential biomarkers, that might contribute to the blastemal subtype’s increased malignancy, which could potentially lead to new therapeutic strategies for Wilms’ tumors.
In summary, we present and evaluate a comprehensive framework of powerful algorithms, tools, and databases to analyze molecular high-throughput profiles. The provided methods are of broad inter- est to the scientific community and can help to elucidate complex pathogenic mechanisms.