Accurate Model Compression at GPT Scale

A key barrier to the wide deployment of highly-accurate machine learning models, whether for language or vision, is their high computational and memory overhead. Although we possess the mathematical tools for highly-accurate compression of such models, these theoretically-elegant techniques require second-order information of the model’s loss function, which is hard to even approximate efficiently at the scale of billion-parameter models. In this talk, I will describe our work on bridging this computational divide, which enables the accurate second-order pruning and quantization of models at truly massive scale. Compressed using our techniques, models with billions and even trillions of parameters can be executed efficiently on a few GPUs, with significant speedups, and negligible accuracy loss. Models created using our techniques have been downloaded millions of times from open-source repositories such as HuggingFace.

Dan Alistarh (Institute of Science and Technology Austria)

Dan Alistarh is a Professor at ISTA. Previously, he was a Visiting Professor at MIT, Researcher with Microsoft, a Postdoc at MIT CSAIL, and received his PhD from the EPFL. His research is on algorithms for efficient machine learning and high-performance computing, with a focus on scalable DNN inference and training, for which he was awarded ERC Starting and Proof-of-Concept Grants. In his spare time, he works with the ML research team at Neural Magic, a startup based in Boston, on making compression faster, more accurate and accessible to practitioners.

Interactive Machine Learning with Graph-Structured Data

In this talk I will give an overview of our contributions to what I call interactive machine learning. Often, interaction in Computer Science is interpreted as the interaction of humans with the computer but I intend a broader meaning of the interaction of machine learning algorithms with the real world, including but not restricted to humans. Interactions with humans span a broad range, where they can be intentional and guided by the human or they can be guided by the computer such that the human is oblivious of being guided. Another example of an interaction with the real world is the use of machine learning algorithms in cyclic discovery processes such as drug design. Important properties of interactive machine learning algorithms include efficiency, effectiveness, responsiveness, and robustness. In the talk I will show how these can be achieved in a variety of interactive contexts, focussing on graph-structured data.

Thomas Gärtner (TU Wien)

Thomas Gärtner has been Professor of Machine Learning at TU Wien since 2019. From 2015 to 2019 he had been full Professor of Data Science at the University of Nottingham. Before that, he had been leading a research group jointly hosted by the University of Bonn and Fraunhofer IAIS. During this time he received an award in the Emmy-Noether programme of the DFG. His main area of research is computationally efficient and effective machine learning algorithms with theoretical guarantees and practical demonstrations in real-world applications. He has been awarded grants from DFG, UKRI, FFG, WWTF, EU, and FWF. He gave tutorials at ICML and ECMLPKDD, was program co-chair of ECMLPKDD and of more than 10 international workshops. He has been an editor for the Machine Learning journal since 2006, a board member of the CAIML at TU Wien since 2021, and has regularly been Area Chair for ECMLPKDD, NeurIPS and/or ICML.

Submodular optimization and interpretable machine learning

Submodular functions are used to characterize the diminishing-returns property, which appears in many application areas, including information summarization, sensor placement, viral marketing, and more. Optimizing submodular functions has a rich history in mathematics and operations research, while recently, the subject has received increased attention due to the prevalent role of submodular functions in data-science applications. In this talk we will discuss two recent projects on the topic of interpretable classification, both of which make interesting connections with submodular optimization. For the first project, we address the problem of multi-label classification via concise and discriminative rule sets. Submodularity is used to account for diversity, which helps avoiding redundancy, and thus, controlling the number of rules in the solution set. In the second project we aim to find accurate decision trees that have small size, and thus, are interpretable. We study a general family of impurity functions, including the popular functions of entropy and Gini-index, and show that a simple enhancement, relying on the framework of adaptive submodular ranking, can be used to obtain a logarithmic approximation guarantee on the tree complexity.

Aristides Gionis (KTH Royal Institute of Technology)

Aristides Gionis is a WASP professor in KTH Royal Institute of Technology, Sweden, and an adjunct professor in Aalto University, Finland. He obtained his PhD from Stanford University, USA, and he has been a senior research scientist in Yahoo! Research. He has contributed in several areas of data science, such as data clustering and summarization, graph mining and social-network analysis, analysis of data streams, and privacy-preserving data mining. His current research is funded by the Wallenberg AI, Autonomous Systems and Software Program (WASP) and by the European Commission with an ERC Advanced grant (REBOUND) and the project SoBigData++.

An Update to Dynamic Graph Algorithms

From a procedural viewpoint, an algorithm is a list of instructions that transforms a given input into the desired output. In many scenarios the input is not given to the algorithm in its entirety from the start and might undergo changes that the algorithm needs to react to. Dynamic algorithms are a class of algorithms that are specifically designed to quickly adapt their output after an update to the input data. In the past years, near-optimal dynamic algorithms have been designed for many "textbook" problems and a successful research program has been established around using dynamic algorithms to speed up static algorithms. However, these efforts have arguably created little real-world impact. In this talk, I propose a systematic focus shift for dynamic algorithms towards problems relevant to practitioners in domains like network science and machine learning. As a case study, I present recent work on dynamic k-center clustering on evolving graphs: we give the first efficient approximation algorithms for updating the k-center clustering of a graph undergoing edge insertions and/or deletions.

Sebastian Forster (Paris Lodron University of Salzburg)

Sebastian Forster is a professor at the Department of Computer Science of the Paris Lodron University of Salzburg where he performs basic research in the areas of distributed and dynamic algorithms. Sebastian completed his PhD under the supervision of Monika Henzinger at the University of Vienna in 2015. His thesis on dynamic graph algorithms was awarded with the Heinz Zemanek Award of the Austrian Computer Society. He joined the Paris Lodron University of Salzburg in 2017 and received an ERC Starting Grant in 2020. Over the course of his career, he went on research stays at Microsoft Research in Mountain View (2014), the Simons Institute for the Theory of Computing at UC Berkeley (2015), the Max Planck Institute for Informatics in Saarbrücken (2016), and Google Research in Zurich (2023).

Inequality and Fairness in social networks and algorithms

While algorithms promise many benefits including efficiency, objectivity, and accuracy, they may also introduce or amplify biases. In this talk, I show how biases in our social networks are fed into and amplified by ranking and recommender systems. Drawing from social theories and fairness literature, we argue that biases in social connections need to be taken into consideration when designing people recommender systems.

Fariba Karimi (TU Graz)

Fariba Karimi is a data scientist who develops mathematical and computational models to study inequalities in socio-technical networks and algorithms. She is currently full professor of Data Science at the faculty of Computer Science and Biomedical Engineers at the Graz University of Technology. 
Fariba Karimi received her doctorate from the University of Umea in 2015. She then spent four years researching at the computational social science department at Leibniz Institute for the Social Sciences in Cologne, Germany. Since March 2021, she has been the group lead of the “Network Inequality” group at Complexity Science Hub Institute in Vienna. Before joining TU Graz, she also served as a tenure track professor at the Department of Computer Science at Vienna University of Technology. In 2023, she received the prestigious Young Scientist Award from the German Physical Society for her contribution in modeling minorities and inequalities in networks.

Making Sense of Large Networks

A graph is a very simple concept, just consisting of nodes and edges. Real-world graphs exhibit a surprisingly rich set of patterns. For instance, nodes can form groups with different characteristics, e.g., groups of densely connected nodes, or groups of nodes that form bi-partite or tree-like patterns. Nodes can have different structural roles like hubs, spokes and bridges. Graphs are usually represented by adjacency matrices or edge lists. Even for a small graph, it is impossible to understand its patterns based on this representation. For large real-world social or biological networks, this is completely impossible. Therefore, we need methods for representation learning, clustering and summarization in order to make sense of networks. This talk covers some recent approaches, ranging from methods that follow an information-theoretic objective to deep representation learning approaches.

Claudia Plant (University of Vienna)

Claudia Plant is full professor, leader of the Data Mining and Machine Learning research group at the Faculty of Computer Science University of Vienna, Austria. Her group focuses on new methods for exploratory data mining, e.g., clustering, anomaly detection, graph mining and matrix factorization. Many approaches relate unsupervised learning to data compression, i.e. the better the found patterns compress the data the more information we have learned. Other methods rely on finding statistically independent patterns or multiple non-redundant solutions, on ensemble learning or on nature-inspired concepts such as synchronization. Indexing techniques and methods for parallel hardware support exploring massive data. Claudia Plant has co-authored over 150 peer-reviewed publications, among them more than 30 contributions to the top-level data mining conferences KDD and ICDM and 4 Best Paper Awards. Papers on scalability aspects appeared at SIGMOD, ICDE, and the results of interdisciplinary projects in leading application-related journals such as Bioinformatics, Cerebral Cortex and Water Research.

32 Milliseconds to the Matterhorn - On in-situ sensing and process understanding of natural hazards and climate change in mountain areas

The high Alpine cryosphere is one of the physical environments most affected by the current climate crisis. Glacial retreat is hitting unprecedented maximum values in 2022 and mass movements of increasing magnitude are being observed over the past decade. Observations and data from remote, hard to reach often hazardous mountain areas is traditionally scarce limiting our ability to understand processes and devise models e.g., for forecasting. It has been recently pointed out by the Global Climate Observing System (GCOS) as well as Global Atmosphere Watch (GAW) that one of the few “blind spots” on planet earth where observations are scarce are high altitude mountain environments. With our research agenda of applying low-power wireless sensors and data science techniques to long-lived environmental process monitoring and natural hazard mitigation applications we have demonstrated that indeed highly detailed observation records of value for both basic research and decision making can be obtained. We will examine select examples from high-mountain areas and discuss how future combined sensory and robotic systems could make the difference especially in extremely hard to monitor steep and high-altitude mountain areas.

Jan Beutel (University of Innsbruck)

Jan Beutel received his MSc and PhD in Electrical Engineering from the Swiss Federal Institute of Technology (ETH), Zurich in 2000 and 2005 respectively. He has been with u-blox AG and spent time as a visiting researcher at the Berkeley Wireless Research Center. At ETH Zurich he has been heading a research group on networked embedded systems at the Computer Engineering and Networks Lab (TIK). In 2020, he joined the University of Innsbruck as a full Professor. In his research Jan Beutel has pioneered the use of in-situ wireless sensors for long-lived environmental monitoring and natural hazard mitigation applications, especially in high-mountain areas leading to many highly cited publications. The sensor networks and associated data on the slopes of the Matterhorn (CH) constitute the longest and densest data record in mountain permafrost research worldwide feeding into both basic research as well as international climate monitoring and policy making.

 Special Sessions

Young Experts: Minute Madness

In the "Young Experts: Minute Madness" session, excellent doctoral students in the field of computer science at Austrian universities will present their research work in a 1-minute overview talk, followed by a poster session.

  • Philipp Hofer (Johannes Kepler Universität Linz):

    Enhancing Privacy-Preserving Biometric Authentication through Decentralization

  • Timo Bertram (Johannes Kepler Universität Linz):

    Contrastive Learning for Game AI

  • Maximilian Heisinger (Johannes Kepler Universität Linz):

    Encoding, Solving, And Benchmarking for SAT and Extensions

  • Michael Holly (TU Graz):

    Exploring User Engagement in Virtual Learning Environments for STEM Education

  • Saeedeh Barzegar Khalilsaraei (TU Graz):

    Learning Subdivision Curves and Surfaces

  • Viet Man Le (TU Graz):

    Intelligent Techniques for Efficient Diagnostic Reasoning

  • Antonis Skarlatos (Paris Lodron University of Salzburg):

    Dynamic Algorithms for $k$-Clustering Problems

  • Charlotte Hoffmann (Institute of Science and Technology Austria):

    Theory and Applications of Verifiable Delay Functions

  • Konstantin Kueffner (Institute of Science and Technology Austria):

    Statistical Monitoring of Algorithmic Fairness in Stochastic Systems

  • Maximilian Thiessen (TU Wien):

    Iterative Learning in Graphs and Convexity Spaces

  • Petra Hozzová (TU Wien):

    Inductive Reasoning in Superposition

  • Matthias König (TU Wien):

    The Computational Cost and Benefit of Collective Attacks in Abstract Argumentation

  • Maximilian Vötsch (University of Vienna):

    Design and Implementation of Algorithms for Data Science

  • Sricharan Arunapuram Rangaramanujam (University of Vienna):

    Interplay of Differential Privacy and Dynamic Algorithms

  • Lena Bauer (University of Vienna):

    Mining High-Dimensional Data with Applications in Medicine

  • Fabian Mitterwallner (University of Innsbruck):

    Strong Undecidability Results for Termination of Rewriting

  • Andreas Peintner (University of Innsbruck):

    Sequential Recommendation: A Graph-based Perspective

  • Ramsha Ali (University of Klagenfurt):

    Artificial Intelligence Methods for Scheduling in Production and Supply Chain Management

  • Laura Waltersdorfer (WU Wien):

    Towards an Infrastructure for Continuous AI Auditability

  • Sajjad Khan (WU Wien):

    Towards an Adaptive, Trustworthy and Privacy Preserving Federated Learning System

New Professors Session

In the New Professors Session, recently hired professors in Austria will say "hi" and introduce themselves to the Austrian community. In short presentations, they provide insights into what drives them in their work and highlight their current and future research plans. We look forward to the following presentations: