Title: Trust in the Untrusted World: Private Access over Public
Infrastructures
Abstract: We live in interesting times in that our digital lives have become increasingly interdependent and
interconnected. Such interconnections rely on a vast network of multiple actors whose
trustworthiness is not always guaranteed. Over the past three decades, rapid advances in
computing and communication technologies have enabled billions of users with access to
information and connectivity at their fingertips. Unfortunately, this rapid digitization of our
personal lives is also now vulnerable to invasion of privacy. In particular, now we have to worry
about the malicious intent of individual actors in the network as well as large and powerful
organizations such as service providers and nation states. In the backdrop of this reality of the
untrusted world, we raise the following research questions: (i) Can we design a scalable
infrastructure for voice communication that will hide the knowledge of who is communicating
with whom? (ii) Can we design a scalable system for oblivious search for documents from public
repositories? (iii) Can we develop scalable solutions for private query processings over public
databases? These are some of the iconic problems that must be solved before we can embark
on building trusted platforms and services over untrusted infrastructures. In this talk, we present
a detailed overview of a system for voice communication that hides communication metadata
over fully untrusted infrastructures and scales to tens of thousands of users. We also note that
solutions to the above problems rely on an intermediary service provider. We conclude this talk
with an open question on the efficacy of a decentralized paradigm for cryptocurrency in the
broader context of our digital lives that can potentially eliminate the need for an intermediary in
provisioning trusted services and platforms.
Biography: Divy Agrawal is a Distinguished Professor and Chair of Computer Science at the University of California at
Santa Barbara. He also holds the Leadership Endowed Chair in the Department of Computer Science at
UCSB. He received BE(Hons) from BITS Pilani in Electrical Engineering and then received MS and PhD
degrees in Computer Science from State University of New York at Stony Brook. Since 1987, he has been
on the faculty of computer science at the University of California at Santa Barbara. His research expertise
is in the areas of databases, distributed systems, cloud computing, and big data infrastructures and
analysis. Over the course of his career, he has published more than 400 research articles and has
mentored approximately 50 PhD students. He serves as Editor-in-Chief of the Proceedings of the ACM on Modeling of Data and Springer journal on Distributed and Parallel Databases
and has either served or is serving on several Editorial Boards including ACM Transactions on Databases,
IEEE Transactions on Data and Knowledge Engineering, ACM Transaction on Spatial Algorithms and
Systems, ACM Books, and the VLDB Journal. He served as a Trustee on the VLDB Endowment and is
currently serving as the Chair of ACM Special Interest Group on Management of Data (ACM SIGMOD). He
received a Gold Medal from BITS Pilani. Professor Agrawal is the recipient of the UCSB Academic Senate
Award for Outstanding Graduate Mentoring. He and his co-authors are recipients of best paper awards
(ICDE 2002, MDM 2011), influential paper (NDSS 2024), and test-of-time awards (ICDT, MDM). He is a
Fellow of the ACM, the IEEE, and the AAAS. https://sites.cs.ucsb.edu/~agrawal/
Title: Novel Algorithms for Linking Records
Abstract: Given multiple data sets, the problem of record linkage is to cluster them such that each cluster has all the information pertaining to a single entity and does not contain any other information. In this presentation we summarize some of the novel algorithms that we have recently created in the context of record linkage.
Blocking is a technique that is typically used to speed up record linkage algorithms. Recently, we have introduced a novel algorithm for blocking called SuperBlocking. We have created novel record linkage algorithms that employ SuperBlocking. Experimental comparisons reveal that our algorithms outperform state-of-the-art algorithms for record linkage. We have also developed parallel versions of our record linkage algorithms and they obtain close to linear speedups. We will provide details on these algorithms in this presentation.
We can think of each record as a string of characters. Numerous distance metrics can be found in the literature for strings. The performance of a record linkage algorithm might depend on the distance metric used. Some popular ones are: edit distance (also known as the Levenshtein distance), q-gram distance, Hausdorff distance, etc. Jaro is one such popular distance metric that is being widely used for applications such as record linkage. The best-known prior algorithms for computing the Jaro distance between two strings took quadratic time. Recently, we have presented a linear time algorithm for Jaro distance computation. We will summarize this algorithm also in this presentation.
Biography: Sanguthevar Rajasekaran received his M.E. degree in Automation from the Indian Institute of Science (Bangalore) in 1983, and his Ph.D. degree in Computer Science from Harvard University in 1988. Currently he is the Director of the School of Computing, Board of Trustees Distinguished Professor, and Pratt & Whitney Chair Professor of CSE at the University of Connecticut. Before joining UConn, he has served as a faculty member in the CISE Department of the University of Florida and in the CIS Department of the University of Pennsylvania. During 2000-2002 he was the Chief Scientist for Arcot Systems. His research interests include Big Data, AI and Machine Learning, Bioinformatics, Algorithms, Data Mining, Randomized Computing, and HPC. He has published over 350 research articles in journals and conferences. He has co-authored two texts on algorithms and co-edited six books on algorithms and related topics. He has been awarded numerous research grants from such agencies as NSF, NIH, US Census Bureau, CIA, DARPA, Industry, and DHS (totaling more than $22M). He is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE), the American Association for the Advancement of Science (AAAS), the American Institute for Medical and Biological Engineering (AIMBE), and the Asia-Pacific Artificial Intelligence Association (AIAA). He is also an elected member of the Connecticut Academy of Science and Engineering.
https://raj.cse.uconn.edu/
Title: Formal Methods: Whence and Whither
Abstract: Alan Turing arguably wrote the first paper on formal methods 75 years ago. Since then, there have been claims and counterclaims about formal methods. Tool development has been slow but aided by Moore’s Law with the increasing power of computers. Although formal methods are not widespread in practical usage at a heavyweight level, their influence as crept into software engineering practice to the extent that they are no longer necessarily called formal methods in their use. In addition, in areas where safety and security are important, with the increasing use of computers in such applications, formal methods are a viable way to improve the reliability of such software-based systems. Their use in hardware where a mistake can be very costly is also important. This talk explores the journey of formal methods to the present day and speculates on future directions.
Biography: Jonathan Bowen, FBCS FRSA, is Emeritus Professor of Computing at London South Bank University, where he headed the Centre for Applied Formal Methods, and Chairman of Museophile Limited, a consultancy company. From 2017, he has also been an Adjunct Professor at Southwest University, Chongqing, China. Previously he was Professor of Computer Science at Birmingham City University, a lecturer at the University of Reading, a Research Officer at the Oxford University Computing Laboratory, and a Research Assistant at Imperial College London. He has also been a visiting professor at several institutions internationally. He has been involved with the field of computing both within industry (including Marconi Instruments, Logica, Silicon Graphics Inc., and Altran Praxis) and academia since 1977. His research interests have ranged from formal methods, safety-critical systems, the Z notation, provably correct systems, rapid prototyping using logic programming, decompilation, hardware compilation, software/hardware co-design, linking semantics, and software testing, to the history of computing, online museums, and digital culture. Bowen has been an academic co-editor of the journal Innovations in Systems and Software Engineering and an associate editor of ACM Computing Surveys. He originally studied Engineering Science at Oxford University.
Title: Engineering resilient AI and autonomous systems: Advances and open
challenges
Abstract: AI and autonomous systems are adopted at pace in healthcare,
transportation, manufacturing, and other important domains. These systems
are capable to perform sophisticated tasks, and to make decisions with
limited or no human oversight. As such, they have the potential to perform
or support complex missions that are dangerous, difficult or tedious for
humans. Nevertheless, to achieve this potential, AI and autonomous systems
must operate resiliently in open, real-world environments. This keynote
will present recent advances in the use of formal methods to engineer AI
and autonomous systems capable of providing their required functionality
despite operating in the presence of uncertainty, change, faults, failure
and other disruptions.
Title: Echoes of Authenticity: Reclaiming Human Sentiment in the LLM Era
Abstract: This paper scrutinizes the unintended consequences of employing large language models (LLMs) like ChatGPT for editing user-generated content, particularly focusing on alterations in sentiment. Through a detailed analysis of a climate change tweet dataset, we uncover that LLM-rephrased tweets tend to display a more neutral sentiment than their original counterparts. By replicating an established study on public opinions regarding climate change, we illustrate how such sentiment alterations can potentially skew the results of research relying on user-generated content. To counteract the biases introduced by LLMs, our research outlines two effective strategies. First, we employ predictive models capable of retroactively identifying the true human sentiment underlying the original communications, utilizing the altered sentiment expressed in LLM-rephrased tweets as a basis. While useful, this approach faces limitations when the origin of the text—whether directly crafted by a human or modified by an LLM—remains uncertain. To address such scenarios where the text's provenance is ambiguous, we develop a second approach based on the fine-tuning of LLMs. This fine-tuning process not only helps in aligning the sentiment of LLM-generated texts more closely with human sentiment but also offers a robust solution to the challenges posed by the indeterminate origins of digital content. This research highlights the impact of LLMs on the linguistic characteristics and sentiment of user-generated content, and more importantly, offers practical solutions to mitigate these biases, thereby ensuring the continued reliability of sentiment analysis in research and policy.
Biography: Ram D. Gopal is the Information Systems Society’s Distinguished Fellow and Alan Turing Institute’s Turing Fellow, and a Professor of Information Systems Management and Analytics at the Warwick Business School. He also serves as the Academic Director of the Gillmore Centre for Financial Technology at the Warwick Business School. He previously served as the Por-Dean for Research at the Warwick Business School (2020-2023) and as Head of the Department of Operations and Information Management in the School of Business, University of Connecticut (2008-2018. He has a diverse and a rich portfolio of research that spans analytics, health informatics, financial technologies, information security, privacy and valuation, intellectual property rights, online market design and business impacts of technology. He has served on the editorial boards of top journals including Information Systems Research and has served as the President of the Workshop on Information Technologies and Systems organization from 2016 to 2018. At the Warwick Business School, he teaches ‘Digital Transformation’ on the Full-time MBA and Executive MBA (London), as well as ‘Digital Finance, Blockchain & Cryptocurrencies’ on the MSc in Management of Information Systems and Digital Innovation, and ‘Text Analytics’ on the MSc in Business Analytics.
Abstract: In 2018, Krenn reported that certain problems related to the perfect matchings and colourings of graphs emerge out of studying constructability of general quantum states using modern photonic technologies.
He realized that if we can prove that the weighted matching index of a graph, a parameter defined in terms of perfect matchings and colourings of the graph is at most 2, that could lead to exciting insights on the potential of resources of quantum inference. Motivated by this, he conjectured that the weighted matching index of any graph is at most 2. The first result on this conjecture was by Bogdanov, who proved that the (unweighted) matching index is at most 2, thus classifying graphs non-isomorphic to K_4 into Type 0, Type 1 and Type 2. By definition, the weighted matching index of Type 0 graphs is 0. We give a structural characterization for Type 2 graphs, using which we settle Krenn's conjecture for Type 2 graphs.
We also present several other results regarding Krenn's conjecture: (1) Krenn's conjecture is true for multi-graphs whose underlying simple graph is of maximum degree at most 3. Also we show that Krenn's conjecture is true when the underlying simple graph has vertex connectivity at most 2. We also show some non-constructability results
when the experiment graph is assumed to be simple.
Biography: Dr. L. Sunil Chandran is a professor in the department of Computer Science and Automation in Indian Institute of Science, Bangalore. He received his Ph.D from Indian Institute of Science, Bangalore
and was a post doctoral fellow in Max-Planck Institute for Informatik, Saarbruecken, Germany. His area of research is graph theory, combinatorics and graph algorithms. He is fellow of
Indian National Science Academy (INSA) and Indian National Academy of Engineering (INAE).
Abstract: Generative capabilities of Large Language Models (LLMs) have captured the
imagination of technologists and the common users alike. What are these
language models? How exactly do they train? How important is the corpus? How do they generate answers that look like human-produced output? In this talk, we will discuss the basics of these issues. Despite the immense success of GPT-like models for English, other languages and most notably, Indian languages seem far behind. We will, hence, focus on Indian languages in the talk next.
We will show how Indian languages have some markedly different characteristics that requires special processing. In particular, we will draw inspiration from Indian theories of text processing. In the end, we will touch upon an ambitious national project, BharatGPT, that aims to put India on a high pedestal in generative AI capabilities.
Biography: Arnab Bhattacharya is a Professor at the Department of Computer Science and
Engineering at the Indian Institute of Technology, Kanpur to which he is
affiliated since December 2007. He received his MS and PhD degrees in Computer Science from the University of California, Santa Barbara, USA and his Bachelors degree from Jadavpur University. His current research interests include natural language processing, information retrieval, data mining, databases, machine learning, and artificial intelligence. Arnab holds multiple patents in databases and has over 75 publications in reputed journals and conferences. He is also the founding coordinator of the Centre for Indian Knowledge System (IKS) studies at IIT Kanpur.