19th Workshop on Compression, Text, and Algorithms (WCTA 2024)

Puerto Vallarta, Jalisco, México
September 26th, 2024

19th Workshop on Compression, Text, and Algorithms (WCTA 2024)

WCTA 2024 - Program

THURSDAY

Keynote: Tomohiro I. Constructing Compact Indexes for Parameterized Pattern Matching
Session chair: Yuto Nakashima

Session 1
Session chair: Giuseppe Romana
  • Rocco Ascone
    A unifying taxonomy of pattern matching in degenerate strings and founder graphs
  • Alessio Campanellli
    Where the patterns are: repetition-aware compression for colored de Bruijn graphs
  • Francisco Olivares
    Near-Optimal Search Time in δ-Optimal Space, and Vice Versa
  • Session 2.
    Session chair: Tomohiro I

  • Alejandro Pacheco
    Counting on General Run-Length Grammars
  • Joseph Winjum Enhancing
    Space Efficiency in Grammar Compressed Strings with Bitpacking
  • Cristian Urbina
    Generalized Straight-Line Programs
  • Session 3
    Session chair: Gonzalo Navarro

  • Peaker Guo
    New Properties of Fibonacci and Thue-Morse Words Resolve Conjecture on Net Frequency
  • Daniel Puttini
    Near-Optimal Time Construction of Minimum Wheeler DFAs
  • Davide Cenzato
    On computing the deterministic colex width of regular languages

  • Keynote Speaker

    Constructing Compact Indexes for Parameterized Pattern Matching

    A parameterized string (p-string) is a string over an alphabet (Σs ∪ Σp), where Σs and Σp are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings x and y are said to parameterized match (p-match) if and only if x can be transformed into y by applying a bijection on Σp to every occurrence of p-symbols in x. P-matching has a wide range of applications in software maintenance, image processing, RNA structural matching, and so on, and thus, algorithms and data structures for p-matching have been extensively studied since Baker's seminal work [STOC 1993]. A popular problem is the indexing problem, in which we preprocess a p-string T of length n so that we can efficiently find the occurrences of substrings of T that p-match with a given pattern. Let σs and respectively σp be the numbers of distinct s-symbols and p-symbols that appear in T and σ = σs + σp. Extending the Burrows-Wheeler Transform (BWT) based index for exact string pattern matching, Ganguly et al. [SODA 2017] proposed parameterized BWTs (pBWTs) to design the first compact index for p-matching, and posed an open problem on how to construct the pBWT-based index in compact space, i.e., in O(n  log  σ) bits of space. Hashimoto et al. [SPIRE 2022] showed how to construct the pBWT for T, in O(n  log  σ) bits of space and O(n σp  log  n /  log   log  n) time in an online manner while reading the symbols of T from right to left. In our recent paper [ICALP 2024], we refine Hashimoto et al.'s algorithm to run in O(n  log  σp  log  n /  log   log  n) time. Our result has an immediate application to constructing parameterized suffix arrays in O(n  log  σp  log  n /  log   log  n) time and O(n  log  σ) bits of working space. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner.

    Tomohiro I

    Kyushu Institute of Technology, Japan

    He is Associate Professor at Kyushu Institute of Technology, Japan. His main research interests are string processing algorithms and data structures including those working in small or compressed space. He is also interested in combinatorial properties on strings such as repetitions and palindromes with the most prominent contribution to the proof of the "Runs" theorem [SIAM J. Comput. 2017].

     

     


    Contact

    For more information, please contact karina.figueroa at umich.mx