Inverse Problems for Topological Transforms


                                 by
                    Yitzchak Elchanan Solomon
            B.S. / M.A, UCLA; Los Angeles, CA, 2013
           M.S., Brown University; Providence, RI, 2016


        A dissertation submitted in partial fulfillment of the
        requirements for the degree of Doctor of Philosophy
       in the Department of Mathematics at Brown University


                     Providence, Rhode Island
                             May 2019
© Copyright 2019 by Yitzchak Elchanan Solomon
  This dissertation by Yitzchak Elchanan Solomon is accepted in its present form by
       the Department of Mathematics as satisfying the dissertation requirement
                        for the degree of Doctor of Philosophy.


Date
                                   Jeffrey F. Brock, Ph. D., Advisor


                       Recommended to the Graduate Council


Date
                                      Thomas Goodwillie, Reader


Date
                                       Richard Schwartz, Reader


                         Approved by the Graduate Council


Date
                                        Andrew G. Campbell
                                     Dean of the Graduate School


                                          iii
Vita


Elchanan Solomon attended UCLA, where he received Bachelor of Science and Master of Arts
degrees in Mathematics. He graduated Summa Cum Laude as a UCLA Regent’s Scholar,
and with the Daus prize in mathematics.


                                           iv
Per la Victòria, que sempre estava orgullosa de mi.


   ;‫ שלי‬M‫למשפחתי; שקיבלו את המזכרות מכל הכנסי‬


                        v
Acknowledgements


There are many people I’d like to thank for making the content of this thesis possible and
the time spent researching it a pleasure.


   Firstly, my advisors.

  • Jeff Brock: Thank you for your mentorship and generosity during my years studying
     geometric topology, and your continued support during my extended academic transi-
     tion. Thank you as well for helping arrange so many unique research, mentoring, and
     teaching opportunities, from which I have benefited both academically and profession-
     ally. I like to think that the distinctly geometric flavor of my current research is a
     product of the discussions we had during my first years as a graduate student.

  • Steve Oudot: Without your mentorship, generosity of time and spirit, enthusiasm,
     humor, patience, and interminable emendations, I would never have produced the
     work in this thesis, much less enjoyed doing so. If the reader finds the exposition below
     coherent, at times verging on lucid, they have Steve to thank. Merci!

   Thanks as well to the other members of my defense committee, Thomas Goodwillie and
Richard Schwartz, for taking the time to read this thesis and discuss its contents.


                                             vi
   Next, my academic mentors and collaborators.

  • Sara Kališnik: For introducing me to Steve Oudot and single-handedly starting my
     academic transition, for all your conference invitations and professional advice, for
     networking me with everybody I have ever needed to meet, for always looking out for
     me, and for being a wonderful and inspiring friend – Hvala!

  • Sam Watson: Thank you for everything you taught me about teaching and course
     design.

  • Clément Maria: Thank you for our ongoing collaboration, which has a become a sub-
     stantial component of this thesis. I hope to visit you again in Sophia-Antipolis, perhaps
     when there is less rain and more sanglier.

  • Jean-Marc Schlenker: Thank you for your generosity of time and energy during my
     summer working in Luxembourg, as well as our continued collaboration in Singapore.
     Though we were never able to complete our project, I learned a lot in the process.

  • Paul Carter: Though it started as a joke, I’m proud of the pedagogical paper we wrote
     together.

   Next, to all the friends who have tolerated my pranks and childish humor with a smile
and occasionally a laugh. Special thanks to:

  • Laura Walton: For teaching me about generosity and letting me eat your almonds.

  • Amalia Culiuc: For accepting me during my Murakami phase.

  • Samir Chowdhury: For being my academic big brother.


                                               vii
  • Doreen Reuchsel: For playing the role of my Boddhisattva.

  • Elizabeth Crites: For our musical adventures.

  • Alicia Harper: For all our late-night rambles.

  • Seoyoung Kim: For our teatime chats in broken French.

  • Melissa McGuirl: For being a fantastic seminar and workshop co-organizer.

  • Ashley Weber, Sunny Yang Xiao, and Peihong Jiang: For friendship and support as
     academic siblings.

   Next to all the lovely staff of the math department, for all their help and good humor.
Special thanks to:

  • Doreen Pappas: For all the irreverent laughter and exotic sweets we’ve shared.

   And finally:


   Per la Victòria: Gràcies per tot.

                                                                    ‫למשפחתי; תודה על הכל‬


                                           viii
Contents


1 Introduction                                                                                  1
  1.1   Inverse Problems in Machine Learning . . . . . . . . . . . . . . . . . . . . .           1
  1.2   Topological Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .        3
  1.3   Inverse Problems in TDA: . . . . . . . . . . . . . . . . . . . . . . . . . . . .         8
        1.3.1   Topological Transforms . . . . . . . . . . . . . . . . . . . . . . . . . .       9
  1.4   Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10

2 Background                                                                                    20
  2.1   Metric Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       20
        2.1.1   The Hausdorff and Gromov-Hausdorff Metrics . . . . . . . . . . . . .            20
        2.1.2   Metric Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . .       22
        2.1.3   Reeb Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       23
        2.1.4   Metric Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     25
  2.2   Algebraic Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      27
        2.2.1   Persistent Homology . . . . . . . . . . . . . . . . . . . . . . . . . . .       27
        2.2.2   Extended Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . .      33
        2.2.3   Point Cloud Persistence . . . . . . . . . . . . . . . . . . . . . . . . .       36


                                               ix
        2.2.4   Morse-Type Functions . . . . . . . . . . . . . . . . . . . . . . . . . .         39
        2.2.5   Persistence for Reeb Graphs . . . . . . . . . . . . . . . . . . . . . . .        39
        2.2.6   Euler Calculus    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    43

3 Prior Work                                                                                     45

4 The Intrinsic Persistent Homology Transform                                                    54
  4.1   Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     54
  4.2   The IPHT and IECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          55
  4.3   Stability Results   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    62
  4.4   Injectivity Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    65
  4.5   Overview of the Proofs from Section 4.4 . . . . . . . . . . . . . . . . . . . .          69

5 The Distance Kernel Transform                                                                  75
  5.1   Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     75
  5.2   The Distance Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        77
  5.3   The Distance Kernel Embedding         . . . . . . . . . . . . . . . . . . . . . . . .    80
  5.4   Stability and Inverse Results . . . . . . . . . . . . . . . . . . . . . . . . . . .      85
  5.5   Topological Kernel Transforms . . . . . . . . . . . . . . . . . . . . . . . . . .        98
  5.6   Metric Stability and Operator Perturbation . . . . . . . . . . . . . . . . . . 106

6 Conclusion                                                                                    113

Appendices                                                                                      115

A Proofs for the IPHT and IECT                                                                  116
  A.1 Proof of Theorem 4.4.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
  A.2 Proofs of Lemmata 4.5.1, 4.5.2, and 4.5.3 . . . . . . . . . . . . . . . . . . . . 119

                                                x
  A.3 Proof of Proposition 4.4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

B Proof of Proposition 4.5.5 for ΨG                                                      128
       B.0.1 Technical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
       B.0.2 Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
       B.0.3 Comparing Distinct Cases . . . . . . . . . . . . . . . . . . . . . . . . 139
       B.0.4 Comparing Identical Cases . . . . . . . . . . . . . . . . . . . . . . . . 145

C Proof of Proposition 4.5.5 for χG                                                      151
       C.0.1 The three cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
       C.0.2 Technical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
       C.0.3 Reduction to the three cases . . . . . . . . . . . . . . . . . . . . . . . 158
  C.1 Cases (I) and (I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
  C.2 Cases (I) and (II) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
  C.3 Cases (I) and (III) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
  C.4 Cases (II) and (III) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
  C.5 Cases (III) and (III) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

D The Case of Topological Self-Loops and Few Vertices                                    176

Bibliography                                                                             189


                                             xi
List of Tables


                 xii
List of Figures


 1.1   Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    2
 1.2   Bag-of-Words Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     3
 1.3   TDA Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
 1.4   A family of non-isometric point clouds with the same persistence module. . .          9
 1.5   A visualization of the IPHT. . . . . . . . . . . . . . . . . . . . . . . . . . . .   11
 1.6   Non-isomorphic graphs with the same topological transform. . . . . . . . . .         13
 1.7   Plots of the eigenfunctions of the operator DX on a circle and a torus. . . . .      16

 2.1   A barcode and its associated persistence diagram. . . . . . . . . . . . . . . .      32
 2.2   Point Cloud Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    38

 3.1   Visualization of the PHT. . . . . . . . . . . . . . . . . . . . . . . . . . . . .    46
 3.2   An example of an Euler curve. . . . . . . . . . . . . . . . . . . . . . . . . . .    53

 4.1   Non-isomorphic graphs with the same topological transform. . . . . . . . . .         66
 4.2   Embedding counterexamples. . . . . . . . . . . . . . . . . . . . . . . . . . .       66
 4.3   Aut(G) 6= 0 but ΨG not injective. . . . . . . . . . . . . . . . . . . . . . . . .    74

 A.1 A metric graph with basepoint x. . . . . . . . . . . . . . . . . . . . . . . . . 120


                                            xiii
A.2 The cactus approximation of G. . . . . . . . . . . . . . . . . . . . . . . . . . 124

B.1 When the smallest death time is larger than expected. . . . . . . . . . . . . 130
B.2 Visualizing a birth time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
B.3 Visualizing a simple cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.4 Visualizing a birth time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
B.5 Case A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
B.6 Case B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
B.7 Case C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
B.8 When p and q lie on the same edge. . . . . . . . . . . . . . . . . . . . . . . . 138
B.9 Cases A and B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
B.10 Cases B and C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
B.11 Cases A and C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
B.12 Case A and Case A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
B.13 Case B and Case B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
B.14 Case C and Case C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

C.1 A non-constant barcode giving rise to a constant Euler curve. . . . . . . . . 152
C.2 Case (I). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
C.3 Case (II). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
C.4 Case (III). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
C.5 The ten possibilities resulting from a cancellation. . . . . . . . . . . . . . . . 156
C.6 The two possible scenarios in which the smallest nonzero death time in ΨG (p)
     is undetectable in χG (p). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
C.7 Cases (I) and (I). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
C.8 Cases (I) and (I). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

                                           xiv
C.9 Cases (I) and (II). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
C.10 Case (II) and (III). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
C.11 Forcing cancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
C.12 Cases (III) and (III). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
C.13 Cases (III) and (III). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

D.1 Cases (A) and (D). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
D.2 Cases (B) and (D). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
D.3 Cases (C) and (D). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
D.4 Cases (D) and (D). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
D.5 A self-loop in G becomes a leaf edge in IP HT (G) or IECT (G). . . . . . . . 184
D.6 A figure-eight graph becomes a circle after applying IECT (G). . . . . . . . . 185
D.7 When G has two vertices and self-loops. . . . . . . . . . . . . . . . . . . . . 186
D.8 The image a two-vertex graph under the IPHT or IECT. . . . . . . . . . . . 188


                                           xv
                                                                    CHAPTER 1


Introduction


1.1      Inverse Problems in Machine Learning

In recent decades, success in machine learning has revolved around the study of non-linear
feature extraction and non-linear models. This paradigm uses large training sets and in-
creased processing power to produce highly flexible models with ever increasing prediction
accuracies. However, there is an emerging awareness among machine learning researchers
and end-users that these non-linear techniques can be very hard to interpret. Often, the
mapping from the input (data) space to the target (modeling) space is so complex that it is
virtually impossible to predict what simple transformations in the target space might mean
for real-world data, if they can be given any interpretation at all. Similarly, it is possible for
slightly different input data sets to produce wildly divergent models. As prediction accuracy
is only one part of the data analysis pipeline, many researchers are now studying the hard
mathematical problems underlying the explainability and interpretability of machine learning

                                                1
                                                                                                           2


algorithms. Two central challenges are the discriminativity and preimage problems:

   • Discriminativity: Do distinct data sets produce distinct descriptors? In other words,
      does the mapping from the input space to the target space have a left inverse? If this
      is not the case, it may be hard to explain exactly what kind of information is being
      stored in the descriptor.

   • Preimage: Does every descriptor correspond to at least one data set? In other words,
      does the mapping from the input space to the target space have a right inverse? This
      is essential for our ability to interpret the meaning of our features.

                           Data                                    Features


Figure 1.1: In the figure above, the two green data structures are mapped to a common blue feature,
causing a lack of discriminativity. Additionally, there is no preimage data structure for the orange feature.


    See Figure 1.1 for a visualization of these phenomena.


    We will use the general term inverse problem to refer to any question that can be posed
about the inverse of a feature extraction model. To illustrate this with an example, consider
the bag-of-words model in text analysis. This is a model that represents text as a set of
words, disregarding word order but keeping track of multiplicity. In this context, we have
the following inverse problems:
                                                                                                    3


   • Discriminativity: Are there distinct texts that produce the same bag-of-words? (Yes,
      consider "The man ate the dog" and "The dog ate the man".)

   • Preimage: Are there bags-of-words that do not correspond to any text? (Yes, consider
      the multiset {hello:8,the:4,banana:1,whereas:783}.)

   Though the bag-of-words feature vector has its limitations, there are certain scenarios
in which it can be used to recover the original text, see Figure 1.2, and one can ask how
likely this is to occur. Moreover, it can be enriched by storing n-tuples of adjacent words,
for n > 1, which may serve to improve its discriminatory power. We see that, even for this
simple model, one can pose a variety of interesting and nontrivial inverse problems.

                                                                     to    2
                                                                     be 2
                       ?? ?? ?? ??? ?? ??                            or    1
                                                                     not 1
Figure 1.2: Can you figure out which famous text belongs on the left-hand side above? Is this the only
grammatically correct possibility?


1.2      Topological Data Analysis

The focus of this thesis is on applied topology, and in particular, Topological Data Analy-
sis (TDA). TDA is a set of feature extraction and modeling algorithms built around ideas
and techniques from algebraic topology and metric geometry. As a result, it is particularly
well-suited to studying data sets of complex shapes. The central invariant of TDA is per-
sistent homology, which maps an input shape to a descriptor consisting of a set of intervals
on the real line (called a barcode). A rigorous definition of persistent homology will be given
                                                                                             4

in Subsection 2.2.1, but we will dedicate the rest of this section to providing an intuitive
explanation.


   One of the central objects of study in algebraic topology are the homology groups Hd (X)
of a topological space X. These groups count the various d-dimensional holes in X, and
enjoy the important property of being invariant under homotopy equivalence. The rank of
the group Hd (X) is called the d-th Betti number βd (X). The mapping that sends X to its
vector of Betti numbers (β0 (X), β1 (X), β2 (X), · · · ) transforms a complex, intractable data
structure (a topological space) into one that fits neatly into a machine learning framework.
However, because there are many distinct spaces with isomorphic homology groups, this
mapping is not discriminative.


   The transition from homology to persistent homology occurs when we study filtered topo-
logical spaces, i.e. nested families of spaces varying over a real parameter. This situation
is quite general: any real-valued function f : X → R on a topological space X allows us
to pull back the linear ordering on R to a filtration of X by its sub- or super-level sets.
Thus, a real-valued function on X gives rise to a sequence of spaces Xt , parametrized by the
real value t ∈ R. We can encode the change of the homology groups by recording the Betti
curves βd (t) = βd (Xt ). These Betti curves contain more topological information than the
Betti numbers, but come at a computational cost: Betti curves are vectors in an infinite-
dimensional function space, rather than being simple integers. However, because function
spaces are also Hilbert spaces, we have not left the domain of standard machine learning
techniques. Moreover, under mild assumptions on the space X and function f , these Betti
curves are piecewise constant, and can be stored using a discrete data structure: the multi-
set S(X, f ) of values at which the function increases and decreases, together with the value
                                                                                             5

at zero.


   To move beyond Betti curves, we use the functoriality of homology. That is, because
the sub- or super-level sets {Xt } are nested, and hence come with natural inclusion maps,
there are natural, induced maps between their homology groups. What this allows us to do
is impose further structure on the multi-set S(X, f ). Namely, if a Betti curve increases at
t = a and decreases at t = b, we can determine algebraically whether one of the homolog-
ical generators born at t = a died at t = b; when this is the case, we replace the elements
{a} and {b} in the multi-set with the interval [a, b]. By performing this pairing on all the
points in S(X, f ), allowing −∞ as a right endpoint and +∞ as a right endpoint, we can
convert S(X, f ) into a set of intervals: the barcode B(X, f ). Because we are measuring the
way that homological features in X persistent along the family of spaces {Xt }, this proce-
dure, and the resulting barcode, are called the persistent homology of the pair (X, f ), and
denoted P H(X, f ). As before, there is a price to pay for gathering this extra information:
the resulting data structure, a set of intervals, is not a vector in a Hilbert space, and hence
is more challenging to work with than a function or an integer. Nevertheless, the space of
barcodes has a metric, and there are methods of using barcodes to compare and analyze
shapes. Additionally, there are ways of vectorizing barcodes that preserve more, or at the
very least, different, information than simply resorting to Betti curves.


   Let us illustrate all this with an example. Let P ⊂ Rd be a finite set of points, called a
point cloud. We believe the set P has been sampled from some subset V ⊂ Rd , whose geom-
etry and topology is of interest. We can try and reconstruct V from P by cleverly building a
simplicial complex on top of P . Let X = ∆(P ) be the full (|P | − 1)-simplex having P as its
vertices. Whereas P is a collection of isolated points, ∆(P ) is fully connected, and neither
                                                                                              6

of these necessarily resemble V . Indeed, certain simplices in ∆(P ) are more likely to appear
in V than others. To that end, for a subset Q ⊂ P , we define τ (Q) to be the smallest r such
that ∩q∈Q B(q, r) 6= ∅. When τ (Q) is small, the points in Q are tightly clustered, and so it
is more reasonable to introduce into our model complex a simplex with Q as its vertex set.
By picking a real scale parameter r > 0, we can construct the subcomplex ∆r (P ) ⊂ ∆(P )
defined by ∆r (P ) = {σ ∈ ∆(P ) | τ (σ) ≤ r }. When r is small, ∆r (P ) is sparsely connected,
and as r increases we become more generous in the inclusion of simplices. These subcom-
plexes are sub-level sets of the function τ : ∆(P ) → R.


   For a fixed r, the Betti numbers of ∆r (P ) provide a guess for the Betti numbers of V .
However, if r is too small our reconstruction of V will be too sparse, and if r is too large we
will lose any fine-scale information. Unfortunately, we do not know, a priori, which choice
of r is best, and moreover, it may be the case that each choice of r misses part of the pic-
ture. However, the persistent homology of the family of spaces {∆r (P )} keeps track of how
the homology evolves across a range of parameters. A long interval in P H(∆(P ), τ ) then
corresponds to a topological feature that persists over many scales, which is more likely to
appear in V .


   There is another way to associate the point cloud P to a family of topological spaces.
Define the function dP : Rd → R that send a point x ∈ Rd to its distance from the set
P . The sub-level set {x ∈ Rd | dP (x) ≤ r} is precisely the union of the balls of radius r
centered at the points in P . The contractibility of balls in Euclidean space, together with the
well-known Nerve Theorem, implies that the persistent homology P H(Rd , dP ) is the same
as P H(∆(P ), τ ). Thus, both approaches produce identical topological descriptors.
                                                                                                              7


    This example is illustrated in Figure 1.3.


                                   Derive Filtration


                 Data
                                                                                Filtered Topological Spaces
                                                                   gy
                                                                olo
                                                               m
                                                            Ho


                                                                                                      Rn
                                                       nt
                                                    te
                                                 sis


                                                                                                    2
                                                r
                                             Pe


                                                                Vectorization

                                                                                              ···


          Persistence Module (Barcode)


                                                                                    Feature Vectors
Figure 1.3: We start with a discrete set of points P in Euclidean space, sampled from a shape of interest
(in this case, a human face). We can use P to define two increasing families of spaces, the union of Euclidean
balls and the subcomplexes of ∆(P ), shown on the top-right corner of the figure. Both of these filtered
spaces produce the same barcode. Finally, there exist methods of vectorizing barcodes for use in machine
learning algorithms. Part of this figure has been adapted from [GCPZ06, Fig. 6].


    There are many ways of adapting the persistent homology pipeline to produce new in-
variants, some of which will be introduced in Chapter 2.
                                                                                                  8


1.3        Inverse Problems in TDA:

Let us reconsider the inverse problems discussed earlier in the context of TDA:


   • Preimage: Does every barcode arise as the persistent homology of a real-valued function
        f : X → R on a topological space X? What if we stipulate that X is a particular
        manifold, say an interval or torus, or that X = Rd and f is the distance function from
        a point cloud?

   • Discriminativity: If P H(X, f ) = P H(Y, g), must X and Y be isomorphic? If X =
        Y = Rd , and f and g are distance functions from point clouds P and Q, is there an
        affine transformation sending P to Q?


      The focus of this thesis is on the latter question of discriminativity, to which there is, in
general, a negative answer1 :


   • Rotating and translating a point cloud in Rd does not affect the persistent homology
        of its VR filtration (the same is true for the α- or Čech filtration).

   • The persistent homology of the VR or α-filtration of a point cloud can also be pre-
        served by non-isometries. Consider the three-point metric space Pθ obtained by taking
        the vertices of the triangle in Figure 1.4 below. For any choice of θ ∈ [π/2, π], the per-
        sistence module of its VR filtration is the same (idem. for the α- or Čech filtration).


  1
    For more on the preimage problem, the interested reader can consult the survey [OS18] on inverse
problems in applied topology, written by Oudot and the author of this thesis.
                                                                                                9


                                      1                        1
                                             θ ≥ π=2


                            P H(Pθ ) = f(0; +1)g [ f(0; 1)g [ f(0; 1)g
         Figure 1.4: A family of non-isometric point clouds with the same persistence module.


  • Injectivity can also fail for intrinsic metric spaces. Indeed, the persistence module of
     the Čech filtration is identical for every geodesic tree, see e.g. Lemma 2 in [GGP+ 18].
     The same fact holds for the VR filtration, as shown in Appendix A of [OS17].

  • In [Cur17], Curry characterized the fiber of the persistence map for functions on the
     unit interval, describing precisely which functions produce the same persistence mod-
     ule. However, in most settings, this is a hard, open problem.


   These limitations lead us to consider more sophisticated invariants: topological trans-
forms.


1.3.1     Topological Transforms

We have seen that individual persistence diagrams are not injective invariants of a metric
or topological structure X. However, the methods of applied topology give us the freedom
to pick the real-valued function f : X → R, and we are not limited to picking just one.
From a category-theoretic perspective, a lot of information about the space X is contained
in the space hom(X, R) of continuous, real-valued maps on X. By computing the persis-
tent homology of these maps, one can associate to X a large family of persistence diagram,
                                                                                            10

parametrized by hom(X, R).


   Now, the set hom(X, R) is massive and unwieldy. In practice, we would like to isolate a
smaller family of functions F ⊂ hom(X, R) that is easy to parametrize and is rich in topo-
logical and geometric data. Denote by P H(X, F) the resulting set of persistence diagrams.
The mapping X → P H(X, F) will be called a topological transform, in that it provides a
representation of X in terms of its topological statistics. We will also allow our topological
transforms to take values in other sets of topological invariants, like Betti curves or vector-
izations of persistence diagrams.


   Our aim is to find classes of functions F that produce topological transforms which are
both injective and computable, and hence give rise to discriminative shape invariant. In
practice, the choice of F can depend on the structure of X (topological, geometric, measure-
theoretic, embedded vs. intrinsic) as well as the application in mind.


1.4     Results

In this section, we outline the novel constructions and results contained in this thesis. The
objects of interest are intrinsic metric spaces, that is, metric spaces that do not come with
an embedding into any ambient space.


The Intrinsic Persistent Homology Transform

Let (X, dX ) be a metric space. Our aim is to find a family of functions F ⊂ hom(X, R)
that contains a lot of the geometric information of X. One natural choice is to construct,
for each basepoint p ∈ X, the distance-to-the-basepoint function dp : X → R, defined by
                                                                                          11


dp (x) = dX (p, x). The sublevel sets of these functions are geodesic balls, based at p, of
increasing radius. We will write ΨX to denote the mapping from X to the space of barcodes
that sends p ∈ X to P H(X, dp ). The image of ΨX , as a subset of Barcode space, is the
topological transform we call the Intrinsic Persistent Homology Transform, or IPHT(see
Figure 1.5). This invariant was first studied in [DSW15], in the setting of metric graphs.
If we compute the Euler characteristics of the sublevel sets of the functions dp , instead of
their persistence, we get a variant of the IPHT we call the Intrinsic Euler Characteristic
Transform, or IECT.

                                                    Barcode Space


                            Figure 1.5: A visualization of the IPHT.


   Studying the injectivity of the IPHT or IECT is, in general, a very challenging prospect.
In the work contained in this thesis, the author and Steve Oudot focus on the case when
(X, dX ) is a compact, connected metric graph, i.e. a metric space homeomorphic to the
geometric realization of a connected one-dimensional simplicial complex. Section 4.2 gives
formal definitions of the constructions involved in these topological transforms, and demon-
strates that, when X is a compact metric graph, the map ΨX , and its Euler charateristic
analogue χG , are Lipschitz continuous:
                                                                                            12

Lemma (4.2.6). Fix a compact metric graph G. Let p, q ∈ G be any two basepoints. Then
dF D (ΦG (p), ΦG (p0 )) ≤ dG (p, q) and dB (ΨG (p), ΨG (q)) ≤ dG (p, q).

Theorem (4.2.7;[DSW15], §3). Let G, G0 be a pair of compact metric graphs, and let M be
a correspondence between them realizing the Gromov-Hausdorff distance δ = dGH (G, G0 ). If
p ∈ G and p0 ∈ G0 are a pair of points with (p, p0 ) ∈ M then the two Reeb graphs ΦG (p) and
ΦG0 (p0 ) are within 6δ of each other in the functional distortion distance, and the resulting
barcodes ΨG (p) and ΨG0 (p0 ) are within 18δ of each other in the bottleneck distance.

Lemma (4.2.8). Let G = (V, E, dG ) and G0 = (V 0 , E 0 , dG0 ) be a pair of compact metric
graphs. Define
                       N = max{deg(G) − 2|V | + 2, deg(G0 ) − 2|V 0 | + 2}

Then for all p ∈ G, p0 ∈ G0 , α ∈ (1, ∞)


                                                                           1/α
                       kχG (p) − χG0 (p0 )kα ≤ 2N (2dB (ΨG (p), ΨG0 (p0 ))


   Section 4.3 provides stability results for our topological transforms (as well as a measure-
theoretic variant), demonstrating that if two metric graphs are close in the appropriate
metric, then their topological transforms are likewise close:

Theorem (4.3.2; [DSW15]). For a pair of metric graphs G, H, dP D (G, H) ≤ 18 dGH (G, H).

Theorem (4.3.2). For a pair of metric graphs G, H, define


                       N = max{deg(G) − 2|V | + 2, deg(G0 ) − 2|V 0 | + 2}
                                                                                               13


dP D (G, H) ≤ 18 dGH (G, H). Take α ∈ [1, ∞). Then


                             dαED (G, H) ≤ 2N (36dGH (G, H))1/α


Theorem (4.3.3). For a pair of (full-support) metric measure graphs (G, µG ) and (H, µH ),


                     dM P D ((G, µG ), (H, µH )) ≤ 18 D∞ ((G, µG ), (H, µH ))


   Having studied the basic properties of our topological transforms, Section 4.4 turns to
questions of injectivity. The first observation is that there exist nonisometric graphs G and
H with IP HT (G) = IP HT (H), as in Figure 1.6.


                       G                                                 H


Figure 1.6: G and H are not isomorphic, but have the same image under all three of our topological
transforms.


   This demonstrates that our topological transforms are not injective on the full space of
metric graphs. Moreover, one can embed this counterexample in any open neighborhood of
metric graphs:

Proposition (4.4.3). Every open GH-ball in MGraphs contains a pair of non-isometric
graphs with the same intrinsic persistent homology transform.
                                                                                               14

   We do, however, obtain the following local injectivity result, which asserts that a metric
graph cannot be continuously deformed while preserving its IPHT:

Theorem (4.4.7). IP HT is locally injective in the following sense: ∀G ∈ MGraphs there
exists a constant (G) > 0 such that ∀G0 ∈ MGraphs with 0 < dGH (G, G0 ) < (G) we have
dP D (G, G0 ) > 0.

   Going further, we identify subsets of metric graphs for which global injectivity results
can be obtained:

Theorem (4.4.4). The IPHT and IECT are injective up to isometry when restricted to
the sets {G ∈ MGraphs | ΨG injective} and {G ∈ MGraphs | χG injective} respectively,
noting that the latter contains the former. Moreover, the IPHMT is injective up to measure-
preserving isometry on the set {(G, µ) ∈ MMgraphs | ΨG injective}.

   Our next goal is to show that these subsets are appropriately large. The first positive
result is stated in terms of the Gromov-Hausdorff topology:

Proposition (4.4.5). The set {G ∈ MGraphs | ΨG injective} is GH-dense in MGraphs.

   To obtain a generic injectivity result, we move to the slightly coarser fibered topology:

Theorem (4.4.8.A). There is a subset U ⊂ MGraphs containing {G ∈ MGraphs |
ΨG injective} and {G ∈ MGraphs | χG injective} which is open and dense in the fibered
topology, and such that the IPHT and IECT are injective on U , up to isometry.

Theorem (4.4.8.B). Let MGraphs∗ be the subset of MGraphs consisting of graphs whose
underlying combinatorial graph has (i) no topological self-loops and (ii) at least three ver-
tices of valence not equal to two. Let π : MMGraphs → MGraphs be the forgetful map
π(G, µ) = G. Then the intrinsic persistent homology measure transform is injective on
π −1 (U ∩ MGraphs∗ ), up to measure-preserving isometry.
                                                                                               15


   The upshot is that the set of counterexamples for injectivity sits inside a nowhere dense
set of metric graphs (in the fibered topology), which, depending on the application in mind,
may be of no consequence.


   Finally, in Section 4.5, we outline the proofs of these injectivity results, with the technical
casework relegated to the appendices.


The Distance Kernel Transform

The second topological transform we study takes as input a compact metric measure space
(X, dX , µX ). The functions of interest are linear combinations of the eigenfunctions of the
following integral operator DX : L2 (X) → L2 (X):

                                            Z
                                X
                             (D f )(x) =        dX (x, y)f (y)dµX (y)
                                            X


   The eigenfunctions (φi , λi ) of DX encode the geometry of X in a very compact way, and
can be thought of as generalizations of the distance-to-the-basepoint functions dp considered
earlier. This is the reason that we favor the use of these eigenfunctions over those of the
Laplacian. See Figure 1.7 for plots of these eigenfunctions on the circle and the torus.


   Before considering the topological transform induced by this family of functions, we study
their geometric and analytic properties. In Section 5.3 we show that, when the associated
eigenvalue is nonzero, our eigenfunctions are Lipschitz and smooth:
                                        q
Lemma. (5.6.6) When λi 6= 0, φi is ( Vol(X)/|λi |)-Lipschitz.
                                                                                                      16


          Figure 1.7: Plots of the eigenfunctions of the operator DX on a circle and a torus.


Lemma. (5.3.4) Any eigenfunction φ of the distance kernel operator DX with nonzero eigen-
value is smooth.

                                                                                      √
   We then go on to define coordinates αi : X → C ∼
                                                  = R2 given by αi (x) =                  λi φi (x) (note
that the eigenvalues may be negative, with imaginary square root) and a map Φn : X →
Cn ∼
   = R2n given by Φn = (α1 , α2 , · · · , αn ). When n = ∞, we write Φ for Φ∞ . We obtain the
following injectivity results, which assert that using infinitely many eigenfunctions separates
points on X (injectivity), and using finitely many eigenfunctions separates sufficiently distant
points on X (coarse injectivity):

Lemma. (5.3.7) Let (X, dX , µX ) be a compact, strictly positive metric measure space. Then
the map Φ : X → R∞ is injective.

Corollary. (5.3.11) Let M be a complete k-dimensional Riemannian manifold with positive
injectivity radius R. For every r ≤ R/2 there is a natural number N = N (k, r) such that if
Φn (x) = Φn (y) for n ≥ N then dX (x, y) ≤ 3r.

Corollary. (5.3.14) Let (X, dX , µX ) be a compact (a, b)-standard metric measure space with
                                                                                              17

threshold parameter r. For every s ≤ r there is a natural number N = N (s, a, b) such that if
Φn (x) = Φn (y) for n ≥ N then dX (x, y) ≤ 3s.

      Moving on, Section 5.4 considers injectivity results on the space on metric measure spaces.
The following lemma tells us that, under appropriate regularity conditions, Φ(X1 ) = Φ(X2 )
implies X1 = X2 .

Lemma. (5.4.2) Fix a set X. Let µ1 and µ2 be strictly positive measures on X, with
µ1 absolutely continuous with respect to µ2 , and d1 and d2 metrics on X making X1 =
(X, d1 , µ1 ) and X2 = (X, d2 , µ2 ) metric measure spaces. Let D1 and D2 be the resulting
integral operators. If Φ(X1 ) = Φ(X2 ), then d1 = d2 . If, furthermore, the Radon-Nikodym
derivative dµ1 /dµ2 is continuous, then µ1 = µ2 .

      When only finitely many eigenfunctions are used, we show that the Hausdorff distance
between Φn (X) and Φn (Y ) controls the Gromov-Hausdorff between them:

Theorem 1.4.2. (5.4.9) Let (X, dX , µX ) and (Y, dY , µY ) be finite metric measure spaces,
                                                                             2
with eigenvalues {λi } and {νi }. Let k ≤ |X|, |Y |, and suppose that dLH (Φk (X), Φk (Y )) ≤ .
Let Xθ ⊆ X be those points x ∈ X with µX (x) ≥ θ. Then for any θ ≥ 0,

                                             q      q                λk+1 + νk+1
                     dGH (Xθ , Yθ ) ≤ 2 max( |λ1 |, |ν1 |) + 2 +
                                                                          θ

Theorem 1.4.3. (5.4.14) Let (X, dX , µX ) and (Y, dY , µY ) be finite, doubling metric mea-
sure spaces, with eigenvalues {λi } and {νi }. Fix k ∈ N and δ > 0, and suppose that
  2
dLH (Φk (X), Φk (Y )) ≤ . Then there exists an Nk such that

                                q          q
 dGH (X, Y ) ≤ 2( + 2δ) max( |λ1 + δ|, |ν1 + δ|) + ( + 2δ)2 + Nk (λk+1 + νk+1 + 2δ) + 2δ
                                                                                                  18

       In Section 5.5, we use the geometric and regularity results of the prior sections, together
with some analytic results in the theory of persistence, to prove the existence of persistence
diagrams and Euler curves for our eigenfunctions:

Proposition. (5.5.7)
       Let (X, dX , µX ) be a compact metric measure space. For any homological degree k ≥
                                                 Pn
0, and any finite linear combination f =             i=1 ci φi   of eigenfunctions of DX with nonzero
eigenvalue, we (tentatively) define the degree-k Betti curve to be the sum of the indicator
functions of the intervals in the degree-k persistent homology of (X, f ):

                                                         X
                                      βk (X, f ) =                   1I
                                                     I∈P Hk (X,f )


The Euler curve is then (tentatively) defined to be the alternating sum of these Betti curves:

                                                 ∞
                                                     (−1)k βk (X, f )
                                                 X
                                    χ(X, f ) =
                                                 k=0


Suppose now that X is homeomorphic to the geometric realization of a finite simplical com-
plex, and implies bounded degree-q total persistence. Let p = 1/q. Then for any homological
degree k, the sum defining βk (X, f ) converges in Lp . Under the same hypothesis, the sum
defining χ(X, f ) is finite, so that the Euler curve is likewise shown to exist as a function in
Lp .

       We then introduce a number of new topological transforms. Embedding X into R2n
via Φn and then computing persistence diagrams or Euler curves of functions of the form
fv (x) = hx, vi for v ∈ S2n−1 , we obtain the Embedded Persistence Kernel Transform (E-PKT)
or Embedded Euler Kernel Transform (E-EKT). Alternatively, we might compute persistence
diagrams and Euler curves of normalized linear combinations of eigenfunctions on X, giving
                                                                                           19

rise to the Intrinsic Persistence Kernel Transform (I-PKT) or Intrinsic Euler Kernel Trans-
form (I-IKT). We conclude by demonstrating that whereas the intrinsic transforms enjoy
nice continuity properties, the embedded transforms enjoy coarse injectivity properties:

Proposition 1.4.4. (5.5.12) Suppose that X is a compact metric measure space. The I −
P KTn is continuous on S2n−1 .

Theorem 1.4.5. (5.5.13) Let X and Y be compact, strictly positive metric measure spaces.
There exists a function gX,Y : N → R+ with the following property: if E − P KTn (X) =
E − P KTn (Y ) or E − EKTn (X) = E − EKTn (Y ) then:


                                    dG (X, Y ) ≤ gX,Y (n)


   When Φn is injective on X, the intrinsic and embedded transforms are identical, and we
obtain an invariant which is both continuous and coarsely injective.
                                                                 CHAPTER 2


Background


Throughout this thesis, we will assume familiarity with the standard theory of metric spaces,
real analysis, measure theory, point-set and algebraic topology. The purpose of this chapter
is to introduce the reader to more advanced, relevant concepts in these fields. Some of the
following topics are essential to understanding the results of the thesis, whereas others are
important for understanding the proofs. For readers most interested in parsing the results,
we have indicated which topics can be skipped.


2.1     Metric Geometry

2.1.1    The Hausdorff and Gromov-Hausdorff Metrics

In this section, we define two metrics on spaces of metric obects: the Hausdorff and Gromov-
Hausdorff metrics. The Hausdorff metric allows us to compare compact subsets of a fixed


                                             20
                                                                                          21

metric space.

Definition 2.1.1. Let (Z, dZ ) be a metric space, and define C(Z) to be the set of compact
subsets of Z. The metric dZ on Z induces a natural metric on C(Z), called the Hausdorff
metric dZH , defined as follows. Firstly, for X ∈ C(Z) and  > 0, define N (X) = {z ∈ Z |
d(z, X) ≤ }. Then, for X, Y ∈ C(Z), set dZH (X, Y ) = inf{ | X ⊆ N (Y ) and Y ⊆ N (X)}.

   The Gromov-Hausdorff metric allows us to compare compact metric spaces that need not
be embedded in a common space.

Definition 2.1.2. Let (X, dX ), (Y, dY ) be any two compact metric spaces. A correspondence
M between X and Y is a subset of X × Y whose projections to X and Y are surjective, i.e.
πX (M) = X and πY (M) = Y . Intuitively, a correspondence pairs up elements of our two
spaces so that each element of one space is paired up with at least one element of the other.
The cost of a correspondence M is defined as


                       cost(M) =         sup            |dX (x, x0 ) − dY (y, y 0 )|
                                   (x,y),(x0 ,y 0 )∈M


   The Gromov-Hausdorff distance between X and Y is then defined to be the inifimum
over the costs of all correspondences between them.


                                 dGH (X, Y ) = inf cost(M)
                                                        M


   There are many other formulations of the Gromov-Hausdorff distance: interested readers
can consult [BBI01]. An important result is that for compact spaces (as we are considering
here), the infimum in the above definition is actually a minimum, i.e. it is realized by some
(not necessarily unique) minimal cost correspondence, see [IIT16].
   Before moving on, two points of note:
                                                                                           22

  • If X, Y are common subsets of a space Z, dGH (X, Y ) ≤ dZH (X, Y ), and this inequality
     will generally be strict. That is, the intrinsic quantity dGH (X, Y ) need not agree with
     the extrinsic quantity dZH (X, Y ).

  • Whereas the Hausdorff distance can be efficiently computed and approximated, this is
     not the case for the Gromov-Hausdorff distance. Thus, there is a need for computable,
     robust methods of comparing metric objects.


2.1.2    Metric Measure Spaces

A metric measure space is a metric space (X, dX ) equipped with a Borel probability measure
µX . For a general exposition on metric measure spaces, see [Mém11].


   We will restrict our attention to metric measure spaces of full support, i.e. triples
(X, dX , µX ) with X = supp(µX ). The space of such objects can be endowed with a va-
riety of metrics, with one choice being the D∞ metric defined in [Mém11], which we recall
here. Given a pair of compact metric measure spaces (X, dX , µX ) and (Y, dY , µY ), a metric
coupling π is a measure on X × Y with µX and µY as marginals. The set of all metric
couplings is denoted M(µX , µY ). For a fixed metric coupling π ∈ M(µX , µY ) with support
supp(π), we define the cost of that coupling as

                                    1
                         J∞ (π) =              sup            ΓX,Y (x, y, x0 , y 0 )
                                    2 (x,y),(x0 ,y0 )∈supp(π)

where
                          ΓX,Y (x, y, x0 , y 0 ) = |dX (x, x0 ) − dY (y, y 0 )|

   The D∞ metric is then defined to be the infimum of the cost J∞ (π) over all possible
                                                                                           23

couplings π.
                          D∞ ((X, µx ), (Y, µY )) =       inf        J∞ (π)
                                                      π∈M(µX ,µY )


   The support of a measure coupling is always a correspondence (Lemma 2.2 of [Mém11]),
but not every correspondence between the supports of two measures comes from a measure
coupling; thus, this quantity is generally larger than the Gromov-Hausdorff distance.


2.1.3    Reeb Graphs

The theory of Reeb graphs will play an important, technical role in this thesis. Therefore,
as a guide to the curious reader, we provide here some basic, relevant definitions. Those
mainly interested in the results of the thesis, however, can skip the details of this section,
sufficing with the following summary: a Reeb graph is a metric space Rf (X) associated to
a real-valued function f : X → R on a topological space X. Under mild assumptions on
the topological space X, Rf (X) will be homeomorphic to a graph (justifying the nomencla-
ture). When X is a metric graph, and f = dX (p, ·) is the distance-from-p function, Rf (X)
is homeomorphic to X but has a distinct metric.


Definition 2.1.3. Given a topological space X and a continuous function f : X → R,
one can define an equivalence relation ∼f between points in X, where x ∼f y if only if
f (x) = f (y) and x, y belong to the same path-component of f −1 (f (x)) = f −1 (f (y)). The
Reeb graph Rf (X) is the quotient space X/ ∼f , and since f is constant on equivalence
classes, it descends to a well-defined function on Rf (X).
                                                                                                   24

   The space Rf (X) is equipped with the following (potentially infinite-valued) metric.

                                                           (                               )
                                              0
                                      df (x, x ) = min 0 max f ◦ π(t) − min f ◦ π(t)
                                                   π:x→x       t∈[0,1]       t∈[0,1]


         where π : [0, 1] → Rf rangers over all continuous paths from x to x0 in Rf .


   Let us denote by Reeb the space of all compact Reeb graphs. Reeb admits a few natural
metrics, with one common choice being the functional distortion distance, or FD distance.
It is defined as follows

Definition 2.1.4.

                                                                     1
                                                                                      
              dF D (Rf , Rg ) = inf max kf − g ◦ φk∞ , kf ◦ ψ − gk∞ , D(φ, ψ)
                                φ,ψ                                  2

   where

   • φ : Rf → Rg and ψ : Rg → Rf are continuous maps.

   • D(φ, ψ) = sup {|df (x, x0 ) − dg (y, y 0 )| such that (x, y), (x0 , y 0 ) ∈ C(φ, ψ)} where C(φ, ψ) =
      {(x, φ(x))|x ∈ Rf } ∪ {(ψ(y), y)|y ∈ Rg }

   Thus, the FD distance has three components, two that make sure that the approximating
maps preserve the f - and g-functions, and a third which ensures that distances in Rf and
Rg are preserved through the maps φ, ψ. The following lemma states that the FD distance
is well-defined on isometry classes of Reeb graphs.

Lemma 2.1.5. Let Rf and Rg be Reeb graphs. Suppose that dF D (Rf , Rg ) = 0. Then there
are a pair of isometries φ : Rf → Rg and ψ : Rg → Rf which preserve functions, i.e.
∀x ∈ Rf , f (x) = g(φ(x)), and ∀y ∈ Rg , g(y) = f (ψ(y)).
                                                                                                  25

Proof. Let φn and ψn be continuous maps for which the term

                                1
                                                                               
                      max         D(φn , ψn ), kf − g ◦ φn k∞ , kf ◦ ψn − gk∞
                                2

is less than 1/n, i.e. approximate matchings between our Reeb graphs whose distortion
approaches the infinimum, which in this case is zero. The fact that the term 12 D(φn , ψn )
goes to zero means that the Gromov-Hausdorff distance between Rf and Rg goes to zero,
and hence, by Theorem 7.3.30 in [BBI01], Rf and Rg are isometric. In fact, the proof of
that theorem demonstrates that some subsequence of these maps φn and ψn converges to a
pair of isometries φ, ψ. The requirements that kf − g ◦ φn k∞ → 0 and kf ◦ ψn − gk∞ → 0
ensure that these isometries preserve the height functions.


2.1.4     Metric Graphs

In this paper, a combinatorial graph will be a a 1-dimensional cell complex, also called a
multigraph in the literature. These graphs may contain self-loops and multiple edges between
vertices. A metric graph is a combinatorial graph with weighted edges equipped with the
induced path metric, and a metric measure graph is a metric graph equipped with a Borel
measure. In the setting of this paper we will always assume our graphs are compact. We
will write MGraphs to denote the space of all compact metric graphs, and MMGraphs
to denote the space of metric measure spaces (G, dG , µG ), where (G, dG ) ∈ MGraphs and
µG is a Borel measure of full support.

Definition 2.1.6. A simple cycle in a graph G is a sequence of vertices v0 , v1 , · · · , vn−1 , vn =
v0 with [v0 , v1 ] an edge in G, and no vertex repeated aside from the base vertex v0 = vn .

Definition 2.1.7. A topological self-loop in a metric graph G is a simple cycle where every
                                                                                           26

non-base vertex has valence equal to two. Any metric graph G is a isometric to a graph
G0 obtained from G by deleting vertices of valence two and merging the adjacent edges. A
topological self-loop in G becomes a proper self-loop in the graph G0 .

   The space MGraphs can be topologized in various ways, two of which we highlight here:

Definition 2.1.8. The Gromov-Hausdorff topology on MGraphs is the one induced by the
Gromov-Hausdorff metric.

   We now motivate the second topology. Let Ω be the following disjoint union of Euclidean
cones.
                                                 RE
                                         a
                                  Ω=              >0 / Aut(X)
                                       X=(V,E)

where the disjoint union iterates over the set of distinct, unlabeled combinatorial graphs
(considered up to graph isomorphism). On each RE
                                               >0 , we quotient out by the following action

of Aut(X): an automorphism γ of X permuting the edges of X gives rise to a map on RE
                                                                                   >0 ,

sending f ∈ RE
             >0 to f ◦ γ (precomposition by γ). There is a bijection p : Ω → MGraphs

that, for a given combinatorial graph X and an assignment of edge weights ~v , gives the
induced path metric. This gives rise to the following topology on MGraphs, which we call
the fibered topology.

Definition 2.1.9. Identify MGraphs with Ω using the above bijection, and give Ω the
disjoint union topology coming from the L2 topology on each copy of RE
                                                                     >0 / Aut(X) (which

are homeomorphic to Euclidean fans). We will refer to this topology as the fibered topology
on MGraphs, as it decomposes this space into a countable family of disjoint open sets (the
fibers of MGraphs), each corresponding to a distinct combinatorial structure.

Remark 2.1.10. The fibered topology is Hausdorff on MGraphs, where metric graphs are
identified up to isometric isomorphism (i.e. the isometry must also preserve the combinatorial
                                                                                         27

structure). If we choose to identify metric graphs up to isometry, we will have to work with
the associated quotient topology coming from this coarser equivalence relation.

Remark 2.1.11. Note that the fibered topology arises naturally when considering probability
measures on MGraphs defined as mixture-models, where one first selects one of (countably
many) combinatorial graphs X = (V, E) and then chooses edge weights with a Borel measure
on RE
    >0 with density with respect to Lebesgue measure.


2.2     Algebraic Topology

2.2.1    Persistent Homology

In this section, we define the persistent homology construction in general terms, starting
with identifying its domain and range categories.

Definition 2.2.1. We define the category of R-filtered topological spaces RTop as follows.
An object X of RTop is a family of topological spaces X(r) indexed by r ∈ R, with set
inclusions X(r ≤ s) : X(r) ,→ X(s) for all r ≤ s ∈ R. A morphism of R-filtered topological
spaces X and Y is a family ψ of continuous maps, ψ(r) : X(r) → Y (r), with ψ(s) |X(r) = ψ(r)
for r ≤ s. Equivalently, we assert that the following square commutes.

                                             X(r≤s)
                                   X(r)                X(s)

                                      ψ(r)                ψ(s)

                                             Y (r≤s)
                                   Y (r)               Y (s)

   A rich source of examples of R-filtered topological spaces stems from point clouds (see
Figure 1.3). There are various ways to obtain an object in RTop from a point cloud X, with
one of the most common being the Vietoris-Rips complex.
                                                                                          28

Definition 2.2.2. Let X ⊂ Rd be a point cloud. The Vietoris-Rips (VR) filtration V R(X)
is a filtration on the full simplex on the set X (i.e. the simplex of dimension |X| − 1). For
r ∈ R, the subspace (V R(X)) (r) consists of those simplices of diameter ≤ r.

   One can also obtain R-filtered topological spaces by using real-valued functions.

Definition 2.2.3. Let X be a topological space, and f : X → R a continuous, real-valued
function. We will write (X, f ) to denote the R-filtered topological space consisting of the
sublevel sets of f ,
                              (X, f )(r) = {x ∈ X | f (x) ≤ r}.

Definition 2.2.4. We now define the category of persistence modules k-Mod. An object
M of k-Mod is a family of vector spaces M (r) indexed by r ∈ R, together with linear maps
M (r ≤ s) : M (r) → M (s) for all r ≤ s ∈ R. These linear maps are required to satisfy the
following compatibility axioms: M (r ≤ r) = idM (r) , and M (r ≤ t) = M (s ≤ t) ◦ M (r ≤ s)
for r ≤ s ≤ t ∈ R. A morphism ψ of persistence-modules M and N is a family of maps
ψ(r) : M (r) → N (r) making the following square commute for all r ≤ s.
                                             M (r≤s)
                                   M (r)               M (s)

                                      ψ(r)                ψ(s)

                                             N (r≤s)
                                   N (r)               N (s)

   We define the persistence map as follows.

Definition 2.2.5. Let X be an R-filtered topological space. The associated degree-d per-
sistence module M has the degree-d singular homology group M (r) = Hd (X(r); k) at each
index r ∈ R, and the morphism M (r ≤ s) : M (r) → M (s) induced in homology by the
inclusion X(r) ,→ X(s) for each r ≤ s ∈ R. We will use the notation P Hd (X) = M to
indicate that M is the degree-d persistent homology of X. When our R-filtered topological
                                                                                                      29

space is the sublevel set filtration induced by a continuous real-valued function on a topo-
logical space, f : T → R, we will write P Hd (T, f ) for the resulting persistence module; this
is called functional persistence in the literature1 .

       For the remainder of the survey, we will omit any reference to the choice of field k, except
when it is necessary to be explicit.

       When computing homology in multiple degrees, we will want to keep track of all the re-
sulting persistence modules at once. The appropriate algebraic object is a graded persistence
module.

Definition 2.2.6. A graded persistence module M =
                                                                L
                                                                     i∈N   Mi is the direct sum of a fam-
ily of persistence modules indexed over the natural numbers, together with the labeling
that records which factor is associated to which number2 . The graded persistence module
associated to an R-filtered topological space X is then

                                                    M
                                       P H(X) =           P Hi (X)
                                                    i∈N


       Though persistence modules are not vectors, they still live in a metric space. Indeed, the
category k-Mod comes equipped with an extended pseudo-metric: the interleaving distance
dI .

Definition 2.2.7. An -interleaving of persistence modules M and N consists of two families
of morphisms, f (r) : M (r) → N (r + ) and g(r) : N (r) → M (r + ), making the following
four diagrams commute for all r ≤ s.
   1
     Throughout the survey, we will use capital letters such as X and Y to refer to elements of both RTop
and Top. It will always be made clear, either explicitly or from the context, which one is intended.
   2
     Note that the grading here happens in the category of abelian groups, rather than in the category of
modules. That is, the grading does not come with a multiplicative structure.
                                                                                                                       30

              M (r≤s)                                                N (r≤s)
      M (r)                Ms                                N (r)               N (s)
                  f (r)                  f (s)                           g(r)                      g(s)


                         N (r + )               N (s + )                  M (r + )                     M (s + )
                                 N (r+≤s+)                                             M (r+≤s+)


                    M (r≤r+2)                                                  N (r≤r+2)
     M (r)                                   M (r + 2)      N (r)                                        N (r + 2)
                 f (r)          g(r+)                                   g(r)            f (r+)


                    N (r + )                                                  M (r + )

   Intuitively, one can think of such an interleaving as an approximate isomorphism of
persistence modules. Indeed, a 0-interleaving is exactly an isomorphism.

Definition 2.2.8. The interleaving distance dI between M and N is the infimum of values 
for which an -interleaving exists. It satisfies the triangle inequality but can be zero between
non-isomorphic modules, or equal to infinity.

   The category of persistence modules is abelian, which, among other things, allows one to
take direct sums of persistence modules, defined pointwise.

Definition 2.2.9. Let M and N be a pair of persistence modules. We define their direct
sum M ⊕ N to be the persistence module with vector spaces (M ⊕ N )(r) = M (r) ⊕ N (r)
and maps (M ⊕ N )(r ≤ s) = M (r ≤ s) ⊕ N (r ≤ s) for any r ≤ s.

   An indecomposable persistence module is one that cannot be written as the sum of two
nonzero persistence modules. Examples of such modules include the interval persistence
modules kI , defined as follows. Given an interval I ⊂ R, let kI be such that kI (r) = k for
r ∈ I and has rank zero otherwise, and that kI (r ≤ s) = idk for r ≤ s ∈ I and is the zero
map otherwise.
                                                                                                            31


       The category k-Mod contains some wild objects that are difficult to work with. Thus,
it is necessary to restrict our attention to a class of well-behaved persistence modules which
suffices for practical applications:

Definition 2.2.10. We say that a persistence module M is pointwise finite-dimensional
(pfd) if each vector space M (r) is finite dimensional.

       The following theorem asserts that every pfd persistence module has a particularly simple
decomposition into indecomposables, and highlights the important role played by interval
modules in the theory of persistence.

Theorem 2.2.11 ([CB15]). Every pfd persistence module is isomorphic to the direct sum of
interval modules. Moreover, the decomposition is unique up to isomorphism and reordering
of the terms.

Definition 2.2.12. A barcode B is a multi-set of intervals, i.e. intervals can appear with
multiplicity.

Definition 2.2.13. Every barcode B can be associated with a multi-set of points in the
plane, Dg(B), called its persistence diagram. This is done by mapping the interval ha, bi3 to
the point (a, b) ∈ R2 . See Figure 2.1.

       From Theorem 2.2.11, we see that pfd persistence modules admit a complete invariant:
the barcode formed by the collection of intervals involved in the direct sum decomposition
of the module. This terminology comes from plotting the intervals along a common axis, as
in the left-hand side of Figure 2.1.


   3
    This notation is used to specify the endpoints of the interval, without asserting that it be open or closed
at either end.
                                                                                            32

                                                 10


                                                              2


                0                           10


                                                          0                        10

                    Figure 2.1: The persistence diagram associated to a barcode.


   The space of barcodes has a natural metric: the bottleneck distance dB , defined as follows.

Definition 2.2.14. An -matching between multi-sets of intervals I and J is a bijection
between subsets I 0 ⊆ I and J 0 ⊆ J such that if the interval [a, b] = I ∈ I 0 is matched with
the interval [c, d] = J ∈ J 0 then max{|a − c|, |b − d|} ≤ , and such that any interval in
I \ I 0 or J \ J 0 has diameter at most 2. The bottleneck distance between barcodes is the
infimum of values  for which there exists an -matching between them.

Observation 2.2.15. It is not hard to see that dB is a proper metric on the set of persistence
diagrams containing no diagonal points.

   Persistent homology enjoys a variety of stability theorems. We recall here three of the
most fundamental ones:

Theorem 2.2.16 (Algebraic Stability, [CDSGO16, BL14, CCSG+ 09a]). For a pair M of N
of pfd persistence modules with barcodes B(M ), B(N ), the interleaving distance bounds the
bottleneck distance.
                                 dB (B(M ), B(N )) ≤ dI (M, N )

   In fact, the above inequality is an equality, a result known as the isometry theorem, cf.
[Les15, CDSGO16].
                                                                                               33

Theorem 2.2.17 (Geometric Stability, [CDSO14]). Let X and Y be totally bounded metric
spaces whose VR complexes have degree-i persistence modules M and N respectively. If we let
B(M ) and B(N ) denote the respective barcodes of these persistence modules, and dGH (X, Y )
denote the Gromov-Hausdorff distance between these spaces, then


                               dB (B(M ), B(N )) ≤ 2dGH (X, Y ).


Theorem 2.2.18 (Functional Stability, [CDSGO16, CSEH05]). Let X be a topological space,
and let f, g : X → R be two functions whose sublevel sets have finite-dimensional homology
groups. Then (X, f ) and (X, g) give rise to pfd functional persistence modules M and N
with
                                dB (B(M ), B(N )) ≤ kf − gk∞ .

   In the remainder of the survey, we will slightly abuse notation and write dB (M, N ) in
place of dB (B(M ), B(N )).


2.2.2     Extended Persistence

We have seen that a function on a topological space defines two filtrations, using either
the sub-level or super-level sets of the function, giving rise to a pair of persistence diagrams.
However, it is also possible to incorporate the data of both filtrations into a single persistence
diagram, called the extended persistence diagram, and this additional information will be
crucial in formulating our inverse results. This technical section can be skipped by those
readers focusing on the results of the thesis.

Definition 2.2.19. Let f be a real-valued function on a topological space X. The family
{X (−∞,α] }α∈R of sublevel sets of f defines a filtration, that is, it is nested w.r.t. inclusion:
                                                                                                   34

X (−∞,α] ⊆ X (−∞,β] for all α ≤ β ∈ R. The family {X [α,+∞) }α∈R of superlevel sets of f is also
nested but in the opposite direction: X [α,+∞) ⊇ X [β,+∞) for all α ≤ β ∈ R. We can turn it
into a filtration by reversing the order on the real line. Specifically, let Rop = {˜
                                                                                    x | x ∈ R},
ordered by x˜ ≤ y˜ ⇔ x ≥ y. We index the family of superlevel sets by Rop , so now we have
                                                    ˜
a filtration: {X [α,+∞)
                  ˜
                        }α∈R
                         ˜ op , with X
                                       [α,+∞)
                                        ˜
                                                                 ˜ ≤ β˜ ∈ Rop .
                                              ⊆ X [β,+∞) for all α


   Extended persistence connects the two filtrations at infinity as follows. First, replace
each superlevel set X [α,+∞)
                       ˜
                             by the pair of spaces (X, X [α,+∞)
                                                          ˜
                                                                ) in the second filtration. This
                                                                        ˜
maintains the filtration property since we have (X, X [α,+∞)
                                                       ˜
                                                                                        ˜ ≤ β˜ ∈
                                                             ) ⊆ (X, X [β,+∞) ) for all α
Rop . Then, let RExt = R ∪ {+∞} ∪ Rop , where the order is completed by α < +∞ < β˜
for all α ∈ R and β˜ ∈ Rop . This poset is isomorphic to (R, ≤). Finally, define the extended
filtration of f over RExt by:


      Fα = X (−∞,α] for α ∈ R, F+∞ = X ≡ (X, ∅) and Fα˜ = (X, X [α,+∞)
                                                                 ˜
                                                                             ˜ ∈ Rop ,
                                                                       ) for α


where we have identified the space X with the pair of spaces (X, ∅) at infinity. The subfamily
{Fα }α∈R is the ordinary part of the filtration, while {Fα˜ }α∈R
                                                             ˜ op is the relative part.


   Applying the homology functor H∗ to this filtration gives the so-called extended persis-
tence module V of f , which is a family of vector spaces connected by linear maps induced by
the inclusions in the extended filtration. For functions having finitely many critical values
(i.e. those of Morse type), like the distance from a fixed basepoint in a compact metric graph,
the extended persistence module can be decomposed as a finite direct sum of half-open in-
                                               Ln
terval modules—see e.g. [CDSGO12]: V '            k=1   I[bk , dk ), where each summand I[bk , dk ) is
made of copies of the field of coefficients at every index α ∈ [bk , dk ), and of copies of the zero
                                                                                              35

space elsewhere, the maps between copies of the field being identities. Each summand rep-
resents the lifespan of a homological feature (connected component, hole, void, etc.) within
the filtration. More precisely, the birth time bk and death time dk of the feature are given
by the endpoints of the interval. Then, a convenient way to represent the structure of the
module is to plot each interval in the decomposition as a point in the extended plane, whose
coordinates are given by the endpoints. Such a plot is called the extended persistence diagram
of f , denoted Dg(f ). The distinction between ordinary and relative parts of the filtration
allows us to classify the points in Dg(f ) as follows:

  • p = (x, y) is called an ordinary point if x, y ∈ R;

  • p = (x, y) is called a relative point if x˜, y˜ ∈ Rop ;

  • p = (x, y) is called an extended point if x ∈ R, y˜ ∈ Rop ;

Note that ordinary points lie strictly above the diagonal ∆ = {(x, x) | x ∈ R} and relative
points lie strictly below ∆, while extended points can be located anywhere, including on ∆
(e.g. when a connected component lies inside a single critical level). It is common to parti-
tion Dg(f ) according to this classification: Dg(f ) = Ord(f ) t Rel(f ) t Ext+ (f ) t Ext− (f ),
where Ext+ (f ) are those extended points above the diagonal, Ext− (f ) are those below the
diagonal, and by convention Ext+ (f ) includes the extended points located on the diagonal ∆.


   We can extend the bottleneck distance to the space of extended persistence diagrams
in by allowing matchings between points without regard for the labels Ext, Ord, and Rel.
Throughout this thesis, we assume that our extended persistence diagrams come labelled by
dimension, but do not have any labels indicating if a particular point comes from ordinary,
relative, or extended persistence.
                                                                                                 36

   In the setting of this thesis, the utility of using extended persistence in place or ordinary
persistence is twofold. Firstly, it will be crucial for stability, as adding a small loop to a graph
produces a new graph which is similar to the original one in the Gromov-Hausdorff sense
but will not have a similar ordinary persistence diagram, as it now contains a new feature
that lives forever, and hence sits infinitely far away from the diagonal; not so in extended
persistence, where that feature dies shortly after it is born. Secondly, for G contractible
(i.e. a tree), there is no interesting ordinary homology, but the identifications coming from
the relative part of the filtration produce many features whose birth times and death times
correspond to when boundary leaves appear and when branches merge.


   We have the following general stability result:

Theorem 2.2.20 ([CDSGO12], §6.2). Let X be a topological space homeomorphic to the
realization of a simplical complex, and f, g : X → R two continuous functions whose sub-
levelset and superlevelset filtrations give extended persistence barcodes Bf and Bg . Then
dB (Bf , Bg ) ≤ kf − gk∞ .

   Since every combinatorial graph can be turned into a simplical complex by adding extra
vertices along self-loops and multiple edges, and since all the functions we will be considering
are continuous, this result applies in our setting.


2.2.3     Point Cloud Persistence

Data often takes the form of point clouds: finite subsets of Euclidean space Rd . There are
various ways of deriving a filtered topological space from a point cloud, three of which we
recall here. See Figure 2.2 .
                                                                                                          37

Definition 2.2.21. Let P ⊂ Rd be a point cloud. For Q ⊆ P , define τV R (Q) = inf{r |
d(q1 , q2 ) ≤ r/2 ∀q1 , q2 ∈ Q}. Let ∆(P ) be the abstract (|P | − 1)-dimensional simplex on
the points of P . The Vietoris-Rips (VR) filtered complex V R(P ) is (∆(P ), τV R ). In this
complex, a simplex with vertex set Q appears when the r-balls centered at any two points
in Q overlap.

Definition 2.2.22. Let P ⊂ Rd be a point cloud. For Q ⊆ P , define τC (Q) = inf{r |
          B(q, r/2) 6= ∅}. The Čech filtered complex Čech(X) is (∆(P ), τC ). In this complex, a
T
    q∈Q

simplex with vertex set Q appears when the set of r balls {B(q, r)}q∈Q has a common point
of intersection.

Definition 2.2.23. Let P ⊂ Rd be a point cloud. The α-complex α(X) is a sub-complex of
the Čech complex, defined as follows. Every point p ∈ P has an associated Voronoi region
V (p) = {x ∈ Rd | d(x, p) ≤ d(x, q) ∀q ∈ P }. In the α-complex, a simplex with vertex
set Q appears when the collection of subsets {B(q, r) ∩ V (q)}q∈Q has a common point of
intersection.

        Chazal et. al. studied the stability of point cloud persistence in [CDSO14]. It is there that
one can find a proof of the following well-known stability result, albeit greatly generalized:

Theorem 2.2.24. Let P and Q be finite point clouds in Rd . Then for any dimension k,


                             dB (P Hk (Čech(X)), P Hk (Čech(Y ))) ≤ dH (X, Y )


        The following theorem of Chazal et. al. demonstrates stability for Vietoris-Rips persis-
tence in terms of Gromov-Hausdorff distances.

Theorem 2.2.25 ([CCSG+ 09b], Theorem 3.1). Let X and Y be finite metric spaces4 . Then
    4
        In our setting, these will be point clouds, thought of as metric spaces with the induced metric
                                                                                                              38


for any dimension k,


                           dB (P Hk (V R(X)), P Hk (V R(Y ))) ≤ dGH (X, Y )


                      Vietoris-Rips Complex                      ˇ
                                                                 Cech Complex


                                         α-Complex


Figure 2.2: The top-left figure demonstrates the Vietoris-Rips of a point cloud at a fixed scale parameter.
Note the existence of a green, solid tetrahedron. In the top-right figure, the Čech complex is considered.
This differs from the Vietoris-Rips complex in that the interior of the tetrahedron is not filled in, as the four
associated balls do not have a common point of intersection. Lastly, the bottom-center figure demonstrates
the α-Complex. Because of the restrictions imposed by the Voronoi diagram, certain edges present in the
Vietoris-Rips and Čech complexes are not included here.
                                                                                                      39


2.2.4          Morse-Type Functions

Before studying the theory of persistence for Reeb graphs, we identify an important condition
that rules out wild graphs: being of Morse type. This technical section can be skipped by
those readers focusing on the results of the thesis.

Definition 2.2.26. Let (X, f ) be a pair consisting of a topological space and a real-valued
function. For an interval I ⊆ R, define X I = f −1 (I) to be the preimage of this interval.

Definition 2.2.27. A continuous real-valued function f on a topological space X is of Morse
type if:
    (i) there is a finite set Crit(f ) = {a1 < ... < an } ⊂ R, called the set of critical values,
such that over every open interval (a0 = −∞, a1 ), ..., (ai , ai+1 ), ..., (an , an+1 = +∞) there is a
compact and locally connected space Yi and a homeomorphism µi : Yi ×(ai , ai+1 ) → X (ai ,ai+1 )
such that ∀i = 0, ..., n, f |X (ai ,ai+1 ) = π2 ◦µ−1
                                                  i , where π2 is the projection onto the second factor;

                                                                 ¯i : Yi × [ai , ai+1 ] → X [ai ,ai+1 ] ;
    (ii) ∀i = 1, ..., n − 1, µi extends to a continuous function µ
                         ¯0 : Y0 ×(−∞, a1 ] → X (−∞,a1 ] and µn extends to µ
similarly, µ0 extends to µ                                                 ¯n : Yn ×[an , +∞) →
X [an ,+∞) ;
    (iii) Each levelset f −1 (t) has a finitely-generated homology.


2.2.5          Persistence for Reeb Graphs

This technical section can be skipped by those readers focusing on the results of the thesis.


    We have seen in Section 2.1.3 that the space of Reeb graphs can be given a metric
structure using the FD distance. At the same time, Reeb graphs can also be compared via the
Bottleneck distance between their 1-dimensional extended persistence barcodes. When our
                                                                                                  40

Reeb graphs are of Morse type, the following result from [CO17] relates these two distances.

Theorem 2.2.28 ([CO17]). Let Rf and Rg be two Reeb graphs, with critical values {a1 , · · · , an }
and {b1 , · · · , bm }. Let af = mini {ai+1 − ai }, and ag = mini {bi+1 − bi }. Let K ∈ (0, 1/22]. If
dF D (Rf , Rg ) ≤ max{af , ag }/(8(1 + 22K)), then


                         KdF D (Rf , Rg ) ≤ dB (Rf , Rg ) ≤ 2dF D (Rf , Rg )


   where dB (Rf , Rg ) is the bottleneck distance between their respective extended persistence
barcodes. Note that the upper bound on the bottleneck distance holds more generally for any
pair of Reeb graphs; the qualifications are necessary only for the lower bound.

   Incorporated in this theorem are the two facts that taking extended persistence barcodes
is stable and locally injective (and in fact locally bilipschitz). In this context, local injectivity
means that a fixed Reeb graph Rf has a distinct barcode from any nearby Reeb graphs;
however, there may be a pair of Reeb graphs near Rf with the same barcode. Put another
way, the operation of assigning a barcode to a Reeb graph is not fully injective on the small
open ball described in the theorem – comparisons are only allowed with the fixed reference
Reeb graph at the center of the ball. In fact, generically speaking, there exist arbitrarily
close pairs of Reeb graphs with the same persistence diagram.


Dictionary

An explicit dictionary between geometric features in a Reeb graph and the points in its
extended persistence diagram is given in [BGW14]. We recall it here.


   A downfork is a point v ∈ X together with a pair of adjacent directions along which f
is decreasing (if v is a leaf vertex one direction suffices), and an upfork is a point u ∈ X
                                                                                                41

together with a pair of adjacent directions along which f is increasing (if v is a leaf vertex,
one direction suffices here too). We distinguish between two types of downforks: ordinary
downforks where the two adjacent directions sit in different connected components of the
sublevelset f −1 ((−∞, f (v)]), and essential downforks where they are in the same connected
component. There is a similar dichotomy for upforks. An ordinary downfork v is paired with
two local minima of f in the following way. The two adjacent directions sit in distinct con-
nected components C1 and C2 , and for (X, f ) of Morse type these have unique minima x1 and
x2 . Suppose that f (x1 ) < f (x2 ). Then this pairing corresponds to the point (f (x2 ), f (v)) in
the zero-dimensional persistence diagram, as the connected components born at height f (x2 )
merges into the one born at f (x1 ) at f (v). An identical procedure for (X, −f ) will see a
pairing of ordinary upforks with maxima that give rise to intervals in the relative part of the
persistent homology. Lastly, there is a pairing of essential downforks with essential upforks.
An essential downfork at v corresponds to an point of the form (f (v), ·) in the extended part
of persistence diagram. Consider the set S of paths γ in the sublevel set f −1 ((−∞, f (v)])
terminating at either end at the two downfork directions adjacent to v. To each such path
γ we can associate the quantity min f (γ), and so consider the subset S 0 ⊂ S of paths which
maximize the quantity min f (γ). For each such path, the minimum value of f is achieved
at an essential upfork w, and f (w) is the death time of the point (·, f (v)). This pairing
of v with w is not exclusive, in the sense that there may be different paths γ maximizing
min f (γ), and their corresponding upforks w may be distinct. However, the resulting interval
is well-defined, as all the upforks w have the same f -value. Moreover, every essential upfork
w shows up in such a pairing, which can be found by considering w as a downfork in (X, −f )
and applying the pairing procedure described above.


   Thus we have the following dictionary. Note that the dictionary does not promise a
                                                                                            42

canonical or unique assignment of a downfork or upfork to an interval in the persistence
diagram, nor vice versa. Rather, it is meant to assert that the existence of certain intervals
guarantees the existence of certain downforks or upforks, and vice versa.


                  Dictionary for Extended Persistence of Reeb Graphs

         Reeb Graph                             Persistence Diagram

    Ordinary downfork v        Point of the form (·, f (v)) in ordinary H0 persistence.
      Ordinary upfork w        Point of the form (·, f (w)) in relative H1 persistence.
     Essential downfork v     Point of the form (f (v), ·) in extended H1 persistence.
      Essential upfork w      Point of the form (·, f (w)) in extended H1 persistence.

Lemma 2.2.29. Given a metric graph G and a point p ∈ G, consider the Reeb graph Rf
associated to the function fp (·) = dG (p, ·). Any upfork in Rf distinct from p is necessarily
a vertex of valence at least three. Taken together with the prior dictionary, this implies that
nonzero death times in the persistence diagram of Rf correspond to the distances from p to
vertices of valence at least three.

Proof. Let x 6= p be a point in G which is not a vertex of valence at least three. Then there
are at most two adjacent directions to x in G. It is impossible for such a point to be an
upfork, as at least one of the directions adjacent to it is the initial segment of a geodesic
from x to p. This leaves at most one direction along which the distance to p (and hence the
value of f ) can be increasing.

Remark 2.2.30. Consider again the setting of Lemma 2.2.29, in which the Reeb graph is
constructed using a function of the form dG (p, ·). Although nonzero death times correspond
to distances from vertices of valence at least three, a birth time need not correspond to
the distance to any vertex. Consider for example a graph G with three vertices v1 , v2 , v3 at
                                                                                           43

pairwise distance 1, forming an equilateral triangle. The only point in the one-dimensional
persistence diagram ΨG (v1 ) corresponds to the loop generated by the full graph, and is born
at distance 1.5, the radius of G at v1 . In the Reeb graph ΦG (p), this is the distance from p
to the downfork point halfway between v2 and v3 .

Lemma 2.2.31. Given a graph G and a point p ∈ G, consider the Reeb graph Rf associated
to the function fp (·) = dG (p, ·). The smallest nonzero death time in ΨG (p) is the distance
from p to the closest vertex of valence at least three.

Proof. This is immediate from Lemma 2.2.29.


2.2.6     Euler Calculus

This technical section can be skipped by those readers focusing on the results of the thesis.

Definition 2.2.32. For a topological space X with finitely many nonzero Betti numbers,
the Euler Characteristic χ(X) is defined to be the alternating sum of these Betti numbers:

                                               ∞
                                                   (−1)n βn (X)
                                               X
                                   χ(X) =
                                               n=0


   Let X be a real analytic manifold, and write CF (X) for the space of constructible func-
tions on X. These are Z-valued functions whose level sets are subanalytic and form a locally
finite family.

Definition 2.2.33. For a function φ ∈ CF (X), we define its Euler integral to be

                         Z               X
                              φ(x)dχ =         m χ({x ∈ X | φ(x) = m})
                          X              m∈Z
                                                                                                    44

Definition 2.2.34. A morphism f : X → Y of real analytic manifolds induces a pullback
map f ∗ : CF (Y ) → CF (X) defined by (f ∗ φ)(x) = φ(f (x)) for φ ∈ CF (Y ).

Definition 2.2.35. A morphism f : X → Y of real analytic manifolds induces a pushforward
map f∗ : CF (X) → CF (Y ) defined by (f∗ φ)(y) =                   φ 1f −1 (y) dχ for φ ∈ CF (X).
                                                           R
                                                               X


   These operations, taken together, allow us to define the following topological transform.

Definition 2.2.36. Let S ⊂ X × Y be a locally closed subanalytic subset of the product of
two real analytic manifolds. Let πX and πY be the projections from X × Y onto each of its
factors. The Radon transform with respect to S is the group homomorphism RS : CF (X) →
CF (Y ) defined by RS (φ) = (πY )∗ [(πX )∗ (φ)1S ] for φ ∈ CF (X).

   Schapira [Sch95] provides the following inversion theorem.

Theorem 2.2.37 (Thm. 3.1 in [Sch95]). Let S ⊂ X × Y and S 0 ⊂ Y × X define a pair of
Radon transforms RS : CF (X) → CF (Y ) and RS 0 : CF (Y ) → CF (X). Denoting by S and
S 0 the closure of these subsets, suppose that the projections πY : S → Y and πX : S 0 → X
are proper. Suppose further that there exists χ1 , χ2 ∈ Z such that, for any x ∈ X, the fibers
Sx = {y ∈ Y : (x, y) ∈ S} and Sx0 = {y ∈ Y : (y, x) ∈ S 0 } satisfy the following criterion:

                                                     
                                                               if x = x0
                                                     
                                                     χ1
                                                     
                                                     
                                χ(Sx ∩   Sx0 )   =
                                                     
                                                               if x 6= x0
                                                     
                                                     χ2
                                                     


   Then for all φ ∈ CF (X),

                                                                     Z        
                       (RS 0 ◦ RS )(φ) = (χ1 − χ2 ) φ + χ2                  φ dχ 1X
                                                                        X


   In particular, if χ1 6= χ2 then the scaling term in (RS 0 ◦ Rs ) is constant and nonzero.
                                                                    CHAPTER 3


Prior Work


The focus of this thesis is on inverse problems for topological transforms of intrinsic shapes.
In this section, we review some of the work done on other topological transforms, with an
emphasis on the persistent homology transform.


Multi-Cover Persistence

In [EO18], Edelsbrunner and Osang consider a generalization of Čech persistence. For a
point cloud P ⊂ Rd , a real parameter r ∈ R≥0 , and a natural number n ∈ N>0 , they define
the subset Xn,r ⊂ Rd as follows: x ∈ Xn,r iff the ball B(x, r) contains at least n points of
P . This gives rise to a family of filtrations on Rd , either by fixing n and varying r or fixing
r and varying n. The focus of [EO18] is on the computational aspects of these invariants.
However, it is not hard to see that the histogram of distances in P can be recovered from these


                                               45
                                                                                           46


persistence diagrams. The work of [BK04] shows that this generically suffices to determine
P . Moreover, the work of Dokmanic et. al. in [DPRV15] demonstrates how to compute P
in practice (cf. the section titled “unlabeled distances").


The Persistent Homology Transform

Our second topological transform of the persistent homology transform, defined by Turner
et. al. in [TMB14]. The input to the PHT is a subanalytic compact subset M of Rd . For
every direction v ∈ Sd−1 , one considers the function fv : M → R, given by fv (x) = v · x (see
Figure 3.1). The output of the PHT is then the map P HT (M ) : Sd−1 → k-Mod, sending a
unit vector v to the persistence module P H(S, fv ).


                                                               Sd−1
                                          M


                                                              v


                                    Figure 3.1: The map fv .


   Another, simplified invariant considered in [TMB14] is the Euler Characteristic transform
(ECT). This is similar to the PHT, but instead of recording the sublevel-set persistence of
                                                                                           47

the functions fv , one computes their Euler Characteristic curves:


                            EC(fv )(t) = χ({x ∈ M | fv (x) ≤ t})


   If one writes {R → Z} for the space of integer-valued functions on the real line, then this
transform is the map ECT (M ) : Sd−1 → {R → Z} that sends a unit vector v to the function
EC(fv ). The codomain of the map ECT (M ) lives in a Hilbert space, making it amenable
to methods in classical statistics and machine learning. Indeed, Turner et al. show how
to use the ECT to turn a set of meshes into a likelihood model on the space of embedded
simplicial complexes. More precisely, they prove that, for d = 2, 3, both of these transforms
are injective, and hence provide sufficient statistics for probability measures on the space
of linearly embedded simplicial complexes. Moreover, they provide an explicit algorithm to
reconstruct M from P HT (M ).


   Recent work of Ghrist et al. [GLM18] and, independently, of Curry et al. [CMT18], using
ideas of Schapira [Sch95], demonstrates the injectivity of the ECT in all dimensions, and
for the larger class of subanalytic compact sets. Because the Euler Characteristic curve of
the functions fv can be derived from their persistence module, this, in turn, implies the
injectivity of the PHT. These proofs of injectivity use the theory of constructible functions
and Euler-Radon transforms, circumventing the involved, constructive arguments used in
[TMB14]. In particular, they use Schapira’s inversion theorem, Theorem 2.2.37 in Section
2.2.6. To take advantage of this theorem, [GLM18] define a Radon transform that can be
computed using the ECT, and then find an appropriate “inverse" Radon transform.


   Let X = Rd and Y = AffGrd , the affine Grassmanian of hyperplanes in Rd . Let S ⊂ X×Y
                                                                                          48

be the set of pairs (x, W ), where the point x sits on the hyperplane W . Letting 1M be the
indicator function of a bounded subanalytic subset M ⊂ Rd , and π1 and π2 the projections
of X × Y onto X and Y respectively, we compute:


                               (Rs 1M )(W ) = (π2 )∗ [(π1∗ 1M )1S ](W )
                                                 Z
                                             =              (π1∗ 1M )dχ
                                                 (x,W )∈S
                                                 Z
                                             =              dχ
                                                 x∈M ∩W

                                             = χ(M ∩ W )


   To see that χ(M ∩ W ) can be computed from the ECT, let W be defined by some unit
vector v and scalar t, i.e. W = {x : x · v = t}. Then, using the inclusion-exclusion property
of the Euler characteristic:


            χ(M ∩ W ) = χ({x ∈ M : x · v = t})

                        = χ({x ∈ M : x · v ≤ t} ∩ {x ∈ M : x · (−v) ≤ −t})

                        = χ({x ∈ M : x · v ≤ t}) + χ({x ∈ M : x · (−v) ≤ −t})

                        − χ(M )

                        = ECT (M )(v, t) + ECT (M )(−v, −t) − ECT (M )(v, ∞),


where ECT (M )(v, ∞) is defined to be lim ECT (M )(v, t), which converges to χ(M ) when
                                           t→+∞

M is bounded.


   Thus, if the Radon transform RS is injective, so is the ECT, as if ECT (M ) = ECT (M 0 )
for a pair of subanalytic subsets M, M 0 ⊂ Rd then RS 1M = RS 1M 0 . What remains to be
                                                                                               49

shown, then, is that RS is indeed injective. We take S 0 ⊂ Y × X to consist of pairs (W, x)
where x lies on the hyperplane W . To apply Theorem 2.2.37, we consider the intersection of
fibers in S and S 0 . For a fixed x ∈ X, Sx = Sx0 ⊂ Y is the set of hyperplanes passing through
x, which is homeomorphic to the projective space RP d−1 , which has Euler characteristic


                                    0                 1
                       χ1 = χ(Sx ∩ Sx ) = χ(RP d−1 ) = (1 + (−1)d−1 )
                                                      2

                                                                           0
   For a pair of distinct points x 6= x0 , the intersection of fibers Sx ∩ Sx0 ⊂ Y consists of all
hyperplanes intersecting both of these points, a subset homeomorphic to RP d−2 . Thus


                                   0                  1
                      χ2 = χ(Sx ∩ Sx0 ) = χ(RP d−2 ) = (1 + (−1)d−2 )
                                                      2

   By Theorem 2.2.37,

                                                  1
                  (RS 0 ◦ RS )(1M ) = (−1)d−1 1M + (1 + (−1)d−2 )χ(M )1Rd
                                                  2

   Thus, if RS 1M = RS 1M 0 , then, composing with RS 0 and applying the above formula and
rearranging terms, we obtain:

                                       1
                 (−1)d−1 (1M − 1M 0 ) = (1 + (−1)d−2 )(χ(M 0 ) − χ(M ))1Rd
                                       2

   The right-hand side is a constant function, and so the left-hand side must be too. The
difference of two non-zero indicator functions is constant precisely when it is equal to zero,
so that 1M = 1M 0 and hence M = M 0 , demonstrating injectivity.
                                                                                              50


How many directions suffice?

The injectivity results of [TMB14, GLM18] require us to compute the PHT or ECT for every
vector on the sphere Sd−1 . Thus it is natural to ask if injectivity can be obtained with only
finitely many directions. We should clarify that we are not asking for finitely many fixed
directions to distinguish an infinite family of shapes. Rather, we would like to know if the
identity of a given subanalytic set S can be inferred by computing and comparing the PHT
or ECT along a finite sequence of directions, with these directions being chosen in real time.
There are two positive results in this vein, both restricted to the case of simplicial complexes,
rather than arbitrary subanalytic sets.


   The first result is that of [BFM+ 18], specifically for the case of planar graphs. They
demonstrate how to use three directions on the circle S1 to determine the location of the
vertices of a planar graph S. The first two direction vectors are (1, 0) and (0, 1), and the
third direction can be computed using the persistence modules derived from the first two. If
S has n vertices, this vertex-localizing algorithm runs in O(n log n) time. Once the locations
of the vertices are identified, one tests for the existence of an edge between pairs of vertices
by using another three persistence modules (the directions of which are derived from the
locations of the vertices). This pair-wise checking for edges introduces a quadratic term into
the running time:

Theorem 3.0.1 (Thm. 11 in [BFM+ 18]). Let M be a linear plane graph with n vertices.
The vertices, edges, and exact embedding of M can be determined using persistence modules
along O(n2 ) different directions.

   The second result, proved in [CMT18], applies to finite, linearly embedded simplicial
complexes S ⊂ Rd for any dimension d. However, their bound on the number of directions is
                                                                                           51

not simply a function of the number of vertices in S, but also of its geometry. In particular,
it depends on the following three constants.

  • d – the embedding dimension.

  • δ – a constant with the following property: for any vertex x ∈ M there is a ball B of
     radius δ in the sphere Sd−1 , such that that for all v ∈ B the Euler curve of fv changes
     values at t = v · x. If one works with the PHT instead of the ECT, the analogous
     requirement is that the persistent homology coming from fv has an off-diagonal point
     with birth or death value v · x. These conditions ensure that the vertex x is observable
     for the ECT or PHT in some simple set of positive measure. Put geometrically, it
     ensures that S is not “too flat" around any vertex.

  • k – the maximum number of homological critical values for fv for any v ∈ Sd−1 ,
     i.e. values at which the Euler characteristic of a sublevel set changes (assuming this
     quantity is finite). If one works with the PHT instead of the ECT, one considers
     homological critical values instead, where the homology of a sublevel set changes.

   They show the following finiteness result:

Theorem 3.0.2 (Thm. 7.1 in [CMT18]). For either the ECT or the PHT, let M ⊂ Rd be
a linearly embedded simplicial complex, with appropriate constants δ, k as in the prior de-
scription. Then there is a constant ∆(d, δ, k) such that M can be determined using ∆(d, δ, k)
directions of the chosen transform.

   The proof of this theorem is a multi-part algorithm, where the data computed at each step
is passed forward as input to the next step. To begin, they show that, for a fixed d, an upper
bound on k and a lower bound on δ provide a bound on the total number of vertices in M
                                                                                                           52

(Lemma 7.4 in [CMT18]). They then show that, given any sufficiently large collection of δ-
nets on the sphere, the resulting set of directions can be used to determine the location of the
vertices in M (Proposition 7.1 in [CMT18]). With the location of the vertices identified, one
                                                                         hS                                 i
defines the following hyperplane arrangement in Rd : W (V ) =                                               T
                                                                              (v1 ,v2 )∈{V ×V −∆} (v1 − v2 ) ,

where V is the vertex set of M , and where ∆ is the diagonal in V × V . That is, W (V ) is the
union of all the hyperplanes in Rd orthogonal to the differences of pairs of distinct vertices in
                                                        
V . The connected components of Sd−1 ∩ Rd \ W (V ) are the (d − 1)-dimensional strata of
the stratification of the sphere induced by W (V ). The crucial observation to be made is that
any two directions in the same top-dimensional stratum induce the same ordering on the
simplices of M . Thus, given the ECT or PHT for any one direction in a stratum, it is possible
to parametrize the ECT or PHT for all the other directions, provided the locations of the
vertices V are known (Lemma 5.3, Proposition 5.2 in [CMT18]). Thus, after identifying the
set V in the prior step, computing the hyperplane arrangement W (V ), and picking a test
direction in each top-dimensional stratum, one has enough data to deduce the ECT or PHT
                              
on all of Sd−1 ∩ Rd \ W (V ) , and, by continuity, on the entire sphere Sd−1 . Since the ECT
or PHT on the full sphere determines the simplicial complex M by prior injectivity results,
we can ultimately deduce M itself. The total number of directions needed in this procedure
is                                                     
                                               !d−1                     δ               !2d
                                       2δ                           2             dk
                                                             
              ∆(d, δ, k) = (d − 1)k                  + 1       1+          + O d−1
                                     sin(δ)                         δ            δ

     The proofs in both [BFM+ 18] and [CMT18] rely heavily on the simplicial complex struc-
ture of M , and there are presently no finiteness results known for more general subanalytic
sets.
                                                                                                     53


Sample ECT Code

The author of thesis maintains a small GitHub repository with simple, unoptimized Python
code for computing and comparing Euler Characteristic Transforms of 2D images [Sol18].
The code samples the ECT along a finite set of directions for each image, and sets the
distance between images to be the sum of the L2 norms between smoothed Euler curves
in matching directions. The choice and number of directions, smoothing parameter, and
resulting classifier all have an impact on the prediction accuracy, although this is not well
understood on a theoretical level at the moment. See Figure 3.2.


Figure 3.2: Left: greyscale image of a handwritten letter in the Devanagari alphabet, used in many North
Indian languages. Right: the ECT of the above letter, taken in the direction v = h1, 1i. Superimposed on
the ECT is a smoothed version, obtained via convolution. Images reproduced from [Sol18].
                                                                CHAPTER 4


The Intrinsic Persistent Homology Transform


In Chapter 3, we considered injective topological transform for extrinsic, embedded shapes.
In the next two sections, we consider ways of defining topological transforms for intrinsic
shapes.


4.1       Motivation

Let (X, dX ) be a general metric space. Because X does not come with a Euclidean embed-
ding, one cannot talk about filtering X along a direction. One might try to remedy this by
choosing an embedding for X, but this cannot be done, in general, without distorting the
geometry of X. More importantly, the choice of an embedding imposes unnecessary addi-
tional structure on X, and we may not want our invariant to depend on the choice of our
embedding. Indeed, the persistent homology transform is not invariant under translations


                                            54
                                                                                           55

of our embedded shape, let alone more complex isometries.


   The alternative approach is to find intrinsic analogues of the height functions fv of Sec-
tion 3, using only the topological and geometric data of (X, dX ). The choice adopted in this
chapter is to consider functions of the form fx = dX (x, ·), where x ∈ X is a fixed basepoint.
This gives a family of functions parameterized by the space X itself. The stability theory of
such invariants was investigated by Carrière, Oudot, and Ovsjanikov in [COO15].


   In general, the inverse problem for this topological transform appears very challenging to
study. The results in this chapter focus on the special case of metric graphs, which, despite
their relative simplicity, are complex enough to model many real-world data sets. Later on
in this thesis, in Chapter 5.1, we consider variants of this transform whose inverse problems
are more tractable.


   The original work contained in this chapter is a result of collaboration between the author
and Steve Oudot.


4.2        The IPHT and IECT

In this section, we review and generalize upon some constructions and results of Dey et. al.
in [DSW15], from a metric geometry perspective.

Definition 4.2.1. A pointed metric graph (G, x) is a pair of a graph G and a choice of
basepoint x ∈ G. We will use the notation PointedMGraphs to refer to the space of such
objects.
                                                                                           56

Definition 4.2.2. Let (G, x) be a pointed metric graph. We will write Φ(G, x) to denote
the Reeb graph given by the function fx : G → R defined by fx (y) = dG (x, y). If we fix a
compact metric graph G then we can view Φ as a function solely of this basepoint, writing
ΦG (x) = Φ(G, x).

   The maps Φ and ΦG are identical in nature but differ in their domain: the former is
defined on the entire space of pointed graphs, whereas the latter is defined on a fixed graph
G. Now, given a Reeb graph we can compute its extended persistence diagram, and we
denote this operation by ExDg. Composing Φ and ΦG with ExDg produces our second pair
of maps.

Definition 4.2.3. Let (G, x) be a pointed metric graph. Define Ψ(G, x) = ExDg ◦Φ(G, x)
to be the extended persistence diagram of the Reeb graph associated to (G, x). Similarly,
for a fixed metric graph G, we define ΨG (x) = ExDg ◦ΦG (x).

   Lastly, we can turn our extended persistence diagrams into Euler curves.

Definition 4.2.4. Let (G, dG ) be a compact metric graph, p ∈ G a basepoint, and let ΨG (p)
be the extended persistence diagram of the pair (G, fp ), with barcodes B0 and B1 in degrees
0 and 1 respectively. We define χ(G, p) = χG (p) to be the following alternating sum of
indicator functions of intervals:

                                               X             X
                                    χG (p) =          1I −          1I
                                               I∈B0          I∈B1


   The function χG (p) is an element of ZR . Viewing the latter as a subspace of continuous
real-valued functions on the real line, we can equip it with the Lp norm for any p ∈ [1, ∞].
Note that χG (p), as the finite sum or difference of indicator functions of bounded intervals,
is always integrable.
                                                                                             57

Remark 4.2.5. Note that χG (p) combines the information of the degree-one Betti curves for
both fp and −fp .

   The following diagram introduces the spaces and maps involved in our constructions; G
is any fixed compact metric graph.
                                                        χ

                                               Ψ

                                          Φ            ExDg
                PointedMGraphs                 Reeb           Barcodes       ZR
                                  ΦG


                          G                   ΨG
                                                      χG


   The following results demonstrate that ΦG ,ΨG , and χG are Lipschitz and that similar
graphs produce similar Reeb graphs, barcodes, and extended Euler curves.

Lemma 4.2.6. Fix a compact metric graph G. Let p, q ∈ G be any two basepoints. Then
dF D (ΦG (p), ΦG (p0 )) ≤ dG (p, q) and dB (ΨG (p), ΨG (q)) ≤ dG (p, q).

Proof. The L∞ distance between the two distance functions d(p, ·) and d(q, ·) is bounded by
dG (p, q) by the triangle inequality. Thus the first claim can be seen by considering by setting
φ and ψ equal to the identity map in the definition of the functional distortion distance, and
the second follows from Theorem 2.2.20.

Theorem 4.2.7 ([DSW15], §3). Let G, G0 be a pair of compact metric graphs, and let M be
a correspondence between them realizing the Gromov-Hausdorff distance δ = dGH (G, G0 ). If
p ∈ G and p0 ∈ G0 are a pair of points with (p, p0 ) ∈ M then the two Reeb graphs ΦG (p) and
ΦG0 (p0 ) are within 6δ of each other in the functional distortion distance, and the resulting
barcodes ΨG (p) and ΨG0 (p0 ) are within 18δ of each other in the bottleneck distance.
                                                                                             58

Lemma 4.2.8. Let G = (V, E, dG ) and G0 = (V 0 , E 0 , dG0 ) be a pair of compact metric graphs.
Define
                     N = max{deg(G) − 2|V | + 2, deg(G0 ) − 2|V 0 | + 2}

Then for all p ∈ G, p0 ∈ G0 , α ∈ (1, ∞)


                                                                        1/α
                      kχG (p) − χG0 (p0 )kα ≤ 2N (2dB (ΨG (p), ΨG0 (p0 ))


Proof. We first claim that ΨG (p) has at most deg(G) − 2|V | + 2 bars. To see this, recall
from Section 2.2.5 that every nonzero death time in ΨG (p) comes from an upfork at a vertex
v ∈ G. The maximum number of death times contributed by a given vertex is (deg(v) − 2),
as at least one direction adjacent to v is part of a geodesic from v to p (assuming p 6= v),
and k upfork directions gives k − 1 death times. Summing this quantity over the vertices of
G gives deg(G) − 2|V |. The remaining zero death times come from p itself. If p is a vertex,
then we have to add back one of the death times we omitted in degree one, as well as the
death time in degree 0. If p is not a vertex, and hence has degree two, then there are again
two zero death times, one in degree one and the other in degree zero.


   We thus see that B = ΨG (p) and B 0 = ΨG0 (p0 ) have at most N bars apiece. Suppose
that dB (B, B 0 ) = . This implies that one can transform B into B 0 by:

  • Deleting intervals of length less than 2.

  • Adding intervals of length less than 2.

  • Extending or shrinking the endpoints of intervals in B by at most  on each side.

As B 0 has at most N bars, the number of new or adjusted intervals (the second and third
                                                                                             59

bullet points) is at most N . Likewise, the number of deleted intervals (the first bullet point)
is at most N . This implies that there is a collection of sets S1 , · · · , Sn , n ≤ 2N , with
corresponding indicator functions hi = 1Ii and integers ni ∈ {±1} such that

                                                       n
                                χG (p) − χG0 (p0 ) =         (−1)ni hi
                                                       X

                                                       i=1


                                           |Si | ≤ 2

   Using the Minkowski inequality,

                                                        n
                            kχG (p) − χG0 (p0 )kα = k         (−1)ni hi kα
                                                        X

                                                        i=1
                                                       n
                                                       X
                                                 ≤           khi kα
                                                       i=1
                                                        n
                                                             (2)1/α
                                                       X
                                                 ≤
                                                       i=1

                                                 ≤ 2N (2)1/α


Corollary 4.2.9. For a fixed compact metric graph G, let p, q ∈ G be any two basepoints.
Then for α ∈ [1, ∞),


                  kχG (p) − χG (p0 )kα ≤ 2(deg(G) − 2|V | + 2) (2dG (p, q))1/α


When α = 1, we see that χG is Lipschitz.

Corollary 4.2.10. Let G, G0 be a pair of compact metric graphs, and let M be a correspon-
dence between them realizing the Gromov-Hausdorff distance δ = dGH (G, G0 ). Let p ∈ G and
                                                                                          60

p0 ∈ G0 be a pair of points with (p, p0 ) ∈ M. Define


                     N = max{deg(G) − 2|V | + 2, deg(G0 ) − 2|V 0 | + 2}


Then
                              kχG (p) − χG0 (p0 )kα ≤ 2N (36δ)1/α

   The next step in our construction is to consider the collection of Reeb graphs produced by
varying the basepoint along a fixed compact metric graph. The set of extended persistence
diagrams for these Reeb graphs is our intended object of study. For a topological space X,
the notation C(X) refers to the set of compact subsets of X.

Definition 4.2.11. For a fixed compact metric graph G, define the intrinsic Reeb transform
of G to be the collection of Reeb graphs IRT (G) = {ΦG (x) | x ∈ G}. The resulting set
of barcodes IP HT (G) = {ΨG (x) | x ∈ G} will be called the intrinsic persistent homology
transform of G. Topologize ZR using any of the equivalent Lα metrics for α ∈ [1, ∞). The
corresponding set of extended euler curves IECT (G) = {χG (x) | x ∈ G} will be called the
intrinsic Euler characteristic transform.
                                    IP HT


                            IRT              ExDg                   χ
              MGraphs             C(Reeb)            C(Barcodes)           C(ZR )


                                              IECT


   If one starts instead with a metric measure graph (G, µ), then the maps ΦG and ΨG can
be used to push forward the measure on G to measures on the spaces Reeb and Barcodes
respectively. For a measurable space Y , the notation P(Y ) refers to the space of Borel
probability measures on Y .
                                                                                         61

Definition 4.2.12. For a fixed compact metric measure graph (G, µ), define the pushfor-
ward measures IRM T (G, µ) = (ΦG )∗ (µ) ∈ P(Reeb) and IP HM T (G, µ) = (ΨG )∗ (µ) ∈
P(Barcodes) as the intrinsic Reeb measure transform and intrinsic persistent homology
measure transform respectively.
                                           IP HM T


                                  IRM T              ExDg
                    MMGraphs              P(Reeb)           P(Barcodes)

   Our focus will mainly be on the maps IP HT, IECT and IP HM T ; the maps IRT and
IRM T are defined for the sake of completeness but are of little independent interest, as
we will see that the original metric graph can be recovered from any of its associated Reeb
graphs (Remark 4.5.4), so nothing is gained from taking infinitely many.


   The following lemma demonstrates that the intrinsic persistent homology transform of a
metric graph has a natural metric structure. Note that this is always the case for the IECT,
as the Lα distance is a proper metric.

Lemma 4.2.13. Let G be a metric graph. The bottleneck distance dB restricted to IP HT (G)
is always a true metric.

Proof. We distinguish two cases: either G is a graph consisting of a single point or it is
not. If G is a single point, IP HT (G) is a single barcode consisting of the point (0, 0) on
the diagonal, and dB restricts to give the trivial metric on this space. Otherwise, if G not
the one-point graph, the dictionary of section 2.2.5 implies that none of the Reeb graphs it
produces have barcodes with points on the diagonal, and as mentioned in Section 2.2.1 the
bottleneck distance is a metric if our barcodes avoid the diagonal.
                                                                                         62

   Our three transforms can be used to define pseudometrics on the space of metric graphs
and metric measure graphs respectively, as in [DSW15].

Definition 4.2.14. Let G, H be a pair of compact metric graphs.           Define the persis-
tence distortion pseudometric dP D (G, H) = dB
                                             H (IP HT (G), IP HT (H)) to be the Hausdorff

distance between their corresponding subsets of Barcode space. For α ∈ [1, ∞), define
                                                       α
the α-Euler distortion pseudometric dαED (G, H) = dLH (IECT (G), IECT (H)). Lastly, if
(G, µG ) and (H, µH ) are a pair of metric measure graphs, we define the measured persis-
tence distortion pseudometric using the ∞-Wasserstein metric, dM P D ((G, µG ), (H, µH )) =
dW,∞ (IP HM T (G, µG ), IP HM T (H, µH )).


4.3     Stability Results

The following stability theorem suggests that the persistence distortion can provide a weak
lower bound to the intractable Gromov-Hausdorff distance, and is Theorem 3 in [DSW15],
with the constant changed from 6 to 18 as per the comment following Theorem 4, since we
are using 1-dimensional persistence in addition to 0-dimensional.

Theorem 4.3.1 ([DSW15]). For a pair of metric graphs G, H, dP D (G, H) ≤ 18 dGH (G, H).

   Applying Corollary 4.2.10, we obtain the following, analogous theorem for the IECT.

Theorem 4.3.2. For a pair of metric graphs G, H, define


                    N = max{deg(G) − 2|V | + 2, deg(G0 ) − 2|V 0 | + 2}


dP D (G, H) ≤ 18 dGH (G, H). Take α ∈ [1, ∞). Then


                            dαED (G, H) ≤ 2N (36dGH (G, H))1/α
                                                                                           63

   Lastly, we have the following result for the measured persistence distortion.

Theorem 4.3.3. For a pair of (full-support) metric measure graphs (G, µG ) and (H, µH ),


                    dM P D ((G, µG ), (H, µH )) ≤ 18 D∞ ((G, µG ), (H, µH ))


   The rest of this section is devoted to the proof of Theorem 4.3.3.

Proof. The strategy of this proof is to show that a measure coupling between a pair of
metric-measure graphs gives rise to a low-distortion correspondence between their supports,
which pushes forward to a low-distortion correspondence between subsets of barcode space.
We then show that this correspondence between subsets of barcode space is dense in the
support of the pushforward of the measure coupling, so that the pushforward measures ad-
mit a coupling with support very close to the diagonal.


   Let G, H ∈ MMGraphs such that D∞ (G, H) = δ. For a given  > 0, let π be a measure
coupling of µG and µH for which J∞ (π) < δ + . As we have seen in Section ??, the support
of a measure coupling is always a correspondence, and since µG and µH have full measure,
we know that supp(π) is a correspondence between G = supp(µG ) and H = supp(µH ).
Moreover, since dGH (G, H) = 21 inf R kΓG,H kL∞ (R×R) , we can deduce that dGH (G, H) < δ + .
Using the proof of Theorem 4.3.2 from [DSW15], if (x, y) ∈ supp(π) are pairs of points
paired by the correspondence then dB (Ψ(G, x), Ψ(H, y)) ≤ 18(δ + ). Now consider the
pushforward measures µ
                     ˜G = (ΨG )∗ (µG ), µ                      ˜ = (ΨG × ΨH )∗ (π) ∈
                                        ˜H = (ΨH )∗ (µH ), and π
P (Barcodes × Barcodes).

Claim 4.3.4. π
             ˜ is a measure coupling of µ
                                        ˜G and µ
                                               ˜H .
                                                                                            64

                        ˜ has these measures as marginals. Let S ⊂ Barcodes be a
Proof. Let us show that π
measurable subset of the barcode space


                   ˜ (S × Barcodes) = π((ΨG × ΨH )−1 (S × Barcodes))
                   π
                                           −1
                                      = π(ψG  (S) × H)
                                             −1
                                      = µG (ψG  (S))

                                      =µ
                                       ˜G (S)


A symmetric argument works to show that the other marginal is µ
                                                              ˜H

   Next, we claim that the image of the support of π is dense in the support of π
                                                                                ˜ . This is
a general fact about measures and pushforwards.

Claim 4.3.5. Let P be a Polish space, and T any topological space, and let f : P → T be
continuous. If Π is a probability measure on P , then f (supp(π))) is dense in supp(f∗ (π)).

Proof. Firstly, take x ∈ supp(π), and let V be any open neighborhood of f (x). (f∗ (π))(V ) =
µ(f −1 (V )), and since f −1 (V ) is an open neighborhood containing x (which is a point in the
support of π), f −1 (V ) has positive measure. Thus every neighborhood of f (x) has positive
measure. This shows the inclusion


                                 f (supp(π))) ⊆ supp(f∗ (π))


   Next, take y ∈ supp(f∗ (π)), and let V be an open set containing y. Then (f∗ π)(V ) > 0.
Since π is a probability measure on a Polish space, it is Radon, and hence any subset
of P \ supp(π) has measure zero. Hence f −1 (V ) must intersect the support of π. Let
x ∈ f −1 (V ) ∩ supp(π). Then f (x) ∈ V ∩ f (supp(π)). Thus, every neighborhood V of y
                                                                                                    65


meets f (supp(π)), proving density.

                                     π ). By density, there is an arbitrarily close pair (b01 , b02 ) ∈
   Now take a pair (b1 , b2 ) ∈ supp(˜
(ΨG × ΨH )(supp(π)), corresponding to a pair (x, y) ∈ supp(π) for which b01 is the barcode
associated to the pointed graph (G, x), and similarly for b02 and (H, y). As we have shown,
the distance between the barcodes b01 and b02 is at most 18(δ + ). By the triangle inequality,
and taking limits, the same is true of the pair (b1 , b2 ). Finally, letting  go to zero completes
the proof.


4.4      Injectivity Results

Although these topological transforms are richer than single barcodes, there still exist pairs of
graphs G and H which are not isometric but for which IP HT (G) = IP HT (H), IECT (G) =
IECT (H), and IP HM T (G) = IP HM T (H). The following is a particularly simple exam-
ple, others can be found in [DSW15].

Counterexample 4.4.1. In the following figure 4.1, the lengths of the small branches are
all equal, as are the lengths of the middle-sized branches, and finally both central edges have
the same length too. For every middle-sized branch in G there is a corresponding branch in
H with the same number of small branches, not necessarily on the same side. The barcodes
for points on matching branches are the same. Similarly the barcodes for points along the
central edges of G and H agree.


   This demonstrates that the IPHT (and hence IECT) is not injective on the space MGraphs.
In fact, the above counterexample can be embedded in any graph G, in the following sense.
                                                                                                   66


                     G                                                  H


Figure 4.1: G and H are not isomorphic, but have the same image under all three of our topological
transforms.


Counterexample 4.4.2. For a given graph G with edge E, we can glue one of the two
trees from Counterexample 4.4.1 at its center. The two resulting graphs, G1 and G2 , have
the same intrinsic persistent homology transform. Moreover, we are also free to scale down
our trees before we glue them to G, and hence G1 and G2 can be taken to be as close to G
as one likes in the Gromov-Hausdorff metric.


Figure 4.2: The same edge on a given graph G, with one of the two counterexample trees glued along its
center.


   This counterexample is summarized in the following proposition.
                                                                                          67

Proposition 4.4.3. Every open GH-ball in MGraphs contains a pair of non-isometric
graphs with the same intrinisc persistent homology transform.

   Contrast this with the following pair of results, which assert that there is a subset of
MGraphs which is dense in the GH-topology, and on which our topological transforms
are injective, up to isometry. That is, two graphs in this subset have the same topological
transform iff they are isometric (even though their combinatorial structures might differ).

Theorem 4.4.4. The IPHT and IECT are injective up to isometry when restricted to the
sets {G ∈ MGraphs | ΨG injective} and {G ∈ MGraphs | χG injective} respectively,
noting that the latter contains the former. Moreover, the IPHMT is injective up to measure-
preserving isometry on the set {(G, µ) ∈ MGMraphs | ΨG injective}.

Proposition 4.4.5. The set {G ∈ MGraphs | ΨG injective} is GH-dense in MGraphs.

   Combining this proposition with the following result from [BBI01] demonstrates that any
compact length space can be approximated by graphs in {G ∈ MGraphs | ΨG injective}.
This suggests that we can study the structure of more complex geodesic spaces by under-
standing the intrinisc persistent homology transforms of approximating graphs.

Proposition 4.4.6 ([BBI01],7.5.5). Every compact length space can be obtained as a Gromov-
Hausdorff limit of finite graphs.

   However, Proposition 4.4.3 implies that injectivity (even up to isometry) on a generic
(open, dense) subset of MGraphs is impossible in the GH topology. Our approach there-
fore is two-fold: firstly, to derive a local injectivity result in the GH topology (Theorem
4.4.7). Secondly, to find a natural, application-driven topology on MGraphs for which
there does exist a generic subset on which our topological transforms are injective, up to
                                                                                              68

isometry.


   Our local injectivity results assert that for every metric graph G, there exists a GH-ball
of radius (G), centered at G, such that no metric graph H in this ball, distinct from G,
has the same intrinsic persistent homology transform as G. This does not mean that the
intrinisc persistent homology transform is injective on this ball, and indeed Counterexample
4.4.2 demonstrates that this is not the case.

Theorem 4.4.7. IP HT is locally injective in the following sense: ∀G ∈ MGraphs there
exists a constant (G) > 0 such that ∀G0 ∈ MGraphs with 0 < dGH (G, G0 ) < (G) we have
dP D (G, G0 ) > 0.

   Although the Gromov-Hausdorff topology is too fine to admit further injectivity results,
these can be formulated in the fibered topology.

Theorem 4.4.8.A. There is a subset U ⊂ MGraphs containing {G ∈ MGraphs |
ΨG injective} and {G ∈ MGraphs | χG injective} which is open and dense in the fibered
topology, and such that the IPHT and IECT are injective on U , up to isometry.

Theorem 4.4.8.B. Let MGraphs∗ be the subset of MGraphs consisting of graphs whose
underlying combinatorial graph has (i) no topological self-loops and (ii) at least three vertices
of valence not equal to two. Then the intrinisc persistent homology measure transform is
injective on U ∩ MGraphs∗ , up to measure-preserving isometry.

Remark 4.4.9. By considering metric graphs up to isometry (and not only isometric iso-
morphism), and working with the resulting, coarser quotient of the fibered topology, it is
possible to remove the “up to isometry" qualification in the prior result. It is a basic result
in topology that our dense set U descends to a dense set in any quotient topology, and it is
not hard to verify that, in our case, it continues to be open.
                                                                                           69

   The necessity of restricting oneself to MGraphs∗ for Theorem 4.4.8.B can be seen in the
following counterexample.

Counterexample 4.4.10. Let (G, µ) be a metric-measure graph homeomorphic to an in-
terval, and let f : G → G be the isometry exchanging its leaves. Let S ⊂ G be a measurable
subset for which S ∩ f (S) = ∅. Then (ΨG )∗ (µ)(ΨG (S)) = µ(S) + µ(f (S)), as S and f (S)
are mapped to the same subset of barcode space by ΨG . The resulting measure on barcode
space is thus obtained by symmetrizing µ with respect to the automorphism f , and since
there are many distinct measures with the same such symmetrization, this procedure cannot
be reversed and IP HM T will not be injective.

   Theorem 4.4.8.A has the following immediate consequence.

Corollary 4.4.11. There is a fibered-topology-generic subset U of MGraphs such that dP D
and dαED are true metrics on the isometry equivalence classes of U .


4.5     Overview of the Proofs from Section 4.4

In this section, we demonstrate how to deduce Theorems 4.4.4 and 4.4.8 from a collection
of technical lemmata. The proofs of these lemmata, as well as the remaining results from
the prior sections, are located in the appendices. A list at the end of this section directs
the reader to the appropriate appendix and section for each result. We have organized the
thesis in this way so as to maintain the readability of the text while still emphasizing those
proofs which are most instructive.


   The proof of Theorem 4.4.4 is based on the following three lemmata.
                                                                                            70

Lemma 4.5.1. Let (G, dG ) be a metric graph that is not a circle. For every p ∈ G there
exists (p) > 0 such that for all q with 0 ≤ q < (p), we have


                                  dG (p, q) = dB (ΨG (p), ΨG (q))


Lemma 4.5.2. Let (G, dG ) be a metric graph with vertex set V (omitting vertices of valence
two). Suppose that G is not a circle. Define θG : G → R as follows:


                                      θG (p) = min dG (p, v)
                                               v∈V


Then θG is a continuous function, and for all p ∈ G there exists (p) > 0 such that for all q
with 0 ≤ q < (p) we have
                                   dG (p, q) = |θG (p) − θG (q)|

Lemma 4.5.3. Let (G, dG ) be a metric graph that is not a circle. Define sG : IECT (G) → R
as follows: for γ ∈ IECT (G), sG (γ) is the smallest point of discontinuity for γ. Then for
all but finitely many p ∈ G, sG (χG (p)) = θG (p). Suppose further that χG (p) is injective. If
one defines s˜G (γ) = limpn s(γn ), using any sequence γn ∈ IECT (G) \ {γ} converging to γ,
then s˜G (χG (p)) = θG (p) for all p ∈ G..

Proof of Theorem 4.4.4. Lemma 4.2.6 implies that ΨG is continuous, and Lemma 4.4.2 im-
plies the same for χG . We know that a continuous bijection from a compact space to a
Hausdorff space is a homeomorphism. Thus, when ΨG or χG are injective, (IP HT (G), dB )
or (IECT (G), Lα ) are homeomorphic, as topological spaces, to (G, dG ).


   For the remainder of the proof, we focus on the IPHT, as the proof is virtually identical
for the IECT. Let (IP HT (G), dˆB ) be the intrinsic path metric space derived from the metric
                                                                                                            71

space (IP HT (G), dB ) 1 . We identify G with IP HT (G) via the map ΨG , considering dG , dB ,
and dˆB as metrics on G. Since ΨG is a homeomorphism, the class of dB -continuous paths is
the same as the class of dG -continuous paths. We claim that (IP HT (G), dˆB ) is isometric to
(G, dG ), so that we may recover G from IP HT (G).


       Let γ : I → G be a dG -continuous path, and P = {0 = t0 , · · · , tn = 1} a partition.
Let `G,P (γ) and `B,P (γ) denote the lengths of γ in dG and dB with respect to this parti-
tion. We claim that P admits a refinement P ⊆ P 0 = {0 = t0 , · · · , rm = 1} for which
`G,P 0 (γ) = `B,P 0 (γ). As the length of a path in a metric space is the supremum of the lengths
of its partitions, considered over the set of all possible partitions, this implies that γ has the
same length in both dG and dB . Since dˆB is the intrinsic metric defined using dG -continuous
paths and dG is an intrinsic metric, this will imply that dˆB (p, q) = dG (p, q) for all p, q ∈ G.


       For each time t ∈ I and corresponding point γ(t) ∈ G, there is a constant t witnessing
the validity of Lemma 4.5.1 for γ(t). Let Ut be the open dG -neighborhood of γ(t) of radius
t . Since γ : I → G is continuous, there is a constant δt such that γ((t − δt , t + δt )) ⊆ Ut .
Let Vt = (t − δt /2, t + δt /2). The sets Vt form an open cover of I, and hence by compactness
a finite subcover exists, corresponding to a collection of times Ω = {r1 , · · · , rk }. Let us
augment Ω with the times in P to produce our refinement P 0 . Note that if t, t0 ∈ P 0 are
two consecutive times, the triangle inequality implies t0 − t < max{δt , δt0 }, so that either
γ(t0 ) ∈ γ((t − δt , t + δt )) ⊆ Ut or γ(t) ∈ γ((t0 − δt0 , t0 + δt0 )) ⊆ Ut0 . Direct application of Lemma
4.5.1 then yields
   1
    To be precise, an intrinsic path metric is always defined in reference to a class of admissible paths. Here
we are considering all dB -continuous paths
                                                                                                     72


                            m−1
                            X                               n−1
                                                            X
               `G,P (γ) =         dG (γ(ti ), γ(ti+1 )) =         dB (γ(ti ), γ(ti+1 )) = `B,P (γ)
                            i=0                             i=0


   completing the proof that dG = dˆB . For the map IP HM T (G), we can first obtain
IP HT (G) by taking the support of the pushforward measure, since we are by assumption
working with measures of full support. As we have just seen, we can then obtain the underly-
ing metric graph G. The measure of a Borel subset S ⊆ G is then equal to (ΨG )∗ (µ)(ΨG (S)).


   Lastly, when working the the IECT, we replace dˆB with the intrinsic metric defined using
the length element d˜                                            ˆ G . Lemma 4.5.2, and an
                    sG . Lemma 4.5.3 implies this is the same as dθ
                                                     ˆ G = dG . Thus we can reconstruct
argument identical to the one above, then imply that dθ
(G, dG ) from IECT (G) in the event that χG is injective.


Remark 4.5.4. Let (G, p) be a pointed metric graph with associated Reeb graph ΦG (p) and
height function fp : ΦG (p) → R. A similar argument, combined with Proposition ??, imply
              ˆ p ) is isometric to (G, dG ).
that (ΦG (p), df

   The following proposition allows us to leverage our geometric results in proving the
genericity statements of Theorem 4.4.8.

Proposition 4.5.5. Let MGraphs∗ be the set of metric graphs with (i) no topological self-
loops, and (ii) at at least three vertices of valence distinct from two. Let G ∈ MGMraphs∗ .
If the edge lengths of G are linearly independent over Z, then ΨG and χG are injective.

   Let ΩZ be the subset of Ω consisting of pairs (X, ~v ), with the edge weighs in ~v being
linearly dependent over Z (this condition is respected by the action of Aut(X), which only
permutes the entries of ~v ). With the map p : Ω → MGraphs as above, sending a pair
                                                                                               73


(X, ~v ) to the associated shortest-path metric graph, it is clear that U = p(ΩC
                                                                               Z ) is dense and

open in MGraphs in the fibered topology. Proposition 4.5.5 tells us that MGraphs∗ ∩ U ⊂
{G ∈ MGraphs∗ | ΨG and χG injective}. This, taken together with Theorem 4.4.4, proves
Theorem 4.4.8.B.


   For Theorem 4.4.8.A, we must consider the implications of topological self-loops or two or
fewer vertices of valence distinct from two. Any metric graph with a topological self-loop ad-
mits an isometric automorphism that flips the loop. Moreover, certain combinatorial graphs
with fewer than three vertices of valence not equal to two, such as a pair of vertices con-
nected by multiple edges, admit isometric automorphisms flipping those vertices regardless of
the metric structure chosen. These isometric automorphisms are obstructions to injectivity
of ΨG (and hence χG ). Proposition 4.5.5 states that, setting aside for these combinatorial
obstructions, ΨG is generically injective, up to isometry. Intuitively, Proposition 4.5.5 is a
generalization of the result that random metric graphs have trivial automorphisms groups.
However, our result is strictly stronger, as it is possible for ΨG to fail to be injective even if
Aut(G) is trivial, as illustrated in Figure 4.3.


   To complete the proof of Theorem 4.4.8.A, we show that, even when G has topological
self-loops or few vertices of valence not equal to two, it is possible to reconstruct the isometry
type of G from IP HT (G) or IECT (G), when the edge lengths of G are linearly independent
over Z, as any failures of injectivity of ΨG and χG are of a particularly simple type. This
analysis is carried out in Chapter D.
                                                                                                    74

                                                                                      1
                    1
                                                                                 5
              0:9          5
                    1:1                                                              1:1
                                                 10
                                                                             1
                               p                                         q
                                                                                      0:9
                          6                                                  5
                                                                                      1
Figure 4.3: The basepoints p and q produce identical barcodes, despite the graph having a trivial auto-
morphism group.


   Detailed proofs are contained in the following sections. The sections are ordered by
logical implication and not the order in which their results appear above.


   • Section A.1 proves Theorem 4.4.7, that the IPHT is locally injective.

   • Section A.2 Proves Lemmata 4.5.1, 4.5.2, and 4.5.3.

   • Section A.3 proves Proposition 4.4.5, showing GH-density of ΨG -injective graphs.

   • The proof of Proposition 4.5.5 is split into two parts. The first part, injectivity of
      ΨG , can be found in Appendix B. The second part, injectivity of χG , can be found in
      Appendix C.

   • Appendix D discusses the case of topological self-loops and two or fewer vertices of
      valence not equal to two.
                                                                   CHAPTER 5


The Distance Kernel Transform


5.1      Motivation

In this chapter, we introduce a number of new topological transforms. These are not unre-
lated to the topological transforms of the prior chapter, and are in fact motivated by some
of the limitations of the IPHT and the IECT, namely:

  • Metric graphs enjoy the following geometric property: For any basepoint p and sec-
      ond point x, there is a constant (x) > 0 such that if d(x, y) ≤ (x) then d(x, y) =
      |d(p, x) − d(p, y)|. What this means is that the distance-from-the-basepoint function
      d(p, ·) encodes the local geometry of our space. This is true for metric graphs because,
      for any radius r, the sphere B(p, r) is always a finite set of points. In a higher-
      dimensional space, or a general metric space, this will generally fail to be the case, and
      the geometric content of distance-to-the-basepoint functions is much more limited.
      This situation can be improved by considering the distance to a a set of n basepoints,

                                              75
                                                                                               76

      but then the dimension of the parameter space blows up, and the resulting invariant
      is practically uncomputable.

  • We have seen that nearby basepoints produce similar persistence diagrams and ex-
      tended Euler curves. Thus, the IPHT and IECT contain a lot of redundant information.
      This can be partially avoided by cleverly choosing the basepoints whose persistence
      diagrams will be computed. However, there is no straightforward way to optimize this
      procedure.

  • The IPHT and IECT rely on the topological complexity of the underlying space. That
      is, because distance-to-the-basepoint functions are convex, their sublevel sets tend to
      have relatively little homology. For example, the ball of radius r < π around a point
      on the unit sphere is always contractible.

   The approach adopted in this chapter is motivated by the technique of PCA (Principal
Component Analysis). The map T sending a point p to the distance function d(p, ·) isomet-
rically embeds a metric space X into L2 (X). Thus, we can identify X with T (X), a subset
of the Hilbert space L2 (X), and consider which distance functions explain the most variance
in T (X). In other words, we seek basepoints {p1 , · · · , pk } ∈ X such that the projection of
T (X) on to hT (p1 ), · · · , T (pk )i minimizes the change in L2 -norm. Ideally, these basepoints
will produce distinct and informative persistence diagrams.


   From a functional-analytic perspective, however, it makes no sense to limit ourselves
to projecting along vectors of the form T (p). Indeed, we may preserve more geometry by
projecting along other vectors in L2 (X). Moreover, readers familiar with PCA will recall
that it has two formulations: one as maximizing explained variance, and the other as solving
an eigenvalue problem. The corresponding integral operator on L2 (X) has a very simple
                                                                                             77

kernel – the distance function d : X × X → R. The eigenfunctions of this operator operator
will then replace the distance-to-the-basepoint functions considered earlier. This has the
following advantages:

  • The kernel of a self-adjoint integral operator can be recovered from its eigenfunctions
      and eigenvalues (this is a consequence of the Spectral Theorem). Since our kernel is
      the distance function itself, our eigenfunctions necessarily contain all the geometric
      data of our space.

  • These eigenfunctions are designed to maximize the “explained geometry" of our space.
      Thus, there is relatively little redundant information.

  • Unlike our distance functions, these eigenfunctions are not generally convex, so their
      sublevel sets are rich in topological content.

   In the following sections, we justify the claims above, explore the geometric content of
these eigenfunctions, and investigate the stability and injectivity of the resulting topological
transforms.


5.2       The Distance Kernel

Let (X, dX , µX ) be a compact metric measure space. We define the following operator on
L2 (X):
                                            Z
                             (DX f )(x) =       f (y) dX (x, y)dµX (y)
                                            X


Proposition 5.2.1. DX is a self-adjoint operator.

Proof. By convention, µX is Radon. Since X is compact, this implies that µX (X) < ∞, and
hence (X, µX ) is σ-finite. We can thus apply Fubini’s theorem, and the symmetric of the
                                                                                                       78

distance function dX , to observe that, for two integrable functions f and g,

                                    Z   Z                                
                    hDX f, gi =                   f (y) dX (x, y)dµX (y) g(x)dµX (x)
                                    X        X
                                    Z   Z
                               =             f (y)g(x) dX (x, y)dµX (x)dµX (y)
                                    X    X
                                    Z             Z                            
                               =        f (y)             g(x) dX (y, x)dµX (x) dµX (y)
                                    X                 X

                               = hf, DX gi


demonstrating self-adjointness.

Proposition 5.2.2. DX is a compact operator.

Proof. Let fn ∈ L2 (X) be a bounded sequence of functions, ||fn ||∞ ≤ C for all n.
   For dX (x, x0 ) ≤  and all n,

                                             Z                                                    
           |DX fn (x) − DX fn (x0 )| =             (dX (x, y)fn (y) − dX (x0 , y)fn (y)) dµX (y)
                                                                                                  
                                                 X
                                             Z
                                        ≤          | dX (x, y) − dX (x0 , y)||fn (y)|dµX (y)
                                              X

                                        ≤  · C Vol(X)


   Thus DX fn is an equicontinuous family of functions on X, so, by the Arzelà–Ascoli
theorem, it contains a uniformly convergent, and hence L2 -convergent, subsequence. This
demonstrates compactness.

Corollary 5.2.3. The spectral theorem for compact, self-adjoint operators on a Hilbert space
implies that L2 (X) admits a finite or countably infinite orthonormal basis consisting of eigen-
functions φi of DX , with eigenvalues λi that go to zero as i goes to infinity.
                                                                                           79

Convention 5.2.4. The spectral theorem asserts the existence of the eigenfunctions φi , but
does not guarantee their uniqueness. Indeed, the choice is never unique. If the eigenvalue
λi has geometric multiplicity one, one has two choices: {φi , −φi }. If the eigenvalue has
geometric multiplicity greater than one, there are infinitely many choices. In what follows,
we will make the generic assumption that all the eigenvalues have multiplicity one. In
resolving the ambiguity of the choice of sign, we have two options. In [?], Note 1, the
authors suggest fixing an arbitrary function f for which hf, φi i =
                                                                  6 0 ∀i, and asserting that
hf, φi i > 0 ∀i. In order to maintain consistency of the sign convention, we would like f to
be canonically defined. For example, one might set f to be the constant function f (x) = 1,
which does not depend on the representation of the data. The limitation of this approach
is that it is hard to see why, even generically speaking, hφi , 1i =
                                                                   6 0 ∀i. For this reason, we
adopt the convention that hφi , |φi |i > 0 ∀i. Let us see why, generically speaking, this dot
product is nonzero. Let Xi+ = {x ∈ X | φi (x) > 0i and Xi− = {x ∈ X | φi (x) < 0i. If
hφi , |φi |i = 0, we have

                            Z                              Z
                                   φ2i (x)dµX (x) −                  φ2i (x)dµX (x) = 0
                            Xi   +
                                                               Xi−


At the same time, because hφi , φi i = 1, we have

                            Z                              Z
                                      φ2i (x)dµX (x)   +             φ2i (x)dµX (x) = 1
                                Xi+                        Xi−


Taking these two equations together, we end up with the non-generic condition that

                                           Z
                                                                           1
                                                   φ2i (x)dµX (x) =
                                             Xi+                           2
                                                                                                        80


5.3      The Distance Kernel Embedding

In this section, we consider the Euclidean embedding provided by the spectrum of the oper-
ator DX associated to a compact metric measure space (X, dX , µX ). Following Convention
5.2.4, we generically assume that our eigenspaces are of multiplicity one, and that we have
a coherent convention for breaking the ±-symmetry and picking “positive" eigenfunctions.

Definition 5.3.1. Let DX : L2 (X) → L2 (X) be the operator defined in Section 5.2, with
corresponding orthonormal system of eigenfunctions and eigenvalues (φi , λi ). For an eigen-
                         √
function φi , define αi = λi φi : X → C. By convention, when λi is negative, we take the
square root with positive imaginary part. For a point p ∈ X, with associated distance-to-p
function dp , we have

                              Z                                                         q
             hdp , φi iL2 =       dX (p, x)φi (x)dµX (x) = (DX φi )(p) = λi φi (p) =        λi αi (p)
                              X


   Thus dp has the following eigenfunction expansion, which converges in L2 .

                                             ∞
                                             X                         ∞
                                                                       X
                                      dp =         hdp , φi iL2 φi =         αi (p)αi
                                             i=1                       i=1

   We now demonstrate that the eigenfunctions φi corresponding to nonzero eigenvalues are
Lipschitz.

Definition 5.3.2. Let DX : L2 (X) → L2 (X) be the operator defined in Section 5.2, with
corresponding orthonormal system of eigenfunctions and eigenvalues (φi , λi ). Define βi =
λφ : X → R.
                                           q                                                   q
Lemma 5.3.3. The function βi is                Vol(X)-Lipschitz. Hence, if λi 6= 0, φi is ( Vol(X)/|λi |)-
Lipschitz.
                                                                                                        81

Proof. Let p, q ∈ X with dX (p, q) ≤ . Then by the fact that βi = DX φi and Cauchy-
Schwarz,

                                                                    2
             |βi (p) − βi (q)|2 = (DX φi )(p) − (DX φi )(q)
                                   Z                                                  2
                                                                                      
                               =         (dX (p, x) − dX (q, x))φi (x)dµX (x)
                                       X
                                   Z                                           Z
                                                                     2
                               ≤           (dX (p, x) − dX (q, x)) dµX (x) ·           φ2i (x)dµX (x)
                                       X   |         {z          }                 X
                                                 ≤dX (p,q)≤
                                                                               |            {z      }
                                                                                            =1

                               ≤ 2 Vol(X)


                           q                         q
Thus, |βi (p) − βi (q)| ≤  Vol(X), so βi is              Vol(X)-Lipschitz.

   We also have the following regularity result.

Lemma 5.3.4. Any eigenfunction φ of the distance kernel operator DX with nonzero eigen-
value is smooth.

Proof. The operator D is convolution with the distance kernel. It is a standard result in Rie-
mannian geometry (cf. [Iva]) that the distance function on a Riemannian manifold is smooth
a.e., in particular away from the pairs of conjugate points on our manifold. Convolving a
bounded function with a smooth (or smooth a.e.) function gives a smooth function. Since
φ is a scaled convolution of itself with the distance kernel, it must be smooth.

   We now consider the functions αi as coordinates on the space X.

Definition 5.3.5. For n ≥ 1, we define Φn : X → Cn to be the map sending a point p ∈ X
to (α1 (p), · · · , αn (p)) ∈ Cn . Setting n = ∞ gives us a map Φ : X → C∞ . We define Ψn and
Ψ similarly, using β in place of α.
                                                                                                82

Definition 5.3.6. For a metric measure space (X, dX , µX ), we define the distance kernel
transform DKT (X) to be the image of the map Φ : X → C∞ , and the truncated distanced
kernel transform DKTk (X) to be the image of the map Φk : X → Ck .

   From now on, for any topological space X we will write ΣX to be its Borel σ-algebra, and
we will call (X, ΣX , µX ) strictly positive if the measure of any nonempty open set is strictly
positive. The following lemma demonstrates, under mild conditions on the measure, that
the DKT is injective.

Lemma 5.3.7. Let (X, dX , µX ) be a compact, strictly positive metric measure space. Then
the map Φ : X → R∞ is injective.

Proof. Suppose that there are p 6= q ∈ X such that Φ(p) = Φ(q). This implies that Ψ(p) =
Ψ(q), so that βi (p) = βi (q) ∀i. Let dp and dq be the distance functions associated to p and
q respectively. We know that

                                  Z
                   hdp , φi i =        dX (p, x)φi (x)dµX (x) = (DX φi )(p) = βi (p)
                                   X


Thus, using the L2 -convergence of our eigenfunction expansion,

                                 n
                                 X                                 n
                                                                   X
                        k dp −         βi (p)φi kL2 = k dp −             hdp , φi iφi kL2 → 0
                                 i=1                               i=1


Similarly,
                                 n
                                 X                                 n
                                                                   X
                        k dq −         βi (q)φi kL2 = k dq −             hdq , φi iφi kL2 → 0
                                 i=1                               i=1

   Since
                                          n
                                          X                  n
                                                             X
                                                βi (p)φi =         βi (q)φi
                                          i=1                i=1
                                                                                               83

   we may apply the triangle inequality and take limits to conclude that k dp − dq kL2 = 0.
Let r = dX (p, q)/3 > 0, and let U be the neighborhood around p of radius r. The function
| dp − dq | is bounded below by r on U , and since U is not empty (it contains p), it has strictly
positive measure. This then implies k dp − dq kL2 > 0, a contradiction. Thus Φ(p) 6= Φ(q)
for p 6= q.

Remark 5.3.8. Bates [Bat14] demonstrated that finitely many Laplacian eigenfunctions are
needed to (not necessarily isometrically) embed a Riemannian manifold in Euclidean space,
and that the maximal embedding dimension depends on the dimension, injectivity radius,
Ricci curvature, and volume of the manifold. By contrast, Lemma 5.3.7 provides an embed-
ding with infinitely many eigenfunctions, a much weaker result. However, it holds in greater
generality, applying to any compact, strictly positive metric measure space.

   As opposed to Laplacian eigenfunctions, it is not clear that finitely many distance kernel
eigenfunctions suffice for an injective embedding. However, one can, in certain settings, get a
coarse injectivity: if Φn (p) = Φn (q) for p, q ∈ X, then dX (p, q) ≤ (n), where limn (n) = 0.
Let us first show this in the setting of Riemannian manifolds, equipped with the volume
measure, for which we will need the following lower bound on the volume of balls in X.

Proposition 5.3.9 ([Cro80], Prop. 14). For every dimension n, there exists a constant
Cn such that if M is an n-dimensional Riemannian manifold with injectivity radius R, and
r ≤ 12 R, then Vol B(x, r) ≥ Cn rn for all x ∈ M .

Lemma 5.3.10. Let X be a compact k-dimensional Riemannian manifold with positive
injectivity radius R > 0, and let r ≤ R/2. Then there exists a threshold T (r, n) > 0 such that,
if x, y ∈ X are points with associated distance functions dx and dy , and if k dx − dy kL2 ≤ T ,
then dX (x, y) ≤ 3r.
                                                                                          84

Proof. Suppose that dX (x, y) > 3r. Then the balls of radius r about x and y do not overlap,
and for all z ∈ B = B(x, r) t B(y, r), | dX (x, z) − dX (y, z)| ≥ r. We can deduce that

                                               Z
                            k dx − dy k2L2 ≥       (dx − dy )2 dµX ≥ 2Cn rn+2
                                               B


Thus
                                                        q
                                    k dx − dy kL2 ≥         2Cn r(n/2)+1

                        √
   Setting T (n, r) =       2Cn r(n/2)+1 completes the proof.

Corollary 5.3.11. Let M be a complete k-dimensional Riemannian manifold with positive
injectivity radius R. For every r ≤ R/2 there is a natural number N = N (k, r) such that if
Φn (x) = Φn (y) for n ≥ N then dX (x, y) ≤ 3r.

   To generalize the above result to metric measure spaces, we need a lower bound on the
volume of balls, and the notion of an (a, b)-standard measure is appropriate for that.

Definition 5.3.12. Let a, b > 0 be positive real numbers. A metric measure space (X, dX , µX )
is called (a, b)-standard if there is a threshold r > 0 such that ∀s ≤ r and ∀x ∈ X,


                                          Vol B(x, s) ≥ asb


   The constant a can be interpreted as bounding the curvature of X, whereas the constant
b is related to the dimension of X.

   The proof of the following Lemma is identical to that of Lemma 5.3.10.

Lemma 5.3.13. Let (X, dX , µX ) be a compact (a, b)-standard metric measure space with
threshold parameter r > 0. For every s ≤ r there exists a threshold T (s, a, b) such that, if
                                                                                              85

x, y ∈ X are points with associated distance functions dX and dy , and if k dx − dy kL2 ≤ T ,
then dX (x, y) ≤ 3s.

Proof. Suppose that dX (x, y) > 3s. Then the balls of radius s about x and y do not overlap,
and for all z ∈ B = B(x, r) t B(y, r), | dX (x, z) − dX (y, z)| ≥ s. We can deduce that

                                            Z
                       k dx − dy k2L2   ≥       (dx − dy )2 dµX ≥ 2(asb )s2 = 2asb+2
                                            B


Thus
                                                          √ (b/2)+1
                                        k dx − dy kL2 ≥    2as

                           √
   Setting T (s, a, b) =       2as(b/2)+1 completes the proof.

   And we obtain a corresponding corollary.

Corollary 5.3.14. Let (X, dX , µX ) be a compact (a, b)-standard metric measure space with
threshold parameter r. For every s ≤ r there is a natural number N = N (s, a, b) such that if
Φn (x) = Φn (y) for n ≥ N then dX (x, y) ≤ 3s.


5.4      Stability and Inverse Results

In this section, we explore the geometric content of the distance kernel embedding.

Observation 5.4.1. Lemma 5.3.7 states that Φ is an injection when (X, dX , µX ) is a strictly
positive metric measure space. That is, if Φ(X, dX , µX ) = Φ(Y, dY , µY ) for a pair of strictly
positive metric measure spaces, we know that there is a bijection from X to Y preserving all
the eigenfunctions. Thus, in the following lemma, we will fix an underlying space and allow
the metric and measure to vary.
                                                                                                   86

Lemma 5.4.2. Fix a set X. Let µ1 and µ2 be strictly positive measures on X, with µ1 ab-
solutely continuous with respect to µ2 , and d1 and d2 metrics on X making X1 = (X, d1 , µ1 )
and X2 = (X, d2 , µ2 ) metric measure spaces. Let D1 and D2 be the resulting integral op-
erators. If Φ(X1 ) = Φ(X2 ), then d1 = d2 . If, furthermore, the Radon-Nikodym derivative
dµ1 /dµ2 is continuous, then µ1 = µ2 .

Proof. The distance functions d1 , d2 : X × X → R have the eigenfunction expansion:

                                         ∞
                                               αiX (x1 )αiX (x2 )
                                         X

                                         i=1


   This converges in L2 (µ1 ⊗ µ1 ) to d1 and in L2 (µ2 ⊗ µ2 ) to d2 . Let us denote by Sn the
partial sums of this expansion:

                                               n
                                                     αiX (x1 )αiX (x2 )
                                               X
                                    Sn =
                                               i=1


   By standard measure theory arguments, we can extract a subsequence Snk that con-
verges to d1 pointwise on (X × X) \ N1 , where (µ1 ⊗ µ1 )(N1 ) = 0. We can then extract
a further subsequence Snkj that converges pointwise to d2 on ((X × X) \ N1 ) \ N2 , where
(µ2 ⊗ µ2 )(N2 ) = 0. Since µ1 is absolutely continuous to µ2 , if we set N = N1 ∪ N2 then
(µ1 ⊗ µ1 )(N ) = 0. Since µ1 is strictly positive, the set N cannot contain any open sets, and
hence N c is dense in X × X. We see then that d1 = d2 on a dense subset of X × X, so that
they are equal everywhere. From now on, we will write dX to denote these two metrics.


   Next, let f = dµ1 /dµ2 be the Radon-Nikodym derivative of µ1 with respect to µ2 . We
claim that, since D1 and D2 have the same eigenfunctions and eigenvalues, D1 (g) = D2 (g)
for any square-integrable function g. To see this, let ci = hg, φi iL2 (µ2 ) . We know that
                                                                                              P
                                                                                                  ci φi
                                                                                                         87
                                                     Pn
converges to g in L2 (µ2 ), i.e. if we let gn =        i=1 ci φi   then

                                     Z
                                          |gn (x) − g(x)|dµ2 (x) → 0
                                      X


Since X is compact and f continuous, |f | ≤ M for some constant M . Then

 Z                               Z                                        Z
      |gn (x) − g(x)|dµ1 (x) =       |gn (x) − g(x)|f (x)dµ1 (x) ≤ M            |gn (x) − g(x)|dµ1 (x) → 0
  X                              X                                          X


     Thus the sequence gn converges to g in L2 (µ1 ) as well. This implies that Di (gn ) converges
uniformly to Di (g), for i = 1, 2, as by Cauchy-Schwarz we have, ∀x ∈ X,

                                            Z
                       Di (gn − g)(x) =           dX (x, y)(gn − g)(y)dµi (y)
                                              X

                                          ≤ k dX (x, y)kL2 (µi ) kgn − gkL2 (µi )
                                                        q
                                          ≤ diam(X) Volµi (X)kgn − gkL2 (µi )


     By construction, D1 (gn ) = D2 (gn ) for all n. As these two, identical sequences converge
uniformly to both D1 (g) and D2 (g), we conclude D1 (g) = D2 (g).


     Finally, suppose that µ1 6= µ2 , so that the Radon-Nikodym derivative f is not identically
equal to 1. Since f is continuous, we can find a point x ∈ X, a constant C > 0, and a radius
 > 0 such that (without loss of generality, swapping the roles of µ1 and µ2 if necessary)
µ2 (B(x, r))/µ1 (B(x, r)) ≥ C > 1 for all r ≤ . Let N be a large natural number for which

                                                          1+C
                                             |1 − C| >
                                                           N

     Let z ∈ X be a point at distance δ = dX (x, z) from x. We will assume δ/N is less than
                                                                                           88

, otherwise can we can take N even larger. Let U = B(x, d/N ). By the triangle inequality,
∀y ∈ X,
                      | dX (y, z) − δ| = | dX (y, z) − dX (x, z)| ≤ dX (y, z)

Thus | dX (y, z)−δ| ≤ δ/N for y ∈ U . Now, consider the indicator function 1U . By definition,

                                                Z
                                D1 (1U )(z) =       dX (y, z)dµ1 (y)
                                                U


We see, then, that
                             |D1 (1U )(z) − δµ1 (U )| ≤ (δ/N )µ1 (U )

Similarly,
                             |D2 (1U )(z) − δµ2 (U )| ≤ (δ/N )µ2 (U )

As we have shown in the prior paragraph, D1 (1U ) = D2 (1U ). Thus D1 (1U )(z) = D2 (1U )(z),
and so an application of the triangle inequality gives

                                                    δ
                            |µ1 (U ) − µ2 (U )| ≤     (µ1 (U ) + µ2 (U ))
                                                    N

   Dividing both sides by µ1 (U ), we obtain


                                                      1+C
                                        |1 − C| ≤
                                                       N

   which is impossible. Thus µ1 = µ2 .


   We have the following immediate corollary.

Corollary 5.4.3. The DKT is injective on the space of Riemannian manifolds.
                                                                                             89

   For finite metric measure spaces, Lemma 5.4.2 only requires a finite-dimensional embed-
ding.

Corollary 5.4.4. Let X = (X, dX , µX ) and Y = (Y, dY , µY ) be a pair of finite metric
measure spaces, with |X|, |Y | ≤ n. If Φn (X) = Φn (Y ), then X and Y are isomorphic as
metric measure spaces.

Proof. This follows from Lemma 5.4.2 by noting that all the "higher" eigenfunctions vanish,
and hence are equal.

   These injectivity results tell us that distinct spaces have distinct embeddings. How-
ever, we would like to assert something stronger: spaces with similar embeddings are also
geometrically similar.


Finite Metric Measure Spaces

Let us start by studying the discrete (zero-dimensional) case. We have a metric measure space
(X, dX , µX ) with |X| = n. Define the matrix Dij = dX (xi , xj )µ(xj ), so that if f : X → R is
a function, and v ∈ Rn is the vector vi = f (xi ), then (Df )(xj ) = (Dv)j . Define the matrix
Qij = δij µ(xi ), giving rise to the inner product Q(v, w) = v T Qw.

Claim 5.4.5. The opertor D is self-adjoint with respect to Q.
                                                                                                 90

Proof. Let v, w ∈ Rn . Compute:

                                         X
                          hDv, wiQ =           (Dv)i wi µ(xi )
                                          i
                                         X X
                                     =         (       vj Dij )wi µ(xi )
                                          i        j
                                         X
                                     =         d(xi , xj )vj wi µ(xj )µ(xi )
                                         i,j
                                         X X
                                     =         (       d(xj , xi )µ(xi )wi )vj µ(xj )
                                         j         i
                                         X X
                                     =         (       Dji wi )vj µ(xj )
                                         j         i
                                         X
                                     =         (Dw)j vj µ(xj )
                                         j

                                     = hv, DwiQ


   Thus, by the spectral theorem, D has real eigenvalues and a basis of Q-orthonormal
real eigenvectors. We will generically assume that D has distinct eigenvalues λ1 , λ2 , · · · , λn ,
ordered by decreasing absolute value, and associated eigenvectors e1 , · · · , en . Let A be the
n × n matrix whose ith column is ei . Q-orthonormality of or eigenbasis means that


                                                   AT QA = I


   A little matrix algebra gives
                                               AAT = Q−1
                                                                                                                      91

    Denoting the ith row of the matrix A by ri , this tells us that

                                                           q
(5.1)                                      kri k = 1/ µX (xi ).

                                       √
    We define functions αi (xj ) =λi (ei )j , giving rise to the embedding Φ = (α1 , · · · , αn ). If
                                                   √
V is the diagonal matrix whose (ii)th entry is λi , then Φ maps xi to the ith row of AV .


    We now show how to recover the geometry of X from its embedding. For vectors v, w ∈
Cn , define the following bilinear form

                                                         n
                                                         X
                                           [v, w] =            vi wi ∈ C
                                                         i=1


This form is symmetric but not an inner product. For xi ∈ x, let di : X → R be the “distance
to xi function, thought of as a vector in Rn ⊂ Cn . Observe that

                                           n
                                           X                      n
                                                                  X
                                    di =         hel , di iei =         λl (el )i el
                                           l=1                    l=1


Hence
                                               n q
                                               X                q
                      d(xi , xj ) = (di )j =         ( λl el )i ( λ1 el )j = [Φ(xi ), Φ(xj )]
                                               l=1

    If we truncate our embedding, we obtain an approximation of the distance function. The
following bound follows from the Cauchy-Schwartz inequality and Equation 5.1


                                                n                                                     λk+1
                                                X                
                                                                  
(5.2) |d(xi , xj ) − [Φk (xi ), Φk (xj )]| =   
                                                     λ (e ) (e )
                                                       l l i l j
                                                                          ≤ λk+1 kri kkrj k = q
                                               l=k+1                                             µX (xi )µX (xj )
                                                                                         92

Example 1. Let X consist of two points, with d(x1 , x2 ) = 1. Let µ(x1 ) = 1 and µ(x2 ) = 4.
Our matrix is then                                       
                                                0      4
                                                         
                                                         
                                                    1 0

The eigenvalues of this matrix are ±2. Let
                                                                         
                                          √1                          √1
                                           2                         2   
                                e1 = 
                                     
                                                
                                                        e2 = 
                                                              
                                                                   
                                                                   
                                           1
                                           √                    −1     √
                                          2 2                         2 2


   These are eigenvectors with eigenvalue +2 and 2 respectively. If we define the inner
product matrix                                               
                                                    1    0
                                          Q=
                                            
                                                              
                                                              
                                                        0 4

Observe that
                                                              1    1
                            eT1 Qe1 = eT2 Qe2 = 1 ·             +4· =1
                                                              2    8
                                                              1    1
                            eT1 Qe2 = eT2 Qe1 = 1 ·             −4· =0
                                                              2    8

   so that e1 and e2 are Q orthonormal. We have then that

                                           √ √
                                            2 −2
                                Φ(x1 ) = h √ , √ i = h1, ii
                                            2   2
                                       √    √
                                         2 − −2   1 −i
                            Φ(x2 ) = h √ , √ i = h , i
                                      2 2 2 2     2 2

   and
                                                                       1 1
                         dX (x1 , x2 ) = [Φ(x1 ), Φ(x2 )] =             + =1
                                                                       2 2

                          dX (x1 , x1 ) = [Φ(x1 ), Φ(x1 )] = 1 − 1 = 0
                                                                                          93

                                                                           1 1
                             dX (x2 , x2 ) = [Φ(x2 ), Φ(x2 )] =             − =0
                                                                           4 4

   To relate distance in the embedding space to distances between metric spaces, we need
two technical lemmata.

Lemma 5.4.6. The bilinear form [, ] enjoys the following Cauchy-Schwarz inequality:


                                             |[v, w]| ≤ |v||w|


Proof. Let v = (c1 , · · · , ck ) ∈ Ck and w = (d1 , · · · , dk ) ∈ Ck . By definition,

                                                             k
                                                             X
                                             [v, w] =              ci di
                                                             i=1


Using the triangle inequality for complex numbers,

                                            k              k
                                           X               X
                              |[v, w]| =       c  d     ≤          |ci ||di | = h˜   ˜
                                                                                 v , wi
                                                   
                                                i i
                                                   
                                            i=1              i=1


where v˜, w˜ ∈ Rk are obtained from v and w by taking component-wise absolute values. Note
that
                                            k                k
                                   |v|2 =                          |ci |2 = |˜
                                                                             v |2
                                            X                X
                                                  ci c¯i =
                                            i=1              i=1

                                            k                k
                                   |w|2 =         di d¯i =         |di |2 = |w|
                                                                             ˜2
                                            X                X

                                            i=1              i=1

   Thus v and v˜ have the same magnitude, as do w and w.
                                                      ˜ To complete the proof, we apply
the ordinary Cauchy-Schwarz inequality to v˜ and w,
                                                 ˜


                                       h˜   ˜ ≤ |˜
                                        v , wi   v ||w|
                                                     ˜ = |v||w|
                                                                                                              94


Lemma 5.4.7. Let v1 , v2 , w1 , w2 ∈ Ck be vectors such that kv1 − w1 kL2 , kv2 − w2 kL2 ≤ .
Then
                        |[v1 , v2 ] − [w1 , w2 ]| ≤  min(|v1 | + |v2 |, |w1 | + |w2 |) + 2

Proof. Without loss of generality, the minimum min(|v1 | + |v2 |, |w1 | + |w2 |) = |v1 | + |v2 |. By
bilinearity,


            [w1 , w2 ] = [v1 , v2 ] + [v1 , (w2 − v2 )] + [(w1 − v1 ), v2 ] + [(w1 − v1 ), (w2 − v2 )]


Thus,


         |[v1 , v2 ] − [w1 , w2 ]| ≤ |[v1 , (w2 − v2 )]| + |[(w1 − v1 ), v2 ]| + |[(w1 − v1 ), (w2 − v2 )]|


The proof then follows by applying Cauchy-Schwarz.

   To state our next theorem, we need one final definition.

Definition 5.4.8. For (X, dX , µX ) a finite metric measure space, and θ ≥ 0 a real parameter,
define
                                         Xθ = {x ∈ X | µX (x) ≥ θ}

The set Xθ inherits a metric and measure from X.

   We now have:

Theorem 5.4.9. Let (X, dX , µX ) and (Y, dY , µY ) be finite metric measure spaces, with eigen-
                                                                            2
values {λi } and {νi }. Let k ≤ |X|, |Y |, and suppose that dLH (Φk (X), Φk (Y )) ≤ . Then for
                                                                                                       95

any θ ≥ 0,
                                               q       q               λk+1 + νk+1
                     dGH (Xθ , Yθ ) ≤ 2 max( |λ1 |, |ν1 |) + 2 +
                                                                            θ

Proof. Suppose that the Hausdorff distance between Φk (X) and Φk (Y ) is realized by a
pairing R ⊂ |X| × |Y |. This induces a pairing on X × Y . Suppose that x is paired with y
and x0 is paired with y 0 . Write v1 = Φk (x), v2 = Φk (x0 ), w1 = Φk (y), and w2 = Φk (y 0 ). Then
Equation 5.2 tells us that |d(x, x0 ) − [v1 , v2 ]| ≤ λk+1 /θ and |d(y, y 0 ) − [w1 , w2 ]| ≤ µk+1 /θ. The
                                      q
norms of v1 and v2 are at most            |λ1 |, and similarly the norms of w1 and w2 are at most
q
    |µ1 |. The proof then follows from the triangle inequality and the prior lemma.

Observation 5.4.10. When k ≥ |X|, µk+1 = 0, and similarly for Y .

Observation 5.4.11. Note that in the case of uniform unit atomic measures, i.e. µX (xi ) =
1 = µY (yj ) ∀i, j, X1 = X and Y1 = Y . More generally, Xθ∗ = X for θ∗ = minx∈X µX (x).

Remark 5.4.12. The prior theorem is stated in terms of Gromov-Hausdorff distances, and
does not take into account the measure. This is because scaling the measure does not
affect the embedding: the eigenvalues change, but so does the inner product Q used to
enforce orthonormality. These two forces cancel out perfectly, and the embedding vectos
are preserved. Thus, we can say that our embedding uses the measure to prioritize certain
distance functions on our space, but it does not record the measure itself.

     The quality of the estimate in Theorem 5.4.9 depends on the decay of the eigenvalues of
the operators DX and DY . Though this seems difficult to estimate in general, we do have
the following, simple bound on the top eigenvalue.

Lemma 5.4.13. For any compact metric measure space (X, dX , µX ), λ1 ≤ diam(X) vol X),
and this bound is asymptotically sharp.
                                                                                                          96

Proof. Let f ∈ L2 (X) be a function whose absolute value attains a (not necessarily unique)
maximum at p. Observe that

                         Z                             Z
         |(DX f )(p)| = 
                                                
                                 dX (p, x)f (x)dx ≤        dX (p, x)|f (x)|dx ≤ diam(X) vol(X)|f (p)|
                             X                           X


Taking f = φ1 completes the proof.


   To demonstrate asymptotic sharpness, let X consists of n points, at distance d > 0 from
all other points, and each with measure 1/n, so that X has unit measure. If f is any nonzero
constant function, then DX (f )/f = d · n−1
                                         n
                                            , which converges to d as n → ∞.


General Metric Measure Spaces

To obtain an analogue of Theorem 5.4.9 in the general setting, we make use of Corollary
5.6.2.

Theorem 5.4.14. Let (X, dX , µX ) and (Y, dY , µY ) be doubling metric measure spaces, with
                                                                                      2
eigenvalues {λi } and {νi }. Fix k ∈ N and δ > 0, and suppose that dLH (Φk (X), Φk (Y )) ≤ .
Then there exists an Nk such that

                                       q             q
 dGH (X, Y ) ≤ 2( + 2δ) max( |λ1 + δ|, |ν1 + δ|) + ( + 2δ)2 + Nk (λk+1 + νk+1 + 2δ) + 2δ


Proof. By Corollary 5.6.2, we know that there exists N = Nk large enough such that we can
                      ˆ N ⊂ X and YˆN ⊂ Y with
find discrete samples X


                                                        ˆN ) ≤ δ
                                                 dH (X, X
                                                                                                 97

                                          dH (Y, YˆN ) ≤ δ

                                                 ˆ X
                                     dH (Φk (X), Φ( ˆ N )) ≤ δ

                                                   ˆ YˆN )) ≤ δ
                                      dH (Φk (Y ), Φ(

                                         ˆ i | ≤ δ ∀i = 1, · · · , k
                                   |λi − λ

                                   |νi − νˆi | ≤ δ ∀i = 1, · · · , k

   An application of the triangle inequality demonstrates that


                                      ˆ X
                                  dH (Φ(        ˆ YˆN )) ≤  + 2δ
                                         ˆ N ), Φ(


An application of Theorem 5.4.9, together with the observation that the smallest measure
                                            ˆ N and YˆN is (1/N ), gives us that
of any point in the empirical distributions X

                                            q        q
                                            ˆ 1 |, |ˆ
              ˆ N , YˆN ) ≤ 2( + 2δ) max( |λ
         dGH (X                                                            ˆ k+1 + νˆk+1 )
                                                    ν1 |) + ( + 2δ)2 + N (λ


               ˆ i ≤ λi + δ and µ
   Noting that λ                ˆi ≤ µi + δ for i = 1, · · · , k, we have

                                    q            q
       ˆ N , YˆN ) ≤ 2( + 2δ) max( |λ1 + δ|, |ν1 + δ|) + ( + 2δ)2 + N (λk+1 + νk+1 + 2δ)
  dGH (X


Lastly, a final application of the triangle inequality tells us that


                             ˆ N ) + dGH (X
       dGH (X, Y ) ≤ dGH (X, X            ˆ N , YˆN ) + dGH (YˆN , Y ) = dGH (X
                                                                              ˆ N , YˆN ) + 2δ
                                                                                               98

   From which we conclude that

                                  q         q
 dGH (X, Y ) ≤ 2( + 2δ) max( |λ1 + δ|, |ν1 + δ|) + ( + 2δ)2 + N (λk+1 + νk+1 + 2δ) + 2δ


5.5          Topological Kernel Transforms

In this section, we introduce a large family of topological transforms. In order to do so, we
need to know that our eigenfunctions actually have persistence diagrams, which is not the
case for any function on a topological space (e.g. if the associated persistence module is not
pfd).

Proposition 5.5.1. Let (X, dX , µX ) be a compact metric measure space homeomorphic to the
geometric realization of a finite simplicial complex. Then for any finite linear combination
        Pn
f =      i=1 ci φi   of eigenfunctions of DX with nonzero eigenvalue, the persistence diagram
P H(X, f ) exists.

Proof. Lemma 5.3.4 states that the eigenfunctions of DX are smooth, and hence continuous.
Since f is a finite linear combination of continuous functions, it, too, is continuous. Lastly, we
appeal to Theorem 2.22 in [CDSGO16], which asserts the existence of persistence diagrams
of continuous functions on geometric realizations of finite simplicial complexes.

   In addition to persistent homology, we would also like to compute the Betti and Euler
curves of the pair (X, f ). Justifying the existence of these invariants requires introducing
some more ideas from the theory of persistent homology.
                                                                                          99

Definition 5.5.2. For a point x in a persistence diagram, define per(x) to be the distance
from x to the diagonal. For a real-valued function f : X → R on a triangulable, compact
metric space X, exponent q > 0, and threshold parameter t ≥ 0, we define:


                                                             per(x)q
                                                    X
                                 Persq (f, t) =
                                                  per(x)>t


This is the sum of the kth powers of the persistence of points in P H(X, f ) with persistence
at least t. When t = 0, this quantity is called the degree-q total persistence of f .

   Bounding the total degree-q persistence of P H(X, f ) necessitates placing some restric-
tions on X and f .

Definition 5.5.3. A triangulable metric space X has polynomial combinatorial complexity
if there are constants C0 and M such that, for any radius parameter r > 0, there exists a
triangulation T of X where every triangle has diameter at most r, and T has at most C0 /rM
simplicies.

   In [CSEHM10], the authors note that the bilipschitz image of an M -dimensional Eu-
clidean simplicial complex will always be of polynomial combinatorial complexity. They also
prove that Lipschitz functions on such spaces have finite total degree-q persistence, for k
sufficiently large.

Theorem 5.5.4 ([CSEHM10], §2.3). Let f : X → R be a Lipschitz function on a triangulable
metric space X of (C0 , M )- polynomial combinatorial complexity. Let q = M + δ for some
constant δ > 0. Then we have the following bound on total degree-q persistence:

                                                                         !
                                              M               M +δ
                                                              δ
                      Persq (f, 0) ≤ C0 Lip(f ) Amp(f ) · 1 +
                                                                δ
                                                                                                 100

where Lip(f ) is the Lipschitz constant of f and Amp(f ) = max f − min f .

   This motivates the following definition:

Definition 5.5.5 ([CSEHM10],§2.3). A metric space X implies bounded degree-q total per-
sistence if there is a constant CX such that Persq (f, 0) ≤ CX for any real-valued Lipschitz
function f with Lip(f ) ≤ 1.


   Note that if X implies bounded degree-q total persistence, Persq (f, 0) ≤ Lip(f )q CX for
any Lipschitz function f , as scaling a function only changes the persistence by scaling the
endpoints of the intervals in its barcode.

   Before returning to the study of Betti and Euler curves, we need one more technical
result.

Theorem 5.5.6 ([CSEHM10], Wasserstein Stability, §3). Let X be a triangulable, compact
metric space implying bounded degree-q total persistence, and let f, g : X → R be two tame,
Lipschitz funtions. Then we have the following bound on the Wasserstein-p distance between
their persistence diagrams:
                                               1           1− q
                                  Wp (f, g) ≤ C p kf − gk∞ p

   for all p ≥ q, where C = CX max{Lip(f )q , Lip(g)q }.

   We now prove existence of Betti and Euler curves for eigenfunctions with nonzero eigen-
value.

Proposition 5.5.7. Let (X, dX , µX ) be a compact metric measure space. For any homolog-
                                                              Pn
ical degree k ≥ 0, and any finite linear combination f =          i=1 ci φi   of eigenfunctions of DX
with nonzero eigenvalue, we (tentatively) define the degree-k Betti curve to be the sum of the
                                                                                            101

indicator functions of the intervals in the degree-k persistent homology of (X, f ):

                                                           X
                                     βk (X, f ) =                      1I
                                                       I∈P Hk (X,f )


The Euler curve is then (tentatively) defined to be the alternating sum of these Betti curves:

                                                ∞
                                                       (−1)k βk (X, f )
                                                X
                                   χ(X, f ) =
                                                k=0


Suppose now that X is homeomorphic to the geometric realization of a finite simplical com-
plex, and implies bounded degree-q total persistence. Let p = 1/q. Then for any homological
degree k, the sum defining βk (X, f ) converges in Lp . Under the same hypothesis, the sum
defining χ(X, f ) is finite, so that the Euler curve is likewise shown to exist as a function in
Lp .

Proof. This proof makes use of Lemma 5.6.6, that eigenfunctions of DX with nonzero eigen-
value are Lipschitz. We would like to show that the potentially infinite sum of indicator
functions arising in the definition of βk (X, f ) converges in Lp for 0 < p < ∞. The Weier-
strass M-test guarantees convergence if

                                           X
                                                       kIkLp < ∞
                                       I∈P Hk (X,f )


       Note that the length of an interval I in a barcode is twice the persistence of the corre-
sponding point in the associated persistence diagram. Thus we have:


                                k1I kLp = len(I)1/p = (2 per(I))1/p


       We can thus rewrite the above sum
                                                                                             102


                                          kIkLp = 21/p                   per(x)1/p
                              X                              X

                          I∈P Hk (X,f )                  x∈P Hk (X,f )

   By choice of p, Lipschitzness of f , and the fact that X implies bounded degree-q total
persistence, we can appeal to Theorem 5.5.4 to assert


                                            per(x)1/p ≤ persq (f, 0) < ∞
                                 X

                            x∈P Hk (X,f )


   Thus, we have shown that βk (X, f ) exists and is well-defined.


   To demonstrate that the sum defining χ(X, f ) is finite, we merely note that, as X is
homeomorphic to the geometric realization of a finite simplical complex, there is a finite
degree K such that no subset of X has nontrivial homology in degree k ≥ K. Thus we can
write:
                                                 K
                                                     (−1)k βk (X, f )
                                                 X
                                   χ(X, f ) =
                                                 k=0


   We now define our transforms, implicitly assuming that all our eigenspaces have mul-
tiplicity zero, our scheme for identifying “positive" eigenfunctions does not fail, our metric
measure spaces imply bounded degree-q total persistence and are homeomorphic to the ge-
ometric realization of a finite simplicial complex.

Definition 5.5.8. Let (X, dX , µX ) be a compact metric measure space, and D : L2 (X) →
L2 (X) a compact, self adjoint operator with spectrum (φi , λi ). Suppose further that the
                                                                          √
eigenfunctions φ are continuous, and define the coordinate maps αi (x) = λi φi (x), giving
rise to the infinite-dimensional coordinization Φ = (α1 , α2 , · · · ) and its finite-dimensional
                                                                                                    103

truncations Φn . Although the maps αi are complex-valued, we will now think of them as
being valued in R2 . Thus, the target of the Φn is R2n .


   Let c00 be the space of real sequences with finitely many nonzero components. Equip c00
with the `2 metric, and let S∞
                             00 ⊂ C
                                    ∞
                                      be the unit ball, which is not compact.


   For n finite, the embedded persistence kernel transform E − P KD Tn (X) is the PHT ap-
plied to DKTn (X), which is valued in S2n−1 and takes values in Barcodes.


   The intrinsic persistence kernel transform I −P KD T (X) is the map from S∞
                                                                             00 to the space

of persistence diagrams that takes v = (v1R , v1I , v2R , v2I , · · · ) ∈ S∞
                                                                           00 to the persistence diagram

of the pair                                                       
                                            ∞               q
                                                          vi` λi φ`i 
                                            X    X
                                      M,
                                            i=1 `∈{R,I}


where φR      I
       i and φi are the real and imaginary parts of φi , respectively. For a natural number

k, define I − T KD Tn to be the corresponding restriction to S2n−1 .


Definition 5.5.9. Using Euler curves in place of persistent homology gives rise to the
embedded and intrinsic Euler kernel transforms, E − EKD T and I − EKD T .

Remark 5.5.10. In general, these Euler curves may be well-defined even when the persistence
diagrams are not. See Section 3 of [CGR12] for an introduction to the theory of O-minimal
structures and the various settings in which Euler characteristics are well-defined. This will
be the case, for example, when our manifold X is real analytic and its distance function
is a.e. analytic, which will imply that the eigenfunctions are too (by the same convolution
argument as above).
                                                                                                       104

Remark 5.5.11. Note that if Φn is injective then the embedded and intrinsic kernel trans-
forms are equivalent. Otherwise, they will contain different information, as Φn (X) will
be homeomorphic to the quotient of X obtained by identifying points x, x0 ∈ X when
φi (x) = φi (x0 ) ∀i = 0, · · · , n − 1.


Stability and Inverse Results

Although the E − P KD T and I − P KD T , and the Euler curve variants, seem similar, they
each enjoy distinct and complementary properties. Throughout this section, we will restrict
ourselves to those metric measure spaces for which our topological transforms are defined.

Proposition 5.5.12. Suppose that X is a compact metric measure space. The I − P KD Tn
is continuous on S2n−1 .

Proof. Observe that our finitely many coordinate functions α1 , · · · , αn are continuous func-
tions on a compact space, and hence bounded by some constant M . Thus if v = (v1R , v1I , v2R , v2I , · · · , vnR , vnI )
and w = (w1R , w1I , w2R , w2I , · · · , wnR , wnI ) are two direction vectors, and we define

                                                 n               q
                                                               vi` λi φ`i
                                                 X     X
                                            f=
                                                 i=1 `∈{R,I}


                                                 n               q
                                                               wi` λi φ`i
                                                 X    X
                                            g=
                                                 i=1 `∈{R,I}

Then:
                                           kf − gk∞ ≤ M kv − wkL1

Thus the L∞ distance between f and g varies continuously with the L1 distance between
their corresponding vectors. Continuity of the I −P KD Tn then follows from Theorem 2.2.18,
which asserts that the barcode distance between P H(X, f ) and P H(X, g) varies continuously
                                                                                     105

with the L∞ distance between f and g.


Theorem 5.5.13. Let X and Y be compact, strictly positive metric measure spaces. There
exists a function gX,Y : N → R+ with the following property: if E − P KD Tn (X) = E −
P KD Tn (Y ) or E − EKD Tn (X) = E − EKD Tn (Y ) then


                                   dG (X, Y ) ≤ gX,Y (n)


Proof. This follows from Theorem 5.4.14 and the injectivity of the PHT and ECT.

Observation 5.5.14. Had we used the eigenfunctions of the Laplacian instead of those of
the distance kernel operator, the above result would be phrased in terms of the diffusion
geometry of the spaces X and Y , rather than their original metrics.


Sampling and Computation

In practice, we propose the following computational pipeline:

  • Starting with a metric measure space (X, dX , µX ), pick an approximation parameter
      > 0.

                                ˆ
  • Sample X to obtain an -net X.

  • By taking the measure of the Voronoi regions of the sampled points, one can put a
                    ˆ so that (X,
     measure µXˆ on X          ˆ µ ˆ ) approximates (X, µX ). Alternatively, one can take
                                  X

                                   ˆ
     the empirical distribution on X.

                                                                        ˆ on X.
  • Compute the eigenfunctions/eigenvectors of the discretized operator D    ˆ
                                                                                           106

                                                                                   ˆ
  • Using a proximity parameter r > 0, build the scale-r α-complex K on the points X.

                                            ˆ induce lower-star filtrations on K, whose
  • Linear combination of eigenfunctions on X
      persistences or Euler curves we can compute.

                                                 ˆ for N sufficiently large, as there is a
  • Lemma 5.3.10 implies that ΦN is injective on X
                                                    ˆ One can then appeal to Theorem
      lower bound on the distance between points in X.
      5.5.13 to assert that if Yˆ is another space with sampling Yˆ , and N is large enough so
      that ΦN is also injective on Yˆ , then I − P KTN (X)
                                                        ˆ = I − P KTN (Yˆ ) implies that X
                                                                                         ˆ

      and Yˆ , and therefore also X and Y , are Gromov-Hausdorff close.


Future Work

Moving forward, we hope to expand this line of research in the following ways:

  • Demonstrate continuity of the I − EKD Tn , generalizing Proposition 5.5.12.

  • Identify spaces X and Y for which the function gXY considered in Theorem 5.5.13
      enjoys the property limn gXY (n) = 0.

  • Identify spaces X for which the coordinatization Φn is an embedding.

  • Conduct experiments to test the efficacy of these topological transforms and the metrics
      they induce.


5.6     Metric Stability and Operator Perturbation

In this section we prove that, up to an arbitrary small perturbation of the distance kernel
embedding, it is sufficient to look at finite approximations of metric spaces in order to study
the spectral properties of the distance kernel operator.
                                                                                                           107

     Koltchinskii and Giné [KG00], and Koltchinskii [Kol98] have established the convergence
of spectra and eigenprojections of random empirical operators approximating a Hilbert-
Schmidt operator. We define some notions in order to state their results, and connect it to
our setting. In the following, let (X, µ) be a probability space, and h a symmetric measurable
kernel h : X × X → R that is square integrable, has trivial diagonal, and defines a Hilbert-
Schmidt integral operator H : L2 (X, µ) → L2 (X, µ), i.e.,

Z                                                                                        Z
    |h(x, y)|2 d(µ×µ)(x, y) < +∞             ∀x ∈ X, h(x, x) = 0          Hf (x) :=          h(x, y)f (y)dµ(y).


    ˆ n := {x1 , . . . , xn } be i.i.d. points of X sampled from µ and defining a probability space
Let X
 ˆ n , µn ) with uniform probability µn (xi ) = 1/n for all i. Let H
(X                                                                 ˆ n be the associated empirical
         ˆ n : L2 (X
operator H         ˆ n , µn ) → L2 (X
                                    ˆ n , µn ), i.e.,


                       ˆ n f (x) =
                                     Z
                                                                 1X
                       H                 h(x, y)f (y)dµn (y) =       h(x, xi )f (xi ).
                                                                 n i

                     ˆ n ) the ordered spectra of the respective operators, indexed over
Denote by λ(H) and λ(H
Z∗ = Z \ {0} such that


                                 λ1 ≥ λ2 ≥ . . . ≥ 0 ≥ . . . ≥ λ−2 ≥ λ−1 .


For a function f : X → R, we denote by f˜ its restriction f˜: X
                                                              ˆ n → R.

     A µ-Glivenko-Cantelli class of functions F ⊂ L2 (X) is a set of functions that satisfy

                         Z                      
                                                
                    sup     f (x)d(µ − µn )(x)   → 0 as n → +∞ almost surely.
                    f ∈F


     We state the convergence theorem:
                                                                                                   108

Theorem 5.6.1 (Koltchinskii and Giné [KG00], Koltchinskii [Kol98]). With the above no-
tation,
                                          ˆ n )|2 → 0 as n → +∞ almost surely.
                     X
                            |λi (H) − λi (H
                     i∈Z∗

Additionally, let F be a class of measurable functions on X with a square integrable envelope
F ∈ L2 (X, µ), i.e. |f (x)| ≤ F (x) for all x ∈ X and f ∈ F, such that, for all i ∈ Z∗ ,


                            Fφi := {f φi : f ∈ F} is µ-Glivenko-Cantelli.


Then, for λ an eigenvalue of H of multiplicity m at distance at least 2ε > 0 from other
eigenvalues σ(H) − {λ} and 0, we have

                                                         
                  ˆ n )f˜, g˜i ˆ
      sup hPλε (H           L2 (Xn ,µn ) − hPλ (H)f, giL2 (X,µ)  → 0 as n → +∞ almost surely,
                                                                  
      f,g∈F


where Pλ (H) : L2 (X, µ) → L2 (X, µ) is the projection on the m-dimensional space spanned
                                                     ˆ n ) : L2 (X
by the eigenfunctions of H of eigenvalue λ, and Pλε (H           ˆ n , µn ) → L2 (X
                                                                                  ˆ n , µn ) is the
                                                         ˆ n of eigenvalues in the interval
projection on the space spanned by all eigenfunctions of H
[λ − ε; λ + ε].

   We connect this theorem to our setting. Let (X, d, µ) be a metric measure space, with
0 < µ(X) < +∞, and Φk : X → Rk the distance kernel embedding in dimension k, induced
by the distance kernel operator D := DX (defined in Section 5.3), assuming all eigenvalues
of D are distinct.
                        ˆ n = {x1 , . . . , xn } ⊂ X induces a finite metric measure space
   Any finite subsample X
 ˆ n , d, µn ), where d is inherited from X by restriction, and µn is the uniform measure
(X
                                ˆk : X
µn (xi ) = µ(X)/n. We denote by Φ    ˆ n → Rk the distance kernel embedding for the
                         ˆ := DXˆn defined on the discrete metric measure space. Denote by
distance kernel operator D
                                                                                          109

                          ˆ i , φˆi )}i∈Z∗ the spectra of D and D
{(λi , φi )}i∈Z∗ and by {(λ                                     ˆ respectively.

   In this section, we prove the following corollary of Theorem 5.6.1:

Corollary 5.6.2. For any compact metric measure space (X, d, µ) with doubling measure,
                                                                         ˆ N ⊂ X, for
bounded volume and diameter, and any ε > 0, there exists a finite subset X
N = N (ε) large enough, such that


                                                ˆ k (X
                                    dH (Φk (X), Φ    ˆ N )) ≤ ε.


   We first prove some lemmas. Define F the class of function F := {d(x, ·) : ∀x ∈
X} ∪ {φi }i=1...k from X to R.

Lemma 5.6.3. If diam(X) < +∞ and 0 < µ(X) < +∞, the family F has a square
integrable envelope.

Proof. Naturally, the functions d(x, ·) are bounded by diam(X) < +∞ for all x ∈ X. Any
eigenfunction φi , for 1 ≤ i ≤ k, has L2 -norm 1, and consequently admits a point x0 for which
              q                                                                     q
|φi (x0 )| ≤ 1/ µ(X). Because φi is 1/|λi |-Lipschitz, the function is bounded by 1/ µ(X) +
                                                                         q
diam(X)/λi . Finally, the constant function x 7→ max{diam(X), 1/ µ(X) + diam(X)/|λk |}
is square integrable and is an envelope for the family F.

Lemma 5.6.4. If X is compact, diam(X) < +∞, 0 < µ(X) < +∞, and µ a doubling
                                       ˆ n ) → 0 as n → +∞ almost surely.
measure, the Haussdorf distance dH (X, X

Proof. Fix ε > 0 arbitrarily small. Because X is compact, it can be covered by finitely many
balls of radius ε. Because diam(X) < +∞, 0 < µ(X) < +∞, and µ is doubling, the measure
of these balls is uniformly bounded below. Consequently, the probability that any such ball
                                                                                        ˆn
remains empty when picking larger samples i.i.d. goes to zero. Consequently, for any ε, X
is almost surely ε-dense in X when n → +∞.
                                                                                                    110

Lemma 5.6.5. The 1-Wasserstein distance W1 (µ, µn ) → 0 as n → +∞ almost surely.

Proof. This is standard in statistics and follows from the fact that the empirical measure
weakly converges to µ almost surely [Var58] and that W1 metrizes weak convergence [Vil09].


Lemma 5.6.6. If f is c1 -Lipschitz and g is c2 -Lipschitz, then f g is (||f ||∞ c2 + ||g||∞ c1 )-
Lipschitz.

Proof. By an elementary computation, for any x, y in the domain:


            |f (x)g(x) − f (y)g(y)| ≤ |f (x)g(x) − f (y)g(x)| + |f (y)g(x) − f (y)g(y)|
                                               ≤ ||g||∞ |f (x) − f (y)| + ||f ||∞ |g(x) − g(y)|.


Lemma 5.6.7. For any eigenfunction φi , 1 ≤ i ≤ k, the family Fφi is µ-Glivenko-Cantelli.

Proof. Because the functions d(·, x) and φi are ck -Lipschitz, for a uniform constant ck ≤
max{1, 1/|λk |}, and uniformly bounded (see proof of Lemma 5.6.3), by Lemma 5.6.6 their
pairwise products are c-Lipschitz for a uniform constant c < +∞.
     Using the formula for W1 from Kantorovich-Rubinstein duality, and Lemma 5.6.5, we
have, for f ∈ F :

Z                                      Z               
                                                       

    f (x)φi (x)d(µ − µn )(x) ≤ c·sup{    hd(µ − µn ) : h is 1-Lipschitz} = c·W1 (µ, µn ) → 0 a.s.


     We finally prove the corollary:
                                                                                                                         111

Proof of Corollary 5.6.2. Because D is compact, it can be expressed as

                                                           X
                                               Df =               λi hf, φi iφi .
                                                           i∈Z∗


                                                                             λi hf, φi iφi that admits finite spectrum
                                                                P
We consider the truncated operator Dk :=                           i=1...k

{(λi , φi )}i=1...k and defines the same distance kernel embedding Φk for X.
         ˆ n be an arbitrary subsample of X. Pick any x ∈ X, and any xˆ ∈ X
     Let X                                                                ˆ n . Because φi

is 1/|λi |-Lipschitz, the distance between Φk (x) and Φk (ˆ
                                                          x) is bounded by:

                                                           q                                     d(x, xˆ)
                ||Φk (x) − Φk (ˆ
                               x)||∞ ≤ max                     |λi ||φj (x) − φj (ˆ
                                                                                  x)| ≤ max q             .
                                               j=1,...,k                               j=1,...,k
                                                                                                   |λi |

     Consequently,

                                                        ˆn)
                                                 dH (X, X
(5.3)                           ˆ n )) ≤
                dH (Φk (X), Φk (X                   q               → 0 as n → +∞ almost surely,
                                                        |λk |

by virtue of Lemma 5.6.4.
     Assume all eigenvalues λ1 , . . . , λk of DX have multiplicity 1 and are not 0. Define ε1 =
1
2
    mini=1...k {|λi − λi+1 |, |λi |} > 0 smaller than the eigenvalue gaps. The following projections
onto eigenspaces are onto 1-dimensional spaces,

                                                                                                                          
 ε1 ˆ ˜
hPλ (D)f , g
            ˜iL2 (Xˆn )   − hPλ (D)f, giL2 (X)  = hφˆi , f˜iL2 (Xˆn ) hφˆi , g˜iL2 (Xˆn ) − hφi , f iL2 (X) hφi , giL2 (X)  .
                                                                                                                           

                                                       ˆ n , then f = g = φi , we apply
                                   x, ·), for any xˆ ∈ X
     Taking first f = φi and g = d(ˆ
Theorem 5.6.1, using Lemma 5.6.3 and 5.6.7, to get:

                                                                            
        by Definition 5.3.1, hφˆi , φ˜i i · λ
                                             ˆ i φˆi (x) − λi φi (ˆ
                                                                  x) → 0 as n → +∞ almost surely.
                                                                   
                                                                                              112

and
                                        
                         ˆ ˜ 2
                        hφi , φi i   − 1 → 0 as n → +∞ almost surely.
                                         

Note that there is an ambiguity in the sign of φˆi . We always select the eigenfunction φˆi
such that hφˆi , φ˜i i ≥ 0. Consequently, together with the convergence of eigenvalues of Theo-
rem 5.6.1,


(5.4)                dH (Φk (X      ˆ k (X
                             ˆ n ), Φ    ˆ n )) → 0 as n → +∞ almost surely.


   In conclusion, by virtue of Equations (5.3) and (5.4), for any ε > 0, there exists N large
                                                           ˆ N )) ≤ ε/2 and dH (Φk (X
enough such that the probability that both dH (Φk (X), Φk (X                               ˆ k (X
                                                                                    ˆ N ), Φ    ˆ N )) ≤
                                  ˆ N ⊂ X of size N is strictly positive. With the triangle
ε/2 are true for an i.i.d. sample X
                                                                 ˆ N such that
inequality for dH , this guarantees the existence of a subsample X


                                                   ˆ k (X
                                       dH (Φk (X), Φ    ˆ N )) ≤ ε.
                                                                  CHAPTER 6


Conclusion


The premise of this thesis is that the flexibility of persistent homology allows for the con-
struction of discriminative and interpretable topological transforms. Over the course of the
prior chapters, we have explored two ways of constructing such transforms when the data
consists of intrinsic metric objects. We have seen that the content of our topological trans-
forms is related to the family of functions used to define them. The transforms defined
in this thesis are based on the metric data of the input shapes, and so the corresponding
assertions of injectivity are framed in terms of Gromov-Hausdorff distances. We have also
seen that our topological transforms enjoy various stability results, and are computationally
feasible to approximate. Taken together, these results suggest the utility of these transforms
in shape comparison, alignment, and analysis.


   There are still a number of technical and computational challenges to overcome in ap-
plying these transforms to real-world data sets. For the IPHT and IECT, one must identify


                                             113
                                                                                           114

which subsets of basepoints are most informative. For the various spectral transforms of
Chapter 5.1, one must likewise make a judicious choice of vectors in Sn−1 for computing per-
sistence diagrams and Euler curves. For either set of transforms, there is also the question
of which metric on C(Barcodes) or C(ZR ) is most effective.


   Finally, the author would like to draw the reader’s attention to the variety of mathe-
matical ideas needed in studying topological transforms. One feels confident in asserting
that future progress in this field will rely on the collaboration of topologists and geometers,
analysts and statisticians, and a host of other specialities in mathematics and the applied
sciences.
Appendices


    115
                                                                APPENDIX A


Proofs for the IPHT and IECT


A.1       Proof of Theorem 4.4.7

We will make use of a local injectivity result for Reeb graphs from [CO17] referenced in the
background as Theorem 2.2.28, setting K = 1/22.


   Let us first deal with the exceptional case when G is a circle. We claim that if G0 is any
other graph in the space MGraphs then dP D (G, G0 ) > 0. Observe first that the barcode
transform of a circle of radius c is the barcode consisting of the single interval (0, c/2), and
thus if G0 is a circle of a different circumference it cannot produce the same barcode. On
the other hand, if G0 is not a circle then Lemma 4.5.1 implies that IP HT (G0 ) is not a single
point, and hence it too cannot equal IP HT (G).


   Next, let G ∈ MGraphs be any graph which is not a circle, and take any basepoint

                                              116
                                                                                        117

x ∈ G, giving rise to a barcode ΨG (x). We will exhibit a constant (G, x) > 0 such that
if G0 is another metric graph with 0 < d(G, G0 ) < (G, x) then IP HT (G0 ) omits the bar-
code ΨG (x). Then since ΨG and ΨG0 are continuous by Lemma 4.2.6, both IP HT (G) and
IP HT (G0 ) are compact subsets of Barcode space, and hence if they are not equal their
Hausdorff distance is strictly positive: i.e. dP D (G, G0 ) > 0.


   Firstly, let a be the minimal distance between successive critical values for the Reeb
graph ΦG (x). Observe that Lemma 4.5.1 implies that the basepoints which produce the
same barcode as x are isolated and hence the set of such basepoints, written Sx , is finite.
Let 0 < r < a/32. Let Ωr = Nr (Sx ) be the union of open neighborhoods of radius r around
points in S. Then G \ Ωr is compact, and no point in this complement produces a barcode
identical to ΨG (x); continuity of ΨG then implies that these barcodes are bounded away
from ΨG (x) by some constant δr > 0.


   Next, let 0 <  < min(δr /36, a/192). Suppose that G0 is another metric graph with
0 < dGH (G, G0 ) < , and take any x0 ∈ G0 . We will shot that ΨG0 (x0 ) 6= ΨG (x). To see
this, let M be any correspondence realizing the Gromov-Hausdorff distance between these
two graphs, δ = dGH (G, G0 ); such a matching exists by the compactness of our graphs. Two
cases emerge.


   Case 1: The correspondence M pairs x0 with a point p ∈ Ωr . at distance at most r from
some point q ∈ Sx .


   Case 2: The point x0 is paired with a point p ∈ G \ Ωr .
                                                                                          118

   Let us first deal with case 1, and let q ∈ Sx be a closest point to p, with d(p, q) < r.
Since ΨG (x) = ΨG (q), it will suffice to show that ΨG0 (x0 ) 6= ΨG (q). We have seen in Lemma
4.2.6 that the Reeb graphs ΦG (q) and ΦG (p) are within r of each other in the FD distance.
Moreover, because p and x0 have been matched in M, Theorem 4.2.7 implies that their Reeb
graphs are within
                                                      a    a
                                    6δ < 6 < 6 ×        =
                                                     192   32

   of each other. Thus, applying the triangle inequality,


                                                               a    a
                              dF D (ΦG (q), ΦG0 (x0 )) < 2 ×      =
                                                               32   16

   Hence, applying Theorem 2.2.28, we see that these two Reeb graphs can only produce
identical barcodes if they are equal to each other. In that case, Corollary A.2.2 tells us
that one can always recover the original metric graph from any of its Reeb graphs, so that
G ' G0 , contradicting our assumption that dGH (G, G0 ) > 0.


   Now let us consider case 2. By Theorem 4.2.7, the barcodes ΨG (p) and ΨG0 (x0 ) are within

                                                       δr   δr
                                   18δ < 18 < 18 ×       =
                                                       36   2

   of each other in the Bottleneck distance. But, as p ∈ G \ Ωr , the barcode ΨG (p) is
bounded away from ΨG (x) by δr . Applying the triangle inequality again, we find that
dB (ΨG (x), ΨG0 (x0 )) > δr /2 > 0. 
                                                                                            119


A.2       Proofs of Lemmata 4.5.1, 4.5.2, and 4.5.3

We will make use of the following result.

Proposition A.2.1. Let (G, x) be a pointed, connected, and compact metric graph, and fix
y ∈ G. There is a constant  = (x, y), depending on x and y, with the property that if z is
a third point with d(y, z) < , then d(y, z) = |d(x, y) − d(x, z)|. In other words, the distance
between y and z can be written as the difference of their distances to x.

Proof. Consider the depiction of a metric graph in Figure A.1, where x ∈ G is a. For
each direction incident to p, there is either a geodesic from y to x that starts out along
that direction, or there is not. In the former situation, moving along that edge (which has
some positive length) will bring one closer to x, and the geodesic from y to x is exactly
the concatenation of the geodesic from y to z and the geodesic from z to x. In the latter
situation, even if there are paths from y to x that depart along that edge and do not return
to y, they will be necessarily be longer than the geodesic distance from y to x, and so for
z sufficiently close to y (in that non-geodesic direction) the shortest path to x will need to
pass through y. Thus, the geodesic from x to z will be the concatenation of the geodesic
from x to y and the geodesic from y to z.


   This has the following useful Corollary.

Corollary A.2.2. If (G, p) and (H, q) are pointed graphs with Φ(G, p) = Φ(H, q) then (G, p)
is isometric to (H, q).
                                                                                             120


                     x


                                                                     "1
                                                "2                    y1

                                                y2


                            Figure A.1: A metric graph with basepoint x.


Proof. Let g : Φ(G, p) → Φ(H, q) be the homeomorphism preserving the height functions
                                                    ˆ p and df
fp = dG (p, ·) and fq = dH (q, ·) respectively. Let df      ˆ q be the intrinsic metrics on

Φ(G, p) and Φ(H, q) defined using the differentials of these height functions. It is clear to see
that g is an isometry between our Reeb graphs when equipped with these metrics. However,
                               ˆ p = dG and df
Proposition A.2.1 implies that df           ˆ q = dH , so g is an isometry between the

original graphs.


Proof of Lemma 4.5.1

Fix a basepoint p ∈ G, and take any other basepoint q. Theorem 2.2.20 implies that
dB (ΨG (p), ΨG (q)) ≤ dG (p, q).


   To prove the reverse inequality, we let δ > 0 be smaller than all the pairwise distance
between points in ΨG (p), not counting multiplicity, as well as the distance between those
points and the diagonal. Suppose dG (p, q) < δ/2, and suppose further that we can find
                                                                                               121

points (a, b) ∈ ΨG (p) and (c, d) ∈ ΨG (q) with d∞ ((a, b), (c, d)) = dG (p, q) < δ/2. Any opti-
mal matching between these barcodes has cost at most dG (p, q) < δ/2 and thus will have to
pair (c, d) with (a, b), as pairing (c, d) with the diagonal or any other points in ΨG (p) violates
the choice of δ, forcing (a, b) too close the diagonal or the other points in ΨG (p). Thus we
will have shown dB (ΨG (p), ΨG (q)) ≥ dG (p, q).


   As G is not a circle, it contains vertices of valence at least three. Let v1 be the closest
such vertex to p. The vertex v1 gives rise to an upfork in ΦG (p) unless v1 has valence exactly
three and is the base of a self-loop with antipodal point p (this is the only way for v to be a
downfork). In that case, the vertex w adjacent to v1 (ignoring degree-two vertices) is either
an upfork or a leaf vertex. Thus ΦG (p) contains either an upfork vertex or a downfork leaf
vertex.


   Suppose that ΦG (p) contains an upfork vertex v, giving rise to the point (a, dG (p, v)) in
ΨG (p). For q sufficiently close to p, v is still an upfork vertex, and by Proposition A.2.1 we
can assert that |dG (p, v)−dG (q, v)| = dG (p, q) when dG (p, q) < (v, p). The vertex a is associ-
ated with an downfork u at height d(p, u). Let γ be the loop in G whose maximum distance
from p is achieved at u and whose minimum is achieved at v. For p sufficiently close to q, the
maximum height along γ is achieved at some point u0 with |dG (q, u0 ) − dG (p, u)| ≤ dG (p, q)
and dG (u, u0 ) ≤ dG (p, q). Moreover, one of the downfork directions at u passes through u0 ,
and vice versa. The minimum height along γ must be achieved at one of its finitely many
vertices, so if q is close enough to p, the triangle inequality implies that the minimum dis-
tance is still achieved at v. Thus v is a candidate for being paired with u0 . Indeed, it is not
hard to see that the set of candidates for u and u0 are the same finite set of upforks, so as
long as p is close enough to q, u0 will be paired with v (again, a consequence of the triangle
                                                                                                 122

inequality). We conclude that ΨG (q) contains the point (d(q, u0 ), d(q, v)).


   Similarly, when ΦG (p) contains the downfork leaf e, we know from Proposition A.2.1
that for q close enough to p, |dG (p, e) − dG (q, e)| = dG (p, q), and it is immediate that e is a
downfork leaf for q as well. As in the prior case, for q close enough to p, the corresponding
upfork vertex v is the same for q as for p, and |dG (p, v) − dG (q, v)| ≤ dG (p, q). We conclude
that ΨG (q) contains the point (d(q, e), d(q, v)).


   We have seen that, for some r(p) > 0, and for dG (p, q) < r(p), we can always find a point
in ΨG (p) that contains a matching point in ΨG (q) at distance exactly dG (p, q) away. Setting
(p) = min{r(p), δ/2} completes the proof.


Proof of Lemma 4.5.2

This is a direction application of Proposition A.2.1. Let V ∗ = {v1 , · · · , vk } be the closest
vertices to p in V (there may be many), with d = d(p, v) for v ∈ V ∗ . Let δ > 0 be such
                           / V ∗ . For each v ∈ V ∗ , let (v, p) be as guaranteed in the
that d(p, w) ≥ d + δ for w ∈
claim. Set  ≤ min{(v1 , p), · · · , (vk , p), δ/2}. Then for q ∈ G with 0 < d(p, q) < , the
closest vertices to q are a subset of V ∗ , and for all v ∈ V ∗ , |d(p, v) − d(q, v)| = d(p, q). Thus
|θG (p) − θG (q)| = d(p, q).


Proof of Lemma 4.5.3

Suppose that p is not a vertex, so that θG (p) > 0. Since G is not a graph, no points in the
persistence diagram have birth or death time greater than zero but less than θG (p). Thus,
it is at θG (p) that the first nonzero birth or death time appears. This gives rise to the first
                                                                                              123

discontinuity in the extended Euler curve unless there is cancellation, i.e. the first nonzero
birth time is equal to the first nonzero death time.


   The smallest nonzero death time in ΨG (p) is the distance from p to the closest vertex
v of valence at least three, unless v has valence three, and is the base of a self loop with
p as its antipodal point (this is the only way for v to be a downfork as opposed to an upfork).


   Suppose that p is not the antipodal point of a self-loop. If the distance from v to its
closest vertex (of valence at least three) is equal to the smallest birth time in ΨG (p), it must
be the case that p sits in the middle of a leaf edge.


   Thus, if p is not a vertex, the antipodal point of a self-loop, or the midpoint of a leaf edge,
θG (p) = sG (p). There are only finitely many such points, which we denote P . Suppose now
that χG (p) is injective, and hence a homeomorphim on to its image. Every γ ∈ IECT (G)
then corresponds to a unique point p, i.e. γ = χG (p), and γn → γ precisely if pn → p. Since
P is discrete, the sequence eventually avoids P . Thus sG (γn ) = θG (pn ) for n sufficiently
large. Since θG is continuous, this implies s˜G (χ) = θG (p).


A.3       Proof of Proposition 4.4.5

Given a compact metric graph G with vertex set V , we define a cactus approximation of G as
follows. Let S ⊂ G be a finite set of points containing V . Define ω(S) = maxp,p0 ∈S dG (s, s0 ),
δ(S) = minp,p0 ∈S dG (s, s0 ), and δ(V ) = minv,v0 ∈V dG (v, v 0 ) . Next, let α : S → R+ be a
function which is zero on the leaves of G. We will produce a new graph by attaching to
each point p ∈ S an interval Ip of length α(p). The resulting graph H will be denoted
                                                                                            124

CactusS,α (G). The idea is illustrated in Figure A.2. The points in S are drawn in red, and
to each one we have attached a new interval, which we will call a thorn in the remainder of
the proof. Note that the thorns attached to leaves in G have length zero, and hence do not
add anything to the graph.


                                p
                           Ip


                           Figure A.2: The cactus approximation of G.


   It is clear that if α(p) ≤  for all p ∈ S, the Gromov-Hausdorff distance between G and
any CactusS,αp (G) is at most . The proof of Proposition ?? follows from Lemma A.3.2.

Definition A.3.1. Let G be a compact graph. The injectivity radius of G, denoted inj(G),
is half the length of the shortest closed curve on G (i.e. half the length of the systole of G).

Lemma A.3.2. Let G be any compact metric graph. Let S ⊂ G be a finite subset and take
0 < α < δ(S)/2. Suppose that 2ω(S) + α < inj(G), ω(S) < δ(V ), and that α : S → R+ is
injective and nonzero on S 0 = S \ {leaves in G}. Then if H = CactusS,α (G), ΨH is injective.
                                                                                           125

Proof. We claim that the location of a point x ∈ H can be determined from its barcode.


Determining if a point x lies on a thorn from ΨH (x). First of all, we can determine
from ΨH (x) whether or not x sits on a thorn. If x is on a thorn Ip , the distance from x to
the tip of this thorn is less than α. The distance from x to the tip of any other thorn is
at least δ(S) > α. Moreover, since α < inj(G), the ball around x of radius α contains no
loops in G. This implies that the smallest positive birth time of a one-dimensional interval
in ΨH (x) is the distance from x to the tip of Ip , and the corresponding interval dies at 0.
The first smallest death time in ΨH (x) is necessarily the distance from x to the base of Ip ,
which is an upfork in ΦH (x).


   Let x ∈ H be a point not sitting on a thorn. Suppose first that x ∈ H sits on an interior
edge of H, i.e. neither boundary vertex of this edge is a leaf. As before, the α-ball around
x contains no cycles in G, and moreover since α < δ(S) it contains no leaf of G, so if ΨH (x)
contains an interval born before α it corresponds to the distance from x to the tip of a thorn,
and such a barcode dies at the distance from x to the base of that thorn, which is nonzero.
Suppose next that x ∈ H sits a leaf edge. An additional possibility from the prior case is
that there is an interval in ΨH (x) born before α that corresponds to the distance from x
to a leaf in G. In that case the distance from x to the other vertex on its edge is at least
δ(S) − α > 2α − α = α, and this is the first nonzero death time in ΨH (x).


   Thus we have seen that the barcodes associated to points on thorns are distinct from
those that are not on thorns, and hence we can identify whether a point x ∈ H lies on a
thorn from its barcode.
                                                                                                 126


Determining the location of a point x from ΨH (x). When x lies on a thorn Ip , we
have seen that ΨH (x) records its distance from the tip and base of that thorn. From this
information, the length α(p) of the thorn can also be derived. Since α is injective on S 0 , we
can identify which thorn this is, and where on it x sits.


   Next, suppose that x does not sit on a thorn. If x lies on an interior edge, with endpoints
p1 , p2 ∈ S, we claim that we can deduce what these points p1 and p2 are from ΨH (x), as well
as the distance from these points to x. This uniquely identifies the point x, as ω(S) < inj(G)
implies that there is only one point equidistant between p1 and p2 . To do this, observe that
the ball of radius ω(S) + α around x contains the thorns on either end of the edge. Since
ω(S) + α < inj(G), this ball is a tree and does not contain any loops in G. Thus the intervals
in ΨH (x) born before ω(S)+α correspond to the distances from x to the tips of these thorns,
and the death times of the corresponding intervals are the distances from x to the base of
those thorns. The barcode may also contain intervals corresponding to the distances from x
to leaves in G, but such intervals die at zero and cannot be confused with intervals coming
from thorns. From this information, one can further deduce the length of those thorns, and
hence the thorns themselves and their corresponding bases in S. This uniquely identifies the
point x ∈ H.


   Lastly, suppose X sits on a leaf edge, with non-leaf endpoint p. Let q1 , · · · , qn be all the
vertices in S adjacent to p. Since δ(S) < ω(S), it is not possible for two distinct leaves in
G to be adjacent to the same vertex in S, and hence all of the qi are in S 0 . As before, we
claim that we can identify the points p, q1 , · · · , qn , as well as the distances from these points
                                                                                                127

to x, from ΨH (x). This will uniquely identify the point x. To do this, observe that the ball
of radius 2ω(S) + α around x contains the thorns Ip , Iq1 , · · · , Iqn . As before, we can identify
which thorns these are from ΨH (x), and subsequently deduce the distances from their bases
to x.
                                                                 APPENDIX B


Proof of Proposition 4.5.5 for ΨG


Note: in the following proof, only the birth and death times are used, not the
pairing.


   To simplify the following proof, we will remove any vertices of degree two and merge
the adjacent edges; this may introduce multiple edges betwen vertices, which is fine for our
purposes. This will produce a new metric graph G0 (as the combinatorial type is different),
but it is clear that ΨG is injective if and only if ΨG0 is. Note that if the edge lengths of G do
not satisfy any nontrivial linear equalities over Z, then neither will the edge lengths of this
new graph. Thus we assume from the beginning that every vertex in G either has valence
greater than or equal to three or has valence equal to one (a leaf vertex).


Remark B.0.1. Any bijective isometry between graphs sends a vertex of valence distinct from
two to another vertex of valence distinct from two (and of the same valence). Thus, such

                                              128
                                                                                              129

vertices are part of the metric data of a graph. The combinatorial data of degree-two vertices
is not used in the following proof, and hence the result still applies if our metric graphs are
considered equivalent up to isometry (and not just metric isomorphism).

   Before we proceed with the case analysis, we introduce some useful technical results.


B.0.1      Technical Results

Lemma B.0.2. Let G be any compact metric graph, possibly with self-loops. For any base-
point p ∈ G, it is possible to deduce the valence of p from ΨG (p), where the valence of a
non-vertex point is considered to be 2. Indeed, the valence of p is equal to the number of
intervals of the form (0, ·) in the one-dimensional part of ΨG (p).

Proof. For r sufficiently small, the ball of radius r around p is isometric to a disjoint union of
intervals of the form [0, r], identified at the origin, where the number of intervals is precisely
the valence of p. In the relative part of our filtration, we compute the homology of this space
after identifying its boundary, a space homotopy equivalent to a wedge of circles, where the
number of circles is now the valence of p minus one. The rank of this homology group is
then equal to the valence of p minus one, until r = 0 and the homology groups vanish. Thus
the number of intervals with death time zero in the one-dimensional part of ΨG (p) is the
valence of p minus one.

Corollary B.0.3. If p and q are two basepoints with different valences, then ΨG (p) 6= ΨG (q).
In particular, in our setting where there are no vertices of valence two, it is impossible for a
vertex and a non-vertex to produce the same persistence diagram.

Lemma B.0.4. For any basepoint p ∈ G, and any vertex v ∈ G not equal to p, v is either
an upfork, a downfork, or both, in ΦG (p). If v is a leaf vertex it is necessarily a downfork.
                                                                                                  130

Proof. If v is a leaf, then the edge to which it is adjacent is by necessity the initial segment
of a geodesic from v to p, making v a downfork. Otherwise, v has valence at least three, so
either two or more directions adjacent to v are the initial segments of geodesics from v to p,
or at most one is. In the former case, v is a downfork. In the latter case, v is an upfork. If v
has valence at least four it is possible for it to be both an upfork and a downfork, depending
on how many of the adjacent directions to v are the initial segments of geodesics from v to
p.

Lemma B.0.5. Assume G is not a circle and has no vertices of valence two. For a non-
vertex point p ∈ G, let d1 be the smallest nonzero death time in ΨG (p). Then there ex-
ist a pair of vertices v1 and v2 of valence at least three such that, for every other vertex
x ∈ G, either (i) d(p, x) = d1 + d(v1 , x) and the geodesic from p to x passes through v1 , (ii)
d(p, x) = L(Ep ) − d1 + d(v2 , x) and the geodesic from p to x passes through v2 , or (iii) Ep
is a self-loop based at x, p sits midway on this loop, and x has valence three, as in Figure B.1.


                                   p             x
                                                           v1


          Figure B.1: The smallest nonzero death time in ΨG (p) is d(p, w) rather than d(p, v).


     Case (iii) does not apply if x is not the base of a self-loop or is an upfork vertex (i.e.
giving rise to a death time) in ΦG (p). We can generally take v1 and v2 to be the endpoints of
Ep at distance d1 and L(Ep ) − d1 respectively, unless Ep is a self-loop with vertex x of valence
three, and p sits midway on this loop, in which case v1 = v2 will be the vertex adjacent to x,
see Figure B.1.
                                                                                                   131

Proof. If Ep has distinct endpoints, then d1 is necessarily the distance from p to the closest
endpoint, and we can take v1 to be this endpoint and v2 to be the other endpoint. Since
every geodesic starting at p must exit via an endpoint of Ep , one of (i) or (ii) must hold.


    If Ep is a self-loop with vertex x, then since G is not a circle, x either has valence three
or valence at least four. In either event, if p is not in the middle of the loop, d1 = d(p, x) and
we can set v1 = x and v2 equal to whatever vertex we like, as (i) always holds. If p is in the
middle of the loop, then x is still an upfork for p so long as the valence of x is at least four,
in which case we can still set v1 = x. Lastly, if x has valence three, then d1 is the distance
from p to the vertex adjacent to x, which we denote v1 , and indeed all geodesics from p to
vertices distinct from x must pass through v1 .


    The following lemma shows how to relate a birth time in ΨG (p) to a sum of edge lengths
in G.

Lemma B.0.6. Let G be any metric graph, p ∈ G a basepoint, and (b, ·) a point in the
one-dimensional persistence diagram of ΨG (v). If p is a vertex, then either (i) 2b the length
of a simple cycle in G containing p, (ii) b is equal to the length of a geodesic from p to a
                                        1   P
leaf vertex e, or (iii) b is equal to   2   i∈U   L(Ei ), where U is a multi-set of edges in which at
least one edge appears exactly once. If p is a non-vertex, let d1 , v1 and v2 be as in Lemma
B.0.5. Then either (i) 2b is the length of a simple cycle in G containing p, (ii) (b − d1 ) is the
length of a geodesic from v1 to a leaf vertex e, (iii) (b − L(Ep ) + d1 ) is the length of a geodesic
                                                            1
from v2 to a leaf vertex e, (iv) (b − d1 ) is equal to
                                                                P
                                                            2   i∈U   L(Ei ), where U is a multi-set of
edges in which at least one edge appears exactly once, or (v) (b − L(Ep ) + d1 ) is equal to
1   P
2   i∈U   L(Ei ), where U is a multi-set of edges in which at least one edge appears exactly once.
                                                                                           132


   The significance of asserting that an edge appears at most once in a multi-set U is that
          1   P
the sum   2   i∈U   L(Ei ) will include some terms with fractional coefficients. In those cases
when the birth time does not correspond to a simple cycle, the geodesics and sums of edges
appearing do not contain Ep .

Proof. The point (b, ·) corresponds to a downfork q in the Reeb graph ΦG (p) at distance b
from p. The downfork q may either be a leaf vertex or a point (not necessarily a vertex) of
valence at least two. We will assume that the geodesics from p to q pass through v1 or v2
(cases (i) and (ii) in Lemma B.0.5), as if Ep is a self-loop with q as its base, then 2b is the
length of this self-loop, and the result has been demonstrated.


   Suppose first that q is a leaf vertex. If p is a vertex then b is the sum of the lengths of
the edges along this geodesic: see the left-hand side of Figure B.2. If p is not a vertex, then
the geodesic from p to q passes through v1 or v2 , so either b − d1 or b − L(Ep ) + d1 is the
sum of the length of the edges along the geodesic from q to this vertex: see the right-hand
side of Figure B.2 (a similar picture applies when Ep is an edge loop). Note that when p is
a non-vertex, the edge Ep does not appear in the edges in the sum.
                                q                           q


                                      p                         d1
                                                                 p


                      Figure B.2: The edges appearing in the sum are drawn in red.
                                                                                             133

   Suppose next that q is not a leaf vertex. As q is a downfork, there are at least two
geodesics from q to p. These geodesics either (a) first meet at p, (b) meet before arriving at
p. In scenario (a), the sum of the two geodesics is a simple cycle of length 2b. See Figure
B.3. Note that this scenario encapsulates the case when Ep is a loop vertex centered at q.

                                                q


                                                    p

                   Figure B.3: The simple cycle of of length 2b is drawn in red.


   In scenario (b), if p is a vertex, then 2b is the sum of edge lengths as show on the left-hand
side of Figure B.4. If p is a non-vertex, then either 2(b − d1 ) or 2(b − L(Ep ) + d1 ) is the
sum of edge lengths, as shown on the right-hand side of Figure B.4 (a similar picture applies
when v1 is not an endpoint of Ep , i.e. the third case of Lemma B.0.5). Observe that the
edge(s) containing q appears only once in this sum. Dividing both sides by 2 demonstrates
the remaining cases of the lemma. Note that, as before, when p is a non-vertex, the edge Ep
does not appear in the edges in the sum.
                                                                                                 134

                                     q                                  q


                                                             p    d2

                                p

Figure B.4: The edges appearing in the sum are drawn in red or blue. Red edges appear once in the sum
and blue edges appear twice.


Remark B.0.7. Note that cases (ii) and (iv) in Lemma B.0.6 can be combined by asserting
that that 2(b − d1 ) is the sum of the lengths of edges in G (with the same edge potentially
appearing more than once), without distinguishing whether those edges lie along a geodesic
from v1 to a leaf vertex or whether they run along a cycle in G. The same is true for cases (iii)
and (v). Though we will often combine these cases in this chapter, the distinction between
them will become important in Chapter C.


B.0.2      Case Analysis

In light of Corollary B.0.3, we can split our casework into two parts: comparing vertices on
the one hand, and comparing non-vertices on the other hand. In either case, our strategy will
be the same. We will assume that distinct basepoints produce the same persistence diagram
and then deduce the existence of a nontrivial linear equality in Z among edge lengths, thus
violating the hypothesis of the theorem.

Proposition B.0.8. If v, w are distinct vertices in G, and ΨG (v) = ΨG (w), then there is a
nontrivial linear equality among edge lengths of G.
                                                                                               135

Proof. Suppose first that one of v or w has valence one, so that the other does as well by
Lemma B.0.2. Then v and w are both leaf vertices, sitting on edges [v, v 0 ] and [w, w0 ]. By
Lemma 2.2.29, the smallest nonzero death time in the persistence diagram of v or w is the
distance to the closest vertex in G of valence at least three. Since G is connected and does
not consist of a single edge, both v 0 and w0 have valence at least three. Thus the smallest
nonzero death times in ΨG (v) and ΨG (w) are the lengths of [v, v 0 ] and [w, w0 ] respectively.
ΨG (v) = ΨG (w) then implies that these two edges have the same length, a non-trivial linear
equality.


   Suppose next that v and w are distinct vertices of valence three or more. Note that in
a connected graph containing at least two distinct vertices of valence at least three, and in
which there are no vertices of valence two, every vertex is adjacent to a vertex of valence at
least three. Moreover, observe that every path from v or w to a vertex of valence at least
three can only hit vertices of valence at least three along the way, as there are no vertices of
valence two and a vertex of valence one is a dead-end. Let v 0 and w0 be the closest vertices
to v and w respectively with valence at least three. By the prior observation, v 0 and w0 must
be adjacent to v and w respectively. This tells us that the smallest nonzero death time in
ΨG (v) or ΨG (w) is the length of the edge [v, v 0 ] or [w, w0 ]. The equality ΨG (v) = ΨG (w) then
implies that these edges have the same length, a nontrivial linear equality unless v 0 = w and
w0 = v.


   If v 0 = w and w0 = v, then v and w are the closest vertices of valence at least three to
each other, and in fact are the unique closest such vertices, otherwise a non-trivial linear
equality among edge lengths has occurred. Let u be the closest vertex to the edge [v, w]
among all vertices distinct from v and w. If there is more than one such closest vertex, or
                                                                                            136

if u is equidistant from v and w, we will have found a nontrivial linear equality among edge
lengths. Otherwise, there is a unique closest vertex u, and it is strictly closer to one of v or
w, say v. By Lemma B.0.4, two possibilities emerge: either u is an upfork (and potentially
also a downfork, this is irrelevant) in ΦG (v), or it is not an upfork, and hence must be a
downfork. The latter case, that u is a downfork, implies the existence of distinct geodesics
from u to v, and hence a nontrivial linear equality among edge lengths. We claim that the
former case is impossible: if u is an upfork, the dictionary of 2.2.5 tells us that the distance
d(u, v) is a death time in ΨG (v). Since v and w are the unique closest vertices to each other,
d(u, v), d(u, w) > d(v, w). We have chosen u so that d(v, w) < d(u, v) < d(x, w) for any
vertex x 6= u, v of valence at least three. Thus, by Lemma 2.2.29, which tells us that death
times in ΨG (w) correspond to distances from w to vertices x of valence at least three, we see
that there is no such vertex x for which d(w, x) = d(u, v), and hence the death time d(u, v)
in ΨG (v) cannot be matched by anything in ΨG (w), so their barcodes cannot be equal.


   For the remainder of the proof, we will want to show the analogous result for non-vertex
points. It will be useful to distinguish three kinds of basepoints p ∈ G. The first kind is
when Ep is a leaf edge, i.e. one of its endpoints has valence one.

                                            Ep
                                                   v
                                               p


                                      Figure B.5: Case A


   The second kind is when p sits on an interior edge Ep , and both endpoint vertices of Ep
are upforks in ΦG (p).
                                                                                                     137


                                                w


                                                              v
                                                     Ep

                                                     p

Figure B.6: Case B. Note that p is not a vertex, but the edge Ep is drawn as it appears in the Reeb graph
ΦG (p), with height proportional to the distance from p.


   In the third kind, p again sits on an interior edge Ep , but whereas the closer vertex v is an
upfork the further one w is not, and hence must be a downfork by Lemma B.0.4. This means
that there are at least two geodesics from w to p starting along different edges adjacent to
w. We claim that one of these geodesics is the subsegment of Ep from w to p, as in the
picture below. If not, both geodesics pass through v, and we find that there are two distinct
geodesics from w to v, giving a nontrivial linear equality among edge lengths, which we have
assumed is not the case.
                                            w


                                                Ep        v

                                            p
                                          Figure B.7: Case C


   One quick remark before we continue our proof is that there are two very simple piece of
geometric data that can be read off of a persistence diagrams.

Remark B.0.9. Since our graphs have no self-loops, Lemma B.0.5 implies that the smallest
nonzero death time in ΨG (p) is the distance from p to the closest vertex of valence at least
three. We will often denote this distance by d1 in the following proofs. Secondly, for any
                                                                                               138

point p ∈ G, the zero-dimensional part of the persistence diagram ΨG (p) contains a single
point (0, D), where D is the radius of the metric space at p – the furthest distance from p
to another point in G.

Definition B.0.10. For an edge E in a metric graph (G, dG ), L(E) will denote the length
of E.

     Moving on, the following lemma will be useful in the case analysis to come.

Lemma B.0.11. Let p, q be two non-vertex points in G sitting on the same non-boundary
edge E. If ΨG (p) = ΨG (q) then there is a nontrivial linear equality among edge lengths of
G.

Proof. Consider the following figure. The closest vertex of valence at least three to p is v, as
since d(q, w) < d(p, w) we could deduce that ΨG (p) 6= ΨG (q) from Lemma 2.2.31. Similarly,
the closest vertex of valence at least three to q is w. From Lemma 2.2.31, we know that
d1 = d(p, v) = d(q, w). Note that p must be strictly closer to v than q, and q must be strictly
closer to w than p, for if, say, there is a geodesic from q to v of length ≤ d(p, v), then this
geodesic passes through w, meaning that d1 = d(q, w) < d(p, v) = d1 , a contradiction.
                                 u

                                     E
                                         d1                 d1


                                     v        p         q        w


Figure B.8: When E = Ep = Eq . Note that ΨG (p) = ΨG (q) and Lemma 2.2.31 imply that p, q are
equidistant to their closest vertices of valence at least three, which are v and w respectively.


     Take u to be a closest vertex to either v or w among the other vertices in G, and suppose
without loss of generality that it is closer to v. If there is more than one choice for u, or if it
                                                                                               139

is equidistant from v and w, then we have a nontrivial linear equality among edge lengths, so
otherwise we may assume that d(u, v) < d(u, w) and d(u, v) < d(x, v), d(x, w) for any fourth
vertex x of valence at least three. If u is an upfork in the Reeb graph ΦG (p), then by Lemma
2.2.29 it gives rise to a death time in ΨG (p) strictly greater than d1 , and strictly smaller than
any other death time in ΨG (q), so that ΨG (p) 6= ΨG (q), contrary to our hypothesis. Thus u
is a downfork, so that there are two geodesics from u to p, and either both geodesics pass
through v or one passes through w. It is impossible for one to pass through w, as any path
from u to p passing through w has length strictly longer than L(E) + d1 , and hence cannot
be a geodesic. Thus both pass through v, and hence there are distinct geodesics from from
u to v, implying some nontrivial linear equality among edge lengths.

   The following two propositions demonstrate the result of Proposition B.0.8 for non-
vertices. The first proposition deals with pairs of basepoints in distinct cases (among the
cases A,B, and C, as defined above), and the second proposition deals with pairs of basepoints
of the same case. It is important to note that our persistence diagrams come labelled by
dimension but do not tell us if a point comes from ordinary, relative, or extended persistence.
Some of the following casework emerges as a result of this ambiguity.


B.0.3      Comparing Distinct Cases

Proposition B.0.12. If v, w are distinct non-vertex points in G of distinct cases, and
ΨG (v) = ΨG (w), then there is a nontrivial linear equality among edge lengths of G.

Proof. Our proof consists of three parts, pertaining to which pairs of cases our points be-
long: (1) cases A and B, (2) cases B and C, and (3) cases A and C. In all these cases,
ΨG (v) = ΨG (w) and Remark B.0.9 implies that the distances from v and w to their closest
                                                                                          140

vertices of valence at least three are equal, and denoted d1 .


   (1) Cases A and B:


                                                          w


                                d2         d1                             v
                                                          r2   Eq
                              Ep       p                             d1
                                                                 q

                                     Figure B.9: Cases A and B


   Suppose p is of case A and q is of case B, with the edge lengths as above. We know that


(B.1)                                      d1 + d2 = L(Ep )


(B.2)                                      d1 + r2 = L(Eq )


   Clearly, Ep 6= Eq as one is a leaf edge and the other is not.


   By hypothesis, r2 is a death time in ΨG (q), corresponding in ΨG (p) to the distance from
p to a vertex of valence at least three. Since the geodesic from p to any such vertex passes
through the segment of Ep of length d1 , we can deduce that r2 − d1 is the sum of the lengths
of edges in G, where none of these edges is Ep by construction, and none of them are Eq
either, as r2 − d1 < r2 + d1 = L(Eq ). Thus there is a set of edges U along a geodesic between
vertices in G, omitting Ep and Eq , for which
                                                                                             141


                                                  X
(B.3)                                 r2 − d1 =         L(Ei )
                                                  i∈U


   Lastly, ΨG (p) contains the death time d2 . This cannot be a point in zero-dimensional
persistence, i.e. that the furthest point from p is the leaf vertex at distance d2 . Indeed,
the point q is necessarily further away than p from this leaf vertex, and hence contains
a larger zero-dimensional death time, making it impossible for ΨG (p) = ΨG (q). Thus d2
is a death time in one-dimensional persistence for ΨG (p), and hence, correspondingly for
ΨG (q). By Lemma B.0.6, three possibilities emerge: either 2d2 is the length of a simple
cycle in G, or 2(d2 − d1 ) is the sum of edges in G, omitting Eq , or 2(d2 − r2 ) is the sum
of edges in G, omitting Eq (here we have combined cases (ii) and (iv), as well as (iii) and (v)).


   In the first case, taking 2(B.1) + (B.3) - (B.2) gives

                                                 X
                              2d2 = 2L(Ep ) +          L(Ei ) − L(Eq )
                                                 i∈U


   Now, this should equal the length of a simple cycle in G. However, Ep cannot appear
among the edges of this cycle, as it is a leaf edge. This implies a nontrivial linear equality
among edge lengths.


   In the second case, we have that

                                                  1X
(B.4)                                d2 − d1 =          L(Ei )
                                                  2 i∈S
                                                                                         142

where this sum of edges omits Eq . Taking (B.2) - (B.1) - (B.3) + (B.4), we obtain

                                                     1X             X
                       0 = L(Eq ) − L(Ep ) +               L(Ei ) −     L(Ei )
                                                     2 i∈S          i∈U


   Multiplying both sides by two gives a nontrivial linear equality among edge lengths. In-
deed, the right side cannot cancel out because neither collection U nor S contain Eq .


   In the third case, we have that

                                                     1X
(B.5)                                 d2 − r2 =            L(Ei )
                                                     2 i∈S

and again this sum of edges omits Eq . Taking (B.2) - (B.1) + (B.5) gives

                                                              1X
                               0 = L(Eq ) − L(Ep ) +                L(Ei )
                                                              2 i∈S

   multiplying by two, as before, produces a nontrivial linear equality among edge lengths.


   Cases B and C:

                                     w1
                                                              w2
                                                    v1
                                     d2   Ep
                                                         r2       Eq
                                               d1                       v2
                                          p                        d1
                                                              q

                                  Figure B.10: Cases B and C


   In this case, we see that
                                                                                           143


(B.6)                                  d1 + d2 = L(Ep )


(B.7)                                 d1 + r2 = L(Eq )


   To start, we assume that Ep 6= Eq by passing to Lemma B.0.11. As before, d2 is a death
time in ΨG (p), corresponding in ΨG (q) to the distance from q to a vertex of valence three or
greater. Without loss of generality this geodesic passes through v2 , as by the hypotheses of
case C any geodesic passing through w2 can be re-routed to pass through v2 and have the
same length. We see that

                                                 X
(B.8)                                d2 − d1 =         L(Ei )
                                                 i∈U


where this collection of edges omits Ep and Eq . We can also see that the birth time r2 shows
up in the one-dimensional persistence of ΨG (q), and hence also in ΨG (p). As before, we end
up with three possibilities. If there is a simple cycle in G of length 2r2 , then taking 2(B.7)
+ (B.8) - (B.6) gives
                                               X
                             2r2 = 2L(Eq ) +         L(Ei ) − L(Ep )
                                               i∈U

This should equal the length of a simple cycle in G. However, in such a cycle each edge
shows up once, whereas the right-hand side above has Ep appear twice, so that we must
have a nontrivial linear equality among edge lengths.
                                                                                      144

     Otherwise, Lemma B.0.6 guarantees that either

                                               1X
(B.9)                              r2 − d1 =         L(Ei )
                                               2 i∈S

or

                                               1X
(B.10)                             r2 − d2 =         L(Ei )
                                               2 i∈S

In either case, the sum of edges omits Eq , and the resulting analysis is the same as when
comparing cases A and B.


     Cases A and C:
                                                          w
                                 d2       d1
                                                     r2
                                      p                       Ep    v

                                                          q    d1

                                Figure B.11: Cases A and C


     Firstly, we see that


(B.11)                                d1 + d2 = L(Ep )


     Observe next that


(B.12)                        L(Eq ) = r2 + d1 = d(w, v) + d1
                                                                                            145

   so that


(B.13)                              d1 = L(Eq ) − d(w, v)


   Now, d2 is a birth time for p, so ΨG (p) = ΨG (q) implies that it is a birth time for d2 . By
Lemma B.0.6, and the fact that any geodesic passing from q to w can be rerouted to pass
through v, we see that


                                                1X
(B.14)                              d2 − d1 =         L(Ei )
                                                2 i∈U

for some multi-set of edges. Then (B.11) − 2(B.13) − (B.14) gives

                                                               1X
                         0 = L(Ep ) − 2L(Eq ) − 2d(w, v) −           L(Ei )
                                                               2 i∈U

   Since Ep 6= Eq , this is a nontrivial linear equality among edge lengths.


B.0.4     Comparing Identical Cases

Proposition B.0.13. If v, w are distinct non-vertex points in G of the same kind, and
ΨG (v) = ΨG (w), then there is a nontrivial linear equality among edge lengths of G.

Proof. Cases A and A:
                                                                                          146


                                   p                       r2   q d
                            d2         d1                           1


                                 Figure B.12: Case A and Case A


   Suppose p and q are both of case A. Note that for every d ∈ [0, L(Ep )], the edge Ep
contains a unique point x at distance d from its unique vertex of valence at least three,
which will necessarily be the smallest nonzero death time in ΨG (x) by Lemma B.0.5, and
similarly for Eq . If E = Ep = Eq then, again by Lemma B.0.5, both p and q would be the
same distance d from the non-leaf vertex of E, implying p = q. Thus, if p 6= q, we may
deduce Ep 6= Eq .


   As d2 and r2 are birth times for p and q respectively, ΨG (p) = ΨG (q) implies that d2 is
a birth time for q and r2 is a birth time for p. If d2 = r2 then L(Ep ) = L(Eq ), a nontrivial
linear equality. Otherwise, we have

                                               1X
(B.15)                             d2 − d1 =         L(Ei )
                                               2 i∈U


                                               1X
(B.16)                             r2 − d1 =         L(Ei )
                                               2 i∈S

   Additionally, we have


(B.17)                                  d1 + d2 = L(Ep )
                                                                                          147

(B.18)                                 d1 + r2 = L(Eq )


   Then −(B.24) + (B.25) + (B.26) − (B.27) gives

                               1X             1X
                         0=−         L(Ei ) +       L(Ei ) + L(Ep ) − L(Eq )
                               2 i∈U          2 i∈S

           1
                     L(Ei ) = d2 − d1 < d2 + d1 = L(Ep ), the multi-set of edges indexed by U
               P
   Since   2   i∈U

does not contain Ep . As Ep 6= Eq , the only way for the right-hand side to cancel out is if a
nontrivial linear equality among edge lengths has occurred.


   Cases B and B:


                                     d2   Ep          r2     Eq
                                               d1                 d1
                                           p                 q

                                  Figure B.13: Case B and Case B


   We observe that


(B.19)                                    d1 + d2 = L(Ep )


(B.20)                                    d1 + r2 = L(Eq )


   Using Lemma B.0.11, we assume without loss of generality that Ep 6= Eq . If d2 = r2
                                                                                            148

then L(Ep ) = L(Eq ), a nontrivial linear equality. Thus let us assume, again without loss of
generality, that d2 < r2 . Since d2 is a death time for p, and hence corresponds to the distance
from q to a vertex, and since the corresponding geodesic must pass along the segment of
length d1 adjacent to q due to length considerations, we deduce further that

                                                  X
(B.21)                                d2 − d1 =         L(Ei )
                                                  i∈S


Since d2 − d1 < L(Ep ) and d2 − d1 < d2 < r2 < L(Eq ), none of the edges present in the sum
can be Ep or Eq .


   On the other hand, since r2 is a death time for q, this must correspond to a geodesic
from p to a vertex in G, passing either (i) through the segment of length d1 or (ii) through
the segment of length d2 . In (i), we have that

                                                  X
(B.22)                                r2 − d1 =         L(Ei )
                                                  i∈T


By construction, none of the edges in this sum are Ep , as the geodesic of length r2 starts at
the non-vertex point p on Ep , and subtracting away d1 removes the length of the segment
from p to a boundary vertex of Ep . Moreover, r2 − d1 < L(Eq ), so this sum of edges omits
Eq as well. Taking (B.19) - (B.20) - (B.21) + (B.22) gives


                                                  X                X
                         0 = L(Ep ) − L(Eq ) −          L(Ei ) +         L(Ei )
                                                  i∈S              i∈T

   a nontrivial linear equality among edge lengths.
                                                                                          149

   In (ii), we have

                                                    X
(B.23)                                 r2 − d2 =             L(Ei )
                                                   i∈T


the sum of edge lengths not including Ep or Eq (by the same argument as in the prior case).
Taking (B.23) + (B.19) - (B.20) gives


                                    X
                               0=          L(Ei ) + L(Ep ) − L(Eq )
                                    i∈T

   again a nontrivial linear equality among edge lengths.


   Cases C and C:


                                  d2       Ep       r2       Eq
                                            d1                 d1
                                       p                 q

                               Figure B.14: Case C and Case C


   By Lemma B.0.11, we may assume Ep 6= Eq .


   As d2 and r2 are birth times for p and q respectively, ΨG (p) = ΨG (q) implies that d2 is
a birth time for q and r2 is a birth time for p. If d2 = r2 then L(Ep ) = L(Eq ), a nontrivial
linear equality. Otherwise, we have

                                                 1X
(B.24)                            d2 − d1 =            L(Ei )
                                                 2 i∈U
                                                                                          150

                                                 1X
(B.25)                               r2 − d1 =         L(Ei )
                                                 2 i∈S

   Additionally, we have


(B.26)                                 d1 + d2 = L(Ep )


(B.27)                                 d1 + r2 = L(Eq )


   Then −(B.24) + (B.25) + (B.26) − (B.27) gives

                               1X             1X
                         0=−         L(Ei ) +       L(Ei ) + L(Ep ) − L(Eq )
                               2 i∈U          2 i∈S

           1
                     L(Ei ) = d2 − d1 < d2 + d1 = L(Ep ), the multi-set of edges indexed by U
               P
   Since   2   i∈U

does not contain Ep . As Ep 6= Eq , the only way for the right-hand side to cancel out is if a
nontrivial linear equality among edge lengths has occurred.


   Proposition 4.5.5 now follows from Propositions B.0.8, B.0.12, B.0.13, and Corollary
B.0.3.
                                                                APPENDIX C


Proof of Proposition 4.5.5 for χG


Let G be a metric graph with no self-loops and at least three vertices of valence distinct
from two. We have shown that ΨG (p) = ΨG (q) implies the existence of a nontrivial linear
equality among the edge lengths of G. Moreover, we see from the proof in Chapter B that
the pairing between birth and death times is not used, and hence it suffices to assume
B(ΨG (p)) = B(ΨG (q)) and D(ΨG (p)) = D(ΨG (q)), where B(·) and D(·) are the multisets
of birth and death times, respectively. This almost implies the same injectivity result for
the Euler curves χG , as the death and birth times of ΨG show up as points of left- and
right-discontinuity in χG . However, it is possible for intervals in the barcode to line up so as
to cancel out points of discontinuity, so that certain birth and death times cannot be read
from χG . For example, the barcode in Figure C.1 gives a constant Euler curve.


                                              151
                                                                                            152

                      ΨG(p)


              Figure C.1: A non-constant barcode giving rise to a constant Euler curve.


   Our aim is therefore to show the following proposition:

Proposition C.0.1. Let G be a metric graph that is not a figure-eight, but which may have
self-loops, and let p, q ∈ G two points with B(ΨG (p)) 6= B(ΨG (q)) or D(ΨG (p)) 6= D(ΨG (q)).
If χG (p) = χG (q), then there is nontrivial linear equality among the edge lengths of G.


   When G is a figure-eight, this is still true except for the pair of points p, q ∈ G that sit
in the middle of each loop.

   The base of our proof is to show that this cancellation of endpoints gives rise to nontrivial
linear equalities among the edge lengths of G. As in the prior proof, it will be useful to
distinguish different cases.


C.0.1      The three cases

We now outline the three cases needed for our proof.


   Case (I): The point p is in the middle of a self-loop or the middle of a leaf edge, as in
Figure C.2
                                                                                          153


                                v2             p            v1


                            p                          v1


                                          Figure C.2: Case (I).


   Case (II): The Reeb graph ΦG (p) contains a downfork non-leaf vertex y, Ep is an interior
edge that is not a self loop, and the two geodesics from y to p enter though distinct sides of
Ep . See Figure C.3. Moreover, there is no cancellation in χG (p).

                                                   y


                                     v2                 p   v1

                                          Figure C.3: Case (II).


   Case (III): ΨG (p) exhibits cancellation but the smallest nonzero death time, d1 , is not
cancelled out. See Figure C.4.
                                                                                                   154

                       ΨG(p)


                       0          d1         d2
                                       b1    b2          b3

Figure C.4: Case (III). Note that the second birth time b2 and the second death time d2 cancel out, but
d1 does not cancel out with b1 .


C.0.2      Technical Results

In this section, we give important definitions and prove some useful lemmas. The following
proof will also make heavy use of Lemmas B.0.5 and B.0.6 from Chapter B.

Corollary C.0.2 (of Lemma B.0.5). If the smallest nonzero death time d1 in ΨG (p) is not
the distance from p to its closest vertex v1 of valence at least three, then p is in Case (I).
Thus, in case (III), we can assume that d1 is the distance d(p, v1 ).

   In the next definition and lemma, we see how a cancellation event gives rise to an equation
among the edge lengths of G.

Definition C.0.3. We will say that χG (p) exhibits cancellation if a birth time is equal to a
death time in ΨG (p), so that the corresponding intervals overlap and the endpoints cannot
be recovered in the extended Euler curve.

Lemma C.0.4. Let p ∈ G be a non-vertex point, and let d1 , v1 , v2 be as in Lemma B.0.5.
Suppose that there is cancellation in χG (p). Then there exists a non-leaf vertex x ∈ G for
which one of the following ten cases must hold:

   1. d1 = L(simple cycle)/2 − d(v1 , x).
                                                                                                     155

  2. d(v1 , x) = d(v1 , e) for some leaf vertex e.

                   1   P
  3. d(v1 , x) =   2   i∈U   L(Ei ) for a multi-set of edges U with at least one edge appearing
      once.

  4. 2d1 = L(Ep ) + d(v2 , e) − d(v1 , x) for a leaf vertex e.

  5. 2d1 = L(Ep ) + 21             L(Ei ) − d(v1 , x) for a multi-set of edges U with at least one edge
                           P
                             i∈U

      appearing once.

  6. 2d1 = L(Ep ) + d(v2 , x) − L(simple cycle).

  7. 2d1 = L(Ep ) + d(v2 , x) − d(v1 , e) for some leaf vertex e.

  8. 2d1 = L(Ep ) + d(v2 , x) − 21
                                       P
                                           i∈U   L(Ei ) for a multi-set of edges U with at least one edge
      appearing once.

  9. d(v2 , x) = d(v1 , e) for some leaf vertex e.

                   1   P
 10. d(v2 , x) =   2   i∈U   L(Ei ) for a multi-set of edges U with at least one edge appearing
      once, as well as an edge containing v2 .

   Moreover, if v1 = v2 , i.e. Ep is a self-loop, then only cases (1) through (3) are possible.
If Ep is a leaf edge, then only cases (2) through (4) can hold, and case (4) becomes 2d1 +
d(v1 , x) − L(Ep ) = 0.

Proof. A cancellation event occurs when a death time is equal to a birth time in ΨG (p).
This death time corresponds to the distance from p to a vertex x, and d(p, y) is either equal
to d1 + d(v1 , x) or L(Ep ) − d1 + d(v2 , x). There are five possibilities for the associated birth
time, as per Lemma B.0.6. The ten cases emerge as per Figure C.5.
                                                                                                               156
                                                                                                  P
                               b = d    b = L(γ)=2         b = d1 + d(v1 ; e)       b = d1 + 1        L(Ei )
                                                                                             2

                                          d1 = L(γ)
                                                                                                      P
                                                            d(v1 ; x) = d(v1 ; e)     d(v1 ; x) = 1        L(Ei )
 d = d1 + d(v1 ; x)                       −d(v1 ; x)                                              2
                                        (1)                (2)                      (3)


                                          d1 = L(Ep )            2d1 = L(Ep )             2d1 = L(Ep )
 d = L(Ep ) − d1 + d(v2 ; x)
                                          +d(v2 ; x)             +d(v2 ; x)                   P
                                                                                          +d(v2 ; x)
                                          −L(γ)=2                −d(v1 ; e)               −1      L(Ei )
                                                                                            2

                                         (6)               (7)                      (8)


                                                                                                       P
                                b = d   b = L(Ep ) − d1 + d(v2 ; e)          b = L(Ep ) − d1 + 1           L(Ei )
                                                                                               2

                                                 2d1 = L(Ep )                         2d1 = L(Ep )

 d = d1 + d(v1 ; x)
                                                 +d(v2 ; e)                                P
                                                                                      −d(v1 ; x)
                                                 −d(v1 ; x)                           +1      L(Ei )
                                                                                        2
                                        (4)                               (5)

                                                                                                  P
 d = L(Ep ) − d1 + d(v2 ; x)                   d(v2 ; x) = d(v2 ; e)              d(v2 ; x) = 1       L(Ei )
                                                                                              2

                                         (9)                               (10)

                      Figure C.5: The ten possibilities resulting from a cancellation.


   When Ep is a self-loop, any geodesic from p to another vertex must through the segment
from p to v1 = v2 of length d1 . When Ep is a leaf edge, p does not sit on any self-loops, but
the leaf vertex generates a birth time, and v2 = e.

   In the next definition and lemma, we explore the implication of having the smallest
nonzero death time cancel out in the Euler curve.

Definition C.0.5. We will say that a birth or death time in ΨG (p) is undetectable in χG (p)
if it is not a point of discontinuity for χG (p). This is more extreme than cancellation, as a
birth or death time that appears with multiplicity needs to cancel out a number of times to
become undetectable.

Lemma C.0.6. Let p ∈ G be a non-vertex point on a graph G which is not a circle. let d1
be the smallest nonzero death time in ΨG . If d1 is undetectable in χG (p) then either (i) p
                                                                                                     157

sits on the middle of a leaf edge, or (ii) p sits on the middle of a self-loop whose vertex has
valence exactly four.

Proof. If p sits on an internal edge, the ball of radius d1 around p contains no downforks,
and hence there are no birth times before or equal to d1 . If p sits on a leaf edge, then the
only possible downfork is the leaf vertex – this appears at time d1 precisely when p is the
midpoint of the edge. If p sits on a self-loop, we consider two possibilities: either p is in the
middle of the self-loop or it is not. If p is in the middle of the self-loop and the loop vertex
has valence exactly four, then it is both an upfork and a downfork and we have cancellation.
If p is in the middle of self-loop and p has valence more than four, the death time L(Ep )/2
shows up with multiplicity at least two and hence is not canceled out entirely. If p is in the
middle of the self-loop and the loop vertex has valence three, then d1 is the distance from p
to the vertex adjance to the loop vertex, which is strictly larger than smallest birth time. If
p is not in the middle of the loop then the loop vertex is an upfork at distance d1 , strictly
smaller than any birth time. See Figure C.6


                                    v2        p            v1


                                p                   v1


Figure C.6: The two possible scenarios in which the smallest nonzero death time in ΨG (p) is undetectable
in χG (p).


   In the following definition and lemma, we show that ΦG (p) has a downfork vertex when-
ever ΨG (p) has fewer than the maximal number of bars.
                                                                                              158

Definition C.0.7. For a metric graph G = (V, E), let N (G) =                           − 2)
                                                                       P
                                                                         v∈V (val(G)


Lemma C.0.8. For any point p ∈ G, ΨG (p) has at most N (G) + val(p) − 1 bars. If ΨG (p)
has strictly fewer than N (G) + val(p) − 1 bars, there is a vertex v ∈ G which is a downfork
for p.

Proof. Every vertex v can give rise to (val(v) − 2) death times in ΨG (p), with equality
precisely when there is a unique geodesics from p to v. This accounts for all points in ΨG (p)
except those that die at zero, and of these are (val(p) − 1).


C.0.3           Reduction to the three cases

Remark C.0.9. Note that death times at zero cannot be canceled out, so it is still always
possible to determine the valence of a point p from χG (p). Thus we can deal with vertex
and non-vertex points separately.

   The following lemma shows that, generically speaking, cancellation cannot occur for
vertex basepoints. This demonstrates Proposition C.0.1 for p and q vertices.

Lemma C.0.10. If p is a vertex, and there is a cancellation of a birth and death times in
χG (p), then there is a nontrivial linear equality among the edge lengths of G.

Proof. Suppose that d > 0 is both a birth and death time for p. By Lemma B.0.6, either
(i) 2d is the length of a simple cycle in G, (ii) d is the distance from p to a leaf vertex, or
            1   P
(iii) b =   2       i∈U   L(Ei ) where some edge shows up only once in the sum. However, as d is a
death time, d = p(v, x) for some vertex x of valence at least three. Any of the three cases
then gives a nontrivial linear equality.

   Now, let us assume p and q are not vertices. The hypothesis of Proposition C.0.1 implies
                                                                                             159

that at least one of χG (p) or χG (q) exhibits cancellation. Let us explore the various possible
scenarios:


  • It is possible that the smallest death times in ΨG (p) and ΨG (q) cancel out in their
      Euler curves. Lemma C.0.6 then implies that both p and q belong to Case (I).

  • The smallest nonzero death time has cancelled out for p and not q, but q exhibits
      cancellation. Then p is in Case (I) and q is in case (III).

  • The smallest nonzero death time has cancelled out for p and not q, and q doesn’t
      exhibit cancellation. Then the total number of bars in ΨG (q) is equal to the total
      number of left-discontinuities in χG (q), which by hypothesis is equal to the same count
      for χG (q), which is less than the total number of bars of ΨG (p), due to the cancellation.
      Thus ΨG (q) has fewer than maximal number of bars. By Lemma C.0.8, p is Case (I)
      and q is Case (II).

  • The smallest nonzero death time has not cancelled out for either p or q, but both
      exhibit cancellation. Then both p and q are in Case (III).

  • The smallest nonzero death time has not cancelled out for either p or q, and only one,
      say p, exhibits cancellation. As we have seen, this implies that p is in Case (III) and q
      is in Case (II).


   We thus see that is suffices to consider the cases when p and q are one of the three cases
listed above, and moreover we can ignore the case when both are in Case (II), as this is not
relevant for our proof.
                                                                                            160

Remark C.0.11. It is important to note that if both χG(p) and χG (q) exhibit cancellation,
but the cancellation happens at the same birth/death time, no loss of discriminatory power
has occurred in passing from the multi-sets of birth and death times to the extended Euler
curve.


C.1      Cases (I) and (I)

In this case, p sits in the middle of a leaf-edge or self-loop Ep , with non-leaf vertex or loop
vertex v1 at distance L(Ep )/2. The same is true for q and the vertex w1 at distance L(Eq )/2.
See Figure C.7.


                                  p      v1                    q       w1


                                         v1                            w1
                       p                            q


                     Figure C.7: The two possibilities for p and q, respectively.


   Suppose v1 = w1 . Then either both, one, or neither of Ep and Eq are leaf edges, see
Figure C.8.
                                                                                              161


                                            p                          q


                                      p

                                                      q            q
                                                             p
                                q

                                                p
                      Figure C.8: The four distinct possibilities when v1 = w1 .


   If both p and q sit on leaf edges, with leaf vertices v2 and w2 , then p and q share, and are
equidistant from, all upforks and downforks in G besides for v1 , v2 and w2 . After cancellation
of the smallest death time with the smallest birth time, the only potential difference between
ΨG (p) and ΨG (q) will be that L(Ep )/2+L(Eq ) is a birth time for p in ΨG (p), corresponding to
w2 , and L(Eq )/2 + L(Ep ) is a birth time for q in ΨG (q), corresponding to v2 . As every death
time in ΨG (p) is of the form L(Ep )/2 + d(v1 , x) for some non-leaf vertex x, and every death
time in ΨG (q) is of the form L(Eq )/2+d(v1 , q) for some non-leaf vertex y, these cannot cancel
out in χG (p) or χG (q) without giving rise to a nontrivial linear equality. Thus χG (p) = χG (q)
implies L(Ep )/2 + L(Eq ) = L(Eq )/2 + L(Ep ), so that L(Ep ) = L(Eq ), a nontrivial linear
equality. If p lies on a self-loop and q a leaf edge, then a similar proof applies, with the birth
time L(Ep )/2 + L(Eq ) in ΨG (p) and (L(Ep ) + L(Eq ))/2 in ΨG (q). If q lies on a self-loop
and p on a leaf edge, we can again apply the same argument, switching the roles of p and
q. Lastly, when v1 = w1 and p and q both lie on self-loops, we know by Lemma C.0.6 that
this vertex has valence four, and hence the graph consists simply of two loops – a figure eight.
                                                                                                     162

   We now suppose that v1 6= w1 . Relative to the point p, the vertex w1 can generically
be assumed to an upfork, as otherwise there would be multiple geodesics from v1 to w1 .
The same is true for q and v1 . Thus d(p, w1 ) is a death time in ΨG (p), and d(q, v1 ) is a
death time in ΨG (q). Two possibilities: (i) at least one of these death times cancels out
in χG , or (ii) neither do. In case (i), we may suppose without loss of generality that the
cancellation happens in χG (p). This means that the death time D = d(p, w1 ) = L(Ep )/2 +
d(v1 , w1 ) is also a birth time. By considering Lemma B.0.6, we see that either (a) 2D is
the length of a simple cycle on which p sits, or D − L(Ep )/2 = d(v1 , w1 ) is either (b) equal
                                                       1   P
to the distance from v1 to a leaf vertex e or (c)      2       i∈U   L(Ei ), where this sum contains
terms with non-integer coefficient. Scenario (a) is impossible, as p only sits on the self-
loop Ep , and the equality L(Ep ) + 2d(v1 , w1 ) = 2D = L(Ep ) would imply d(v1 , w1 ) = 0.
In scenario (b), d(v1 , w1 ) = d(v1 , e), a nontrivial linear equality, as only the right-hand
                                                                     1   P
side contains a leaf edge. In scenario (c), d(v1 , w1 ) =            2   i∈U   L(Ei ), another nontrivial
linear equality, as only the right-hand side contains a term with fractional coefficient. In
case (ii), χG (p) = χG (q) implies that the death time D is also a death time for q, so
that L(Ep )/2 + d(v1 , w1 ) = L(Eq )/2 + d(w1 , y) for some vertex y. With some algebraic
manipulation, we deduce L(Ep ) − L(Eq ) = 2(d(w1 , y) − d(v1 , w1 )). This is a nontrivial linear
equality, as the geodesic from v1 to w1 does not contain Eq , so the negative terms on the
LHS are not equal to those on the RHS.

Remark C.1.1. Note that when G is a figure eight, the points p and q at the middle of each
loop have the same extended Euler curve.
                                                                                                    163


C.2      Cases (I) and (II)

In this case, p sits in the middle of a leaf-edge or self-loop Ep , with non-leaf vertex or loop
vertex v1 at distance L(Ep )/2. For q, which sits on the interior, non-loop edge Eq and is
distance d1 away from the closest vertex w1 , there is a downfork vertex y such that


(C.1)                     d1 + d(w1 , y) = d(q, y) = L(Eq ) − d1 + d(w2 , y)


   Figure C.9.


                     v2          p            v1

                                                                         y


                 p                     v1

                                                                                 d1

                                                               w2            q        w1


                 Figure C.9: The point p is in Case (I) and the point q is in case (II).


   Note, firstly, that Ep 6= Eq . Next, we know that, since there is no cancellation in χG (q),
and since ΨG (q) contains the birth time d(q, y), then so does χG (q), and hence χG (p). By
Lemma B.0.6, and considering the position of p, we see that there are three possibilities:
(1) 2d(q, y) is the length of a simple cycle on which p sits, (2) d(q, y) − L(Ep )/2 = d(v1 , e)
                                                   1
for some vertex e, or d(q, y) − L(Ep )/2 =
                                                       P
                                                   2   i∈U   L(Ei ), where this multiset of edge lengths
contains an edge that appears exactly once.


   In case (1), the only simple cycle on which p can sit is Ep itself, and this only in the case
                                                                                             164

when Ep is a self-loop. From Equation (C.1) we can then deduce the following equations:


                              d1 + d(w1 , y) = d(q, y) = L(Ep )/2


                       d1 − L(Eq ) − d(w2 , y) = −d(q, y) = −L(Ep )/2

   Subtracting the first equation from the second gives


                             d(w1 , y) + d(w2 , y) + L(Eq ) = L(Ep )


which is clearly a nontrivial linear equality among edge lengths, as the LHS does not contain
the edge Ep .


   In case (2), the same algebra produces


                      d(w1 , y) + d(w2 , y) + L(Eq ) = L(Ep ) + 2d(v1 , e)


And in case (3), we obtain

                                                                  X
                      d(w1 , y) + d(w2 , y) + L(Eq ) = L(Ep ) +         L(Ei )
                                                                  i∈U


   As with case (1), these are also nontrivial linear equalities, as the edge Ep (which is either
a self-loop or leaf edge) shows up on the RHS and not the LHS.
                                                                                             165


C.3       Cases (I) and (III)

In this case, p sits in the middle of a leaf-edge or self-loop Ep , with non-leaf vertex or loop
vertex v1 at distance L(Ep )/2. As for, q, χG (q) exhibits cancellation, but the smallest nonzero
death time d1 is not cancelled.


   Since χG (p) = χG (q), the death time d1 in χG (q) must also show up in χG (p), and hence


(C.2)                              d1 = L(Ep )/2 + d(v1 , x)


   By Lemma C.0.3, the cancellation in χG (q) gives us one of ten possible cases. Cases
(2,3,9,10) are all immediate nontrivial linear equalities, so we may focus on the rest. The
point q sits on the edge Eq = [w1 , w2 ], at distance d1 from w1 (it is possible that w1 = w2 ,
i.e. that Eq is a loop).


   In case (1), there is a simple cycle γ containing q and a non-leaf vertex y for which
d1 = L(γ)/2 − d(w1 , y). Then, using Equation C.2, we obtain


                           L(γ)/2 − d(w1 , y) = L(Ep )/2 + d(v1 , x)


This is a nontrivial linear equality unless Ep = γ, as otherwise the fractional terms on either
side are not equal. But if q sits on Ep and is distinct from p, d1 < L(Ep )/2, contradicting
Equation C.2.


   In case (4), 2d1 = L(Eq ) + d(w2 , e) − d(w1 , y) for some non-leaf vertex y and a leaf edge
                                                                                           166

e. Using Equation C.2, we get


                       L(Ep ) + 2d(v1 , x) + d(w1 , y) − L(Eq ) = d(w2 , e)


Recall that, in case (4), we may assume w1 6= w2 . Furthermore, in this case, the geodesic
from q to e passes through w2 , and so does not pass through w1 . Then this is a nontrivial
linear equality, as there is nothing on the left-hand side to cancel out the edge adjacent to
w1 in the d(w1 , y) term.


   In case (5), we derive

                                                                  1X
                     L(Ep ) + 2d(v1 , x) + d(w1 , y) − L(Eq ) =         L(Ei )
                                                                  2 i∈U

This is a nontrivial linear equality, as the right-hand side contains a term with coefficient
1/2 and the left-hand side does not.


   In case (6), d1 = L(Eq ) + d(w2 , y) − L(γ)/2. Together with Equation C.2, we obtain


                      L(Ep )/2 + d(v1 , x) = L(Eq ) + d(w2 , y) − L(γ)/2


This is an immediate nontrivial linear equality, as the signs of the fractional terms on either
side is different.
   In case (7), we derive


                       L(Eq ) − L(Ep ) − 2d(v1 , x) + d(w2 , y) = d(w2 , e)
                                                                                             167


Recall that in Case (7) we may assume that Eq is not a leaf edge. Thus only the right-hand
side has a leaf edge appearing with a positive sign, and so we have a nontrivial linear equality.


   Finally, in case (8), we derive

                                                                   1X
                      L(Eq ) − L(Ep ) − 2d(v1 , x) + d(w2 , y) =         L(Ei )
                                                                   2 i∈U

This is a nontrivial linear equality, as the right-hand side has a term with coefficient 1/2,
and the left-hand side does not.


C.4       Cases (II) and (III)

In this case, we assume χG (p) exhibits cancellation, but the smallest nonzero distance d1 has
not cancelled out. The equation χG (p) = χG (q), together with the fact q has no cancellation,
implies that the smallest nonzero death time in ΨG (q) is also d1 . Moreover, for q, which sits
on the interior, non-loop edge Eq and is distance d1 away from the closest vertex w1 , there
is a downfork vertex y such that


                        d1 + d(w1 , y) = d(q, y) = L(Eq ) − d1 + d(w2 , y)


From which we can deduce that


(C.3)                          2d1 = L(Eq ) + d(w2 , y) − d(w1 , y)


   See Figure C.10.
                                                                                            168

                                                 y


                                                         d1

                                    w2               q        w1

                              Figure C.10: The point q is in case (II).


   By Corollary C.0.2, we may assume that d1 is the distance from p to the closest vertex
v1 of valence at least three. By Lemma C.0.4, there are ten possible scenarios to consider.
Cases (2,3,9,10) are nontrivial linear equalities, so we may focus on the rest.


   Let’s consider case (1),
                                     d1 = L(γ/2) − d(v1 , x)

for some simple cycle γ containing p, and some non-leaf vertex x. Applying Equation C.3,
we obtain
                       L(Eq ) + d(w2 , y) − d(w1 , y) = L(γ) − 2d(v1 , x)

Unless this is a nontrivial linear equality, the edge Eq and the geodesic from w2 to y must
be a subset of γ, which contains all the positive terms on the RHS. Now, we can rewrite this
equation as
                       L(γ) − L(Eq ) − d(w2 , y) = 2d(v1 , x) − d(w1 , y)

The edges appearing on the LHS are the remaining edges in γ, and each edge appears with a
(+1) coefficient. This is true for the RHS as well precisely if the geodesic from v1 to x is the
same as the geodesic from w1 to y. Thus, we see that γ is precisely the cycle appearing in
Figure C.10. Now, either v1 = w1 and x = y or v1 = y and w1 = x. In the former case, note
                                                                                                   169


that there is precisely one point on γ at distance d1 from w1 = v1 for which the geodesic to
y = x passes through w1 = v1 , so that in this case we are forced to conclude p = q. In the
latter case, the situation is as in Figure C.11. That is, p is antipodal to w1 = x on γ, just
as q is antipodal to v1 = y. Moreover, since we saw that the geodesic from p to x passing
through v1 is contained in γ, x is an downfork for p. Since x is also an upfork, it has valence
at least four. This implies that the death time d1 appears with multiplicity at least two in
ΨG (q), and hence χG (q), since there is no cancellation. As χG (p) = χG (q), the same death
time has multiplicity at least two in ΨG (p), forcing v1 = y to have valence at least four. As
we can generically assume that there are unique geodesics from y to w1 or w2 , this forces y to
be an upfork for q. But then y is both an upfork and a downfork, so that q has cancellation,
contrary to our hypothesis.


                                  d1
                                        v 1 = w2 = y


                              p                                 w 1 = v2 = x


                                                           d1
                                               q
Figure C.11: The points p and q are the same distance d1 from v1 and w1 respectively. The vertex
x = v2 = w1 is a downfork for p, and the vertex y = v1 = w2 is a downfork for q. Both x and y must have
valence at least four.
                                                                                        170

   In case (4), we have
                                 2d1 = L(Ep ) + d(v2 , e) − d(v1 , x)

Together with Equation C.3, we obtain


                    L(Eq ) + d(w2 , y) − d(w1 , y) = L(Ep ) + d(v2 , e) − d(v1 , x)


This is an obvious nontrivial linear equality, as the LHS contains no leaf edges.


   In case (5), we have

                                                1X
                               2d1 = L(Ep ) +      L(Ei ) − d(v1 , x)
                                                2

so that we obtain

                                                                         1X
                L(Eq ) + d(w2 , y) − d(w1 , y) = L(Ep ) − d(v1 , x) +          L(Ei )
                                                                         2 i∈U

   This is also an obvious nontrivial linear equality, as the LHS does not contain any term
with fractional coefficient.


   In case (6), we have
                                  d1 = L(Ep ) + d(v2 , x) − L(γ)/2

so that we obtain


                    L(Eq ) + d(w2 , y) − d(w1 , y) = 2L(Ep ) + 2d(v2 , x) − L(γ)
                                                                                            171

   We can rewrite this as


                   L(γ) = 2L(Ep ) + 2d(v2 , x) + d(w1 , y) − L(Eq ) − d(w2 , y)


   Unless this is a nontrivial linear equality, we can conclude that the geodesic from w1 to y
is contained in γ, as it is not cancelled out on the RHS. Moreover, the RHS will have terms
with (+2) or (−1) coefficient unless the edges appearing in the terms L(Ep ) + d(v2 , x) are ex-
actly the same as those appearing in L(Eq )+d(w2 , y). However, the edges in L(Ep )+d(v2 , x)
form a path from v1 to x, and the edges in L(Eq ) + d(w2 , y) form a path from w1 to y, and
we end up with the same two possibilities as in Case (1), i.e. either v1 = w1 and x = y or
v1 = y and w1 = x, both of which gave us a contradiction.


   Cases (7) and (8) are very similar to cases (4) and (5).


C.5       Cases (III) and (III)

Lemma C.0.4 implies that there are ten cases to consider for p and q, and hence one hundred
total cases! Luckily, that number is quickly reduced. Cases (2,3,9,10) are immediate non-
trivial inequalities, so there are only six cases apiece, and thirty six total. To dismiss more
cases, note that our cases come in pairs:

  • Cases (1) and (6) relate 2d1 to the length of a simple cycle, and in particular the
      expression contains no leaf edges and no non-integer coefficients.

  • Cases (4) and (7) involve a sum of edge lengths that contain a leaf edge but no non-
      integer coefficients.
                                                                                          172

  • Cases (5) and (8) contain an expression with non-integer coefficients but no leaf edge
        (as these cases can be ruled out when Ep or Eq is a self-loop or leaf edge).

Thus, for example, setting the expression for 2d1 arising in Case (1) equal to the expression
for 2d1 arising in Case (5) gives an immediate nontrivial linear equality, as only one side of
the equation would contain a non-integer coefficient. Following this reasoning, it suffices to
compare Cases (1) and (6) with themselves and each other, and similarly for Cases (4) and
(7) and (5) and (8). We end up with a respectable nine cases to check.


   Case (1) and (1): There are vertices x and y of valence at least three, and simple cycles
γ1 and γ2 , containing p and q respectively, for which


                            L(γ1 )/2 − d(v1 , x) = L(γ2 )/2 − d(w1 , y)

from which we deduce that


(C.4)                        L(γ1 ) − L(γ2 ) = 2d(v1 , x) − 2d(w1 , y)


   Cases (1) and (6): We have γ1 , γ2 , x, and y as before, we have


                        L(γ1 ) + L(γ2 ) = 2L(Eq ) + 2d(w2 , y) + 2d(v1 , x)


   As before, we assume L(γ1 ) 6= L(γ2 ), so that the cancellations do not happen at the
same time. This implies that the sum of edges appearing on the LHS contains terms with
coefficient (+1), which is impossible for the RHS, giving a nontrivial linear equality.
                                                                                                 173


   Cases (6) and (6): With γ2 , γ2 , x, and y as before, we have


                   L(γ1 ) − L(γ2 ) = 2L(Ep ) − 2L(Eq ) + 2d(v2 , x) − 2d(w2 , y)


   And as we may again assume L(γ1 ) 6= L(γ2 ) we obtain a nontrivial linear equality.


   Case (4) and (4): There are vertices e1 , e2 , x, and y such that


                  L(Ep ) + d(v2 , e1 ) − d(v1 , x) = L(Eq ) + d(w2 , e2 ) − d(w1 , y)


   Now, we can assume that the geodesic from v2 to e1 does not intersect the geodesic
from v1 to x, as otherwise we can reroute the geodesics so they both go through the same
end of Ep , giving us case (2) or (9), both of which are nontrivial linear equalities. Let
us assume the same for Eq . Now, if we consider the edges appearing in the expression
L(Ep ) + d(v2 , e1 ) − d(v1 , x) we see that they lie along a path from e1 to x, as in Figure C.12.
The cancellation occurs because p sits midway between e1 and x on this path. Now, the edges
in L(Eq ) + d(w2 , e2 ) − d(w1 , y) also sit on a path, this time from e2 to y. Unless a nontrivial
linear equality has occurred, this must be the same as the path from e1 to x, in particular
forcing e1 = e2 and x = y. Moving along this path from e1 = e2 to x = y, one finds that
the sign of the terms in the sum L(Ep ) + d(v2 , e1 ) − d(v1 , x) or L(Eq ) + d(w2 , e2 ) − d(w1 , y)
changes at the vertex v1 or w1 , which therefore must be equal. The vertex encountered
before v1 = v1 is then v2 = w2 , so that Ep = Eq , and hence p = q.
   Case (4) and (7): Here we find


                  L(Ep ) + d(v2 , e1 ) − d(v1 , x) = L(Eq ) + d(w2 , y) − d(w1 , e2 )
                                                                                                       174

            e1


                                                          p
                                  v2
                                                              v1
                                                                                     x


  Figure C.12: The point p sits midway on a path from e1 to x, and e1 and x are equidistant from p.


   The LHS has its only leaf edge appearing with sign (+1) and the RHS has its only leaf
edge appearing with sign (−1), so we have a nontrivial linear equality.


   Case (7) and (7): Here we have


                   L(Ep ) + d(w2 , x) − d(w1 , e1 ) = L(Eq ) + d(w2 , y) − d(w1 , e2 )


   This is the same as cases (4) and (4), with the only exceptions being that the positions
of v1 and v2 , as well as w1 and w2 , are switched in Figure C.12, and the signs of the edges
(besides for Ep and Eq ) are flipped.


   Case (5) and (5): Here we have

                            1X                                1X
                 L(Ep ) +       L(Ei ) − d(v1 , x) = L(Eq ) +     L(Ei ) − d(w1 , y)
                            2 U                               2 S

   The edges appearing in L(Ep ) + 12                  L(Ei ) − d(v1 , x) can be assumed to be as in Figure
                                               P
                                                   U

C.13, as if the geodesics leaving either end of Ep intersect, we can reroute them to leave
the same end, producing cases (3) or (10), which are nontrivial linear equalities. The same
                                       1
                                                   L(Ei ) − d(w1 , y). Now, as we saw when comparing
                                           P
is true for the edges in L(Eq ) +      2       S
                                                                                                     175


cases (4) and (4), either a nontrivial linear equality has occurred or the collection of edges
appearing for each equation must be the same. By considering where the sign of the edges
change we can conclude that Ep = Eq , v1 = w1 , and v2 = w2 , so that p = q.


                                                                                x


                                                       p
                                               v2          v1
                                              1
                                                P
Figure C.13:P The edges showing up in L(Ep )+ 2 U L(Ei )−d(v1 , x). The edges in blue are those appearing
in the sum U L(Ei ). These edges have a positive sign in the sum, as does Ep . The remaining edges have
a negative sign.


   In case (5) and (8), we have:

                           1X                                            1X
                L(Ep ) +       L(Ei ) − d(v1 , x) = L(Eq ) + d(w2 , y) −     L(Ei )
                           2 U                                           2 S

   This is a nontrivial linear equality, as the fractional coefficients appearing on the LHS
are positive, whereas those appearing on the RHS are negative.


   In case (8) and (8), we have:

                                      1X                                 1X
               L(Ep ) + d(v2 , x) −       L(Ei )) = L(Eq ) + d(w2 , y) −     L(Ei )
                                      2 U                                2 S

   This is the same as cases (5) and (5), with the only exceptions being that the positions
of v1 and v2 , as well as w1 and w2 , are switched in Figure C.13, and the signs of the edges
(besides for Ep and Eq ) are flipped.
                                                                APPENDIX D


The Case of Topological Self-Loops and Few
Vertices


We have noted that certain combinatorial types, such as graphs X with topological self-loops,
force isometric automorphisms regardless of the geometry chosen. Thus we cannot hope for
ΨX to be injective for any choice of edge lengths on X. Still, we claim that it is generically
possible, in these cases, to reconstruct a metric graph G from IP HT (G) or IECT (G), up
to isometry. As in the prior chapters, we remove all vertices of valence two, merging their
adjacent eges into a single, longer edge. We will also maintain the convention to label the
smallest nonzero death time in ΨG (p) by d1 , noting that when ΨG (p) = ΨG (q) they must
have the same smallest nonzero death time.

Lemma D.0.1. Let G be a metric graph with a topological self-loop C but at least three
vertices of valence at least three. Suppose that there exist pairs of points p, q ∈ G, with p ∈ C
and q ∈
      / C, such that ΨG (p) = ΨG (q). Then there is a nontrivial Z-linear equality among


                                              176
                                                                                             177

the edge lengths of G.

Proof. By Lemma B.0.2, p and q are either both vertices or both non-vertices. If they are
both vertices, we can use the same proof as in Proposition B.0.8, which does not rely on
excluding self-loops.


   Next, let us suppose that p and q are non-vertices. The point p sits on a self-loop,
but the point q may not. Indeed, there are four possibilities for the point q, corresponding
to the three cases A, B, and C of Chapter B, and the fourth case D, when q sits on a self-loop.


   Case A:


                                                          Eq
                                 u                                         v
                        Ep              d1           r2        q      d1
                   x

                             p

                                     Figure D.1: Cases (A) and (D).


   The point q sits on a leaf edge Eq .


   We will first want to consider the exceptional case when the base vertex u of the self-loop
has valence three, and x is antipodal to u, as this is the only situation when the distance
from p to its closest vertex of valence at least three (i.e. u) is not a death time. When
this happens, the smallest nonzero death time for p is strictly bigger than the smallest birth
time. ΨG (p) = ΨG (q) implies this is also true for q, which implies that q is strictly closer to
the leaf vertex of its edge than the non-leaf vertex. Note that the sum of the smallest birth
                                                                                                   178

and death times in ΨG (p) is L(C) + L(E), where C is the cycle on which p sits and E is
the other edge adjacent to u. The same sum for ΨG (q) gives L(Eq ). If ΨG (p) = ΨG (q) then
L(C) + L(E) = L(Eq ), a nontrivial linear equality.


   Moving on, we can assume that the distance d1 = d(p, u) is the smallest nonzero death
time in ΨG (p), as in Figure D.1. Consider the point x that is on the other side of the loop
Ep from u (we may have u = p). This produces a downfork in ΦG (q), and hence a point in
ΨG (q) with birth time d1 + d(v, u) + L(Ep )/2. ΨG (p) = ΨG (q) means that there is a similar
birth time in ΨG (p). This is strictly larger than L(Ep )/2, and hence the geodesic to the cor-
                                                                                  1   P
responding downfork passes through u. By Lemma B.0.6, this is equal to d1 +       2       i∈U   L(Ei )
                                                                                  1   P
for some collection of edges in G omitting Ep . This implies d(v, u)+L(Ep )/2 =   2   i∈U       L(Ei ),
a nontrivial linear equality among edge lengths.


   Case B:


                                                        w

                                          u                        v

                                 Ep                    r2          d1
                                               d1
                             x                                   Eq
                                      p                      q

                                  Figure D.2: Cases (B) and (D).


   The point q sits on an internal edge Eq whose endpoints are both upforks in the Reeb
graph ΦG (q).


   As before, we will first want to consider the exceptional case when the base vertex u of
                                                                                           179

the self-loop has valence three, and x is antipodal to u. Recall that this implies that the
smallest nonzero death time in ΨG (p) is strictly larger than its smallest birth death, which
cannot be the case for the point q, so that ΨG (p) 6= ΨG (q).


   Moving on, we can assume that the distance d1 = d(p, u) is the smallest nonzero death
time in ΨG (p), asin Figure D.2. We have that


(D.1)                                  r2 + d1 = L(Eq )


Moreover, since r2 shows up as a death time in ΨG (q), it must also be a death time in ΨG (p),
corresponding by the dictionary of Section 2.2.5 to the distance from p to an upfork in ΦG (p).
The geodesic realizing this distance passes through u, and hence

                                                 X
(D.2)                                r2 − d1 =         L(Ei )
                                                 i∈U


where the collection U of edges omits Ep by construction and has a total length too short to
contain Eq .


   The one-dimensional persistence diagram in ΨG (p) contains the point (L(Ep )/2, ·). By
Lemma ??, we have a trichotomy. The first possibility is that L(Ep )/2 is half the length of
a simple cycle on which q sits. This implies a nontrivial linear equality as this simple cycle
cannot consist of Ep . The second possibility is that

                                                   1X
(D.3)                            L(Ep )/2 − d1 =         L(Ei )
                                                   2 i∈S
                                                                                       180

with the sum of edges on the right side omitting Eq . Taking (D.1) - (D.2) + 2(D.3) gives

                                                  X                 X
                            L(Ep ) = L(Eq ) +            L(Ei ) −         L(Ei )
                                                  i∈S               i∈U


a nontrivial linear equality. In the third possibility, we have

                                                          1X
(D.4)                                L(Ep )/2 − r2 =            L(Ei )
                                                          2 i∈S

Note that this set S is distinct from the one in (D.3). Taking (D.1) + (D.2) + 2(D.4) gives

                                                     X              X
                            L(Ep ) = L(Eq ) +            L(Ei ) +         L(Ei )
                                                  i∈U               i∈S


a nontrivial linear equality.


   Case C:


                                                                    w

                                         u                                        v
                                                               r2
                                                                        Eq
                                Ep              d1
                                                                             d1
                            x

                                     p                              q

                                     Figure D.3: Cases (C) and (D).


   The point q sits on an internal edge, with the closer endpoint being a downfork and the
further endpoint being an upfork in the Reeb graph ΦG (q).
   As before, we will first want to consider the exceptional case when the base vertex u of
the self-loop has valence three, and x is antipodal to u. Recall that this implies that the
                                                                                                   181

smallest nonzero death time in ΨG (p) is strictly larger than its smallest birth death, which
cannot be the case for the point q, so that ΨG (p) 6= ΨG (q).


   Moving on, we can assume that the distance d1 = d(p, u) is the smallest nonzero death
time in ΨG (p), as in Figure D.3. Note that, generically speaking, the segment of Ep from w to
q must be a geodesic, otherwise there are two distinct geodesics from w to q that pass through
v, and hence two distinct geodesics from w to v, a nontrivial linear equality. Thus one of the
geodesics from w to q consists of a segment of Ep and the other passes through v, and hence
any geodesic starting at q passing through w can be rerouted to a geodesic passing through v.


   Consider the point x that is on the other side of the loop Ep from u. This produces a
downfork in ΦG (q), and hence a point in ΨG (q) with birth time d(q, u) + L(Ep )/2. As we
may assume that the geodesic from q to this downfork can be rerouted through v, we have
d(q, u) + L(Ep )/2 = d1 + d(v, u) + L(Ep )/2. ΨG (p) = ΨG (q) means that there is a similar
birth time in ΨG (p). This is strictly larger than L(Ep )/2, and hence the geodesic to the
corresponding downfork passes through u. By Lemma ??, this is equal to d1 + 21
                                                                                      P
                                                                                          i∈U   L(Ei )
                                                                                  1   P
for some collection of edges in G omitting Ep . This implies d(v, u)+L(Ep )/2 =   2   i∈U       L(Ei ),
a nontrivial linear equality among edge lengths.


   Case D:
                                                                                                               182


                                        u                              v
                               Ep                 d1          Eq                    d1
                           x                              y

                                    p                              q

                                        Figure D.4: Cases (D) and (D).


   Once again, we will first want to consider the exceptional case when the base vertex
u of the self-loop has valence three, and x is antipodal to u. Recall that this implies
that the smallest nonzero death time in ΨG (p) is strictly larger than its smallest birth
death, which implies that q is in the same situation as p, antipodal to the base vertex v
of the self-loop on which it lies. Writing E1 and E2 for the edges adjacent to u and v
respectively, ΨG (p) = ΨG (q) implies that these barcodes have the same smallest nonzero
death time, so that d(p, u) + |E1 | = d(q, v) + |E2 |. Multiplying both sides by 2, we obtain
L(Ep ) + 2|E1 | = L(Eq ) + 2|E2 |, a nontrivial linear equality.


   Moving on, we can assume that the distance d1 = d(p, u) = d(q, v) is the smallest nonzero
death time in ΨG (p) and ΨG (q), as in Figure D.4. The point q sits on a self-loop distinct
from the one on which p lies. Since ΨG (p) contains the birth time L(Ep )/2, this must also
appear as a birth time in ΨG (q). By Lemma ??, this is equal to (i) L(Eq )/2 (as Eq is the
                                                                   1   P
only cycle containing Eq ), (ii) d1 + d(v, e) or (iii) d1 +        2       U   L(Ei ). By an identical analysis,
L(Eq )/2 must equal (i’) L(Ep )/2, (ii’) d1 + d(u, e0 ), or (iii’) d1 +
                                                                                     P
                                                                                         S   L(Ei ). One can easily
see that (i) or (i’) provide a nontrivial linear equality, so these can be ruled out. In case (ii)
                                                                                                183

and (ii’), we can subtract the latter equation from the former to produce


                             L(Ep )/2 − L(Eq )/2 = d(v, e) − d(u, e0 )


   This is a nontrivial linear equality, as only the right hand side contains leaf edges. In
case (iii) and (iii’), we can similarly derive

                                                 1X           1X
                        L(Ep )/2 − L(Eq )/2 =        L(Ei ) −     L(Ei )
                                                 2 U          2 S

This is a nontrivial linear equality unless U just contains Ep and Eq . But then equations (ii)
and (ii’) imply that d1 = 0, contrary to hypothesis. Lastly, the two cross-cases, (ii,iii’) and
(iii,ii’) also give rise to nontrivial linear equalities, as the equalities derived from subtracting
equations contain leaf edges on only one side.

   Noting that the prior lemma uses only the birth and death times in ΨG , and not their
pairing, and in light of Proposition C.0.1 in Chapter C, we have the following Corollary.

Corollary D.0.2. Let G be a metric graph with a topological self-loop C that is not a figure-
eight. Suppose that there exist pairs of points p, q ∈ G, with p ∈ C and q ∈
                                                                           / C, such that
χG (p) = χG (q). Then there is a nontrivial Z-linear equality among the edge lengths of G

   The upshot of the prior lemma is that, generically speaking, the failure of injectivity
introduced by topological self-loops is local, occurring on the loop itself. Indeed, by Lemma
2.2.31, the pairs of points on a topological self-loop that produce identical persistence dia-
grams or Euler curves are precisely those exchanged by the unique automorphism flipping
the loop (this presumes, of course, that G is not a circle). Figure D.5 demonstrates how
a self-loop in G becomes a leaf edge in IP HT (G) and IECT (G). To reconstruct G, we
                                                                                                         184

detect that a failure of injectivity has occured – the newly created leaf vertex in IP HT (G)
or IECT (G) cannot have come from a leaf vertex in G, as we can read the valence of the
point from its barcode or Euler curve. Our statement that the edge lengths of our graphs
satisfy no nontrivial Z-linear equalities implies that this failure of injectivity must come from
a topological self-loop, and so we know how to reinsert the loop when reconstructing G. The
length of the loop is twice the intrinsic length of the leaf edge in IP HT (G) or IECT (G).


                                        G                         IP HT (G)=IECT (G)


Figure D.5: A self-loop in G becomes a leaf edge in IP HT (G) or IECT (G). The barcode or Euler curve
at the end of this leaf indicates that it corresponds to a point of valence two, alerting us that the original
graph contained a self-loop.


    The last possibility to consider is when there are fewer than three vertices in G of valence
distinct from two. If there are no such vertices then G is a circle, whose persistence diagram
is a single point. This point records the circumference of the circle. It is an easy consequence
of Lemma 4.5.1 that the only graphs producing one-point persistence diagrams are circles,
and the same can be said for Euler curves by considering Lemma 4.5.3. Moving on, the
only graphs with a single vertex of valence not equal to two are those arising as wedges of
circles. The proof of Lemma D.0.1 demonstrates that the IPHT of such a graph is a wedge of
thorns, and by the method discussed above it is possible to detect this failure of injectivity
and reconstruct the graph as well. This also covers the case of Euler curves unless our graph
is a figure-eight. In that case, IECT (G) is actually homeomorphic to a circle, see Figure
D.6! This is a distinct topological type from any graph as yet considered, so when we see
                                                                                                      185

this in the IECT (G) we know that G is a Figure eight. To recover the lengths of the loops,
we use two observations: the base of the loops has valence four, and this can be identified in
IECT (G) (the black point). The midpoints of the loops (the points in green in the Figure)
have the unique property (among valence-two points) of having no non-zero death times in
IECT (G). Thus, we can identify their image in IECT (G). These two points on IECT (G)
divide the circle into two arcs. The intrinsic length (using either dˆB or d˜
                                                                           ˆs) of each arc is

then half of the length of the corresponding loop in G.


                G
                                                                                  IECT (G)


Figure D.6: A figure-eight graph becomes a circle after applying IECT (G). However, the image of the
base of the loops as well as the midpoints of the loops can be identified in IECT (G), and from this we can
reconstruct G.


    Next, the graphs with precisely two vertices of valence not equal to two are of the fol-
lowing form: two vertices v and w connected by a single edge or multiple edges, with some
topological self-loops attached at v and w, see Figure D.7. The results of Chapter B imply
that the only failures of injectivity happen either at the self-loops or for pairs of points on the
same edge connecting v and w. Indeed, of the various proofs presented in Chapter B, only
Proposition B.0.8 (injectivity for vertices) and Lemma B.0.11 (injectivity for non-vertices on
the same edge) require the existence of a third vertex. We claim that, when G has topological
self-loops, ΨG is generically injective on the edges connecting v and w. For suppose that p
                                                                                               186

and q are two points on an edge E connecting v and w, with ΨG (p) = ΨG (q). We know from
Lemma 2.2.31 that p and q are at the same distance d1 from their closest vertices of valence
at least three. Suppose, without loss of generality, that the points p, q are arranged so that
one encounters first p and then q when traveling from v to w along E, as in Figure D.7. We
claim that a shortest geodesic from p to one of the vertices v, w must be the subsegment of
E connecting p to v. Otherwise, the shortest geodesic is the subsegment of E connecting p
to w, and hence, as there is a shorter sub-geodesic connecting q to w, we find that p and q
cannot be equidistant from their closest vertices of valence at least three. Symmetrically, a
shortest geodesic from q to one of v, w must be the subsegment of E connecting q and w.
Thus d(p, v) = d(q, w) = d1 . We claim further that p is strictly closer to v than w, as if
d(p, v) = d(p, w), the geodesic from p to w cannot pass through v and passes instead through
q, so that d1 = d(p, v) = d(p, w) > d(q, w) = d1 , a contradiction. Similarly, q is strictly closer
to w than v.

                                                              E
                                             p                    q
                                       d1        z
                                                     E0               d1

                               C
                                   v                                  w    C0
                                                         00
                                                     E


                        Figure D.7: When G has two vertices and self-loops.


   Suppose that at least one of v or w has an attached topological self-loop. Consider the
shortest such loop (as two equal-length loops imply a nontrivial linear equality); without
loss of generality, it is attached at v, and let us denote the loop edge by C. Then ΨG (p)
                                                                                            187

has a one-dimensional point born at d1 + L(C)/2, and the same must be true of ΨG (q). We
now have four possibilities: the corresponding downfork in ΦG (q) occurs either (i) at v, (ii)
along a self-loop attached at v, (iii) along a self-loop attached at w, or (iv) at a downfork z
along an edge E 0 connecting v and w. Case (i) implies that d1 + L(C)/2 is equal to d(q, v).
Moreover, if v is a downfork for q, then symmetry (i.e. the vertex-flipping automorphism
of the subgraph of G obtained by removing self-loops) implies that w is a downfork for p
(as the downfork directions cannot be along self-loops, which move us further away from p
and q), so that there is a geodesic from p to w passing along an edge E 00 connecting v and
w that is distinct from E. Thus d1 + L(C)/2 = d(q, v) = d(p, w) = d1 + L(E 00 ), which is
a nontrivial linear equality. Case (ii) is impossible as we have chosen C to be the shortest
possible loop edge, and q is strictly further from the base of the loop than p. Case (iii)
implies that d1 + L(C)/2 = d1 + L(C 0 )/2, where C 0 is a loop edge attached at w, giving a
nontrivial linear equality. Lastly, case (iv) implies that d1 + L(C)/2 = 21 (L(E) + L(E 0 )),
another nontrivial linear equality.


   Thus, if the failures of injectivity for ΨG are those along topological self-loops, these are
the only failures of injectivity, and we can reconstruct G as before. By the same arguments
as earlier, this extends to the IECT. The only other possibility is that G consists of a pair
of vertices joined by one or multiple edges, as on the left side of Figure D.8. The shape of
the resulting intrinsic persistent homology transform, as seen in the right side of Figure D.8,
tells us that a failure of injectivity has occurred, as the barcode corresponding to the tip of
the blue thorn tells us that the corresponding point has valence two. While the right-hand
side looks identical to the IPHT of a wedge of loops, the barcodes themselves are different.
Indeed, the valence of the central vertex (in red) is equal to the number of thorns in this
case – for a wedge of loops, the valence is twice the number of thorns. Since this valence
                                                                                           188

information can also be read from the IECT, we can also distinguish the IECT of these
graphs from those of the wedge of loops. To recover the lengths of the edges in G, we
note that the intrinsic length (using either dˆB or d˜
                                                    ˆs) of each thorn is half the length of its

corresponding edge.


                                                                    L=2
                                     L


               Figure D.8: The image a two-vertex graph under the IPHT or IECT.


   Thus we have seen that, under our generic assumptions on the edge lengths of G, in
all these cases when ΨG or χG fails to be injective, it is possible to detect that this has
happened, that each resulting scenario can only arise in a unique way, and that there is a
procedure to reconstruct G from IP HT (G) or IECT (G), up to isometry.
Bibliography


[Bat14]     Jonathan Bates. The embedding dimension of Laplacian eigenfunction maps.
            Applied and Computational Harmonic Analysis, 37(3):516–530, 2014.

[BBI01]     Dmitri Burago, Yuri Burago, and Sergei Ivanov. A course in metric geometry,
            volume 33. American Mathematical Society Providence, 2001.

[BFM+ 18]   Robin Lynne Belton, Brittany Terese Fasy, Rostik Mertz, Samuel Micka,
            David L Millman, Daniel Salinas, Anna Schenfisch, Jordan Schupbach, and Lu-
            cia Williams. Learning simplicial complexes from persistence diagrams. arXiv
            preprint arXiv:1805.10716, 2018.

[BGW14]     Ulrich Bauer, Xiaoyin Ge, and Yusu Wang. Measuring distance between reeb
            graphs. In Proceedings of the thirtieth annual symposium on Computational
            geometry, page 464. ACM, 2014.

[BK04]      Mireille Boutin and Gregor Kemper. On reconstructing n-point configurations
            from the distribution of distances or areas. Advances in Applied Mathematics,
            32(4):709–735, 2004.


                                          189
                                                                                       190

[BL14]      Ulrich Bauer and Michael Lesnick. Induced matchings of barcodes and the
            algebraic stability of persistence. In Proceedings of the thirtieth annual sym-
            posium on Computational geometry, page 355. ACM, 2014.

[CB15]      William Crawley-Boevey. Decomposition of pointwise finite-dimensional per-
            sistence modules. Journal of Algebra and its Applications, 14(05):1550066,
            2015.

[CCSG+ 09a] Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J Guibas, and
            Steve Y Oudot. Proximity of persistence modules and their diagrams. In
            Proceedings of the twenty-fifth annual symposium on Computational geometry,
            pages 237–246. ACM, 2009.

[CCSG+ 09b] Frédéric Chazal, David Cohen-Steiner, Leonidas J Guibas, Facundo Mémoli,
            and Steve Y Oudot. Gromov-hausdorff stable signatures for shapes using per-
            sistence. In Computer Graphics Forum, volume 28, pages 1393–1403. Wiley
            Online Library, 2009.

[CDSGO12] Frédéric Chazal, Vin De Silva, Marc Glisse, and Steve Oudot. The structure
            and stability of persistence modules. arXiv preprint arXiv:1207.3674, 2012.

[CDSGO16] Frédéric Chazal, Vin De Silva, Marc Glisse, and Steve Oudot. The structure
            and stability of persistence modules. Springer, 2016.

[CDSO14]    Frédéric Chazal, Vin De Silva, and Steve Oudot. Persistence stability for
            geometric complexes. Geometriae Dedicata, 173(1):193–214, 2014.
                                                                                       191

[CGR12]     Justin Curry, Robert Ghrist, and Michael Robinson. Euler calculus with ap-
            plications to signals and sensing. In Proceedings of Symposia in Applied Math-
            ematics, volume 70, pages 75–146, 2012.

[CMT18]     Justin Curry, Sayan Mukherjee, and Katharine Turner. How many directions
            determine a shape and other sufficiency results for two topological transforms.
            arXiv preprint arXiv:1805.09782, 2018.

[CO17]      Mathieu Carrière and Steve Oudot. Local equivalence and intrinsic metrics
            between reeb graphs. arXiv preprint arXiv:1703.02901, 2017.

[COO15]     Mathieu Carriere, Steve Oudot, and Maks Ovsjanikov. Local signatures using
            persistence diagrams. 2015.

[Cro80]     Christopher B Croke. Some isoperimetric inequalities and eigenvalue estimates.
            In Annales scientifiques de l’École Normale Supérieure, volume 13, pages 419–
            435, 1980.

[CSEH05]    David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of
            persistence diagrams. In Proceedings of the twenty-first annual symposium on
            Computational geometry, pages 263–271. ACM, 2005.

[CSEHM10] David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Yuriy Mileyko.
            Lipschitz functions have l p-stable persistence. Foundations of computational
            mathematics, 10(2):127–139, 2010.

[Cur17]     Justin Curry.      The fiber of the persistence map.           arXiv preprint
            arXiv:1706.06059, 2017.
                                                                                       192

[DPRV15]    Ivan Dokmanic, Reza Parhizkar, Juri Ranieri, and Martin Vetterli. Euclidean
            distance matrices: essential theory, algorithms, and applications. IEEE Signal
            Processing Magazine, 32(6):12–30, 2015.

[DSW15]     Tamal K Dey, Dayu Shi, and Yusu Wang. Comparing graphs via persistence
            distortion. arXiv preprint arXiv:1503.07414, 2015.

[EO18]      Herbert Edelsbrunner and Georg Osang. The multi-cover persistence of eu-
            clidean balls. In 34th International Symposium on Computational Geometry
            (SoCG 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.

[GCPZ06]    Joachim Giesen, Frédéric Cazals, Mark Pauly, and Afra Zomorodian. The
            conformal alpha shape filtration. The Visual Computer, 22(8):531–540, 2006.

[GGP+ 18]   Ellen Gasparovic, Maria Gommel, Emilie Purvine, Radmila Sazdanovic, Bei
            Wang, Yusu Wang, and Lori Ziegelmeier. A complete characterization of the
            one-dimensional intrinsic čech persistence diagrams for metric graphs. In Re-
            search in Computational Topology, pages 33–56. Springer, 2018.

[GLM18]     Robert Ghrist, Rachel Levanger, and Huy Mai. Persistent homology and euler
            integral transforms. arXiv preprint arXiv:1804.04740, 2018.

[IIT16]     Alexander Ivanov, Stavros Iliadis, and Alexey Tuzhilin. Realizations of gromov-
            hausdorff distance. arXiv preprint arXiv:1603.08850, 2016.

[Iva]       Sergei Ivanov. Smoothness of distance function in riemannian manifolds. Math-
            Overflow. URL:https://mathoverflow.net/q/21316 (version: 2010-04-14).

[KG00]      Vladimir I. Koltchinskii and Evarist Giné. Random matrix approximation of
            spectra of integral operators. Bernoulli, 6(1):113–167, 2000.
                                                                                   193

[Kol98]   Vladimir I. Koltchinskii. Asymptotics of spectral projections of some random
          matrices approximating integral operators. In Ernst Eberlein, Marjorie Hahn,
          and Michel Talagrand, editors, High Dimensional Probability, pages 191–227,
          Basel, 1998. Birkhäuser Basel.

[Les15]   Michael Lesnick. The theory of the interleaving distance on multidimensional
          persistence modules. Foundations of Computational Mathematics, 15(3):613–
          650, 2015.

[Mém11]   Facundo Mémoli. Gromov–Wasserstein distances and the metric approach to
          object matching. Foundations of computational mathematics, 11(4):417–487,
          2011.

[OS17]    Steve Oudot and Elchanan Solomon. Barcode embeddings for metric graphs.
          arXiv:1712.03630, 2017.

[OS18]    Steve Oudot and Elchanan Solomon. Inverse problems in topological persis-
          tence. arXiv preprint arXiv:1810.10813, 2018.

[Sch95]   Pierre Schapira. Tomography of constructible functions. In International Sym-
          posium on Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes,
          pages 427–435. Springer, 1995.

[Sol18]   Yitzchak Elchanan Solomon.         Euler curves.      https://github.com/
          IsaacSolomon/EulerCurves, 2018.

[TMB14]   Katharine Turner, Sayan Mukherjee, and Doug M Boyer. Persistent homology
          transform for modeling shapes and surfaces. Information and Inference: A
          Journal of the IMA, 3(4):310–344, 2014.
                                                                                   194

[Var58]   V. S. Varadarajan. On the convergence of sample probability distributions.
          Sankhy¯a: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958.

[Vil09]   Cédric Villani. Optimal Transport. Springer, 2009.