Potential landscape perspectives on roaming: Insights on formaldehyde from geodesic
paths
by
D. Vale Cofer-Shabica
Sc. B., Brown University, 2009
A dissertation submitted in partial fulfillment of the
requirements for the Degree of Doctor of Philosophy
in the Department of Chemistry at Brown University
Providence, Rhode Island
May 2018
© Copyright 2017, 2018 by D. Vale Cofer-Shabica
This dissertation by D. Vale Cofer-Shabica is accepted in its present form by
the Department of Chemistry as satisfying the dissertation requirement
for the degree of Doctor of Philosophy.
Date
Richard M. Stratt, Director
Recommended to the Graduate Council
Date
Peter M. Weber, Reader
Date
Brenda M. Rubenstein, Reader
Approved by the Graduate Council
Date
Andrew G. Campbell
Dean of the Graduate School
iii
Vita
Dylan Vale Cofer-Shabica was born in Miami, Florida, in January of 1987. In addition to
understanding and helping others to understand new things, he loves to bike, to cook,
and to make music—most often by singing. He aspires to leave the world a better place
than he found it.
Education
Ph.D. Physical Chemistry Brown University, Providence, RI, 2012–2018
Sc.B. Chemical Physics Brown University, Providence, RI, 2005–2009
Publications, Talks, & Posters
D. V. Cofer-Shabica, R. M. Stratt. What is special about how roaming chemical reactions
traverse their potential surfaces? Differences in geodesic paths between roaming and
non-roaming events. The Journal of Chemical Physics 2017, 146, 214303
J. M. Budarz, M. P. Minitti, D. V. Cofer-Shabica, B. Stankus, A. Kirrander, J. B. Hastings,
P. M. Weber. Observation of femtosecond molecular dynamics via pump-probe gas
phase x-ray scattering. Journal of Physics B: Atomic Molecular and Optical Physics
2016, 49
D. V. Cofer-Shabica, R. M. Stratt, The geometries of potential energy landscapes imply
dynamical signatures for roaming reactions, American Chemical Society, 250th
National Meeting, Boston, MA, 2015, PHYS 554
M. P. Minitti, J. M. Budarz, A. Kirrander, J. Robinson, T. J. Lane, D. Ratner, K. Saita,
T. Northey, B. Stankus, V. Cofer-Shabica, J. Hastings, P. M. Weber. Toward structural
iv
femtosecond chemical dynamics: Imaging chemistry in space and time. Faraday
Discussions 2014, 171, 81–91
D. V. Cofer-Shabica, Wandering Molecules, Brown University, Research Matters, Provi-
dence, RI, 2014. Invited Talk. https://www.youtube.com/watch?v=X3xyMP9EAco
Awards
Brown University Department of Chemistry Providence RI
Elaine Chase Award for Leadership and Service 2017
William T. King Prize for Teaching 2014
Teaching Fellow 2013, 2014, 2015
Service
Brown University Providence, RI
Diversity and Inclusion Action Committee 2016–2018
WE Teach STEM Discussion Group 2015–2018
Stand Up for Graduate Student Employees 2013–2017
Blackstone Academy High School Pawtucket, RI
Exhibition Night Judge 2013–2018
Teaching
Graduate Teaching Assistant 2012–2015
Brown University Providence, RI
Math Teacher 2009–2011
Blackstone Academy Charter School Pawtucket, RI
The Metropolitan Regional Career and Technical Center Providence, RI
Affiliations
Member of the American Chemical Society 2015–Present
v
Acknowledgments
The completion of my doctorate marks the end of my second degree at Brown—the
university for which I moved to and made a home of Providence, Rhode Island. This
place is the only in which I have lived as an adult and, as such, it has shaped me
greatly. More, even, than Brown or Providence, I have been shaped, mentored, loved,
helped-along, and cared-for by many people. Without them I would not be writing nor
would I have had the ability to complete my degree. To all those named below and to
those who I do not name, I extend my great and heartfelt thanks.
My work with Richard Stratt made my graduate education. He drove me to render
half-baked ideas concrete and to consider carefully the linkages in the chains of my
arguments. Rich shared his commitments to teaching and, I hope, passed on some
of the wisdom he has accumulated about how to tell a scientific story. How rare and
how lucky to have a mentor who cares as much about word-choice and flow as I. I am
grateful beyond measure for all that he has taught me.
All the members of Rich’s group contributed to my education—providing me with
ideas and with places to try out new thoughts. I am particularly indebted to Layne
Frechette for his friendship and for his ideas during the early parts of my project.
The other members of my committee all played important parts in my development
as scientist and teacher. Peter Weber took me in and employed me when I was wounded
and wondering what I would do with myself. Brenda Rubenstein has been a much
appreciated advocate and ally in the department. Matthew Zimmt listened and acted
to make space for me when I told him of teaching assignments, “We can do better.”
Even though he passed away the year before I completed my degree, Jimmie Doll was a
force throughout my education: he taught my first course in chemistry as well as my
last three. I am grateful for his encouragement and his stories and his gentleness when I
vi
asked him about results that were not even wrong.
The staff of the Chemistry department has done so much to make my decade-plus
stay productive and warm. Ken Talbot and Randy Goulet gave my bike a place in the
machine shop even after I became a theorist. Rose Barreira, Sheila Quigley, and Lynn
Rossi all helped me put out fires in my personal and academic life. Margaret Doll, Carol
Defeciani, and David Blair shepherded our cluster of computers. Ana Maria Fortes
made sure that our offices were clean. I am especially grateful for my friendship with
Eric Ferrara.
My relationships with all the members of Stand Up for Graduate Student Employees
(SUGSE) have given me meaning and purpose during my time at Brown. Struggling with
you to make Brown a better place to live and work for everyone has been an honor and
privilege and taught me more about how to be in community and to work for change
than I could have imagined.
I want to thank Kyle Trenshaw for his constant support and friendship, and for
founding and inviting me to participate in WE-Teach-STEM—a much needed space for
discussing the patriarchy and other systems of domination in the academy. I learned so
much by listening.
Christopher Brook and Carolyn Trombi literally kept me in fighting shape and helped
me hold to what I pass off as sanity in our bi-weekly Tae Kwon Do class. Your friendship
has sustained me through many storms of the heart and mind.
The Providence Sacred Harp singers have given me community to call home and
a place to sing. My soul soars with our voices. In addition to sharing songs, Jesse
Polhemus, Laura Hodges, and Mike Richards have become close friends and confidants
by sharing of themselves.
Tara Mulder and I loved each other; she supported me and made sure I completed
my degree when I thought that I might leave. Cassie Houtz is an unexpected joy who
fell into my life and made it possible to finish by feeding me bodily and emotionally.
I am immensely thankful for my family; without their encouragement I would be
with nothing. My parents, Nancy and Stephen Cofer-Shabica, have supported me more
than I can express since I left home. Knowing that I have their confidence is one of
my deepest sources of strength. Molly Shabica and Rachel Shabica are each amazing
sisters who taught me much and cheered me along the way. In the last three years of
vii
my degree, my relationship with Betsy Cofer-Shabica, my youngest sister, grew-up and
became one of mutual care. Thank you for always being ready for a phone-cry and for
being a conduit to my playful and silly self.
This dissertation is dedicated to my grandparents. Grandpa was an organic chemist
and a story-teller who fed my early love of science and of art. Grandma taught me to
bake and gave me the voice with which I sing. Granny spoiled me rotten and knew my
favorite dessert even as the penultimate of thirteen grandchildren. Granddaddy was a
professor of geology, whose sharp mind continues to inspire and who has encouraged
me to think broadly about how to play a part in the healing of the world. Thank you.
viii
Contents
List of Tables xii
List of Figures xiii
1 Introduction 1
1.1 Organization of this document . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Roaming and formaldehyde . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Roaming observed in a broad class of systems . . . . . . . . . . 5
1.2.2 Previous theoretical efforts towards explaining roaming . . . . 7
1.3 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 An idea from condensed systems . . . . . . . . . . . . . . . . . 10
1.3.2 Why geodesics for roaming formaldehyde? . . . . . . . . . . . 11
2 Methods 13
2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Formaldehyde . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 The potential energy landscape ensemble . . . . . . . . . . . . 16
2.2 Geodesic paths on a molecular energy surface . . . . . . . . . . . . . 17
2.2.1 Center of mass conservation . . . . . . . . . . . . . . . . . . . 22
2.2.2 Geodesic parameters . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.3 Trapped at a fixed point in 2 dimensions . . . . . . . . . . . . 34
2.3 Defining and sampling regions of interest . . . . . . . . . . . . . . . . 36
2.3.1 Monte Carlo for microcanonical reactants . . . . . . . . . . . . 37
2.3.2 Molecular dynamics for products . . . . . . . . . . . . . . . . 38
2.3.3 Monte Carlo random walk for the roaming region . . . . . . . . 39
ix
2.3.4 Geodesic itineraries . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Analyzing geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.1 Total lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.2 Diatomic Vibration . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.3 Diatomic Rotation . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4.4 Boundary fraction . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5 Estimating geodesic lengths . . . . . . . . . . . . . . . . . . . . . . . . 47
2.5.1 Total Kinematic Length . . . . . . . . . . . . . . . . . . . . . . 47
2.5.2 Vibrational Kinematic Length . . . . . . . . . . . . . . . . . . . 49
2.6 A search for saddles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.7 Numerical methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.7.1 With respect to integration . . . . . . . . . . . . . . . . . . . . 52
2.7.2 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.7.3 Hessian matrices & vibrational frequencies . . . . . . . . . . . 54
2.8 Computing the entropy of a sampled distribution . . . . . . . . . . . . 58
3 Results and Discussion 60
3.1 An image of the formaldehyde potential . . . . . . . . . . . . . . . . . 61
3.2 Locating a 2nd order saddle on the Bowman energy surface . . . . . . 63
3.2.1 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.2 Search results . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.3 No “roaming saddle” to be found . . . . . . . . . . . . . . . . 66
3.3 Observations from molecular dynamics . . . . . . . . . . . . . . . . . 69
3.3.1 Bird’s nest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3.2 Molecular dynamics branching ratios . . . . . . . . . . . . . . 74
3.3.3 An estimate of the energy-dependent incubation period . . . . 78
3.3.4 Existence of high JCO roamers . . . . . . . . . . . . . . . . . . 81
3.4 Geodesics are consistent with estimate . . . . . . . . . . . . . . . . . . 84
3.5 Geodesics are inherent dynamics . . . . . . . . . . . . . . . . . . . . 86
3.6 Observations from geodesics . . . . . . . . . . . . . . . . . . . . . . . 90
3.6.1 Implications of the bird’s nest . . . . . . . . . . . . . . . . . . 91
3.6.2 Key observations at a single energy . . . . . . . . . . . . . . . 92
x
3.6.3 The null effect of dynamical endpoint correlation . . . . . . . . 98
3.6.4 Trends with increasing landscape energy . . . . . . . . . . . . 100
3.6.5 Convergence of entropy of distributions with energy . . . . . . 108
3.7 Roaming as an entropically mediated phenomenon . . . . . . . . . . . 112
4 Concluding remarks 113
4.1 New methods, new observations . . . . . . . . . . . . . . . . . . . . . 113
4.2 Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Bibliography 116
A Derivation of path-integral form of the equation for free diffusion 125
A.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
A.2 Unconditional minimization . . . . . . . . . . . . . . . . . . . . . . . 129
B Two more ways of preserving the center of mass during escape steps 133
B.1 Swatting a fly without moving: Follow the inverse-mass-weighted gradient134
B.2 Laser-cannon: A general projection . . . . . . . . . . . . . . . . . . . 137
C Microcanonical sampling for the anharmonic oscillator 143
C.1 Normal Mode Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . 143
C.2 Construction of Normal Mode Coordinates . . . . . . . . . . . . . . . 145
C.3 Corrections for Anharmonicity and Selection of Total Angular Momentum147
D Monte Carlo sampling the landscape ensemble 150
D.1 General requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
D.2 Implementation specifics . . . . . . . . . . . . . . . . . . . . . . . . . 152
E Integrating externally-supplied FORTRAN code 157
E.1 Form of switching functions . . . . . . . . . . . . . . . . . . . . . . . 164
? Parts of this dissertation are reproduced from D. V. Cofer-Shabica, R. M. Stratt. What
is special about how roaming chemical reactions traverse their potential surfaces?
Differences in geodesic paths between roaming and non-roaming events. The Journal
of Chemical Physics 2017, 146, 214303 with the permission of AIP Publishing.
xi
List of Tables
2.1 Atomic masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Comparison of step sizes in geodesic algorithm . . . . . . . . . . . . . 32
2.3 Microcanonical and landscape energies . . . . . . . . . . . . . . . . . 33
2.4 Roaming region constants . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Properties of stationary points on the Bowman surface . . . . . . . . . 65
3.2 Coordinates of second order saddle on Bowman’s surface . . . . . . . 67
3.3 Coordinates of global minimum on Bowman’s surface . . . . . . . . . 67
3.4 Coordinates of cis-HCOH minimum on Bowman’s surface . . . . . . . 67
3.5 Coordinates of trans-HCOH minimum on Bowman’s surface . . . . . . 68
3.6 Coordinates of transition state for cis-trans isomerization . . . . . . . . 68
3.7 Coordinates of H2 CO −−→ trans-HCOH transition state . . . . . . . . 68
3.8 Coordinates of tight transition state on Bowman’s surface . . . . . . . . 68
3.9 Coordinates of proposed roaming saddle . . . . . . . . . . . . . . . . 69
3.10 Dispositions of molecular dynamics trajectories with energy . . . . . . 75
3.11 Fraction of roaming trajectories with high JCO . . . . . . . . . . . . . . 83
3.12 Success rate of computed geodesics by channel and energy . . . . . . 91
3.13 Comparison to dynamically correlated (paired) geodesics . . . . . . . . 99
3.14 Mean total kinematic length for geodesics by channel and energy . . . 103
3.15 Portion of identically zero boundary fraction vs. energy . . . . . . . . 107
3.16 Mean boundary fraction vs. energy . . . . . . . . . . . . . . . . . . . 107
3.17 Entropy of total length distributions vs. energy . . . . . . . . . . . . . . 108
D.1 Energy-specific acceptance ratios and seed configurations . . . . . . . 156
xii
List of Figures
1.1 Schematic representation of roaming . . . . . . . . . . . . . . . . . . . 3
1.2 Geodesics on a simple potential energy surface . . . . . . . . . . . . . 9
1.3 Geodesics on a simple potential energy surface . . . . . . . . . . . . . 11
2.1 Energy scales for formaldehyde energy surface . . . . . . . . . . . . . 15
2.2 The potential energy landscape ensemble . . . . . . . . . . . . . . . . 17
2.3 Free and escape steps wind around boundaries . . . . . . . . . . . . . 20
2.4 Müller-Brown potential . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Fixed point on the Müller-Brown potential . . . . . . . . . . . . . . . . 36
2.6 Hessian parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.1 A projection of the potential surface . . . . . . . . . . . . . . . . . . . 62
3.2 Orientation to the H2 CO surface . . . . . . . . . . . . . . . . . . . . . 71
3.3 Six roaming trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.4 Molecular dynamics channels (reactive) . . . . . . . . . . . . . . . . . 76
3.5 Molecular dynamics channels . . . . . . . . . . . . . . . . . . . . . . 77
3.6 Rates of reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.7 Failure of statistical distribution of reaction times . . . . . . . . . . . . 80
3.8 A “cold” shoulder in the JCO distribution . . . . . . . . . . . . . . . . 82
3.9 Roamers exhibiting high JCO . . . . . . . . . . . . . . . . . . . . . . . 84
3.10 Distribution of lengths over full geodesic paths . . . . . . . . . . . . . 85
3.11 Inherent dynamics pictograph . . . . . . . . . . . . . . . . . . . . . . 87
3.12 Geodesic as inherent dynamics of formaldehyde . . . . . . . . . . . . 89
3.13 Total length distributions . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.14 Component length distributions . . . . . . . . . . . . . . . . . . . . . 95
xiii
3.15 Boundary fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.16 Total length distributions in the high-energy limit . . . . . . . . . . . . 101
3.17 Convergence of total length distributions . . . . . . . . . . . . . . . . 102
3.18 Component length distributions in the high-energy limit . . . . . . . . 104
3.19 Boundary fraction in the high-energy limit . . . . . . . . . . . . . . . . 105
3.20 Convergence of distribution of boundary fractions . . . . . . . . . . . 106
3.21 Entropy of total length distributions . . . . . . . . . . . . . . . . . . . 110
3.22 Entropy of component length distributions . . . . . . . . . . . . . . . . 111
A.1 Discretizing a Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
B.1 A projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
xiv
CHAPTER 1
INTRODUCTION
Roaming is a novel reaction mechanism [6] characterized by fragments that take long
excursions, far from the reacting center before returning and dissociating. This strange
dynamical feature turns out to occur in a large number of systems [7] and has generated
considerable experimental and theoretical effort around the detection and explanation
of the phenomenon in the last decade. Formaldehyde, H2 CO—being one of the simplest
systems to exhibit roaming and being the first known to do so—provides us with an
attractive arena within which to understand this novel dynamical phenomenon.
The geodesic formalism [8] is a tool to understand the inherent dynamics of a
classical system by characterizing the most efficient pathways through the energy
landscape. The inherent dynamics of a system are the dominant, most important modes
of motion—instead of describing the fine details, the inherent dynamics give the overall
behaviors and sequencing of events. Likewise, geodesics, rather than being concerned
with the enumeration of minima and transition states between them, provide a global
view of the potential surface.
As theorists, one of our primary tasks is to tell stories that provide durable intuition
to those who would understand or reason about the systems we study. Oftentimes, the
plots of those stories are driven by experimental observations and/or insights gained
through analytic equations or numeric calculations. The work presented here is another
attempt to tell the story of roaming as it manifests in formaldehyde. Experiments by
van Zee and co-workers [9] first hinted at a hole in our understanding of formaldehyde’s
photodissociation dynamics; extant theories of reaction dynamics, such as transition
1
2
state theory [10], neither account for their observations nor enable us to reason about
the roaming mechanism. We use the theoretical framework provided by geodesics to
make a new window into the inherent dynamics of roaming in formaldehyde.
1.1 Organization of this document
In the thesis that follows, we make a thorough investigation of roaming in the context
of formaldehyde using geodesics as our primary lens. The present chapter resumes by
providing background on roaming in the context of formaldehyde and other notable
systems. We also survey the various theoretical thrusts towards understanding roaming
in general. Moving on, we then discuss the group’s development of geodesics as a tool
for understanding the inherent dynamics of condensed systems and make an argument
for why geodesics in a small-molecule system might be applicable to the problem of
roaming.
Chapter 2 details our methods of study and analysis, beginning with the models we
use and assumptions we make about formaldehyde and roaming. We then move to a
description of geodesics in the context of a small molecule, discussing new wrinkles and
our resolutions to them. In Chapter 2, we also describe the molecular dynamics and
Monte Carlo methods used to sample regions of the formaldehyde potential. The chapter
concludes with a description of several of the numerical methods used throughout our
studies.
In Chapter 3, we present and discuss our major results. We begin by looking through
the frame of stationary points and molecular dynamics, but quickly exhaust their utility
and move on to geodesics. Our discussion first takes up a single energy, as in our
paper [1], and then moves to an energy-dependent investigation. Our observations lead
us to conclude that the roaming mechanism is an entropically mediated phenomenon
in formaldehyde. We discuss the implications of this conclusion.
The final chapter, 4, is summative. We recapitulate the major findings as well as the
technical developments that enable our study. We also make suggestions about future
directions for research.
3
Figure 1.1: Schematic representation of roaming
Roaming is a route to products that involves a fragment taking long excursions from
a dissociating parent before returning and abstracting another component (often, but
not exclusively, an atom). In the case of formaldehyde, roaming provides another
route to close-shelled molecular products when excited near the radical dissociation
threshold. A single hydrogen may “roam” up to several angstroms away from the formyl
parent for several picoseconds before returning and abstracting another hydrogen:
H2 CO −−→ HCO···H −−→ H2 + CO.
1.2 Roaming and formaldehyde
Because of its relative simplicity, ease of handling, and relevance to fundamental
reaction mechanics, formaldehyde, H2 CO, has served as a test bed for theorists and
experimentalists interested in fragmentation dynamics for many years; indeed by the
early 1980s, formaldehyde was already considered a well beaten horse [11]. The
great amount of scrutiny to which the four-atom molecule has been subjected makes it
perhaps surprising that a totally unknown reaction mechanism lay hidden amongst its
dynamics1 for so long. In the mid 1990s, van Zee and co-workers presaged roaming after
they observed a strange shift as a function of energy in the distribution of the product
rotational states of CO when photolyzing H2 CO under collisionless conditions near
1 We will address the following point later in this chapter, but it is worth indicating now that in most
of the experiments treating formaldehyde, H2 CO is excited to its first electronic state, S1 , which then
non-radiatively decays to the ground state, S0 , where vibrationally hot, close-shelled products are
formed.
4
the radical dissociation threshold [9]. Dissociation via a slightly energetically higher
vibrational state, yielded a fraction of substantially slower CO rotation. The “cold
shoulder” that appeared in the distribution of JCO product states when exciting near the
radical threshold suggested to them the opening of a second channel to closed-shell
molecular products.
It was not until over a decade later, in a joint theoretical-experimental effort, that
Townsend and colleagues [6] revealed how bizarre the origin of the low-energy bump in
van Zee’s distribution was. Near the radical dissociation threshold, one of the hydrogens
may be ejected from the parent, but not actually depart. This roaming hydrogen may
linger several angstrom’s away for up to several picoseconds before returning and
abstracting another hydrogen. Figure 1.1 shows this schematically. In departing by
this route, the H2 is typically highly vibrationally excited and, as a consequence of
energy conservation, the CO is left with much less internal energy. Because the excited
vibronic states of CO are not heavily populated, most of the energy comes from rotation;
hence the greater population in low JCO states.
The authors of that first paper characterizing roaming [6] immediately recognized
that the phenomenon was not at all mediated by the well-known tight transition state
to molecular products2 . Aside from the great strangeness of the mechanism, a major
piece of evidence was that molecular dynamics studies initiating trajectories from the
transition state to molecular products revealed no evidence of roaming [13, 14]. If
roaming did proceed via the transition state, surely these studies would have encountered
the mechanism.
A note about energy surfaces and scales: After excitation to the first excited elec-
tronic state, H2 CO decomposes into either molecular (scheme 1.1) or radical products
(scheme 1.3). Molecular products are formed on the electronic ground state, S0 , and
radical products on the S0 and T1 (first excited triplet) states [15]. Roaming to molecular
products (scheme 1.2) occurs on the S0 surface [6]. Roaming, however, is not the
primary channel for the formation of molecular products in formaldehyde. Estimates
range from 15% to 40% as a function of energy [16]; but, roaming remains an ill-defined
2 We refer to a “tight” transition state in the sense that it involves the formation of an activated complex
with an energetic barrier [12].
5
process and these estimates should be taken as order-of-magnitude only. Experimen-
tally, the onset of roaming is somewhere between 100 cm–1 and 200 cm–1 below the
the radical dissociation threshold [17]. The relevant experimental energy scales3 for
processes on the S0 surface are given here:
H2 CO −−→ H2 + CO (direct) 27720 cm–1 [9] (1.1)
−−→ HCO···H −−→ H2 + CO (roaming) ≈ 30200 cm–1 [17] (1.2)
−−→ H · + HCO · (radical) 30328 cm–1 [18] (1.3)
Townsend and colleagues [6] demonstrated that roaming in formaldehyde occurs on
a single ground electronic state under classical dynamics and microcanonical initial
conditions with total angular momentum of 0. Therefore, even though quantum effects
surely play a role in dynamics of energy transfer leading up to roaming, roaming can
be understood from a purely classical perspective. This classical viewpoint will be our
frame of reference through the investigation.
1.2.1 Roaming observed in a broad class of systems
While roaming was first observed and characterized in formaldehyde, researchers
quickly sought and found many other systems where roaming occurs [7]; within the
first decade of observation, over 20 systems were reported to roam [19]. Many early
investigations identified roaming by multi-component product state distributions as in
formaldehyde. Out of all the systems, a collection of characteristics of roaming fragments
came together: large amplitude, wandering motion on a flat region of potential and time-
scale separation between that motion and the other vibrations of the system. Roaming
has been reported in a wide range of unimolecular as well as bimolecular reactions.
Here we briefly describe a few of the systems exhibiting roaming to give a sense of the
great plurality of systems that roam.
Acetaldehyde, CH3 HCO—which is simply formaldehyde with a hydrogen replaced
by was a methyl group, CH3 —was a natural second place to find roaming [20]. Here the
relatively massive methyl group roams away from the formyl before abstracting another
3 After
this chapter, our investigation will be exclusively based on a high-accuracy, full-configuration-
space, model potential for formaldehyde. The experimental values given here are connected to
energies on that surface by the addition of the zero point energy; see Section 2.1.1.
6
hydrogen (though rarely a hydrogen will roam and abstract the methyl). Roaming seems
to be the dominant channel for formation of molecular products in acetaldehyde and
can account for as much as 80% of the flux [21].
Roaming on multiple electronic surfaces was first reported on the D0 and D1 surfaces
of NO3 [22, 23, 24]. These authors also report that the NO3 −−→ NO + O2 reaction
proceeds exclusively by roaming [23].
It seems that many roaming systems also feature radical-radical abstraction path-
ways [7] and at least one author has speculated that wherever there exists such a
barrierless route to abstraction leading to excited, closed-shell products, roaming will be
present too [25]. This idea was indeed borne out in a large number of H + Cn H2n+1
alkane reactions [26].
Even though roaming can be described in purely classical terms, much speculation
has been made about the quantum manifestations of roaming. There are not enough
studies to address this point in generality, but the roaming mediated reaction, H +
MgH −−→ Mg + H2 offers at least one finding: Just below break-up energy of the
complex, the vibrational wave functions of bound states seemed to be highly delocal-
ized [27]. This is not at all surprising given the great variety of roaming pathways seen
in other systems [17].
“Roaming” or something like it has even been reported in the liquid phase [28].
Mereshchenko et al. claim that CHBr3 exhibits a roaming-mediated isomerization to
BrHCBr – Br while solvated. In this system, “roaming” occurs on a 100 fs time-scale, with
large amplitude motion and on a flat region of the potential energy surface. While 100
fs is an order of magnitude faster than roaming in most of the other systems described
here, the other properties: flat potential and unpredictable wandering are certainly
consistent.
Roaming has been presented as a failure of transition state theory and a challenge
to our understanding of reaction dynamics in general [7, 19]. Indeed roaming does
present such an opportunity and has rightly inspired considerable interest in a theoreti-
cal framework for understanding the mechanism—the present work is no exception.
However, we would be remiss if we failed to point out that roaming is only another
example in a long history of systems the dynamics of which are not well understood in
terms of barriers and extrema [25, 29]. We suspect that a broader perspective on the
7
energy landscape is necessary to provide insight into the dynamics of these systems.
1.2.2 Previous theoretical efforts towards explaining roaming
Previous attempts to provide a theoretical explanation for roaming can be broadly
divided into two camps based on whether they take a local or global perspective on the
configuration- or phase-space. Local descriptions often rely on the identification of a
reaction coordinate and/or a dividing surface and then make some kind of transition state
calculation to compute a rate or branching ratio. Global methods seek to apprehend
the large scale (phase- or configuration-space) structures that guide the dynamics of
roaming. In this section we will give a brief exposition of the major ideas advanced in
each camp.
“Roaming transition states”
Klippenstein, Harding, and colleagues have done considerable work to advance a
statistical, transition state theory of roaming [30]. They were the first group to identify
a “roaming” transition state, or a saddle point resembling roaming, in formaldehyde
and acetaldehyde [31]. In both cases, the saddles were located 30–70 cm–1 below
the radical dissociation threshold. They present a multi-path transition state theory
with transition states constructed to take into account dynamical effects as well as
the features of the energy surface. Their predictions compare favorably to reduced
dimension trajectories in acetaldehyde [32] at energies less than 350 cm–1 above
the radical asymptote, but suffer substantially above this limit, where roaming is still
prevalent [21, 33].
In the same family of theories, Andrews, Kable and Jordan [34] present a two param-
eter model coupled to a variational transition state theory calculation that successfully
fits roaming vs. radical branching ratios over a narrow range of energies for H2 CO,
CH3 HCO, and NO3 . Unfortunately, they are not able to speak to the portion of the
molecular channel arising from the tight transition state or provide much insight into
how to think about the mechanism.
The Bowman group has also weighed in on the subject of roaming saddles [35]
and contends that, at least in the aldehydes studied, the saddles are insufficiently
8
energetically separated from the tight transition state to be treated independently in a
statistical theory (this judgment is disputed [36]). They also have observed [19] that in all
attempts to define intrinsic reaction coordinates [37] by way of roaming saddles [38, 39],
the dynamics show massive deviations from these paths, euphemistically called “large
amplitude motion” in the vicinity of the saddles. Such deviations are perhaps at the root
of the terminally frustrated attempts at a statistical theory for the prediction of branching
ratios.
Phase space structures governing roaming dynamics
In a recent series of papers [40, 41, 42, 43, 44], the Bristol-Cornell-Crete collaboration
offers a rather different perspective on roaming in the language of non-linear dynamical
systems theory. In a variety of model systems, the authors identify phase-space structures
that guide the dynamics of roaming. In their scheme, roaming trajectories are given
their bounds by phase-space dividing surface constructed by computing the normally
hyperbolic invariant manifolds of the space [41]. They also identify periodic orbits—
which function like attractors—and suggest that the origins of roaming behavior are in a
family of periodic orbits within what they identify as the roaming region of phase-space.
Of particular interest is the authors’ observation of the fractal nature of phase-space
bands of initial conditions leading to different products, which suggests that the subspace
leading to roaming is highly variegated [42].
The collaboration also identifies these periodic orbits and unstable manifolds in
a reduced-dimensional model of formaldehyde [43]. They propose that there exists
a particular periodic orbit associated with roaming that bounds all roaming (but not
radical) trajectories and guides them back to molecular products [43, 44]. While the
results they achieve do much to address the “whys” of roaming, their approach is
unfortunately limited to two degree of freedom systems because of technical (though
not insurmountable) hurdles to computing normally hyperbolic invariant manifolds in
larger systems.
9
Figure 1.2: Geodesics on a simple potential energy surface
In the figures above, R and P correspond respectively to the reactant and product wells
of the Müller-Brown potential [45]. A steepest-descent reaction path connecting them
is shown in the squares (). We also show geodesic paths, circles (
), between the
reactants and products (right) and points labeled A and B (left). In addition to showing
how the geodesic path differs from the reaction path, the figure on the right also shows
a geodesic in several stages of refinement using the algorithms discussed in Section 2.2.
Figure modified from [8].
1.3 Geodesics
Similar in spirit to the global methods employed by the Bristol-Cornell-Crete collabora-
tion, our geodesic formalism [8, 46] is a means for studying the large-scale structures
of a dynamical system. We seek to understand the inherent dynamics—the most im-
portant components of the dynamics, smoothed and stripped of noise. Geodesics are
decidedly not concerned with the basins, transition states, or stationary points of a
potential surface, but with the properties of the major thoroughfares by which a system
traverses configuration space4 . Figure 1.2 shows the contrast between a geodesic path
and a steepest-descent reaction path connecting the basins.
We will formalize our definition of geodesics later, in Chapter 2, but looking ahead,
our notion of shortest paths (geodesics) as most important paths comes from a formal
4 Thatgeodesics are constructed entirely in the space of configurations rather than the phase-space
of the system is another distinction between our methods and those of the Bristol-Cornell-Crete
collaboration. We are still able to pose fruitful questions of the potential surface so long as we probe
the properties of paths rather than of points.
10
rendering of a solution to the Fokker-Planck equation for free diffusion as a path integral:
Z R ,t " Z t 2 #
f –1 dR
G R0 → Rf , t = e D R(τ ) exp (1.4)
e dτ
e e R0 ,0 e 4D 0 dτ
e
G R0 → Rf , t is the probability density for diffusing from R0 to Rf in a time t and
D R(τ ) indicates an integral over the space of all possible paths, R(τ ). D is the diffusion
e e e e
constant. The insight comes from the integral term in the exponential. When diffusion
e e
is slow, D is small and the only paths that will contribute are those that minimize the
2
integral over dR/dτ , i.e., those that minimize the force-free action. Those same
paths are also the shortest paths, or geodesics of the system. There is a quantitative
e
connection between the rate of diffusion in a system and the length of its geodesics; as
they elongate, the dynamics slow down [8].
Geodesic theory is related conceptually and in aim to transition path theory [47], a
scheme for understanding rare-events in potential systems by deriving average trajectory
flows. The techniques provide a prescription for constructing, among other things,
the “current” of reactive trajectories between regions of configuration space. The flow
lines of such currents are filtered objects, with unimportant motions washed out, much
like our geodesics and notion of inherent dynamics. In contrast, however, where the
methods of transition path thory are difficult to apply to systems with many degrees of
freedom, there are simple algorithms [8] for computing geodesics in systems with 2 to
1000s of degrees of freedom.
1.3.1 An idea from condensed systems
The initial ideas for geodesics and our notions of most efficient paths were developed
in the context of condensed systems [8, 46, 49, 50, 51, 52]. Geodesics have enabled
a number of remarkable successes including quantitative prediction of the diffusion-
percolation transition [8], revealing the underlying dynamics of preferential solvation in
a binary mixture of competing solvents [49], and, more recently, providing insight into
the glass transition in hard spheres [51].
What all of these phenomena have in common is that they manifest in systems where
small variations in their energy landscapes are unimportant. Indeed, in the extreme case
of hard spheres, the potential has no fine variation—it only takes on two values: 0 or ∞.
11
Figure 1.3: Geodesics on a simple potential energy surface
Geodesic path (solid red) between endpoints (filled, black circles) in a simple model of
a liquid. In this two-dimensional example, it is apparent how geodesics are like the
shortest-path solutions to a maze. Figure modified from [48].
Even in the case of ordinary liquids at room temperature, the well-depths of most of
the minima are far below the few kB T per degree of freedom (∼ 200 cm–1 ) available
to the system. Rather than being mediated by gathering enough energy into a special
coordinate and surmounting a barrier, these processes are driven by the system’s ability
to navigate a disordered landscape (see Figure 1.3). These entropic considerations
require us to probe the global structure of energy landscape instead of searching for
special stationary points.
1.3.2 Why geodesics for roaming formaldehyde?
Much like the perspective taken by the members of the Bristol-Cornell-Crete collabora-
tion [43, 44], we are firmly of the view that understanding the peculiar dynamics of
roaming will not come from the properties of stationary points on the energy surface.
We also have a different goal than many of the other theoretical projects in Section 1.2.2.
Rather than fiddling with branching ratios, we seek to understand the essential properties
12
of the formaldehyde energy landscape that give rise to roaming.
In deploying geodesics to understand roaming, we implicitly assume that the minor
variations in the potential are not central to the dynamics—indeed, we take the minima
to be irrelevant. Our perspective is in stark contrast to the basin-driven lens offered
by Wigner’s transitions state theory [10] and even Wales’ network-of-basins view on
energy landscapes [53].
We hypothesize that such disregard for local features of the surface is not only
appropriate, but essential because photolysis injects such enormous amounts of energy
into the system. For many regions of the landscape, fluctuations in the potential will
be small in comparison to the kinetic energy—the system will skate high above any
dips or rises. Given this great energetic excess, it would be impossible to pin down
the backbone of the dynamics in terms of just minima and barriers. In the particular
case of roaming, the potential in the region of the landscape where roaming occurs
is quite flat and the roaming hydrogen strays far from the reacting center. The global
exploration of the landscape suggested by its wanderings is just the kind of problem to
which geodesics are suited.
Another way of distinguishing our perspective is by thinking in terms of low-energy
versus high-energy limits on the energy landscape. In the low-energy limit, it makes
sense to prioritize stationary points because excursions away from them are rare. In
contrast, in the high-energy limit, the system’s dynamics will be driven by global
connectivity, which renders the details of stationary points unimportant. In the case of
roaming formaldehyde, we are clearly in the high energy limit and wide exploration of
the space is called for; the holistic mode of analysis offered by geodesics, insensitive to
the details of the potential, fits the problem well.
In the next chapter, we develop the tools to unambiguously specify geodesics on a
small-molecule potential surface. We discuss the landscape ensemble, the space within
which geodesics are defined, in Section 2.1.2, and then give a description of how they
are computed in Section 2.2.
CHAPTER 2
METHODS
In this chapter we describe our models and the assumptions that underlie them. We then
proceed to a description of our algorithms, sampling procedures, and other methods.
The initial sections deal with computing geodesics on a small-molecule potential energy
surface. This task is related to but different from previous work done by the group in
several key ways. The remainder treat the search for a saddle on such a surface, numeric
methods, and a technique for estimating the entropy of an unknown distribution.
2.1 Models
2.1.1 Formaldehyde
To treat roaming in the photodissociation of Formaldehyde (H2 CO), we use a single
classical, ground-state potential surface. This is reasonable at the energy scales of
interest because formaldehyde dissociates on the ground state after excitation to and
relaxation from the first excited singlet state [15].
The potential surface we use was generously provided to us by Joel Bowman, of
Emory University and is described in detail by Zhang et al. [54]. A summary of the
development and features of the surface follows. The surface is global in that it covers
the space of all configurations at energies from the ground state all the way up to triple
fragmentation (H2 CO −−→ H + H + CO). In addition to explicitly treating molecular
dissociation, the authors constructed it with an eye towards studying the abstraction
13
14
reaction H + HCO −−→ H2 + CO; this because of its importance to roaming. The
surface is assembled from independently constructed fits to 6 different regions of
configuration space in internal coordinates. Regions expected to be well described by
single reference methods were computed with coupled clusters [CCSD(T)]. Where these
were expected to fail, multi-reference, configuration-interaction (MR-CI) with complete
active space, self-consistent field (CASSCF) methods were used (with 6 active orbitals
and 6 active electrons). Roughly 170,000 single point energy calculations went into the
fits, which were then joined by quintic switching functions1 . Fits were validated against
the ab initio calculations, adjusted, and generally found to be in quite good agreement.
The authors note that the surface is suitable for dynamics studies, but not for those
demanding “spectroscopic accuracy” because of generally modest energetic deviations
(there was, however, a large, 3.23%, one at a single stationary point) and discrepancies
in computed frequencies. Likewise, the surface is sufficient for our purposes.
The most important property of the surface is that it exhibits roaming; indeed it
is this surface that first contributed to the discovery of roaming by Townsend and co-
workers [6]. As part of our validation of the potential, we implemented the molecular
dynamics scheme described in Section 2.3.2 and successfully reproduced the product
distributions reported in their paper.
That roaming is exhibited on this surface is a point not to be minimized or underes-
timated. As our investigation targets the inherent dynamics of roaming and photodis-
sociation in formaldehyde, we do not need a perfectly accurate potential. We require
only one that has the essential features that give rise to roaming. Classical treatment
of the Bowman surface meets this criteria. While quantum effects may play a role in
the photodissociation of formaldehyde (and they certainly do guide energy flow and
deposition through the electronic states), we know that roaming manifests classically
on this surface and so we pursue a purely classical treatment.
For the purposes of making contact with experimental energies, the zero-point
energy of formaldehyde must be added to the energy of any excitation laser. Zhang
and co-workers give a harmonic estimate of the zero-point energy [55] as:
Ezpe = 5844 cm–1 (2.1)
The potential is defined in a 6-degree-of-freedom space of inter-atomic distances, but
1 The published form deviates from that used in their source code; see Appendix E.1.
15
Figure 2.1: Energy scales for formaldehyde energy surface
This scale is for the fitted energy surface used in our investigations. Plotted also is the
experimental limit of detection for roaming. All energies are given in wavenumbers.
Figure modified from [54].
all of our simulations (and path-finding routines) are carried out in the 12-dimensional,
Cartesian lab frame. A detailed description of the way we interface with the FORTRAN
code provided by the Bowman group is in Appendix E.
The relevant energy scales for the problem and the energy surface are given in
figure 2.1. Our investigations probed energies from 35000 cm–1 to 45000 cm–1 ,
corresponding to excitation wavelengths2 between 349 nm and 255 nm; see Table 2.3
for all energies used. This range starts just above the radical dissociation threshold
(Erad –1
µ > 33240 cm ) allows us to compare the three channels of interest: direct
dissociation to molecular products, roaming dissociation to molecular products, and
radical dissociation. Roaming is not observed experimentally above 41060 cm–1 [17].
A natural system of units for the study of phenomena in the energy range of electronic
excitations is that of Hartree atomic units and we use them throughout our investigation.
The values of the electron’s rest mass, me , its charge, e, and the reduced Planck constant,
¯h, are all set to unity. The fundamental unit of length is the Bohr radius, a0 = 0.529 Å;
2 The upper end of our energy range is non-physical as the same electronic pathways are not accessed
at this wavelength. We can, however, make useful comparison between molecular dynamics and
geodesic results on this surface even at experimentally inaccessible energies.
16
Element Mass / me *
H 1837.15
C 21874.66
O 29156.95
Table 2.1: Atomic masses
* me : rest mass of the electron
the unit of energy is the Hartree, EH = 27.211 eV = 2.1947 × 105 cm–1 ; and time
passes in units of ¯h/EH = 0.0242 fs. The masses of the atoms relevant to the simulation
are given in Table 2.1.
2.1.2 The potential energy landscape ensemble
What we call the inherent dynamics of a system can be thought of as the system’s
classical dynamics stripped of noise (e.g. high frequency vibrations or rotations) that
are not essential to the motions of a system. They describe the dominant thoroughfares
of configuration space—the major pathways followed during a chemical or physical
transformation. In a sense, the inherent dynamics are the most efficient routes through
configuration space. To understand the nature of roaming, we take formaldehyde as
a model system and examine the topology and connectivity of its energy landscape.
There are certainly many ways one could study the space of allowed paths, but a
conceptually simple route is a binary partition of configuration space into allowed
and disallowed regions: those where the high energy of photo-excitation makes the
variegation of the potential irrelevant and those where the potential is so large as to
be practically inaccessible. We do this formally by introducing the potential energy
landscape ensemble [46].
The landscape ensemble is a statistical mechanical ensemble like the canonical
or microcanonical ensembles. However, instead of a fixed temperature or energy, its
thermodynamic variable of control is a fixed maximum potential energy, called the
landscape energy, EL . Within this ensemble, all configurations such that the potential is
less than the landscape energy,
V(R) ≤ EL (2.2)
e
17
Figure 2.2: The potential energy landscape ensemble
The landscape energy, EL , provides the thermodynamic variable of control in the
landscape ensemble. This fixed maximum potential energy plays a similar role to the
microcanonical energy or the canonical temperature. In the figure, clear valleys are
accessible, V(R) ≤ EL , while the shaded mountains are forbidden. Figure from [46].
e
are admitted with equal probability; see Figure 2.2. Equiprobability is a result of the same
entropy maximization arguments that lead to equiprobable states in the microcanon-
ical ensemble. In the thermodynamic limit, the landscape ensemble gives the same
configuration space probability densities any other ensemble (e.g. micro-canonical,
canonical) gives. In some ways, this binary division is the simplest partitioning that one
could make given a space with a scalar field on it3 . Such a divided space gives us a
frame within which to locate the most efficient paths.
2.2 Geodesic paths on a molecular energy surface
The group’s initial development of the ideas behind efficient paths occurred in the
context of diffusing liquids [8]. Writing the solution for the Fokker-Planck equation for
free diffusion as a path integral4 provides our guiding principle.
Z R ,t Z t
–1
f
G R0 → Rf , t = e D R(τ ) exp T R(τ ) dτ (2.3)
e e R0 ,0 e 4D 0 e
e
3 Rather,it is the simplest partitioning of a scalar field that leaves us with any chance of the space
remaining connected. We could have also demanded V(R) = EL , but this space would have volume
of zero measure and would surely be hopelessly disconnected e for all but the simplest potentials.
4 See Appendix A for a full derivation.
18
Here, G R0 → Rf , t is the probability density for diffusing from R0 to Rf in a time t
and D R(τ ) indicates an integral over the space of all possible paths, R(τ ), respecting
e e e e
whatever boundary conditions we impose. T is the kinetic energy and D is the diffusion
e e
constant.
We examined and were then motivated by eq. 2.3 with the goal of understanding
the inherent dynamics or most efficient paths of systems under study. There are many
ways that we could formulate a notion of most efficient path, but an expedient one
comes from combining eq. 2.3 with the potential energy landscape ensemble discussed
in Section 2.1.2.
The landscape ensemble partitions configuration space into allowed and dis-allowed
regions and provides boundary conditions for eq. 2.3. When diffusion is slow (D is
small), the paths that dominate are those that minimize the force-free action in the
exponential. Z t
S R(τ ) = dτ T(R(τ )) (2.4)
e 0 e
We require that we this functional be minimized in the space of paths respecting the
landscape criterion, where V(R(τ )) ≤ EL for all τ . These same paths minimizing the
action are also shortest paths or geodesics, which minimize the kinematic length (to be
e
introduced in Section 2.4.1):
Z t
`= dτ [2T]1/2 (2.5)
0
Note the dimension of T is [Energy] and so the integral in eq. 2.5 has dimension of
[Length] × [Mass]1/2 . The paths that minimize the functional are therefore technically
mass-weighted shortest paths. We will confront this distinction in Section 2.2.1.
From the landscape perspective, the dynamics of a system are not classical paths in
phase space, but much simpler objects, geodesic paths in configuration space. These
geodesics provide information about the connectivity and structure of the energy
landscape. This is our main result: that the inherent dynamics are given by geodesics
in landscape ensemble.
Casting the diffusion equation in the path-integral manner of eq. 2.3 is so powerful
because it frees us from having to solve what is otherwise a horribly complicated and
specific problem—diffusion over a potential is a different (possibly insoluble) differential
equation for every potential. We integrate over all possible paths, composed of many
19
infinitesimally short steps. Because the potential does not vary dramatically over such
short scales, we can use the equation for free diffusion. But, the impact of the potential
does not go away; its presence is encoded in the, perhaps quite stringent, boundary
conditions imposed by the landscape criterion, V(R) ≤ EL .
The condition in eq. 2.4 has form of a non-holonomic constraint.5 Finding solutions
e
to problems with non-holonomic constraints is often difficult, but the form of such
solutions is specified by the Kuhn-Tucker theorem [56]. The theorem holds that the
desired solution will be the union of paths that unconditionally minimize the functional
in eq. 2.4 and those where strict equality holds. The Kuhn-Tucker theorem points to
the need for two components of an algorithm for computing geodesics: free-steps that
unconditionally minimize the path length and steps that trace along the boundary of
the obstacles defined by V(R) = EL contours. For an N atom system, the kinetic energy
can be written in terms of the lab-frame, Cartesian coordinates for the centers of each
e
atom:
1 ˙T ˙
T= R MR (2.6)
2e e
Where RT = (x1 , y1 , z1 , x2 , y2 , . . . , zN ) is the total configuration-space vector with R˙ its
time derivative, and M is the mass matrix, given by:
e e
(M)iµ,jν = δi,j δµ,ν mi i, j = 1 . . . N ; µ, ν = x, y, z (2.7)
With mi is the mass of the ith particle in the N-particle system.
The path between a target vector, Rf , and a starting position, R0 , that unconditionally
minimizes eq. 2.4 in time t for such a kinetic energy is then the straight-line path given
e e
by: τ τ
R(τ ) = 1 – R + R (2.8)
e t e0 t ef
Discretizing this path is straight-forward6 :
R – Ri
Ri+1 = Ri + δs e f e i (2.9)
e e |Rf – R |
e e
5A holonomic constraint is one that can be expressed as an equality relationship between a constant
and a function of the parameters of its domain, e.g. coordinates and time, as in: 0 = f(q1 , q2 , . . . , t).
6 Consider R(τ + ∆τ )
e
20
Figure 2.3: Free and escape steps wind around boundaries
From top to bottom: A free step causes system to enter forbidden region, which it then
escapes to the boundary. The repeated action of free steps into the forbidden region and
escape steps out trace the boundary of the obstacle in accordance with the Kuhn-Tucker
theorem. Figure from [8].
where δs is a step length, chosen to be small relative to the features of the potential7 .
This is the same procedure used previously by the group [8]. After each step we check to
make sure that the landscape criterion, V(R) ≤ EL , is satisfied. If it is, we accept the step.
If it is not, we are in a forbidden region and must escape. The repeated combination of
e
the free, straight-line, steps and escape steps will cause the geodesic to wind around
the obstacles; see Figure 2.3.
Typically, the group has used a Newton-Raphson root search to find the nearest
solution to V(R) = EL .
Rn+1 = Rn – ζ ∇V Rn (2.10)
e
e e e e
7 See Section 2.2.2.
21
where
V(Rn ) – EL
ζ = e 2 (2.11)
∇V Rn
e e
By repeated application of eq. 2.10, we would take steps following the gradient to the
edge of the V(R) = EL boundary. However, in the present case, there is an additional
complication. Formaldehyde is a system that has heterogeneous masses and so moves
e
along the gradient (as the above, naïve Newton-Raphson step would take) will displace
the center of mass. This problem had never been encountered and its exposition is
treated alone in Section 2.2.1. Resolution is found in a projected Newton-Raphson root
search that prevents center of mass perturbation via a uniform back-translation. Escape
steps are then found by iteratively following the projected gradient until the landscape
criterion is satisfied:
Rn+1 = Rn – ζ (1 – M) · ∇V Rn (2.12)
e e e e
where ζ is defined as before in eq. 2.11 and the projection, (1 – M), is a 3N × 3N
matrix, to be derived in Section 2.2.1.
Geodesics between endpoints are therefore found on the formaldehyde surface by
applying eq. 2.9 and, as necessary, eq. 2.12 until the system is within 2δs of the target.
Optimization While the geodesics found by the deterministic portion of our algorithm
satisfy the Kuhn-Tucker theorem, they are not necessarily shortest paths—the theorem
establishes necessary but not sufficient conditions of the form of the result. We therefore
apply a simple optimization scheme to reduce the lengths. During the optimization
phase, we construct geodesics using the previously developed algorithm and so all the
properties of those geodesics hold here as well. The optimization round proceeds as
follows:
1. A random configuration along the R0 → Rf path is selected: Ri , not one of the
end-points
e e e
2. Ri is perturbed:
0
Ri = Ri ± δp (1 – M) · Eˆ
e
e e
Where the sign is chosen randomly between + and – with equal probability,
δp = δs is the length of a small, perturbative step, and Eˆ is a configuration-space
22
unit vector along a randomly chosen, atomic, lab-frame axis. (1 – M) is the same
projection we use to enforce a constant center of mass (see Section 2.2.1).
0
3. If V(Ri ) > EL , the configuration is forbidden; a new Ri is then chosen and the
0 0
process repeated until V(Ri ) ≤ EL and Ri is in the allowed region.
e e
e e
0 0
4. Two new geodesics, R0 → Ri and Ri → Rf , are computed.
e e e e
0
5. If the total R0 → Ri → Rf path is shorter than the original, the new path replaces
the old; otherwise the old path is retained.
e e e
The above procedure is repeated many times until the path length is converged; explicit
termination criteria are given in Section 2.2.2.
2.2.1 Center of mass conservation
In systems with heterogeneous masses, a gradient-following Newton-Raphson root
search erroneously introduces center of mass translation. Here we explore the ramifica-
tions of this and derive a projected root search that does not. This topic is also taken
up in Appendix B, where two other methods are described. The results derived in this
section allow us to construct geodesics that are guaranteed not to displace the center
of mass while traversing configuration space. This is of particular importance to the
present work, where the center of mass motion contributed a non-trivial excess to the
total kinematic length. We considered enforcing other conservation properties, such as
the Eckart conditions for rotations [57]. But the molecular-dynamics-derived endpoints
do not obey these conditions8 and so there is no way to impose them on the geodesics.
Notation
In the analysis of molecular systems with N atoms in d physical dimensions, it is
sometimes desirable to picture the system as specified by a single vector in the Nd-
dimensional configuration space and at other times as N spatial vectors. We use the
following definitions to transition between representations.
8 nor should they, though they do conserve angular momentum.
23
• Let ~xi ∈ Rd be the column vector specifying the spatial coordinates of the ith
atom; i = 1 . . . N.
• Let R ∈ RdN be the column vector representing the total configuration of the
system; RT = (~xT1 , ~xT2 , . . . , ~xTN )
e
e
• Let mi be the mass of the ith atom.
We can extract spatial vectors from the configuration space vector by defining the
d × dN matrix, C ~ such that
ei ~ ·R
~xi = C (2.13)
ei e
For instance, in 3 spatial dimensions,
1
~ = 1
C
e1
1 ...
In general, ↔
~
C i j = δi,j 1 , j = 1...N (2.14)
e
↔
Where 1 is the d × d unit matrix. In the other direction we have:
N
~ T · ~xi
X
R= C (2.15)
e i=1 e i
We compute the center of mass of the system as follows:
N
1 X
~xcm = mi~xi
Mtotal
i=1
N
1 X
~ ·R
= mi C
Mtotal ei e (2.16)
i=1
N
1 X ~
= mi C · R
Mtotal ei
i=1 e
~ ·R
≡M
e e
24
PN ~ is a d × dN dimensional,
where Mtotal = i=1 miis the total mass of the system and M
block-diagonal matrix composed of N ordered, d × d blocks with mi (i = 1 . . . N) along
e
the diagonals.
~ = mi 1
↔
M , i = 1...N (2.17)
e i Mtotal
Should the Center of Mass be Conserved?
Our intuition is that center of mass translation should increase the length of a path
between two regions of configuration space which have the same center of mass; this
is indeed the case as proved below.
The length we seek to minimize (eq. 2.5) is
√
Z
` = dτ 2T(τ ) (2.18)
where T(τ ) is the kinetic energy as a function of progress along the path, defined
(equivalently to eq. 2.6) as:
d~xi 2
N
X
2T(τ ) = mi (2.19)
dτ
i
Discretizing the integral in eq. 2.18 (see eq. 2.117 in Section 2.7.1) gives the contribution
from a single small step, v
uN
m ∆~x 2
uX
∆` = t i i (2.20)
i
where ∆~xi is the displacement of the ith center in a time ∆τ . Suppose the {∆~xi } are
decomposed into 2 components:
∆~xi = ∆~xi0 + ∆~x (2.21)
where ∆~x corresponds to net center of mass translation and the ∆~xi0 preserve the
center of mass. That is:
N
mi ∆~xi0
X
~0 = (2.22)
i
25
Inserting eq. 2.21 into eq. 2.20 yields:
v
uN
2
mi ∆~xi0 + ∆~x
uX
∆` = t
i
v
u
uXN N N
2
mi ∆~xi0 + ∆~x 2 mi ∆~xi0 · ~x (2.23)
X X
=t mi +
u
i i i
v
uN
2
mi ∆~xi0 + M∆~x 2
uX
=t
i
where the last term of the middle line is 0 by eq. 2.22. Identifying,
v
uN
2
mi ∆~xi0
uX
∆`0 = t (2.24)
i
as the length in the absence of center of mass translation, we then have:
q
∆` = ∆`20 + M∆~x 2 > ∆`0 (2.25)
Thus completing the proof that net center of mass translation increases the kinematic
length.
Is the Center of Mass Conserved?
Having established that to be “shortest”, paths through configuration space must preserve
the center of mass in real space, we turn to the algorithm to compute geodesics in the
potential energy landscape ensemble presented in the previous section. There are two
components of the algorithm, which treat “free” and “escape” steps respectively. We
now analyze each in turn.
Free Steps We give eq. 2.9 for computing free (straight-line) steps between the current
position, Rt , and the target, Rf :
e e
R – Rt
Rt+1 = Rt + δR e f e t (2.26)
e e Rf – R
e e
26
Collapsing the scalars into the constant γ yields:
Rt+1 = Rt + γ Rf – Rt (2.27)
e e e e
We can impose the requirement that our boundaries have the same center of mass
(recall eq. 2.16) by demanding:
~ · R0 = M
M ~ ·R (2.28)
f
e e e e
And this is indeed the case for endpoints generated by molecular dynamics. We then
ask if R1 , computed by eq. 2.27 will have the same center of mass as R0 .
e e
?
~ · R0 = M ~ · R1
M
(2.29)
~ · R0 + γ R – R0
e e e he i
=M f
e e e e
Canceling the common term and the scalar leaves:
~ · R – R0 ,
~0 = M (2.30)
f
e e e
which is a restatement of the limits on our boundary conditions, eq. 2.28. This proves,
therefore, that the first step does not perturb the center of mass. By induction, we
conclude:
~ · R0 = M
M ~ · Rt = M
~ ·R (2.31)
f
e e e e e e
for all t. And so the center of mass is not perturbed by the free steps of the algorithm.
Escape Steps When the system enters a region where the potential is greater than
the landscape energy, V(R) > EL , An escape step is required. Usually this entails a
Newton-Raphson root search—iteratively following the gradient until the landscape
e
energy criterion is satisfied:
V(Rn ) – EL
Rn+1 = Rn– e 2 · ∇V Rn (2.32)
∇V Rn
e e e e
e e
Such steps do no preserve the center of mass; here we show how. As in the case of the
free steps, we collapse all the scalars into a constant term, ζ, leaving:
Rn+1 = Rn – ζ ∇V Rn (2.33)
e e e e
27
Again our question is, Does the iterated map preserve the center of mass? We
encode this as (recall eq. 2.16):
~ · Rn =? ~
M M · Rn+1
(2.34)
e e e e !
? ~
= M · Rn – ζ ∇V n
e e e R
e
Canceling the common term and the scalar leaves us with the following expression:
? ~
~0 = (2.35)
M · ∇ V R n
e e e
which doesn’t look too good.
Does it Hold? Proving that eq. 2.35 holds in general would be quite difficult9 , but
invalidating it requires only one counter-example. All the problems treated by the
group’s analysis of geodesics to date, involve translationally invariant potentials, so
we confine our attention to a particularly simple one: the pair-potential between two
mutually interacting centers, labeled 1 and 2, in 3 dimensions. The form of our potential
is then:
V = u(~x1 – ~x2 ) (2.36)
Taking ~x = ~x1 – ~x2 We compute the derivative in eq. 2.35, arriving at:
∂u
+ ∂~x
∇V =
(2.37)
– ∂u
e
∂~x
Inserting eq. 2.37 into eq. 2.35 gives
~
∂u ∂u
M · ∇ V n = m1 – m2
e e R ∂~x ∂~x
∂u (2.38)
e
= m1 – m2
∂~x
?
= ~0
Showing that eq. 2.35 fails for m1 6= m2 . Therefore, a simple, gradient-following
root search does not preserve the center of mass with heterogeneous masses, which
9 and would require mistakes, as it does not.
28
erroneously lengthens the path. On the flip-side, eq. 2.35 does hold when the masses
are the same. It can be shown that eq. 2.35 holds for a sum of such pair potentials with
equal masses and therefore the group’s previous results are safe from this issue.
Enforcing Center of Mass Conservation Analytically
In this section, we derive a transformation to displacements in configuration space,
which leaves them unmodified except for removing any center of mass translation they
would have introduced. We effectively accomplish this via a uniform back-translation.
This general result is then verified. The method presented here is the one used in our
work though is only one of several. Two others, of staggeringly different complexity,
are presented in Appendix B.
Suppose we have a general displacement in configuration space, ∆R, and would
like to remove center of mass motion. We therefore seek P such that:
f
e
~0 = M~ · ∆R + P (2.39)
e f e
This leads to the under-determined relation:
~ · P = –M
M ~ · ∆R (2.40)
e e e f
There are many solutions to this equation (including P = –∆R, which wouldn’t help
us much!), but we can choose the following convenient approach to constructing a
e f
useful solution: Let P be the configuration space vector which displaces each center by
the same quantity such that eq. 2.40 is satisfied. This quantity, of course, will be the
e
negative of the net center of mass displacement introduced by ∆R and is given by:
f
~ cm = –M
∆x ~ · ∆R (2.41)
e f
We can build P from ∆x ~ cm using eq. 2.15 as follows:
e
N
~ T · ∆x
~ cm
X
P= C i (2.42)
i=1
e e
~ T,
Defining the summed matrices as C
e
T ↔
~
C =1 , i = 1...N (2.43)
e i
29
we can write
~T · M
P = –C ~ · ∆R (2.44)
e e e f
Further, defining
T
M=C
~ ·M
~ (2.45)
e e
we can write
P = –M · ∆R (2.46)
e f
So the configuration space representation of ∆R, which includes no center of mass
translation is:
f
∆R? = ∆R + P = (1 – M) · ∆R (2.47)
f f e f
Where 1 is the dN × dN identity.
The dN × dN matrix M has a band structure defined by:
mj ↔
Mi,j = 1 , i, j = 1 . . . N (2.48)
Mtotal
For instance, in three spatial dimensions, d = 3, we have:
m1 0 0 m2 0 0 mN 0 0
0 m1 0 0 m2 0 · · · 0 mN 0
0 0 m1 0 0 m2 0 0 mN
1 0 0 m2 0 0
m
1 ..
M= (2.49)
0 m 1 0 0 m 2 0 .
Mtotal
0 0 m1 0 0 m2
.. ..
. .
Armed with this result we can re-write the equation for escape steps (eqs. 2.32
and 2.33) as:
Rn+1 = Rn – ζ (1 – M) · ∇V Rn (2.50)
e e e e
where
V(Rn ) – EL
ζ = e 2 (2.51)
∇V Rn
e e
as before.
30
Verification
We would like to verify that our expression (eq. 2.50) conserves the center of mass. We
proceed as before, using our new result.
? ~
~ · Rn =
M M · Rn+1
(2.52)
e e e e !
? ~
=M · Rn – ζ (1 – M) · ∇V n
e e e R
e
After, rearranging and canceling the common term as well as the scalar, we have:
? ~
~ · ∇V n = M (2.53)
M R M · · ∇ V n
R
e e e e e e
Inserting the definition for M (eq. 2.45) gives:
? ~T · M
~ · ∇V n = ~ ·C ~ · ∇V n (2.54)
M R M R
e e e e e e e e
which can only hold if
~ =M
M ~ ·C ~T · M
~ (2.55)
~T
e e e e
We can see that this does indeed hold by inserting the component definitions of C
~ (eq. 2.17):
(eq. 2.43) and M
e
e
N
~T · M ~ T · M
h i
~ ·C ~ = ~ C ~
X
M M
i i
e e e i e e e
N
X mi ↔ ↔
~
= 1 · 1 · M
Mtotal
i e
(2.56)
N
1
X
↔
= 1 ~
mi · M
Mtotal
i e
h↔ i
= 1 ·M ~
~T · M
e
~ ·C
M ~ =M ~
e e e e
What of the Newton-Raphson Root Search? From the above, eq. 2.50 clearly pre-
serves the center of mass. However, the point of eq. 2.32 was a root search to ensure
V(R) = EL . Does the sequence of projected Newton-Raphson steps retain the property
e
31
that they locate the V(R) = EL boundary of the forbidden region? Yes it has; the con-
struction of M began with P in eq. 2.42. In which we specified displacing all centers by
e
the same amount, ∆x~ cm . This amounts to a net translation, under which our potential is
e
invariant. In Appendix B.2, we take up the general problem of projected root searches.
If our problem contained a field and the potential was not translationally invariant or if
the boundary configurations did not have the same center of mass, all of this would go
out the window. In particular if it were not the case that the potential was translationally
invariant, we would not be able to rationalize there being something “wrong” with
center of mass translation in the geodesic as we did earlier.
2.2.2 Geodesic parameters
With respect to step sizes, we take steps during the optimization phase of length equal
to the steps during the deterministic phase, δp = δs = 5 × 10–2 a0 = 2.645 × 10–2 Å.
We chose this value because we expected variations in the potential to be small on this
scale and smaller values markedly increased computation time without substantively
increasing the length of the path found. Lengthening of the path corresponds to skipping
fewer obstacles and/or more closely tracing curves—thereby more accurately exploring
the landscape. See Table 2.2 for a comparison of lengths over a range of step sizes.
With regards to termination criteria, we aggressively optimize of the length of the
paths: requiring 103 trials resulting in no shortening. This means that on average,
optimizing a single geodesic requires computing order 104 paths. In most cases, the
optimization phase offers modest (5%) reductions in length of the computed geodesics,
however there are some extreme cases where it yields length reductions of order one
half.
There are also times when the relative positions of endpoints prevent our algorithm
from finding a connecting path between them10 —it is quite possible that one does
not even exist.Pairs of endpoints were excluded from analysis when the algorithm
failed to find a path within a “reasonable” (orders of magnitude greater than typical)
amount of time. Specifically, if the algorithm required more than 102 more steps than
10 For details and rates of failure, see Table 3.12 on page 91.
32
δs /a0 Computed Path Length / a0 Execution Time / s
5 × 10–1 16.6827 0.06
5 × 10–2 17.2255 0.33
5 × 10–3 17.2570 3.51
5 × 10–4 17.2620 35.81
Table 2.2: Comparison of step sizes in geodesic algorithm
Comparison of Euclidean path lengths (without optimization) and execution times to
compute geodesics for different step sizes on the formaldehyde potential with EL =
34928.7 cm–1 . In this example, the Cartesian distance between the products and
reactants is 15.923 a0 . An attempt was made with δs = 5 × 10–5 , but it did not
terminate on its own within 1000 seconds. We postulate that this is because the step
size was smaller than the scale used in computing the gradient (see Section 2.7.2),
which rendered the algorithm inaccurate.
an unobstructed path would require, the search was abandoned:
$ %
R – R0
2
imax = 10 · e e f
(2.57)
δs
where b·c is the floor function. We also set the cutoff at 103 , but this served only to
lengthen execution time and did not appreciably decrease the rate of failure.
Selecting a landscape energy for a small molecule system
Another concern of importance is the selection of the landscape energy. In the group’s
previous work, which has focused on macroscopic systems, the average value of the
potential over the course of a simulation was used [8, 49]. Indeed, in the thermodynamic
limit, hVi = EL exactly. In the present case, we are not in the thermodynamic limit
and such a solution would be of little use. The system spends much of its time in
the reactive well, where the majority of the internal energy is kinetic. The moment of
passage through or around any transition state is but fleeting and will not meaningfully
contribute to the average.
Another technique is simply to select some probable, “thermodynamic” value for the
energy. This is what we did in our first paper treating formaldehyde [1]: the landscape
energy was taken from the peak of the distribution of values of the potential while a
hydrogen was far-removed from the formyl over many molecular dynamics simulations.
33
Eµ / cm–1 excitation wavelength / nm
34500 349.0
∗35000 343.0
35500 337.2
∗36000 331.6
36223 329.2
37000 321.0
∗37600 314.9
37688 314.0
38000 311.0
38688 304.5
∗38814 303.3
∗41010 284.4
∗45000 255.4
Table 2.3: Microcanonical and landscape energies
These energies were selected to span the range of energies over which roaming manifests
in formaldehyde and to allow maximal overlap with the studies of Houston et al. [58].
Energies above were used for sampling reactant well and for molecular dynamics studies.
An experimental realization of these energies would correspond a laser with energy
Eµ – Ezpe ; Ezpe = 5844 cm–1 , see Section 2.1.1. An asterisk (*) indicates energies for
which geodesics were computed. The horizontal line shows the approximate limit at
which roaming is observed experimentally (41060 cm–1 [17]).
While this is fine, it is still somewhat arbitrary in that it depends on the choices
for “far-removed”. It could also conceivably forbid paths that are indeed classically
energetically accessible on the energy surface. In small systems, with few degrees of
freedom, it is the partitioning of energy into different degrees of freedom that controls
the particulars of the dynamics. The key observation is this: the microcanonical energy,
Eµ , already sets a limit on the maximum potential energy, V(R) ≤ Eµ . For rare processes,
such as roaming, energy may need to be partitioned among the various degrees of
e
freedom very specifically. By taking the landscape energy equal to the microcanonical
energy, EL = Eµ , all such partitionings and transformations between them which are
classically accessible, are accessible to the geodesics we construct in the landscape
ensemble. Desiring such properties, this is just what we do.
For this work we compute geodesics at the energies given in Table 2.3; see Figure 2.1
for comparison to the relevant energies in formaldehyde. While all the major conclusions
34
described in Chapter 3 rest upon geodesics constructed with EL = Eµ , we do, at
times, appeal to analyses conducted for our previous work [1]. In these cases, Eµ =
37600 cm–1 and EL = 34928.7 cm–1 . In that earlier work, we also repeated our
analysis of direct and roaming geodesics with the landscape energy set slightly below
the radical dissociation threshold11 , EL = 33238.7 cm–1 < 33240 cm–1 = Erad
µ . We
did not repeat the sampling procedure for reactants or products described below in
Section 2.3, but did for the roaming region. Our results and conclusions were virtually
identical for this lower bound on the landscape energy.
2.2.3 Trapped at a fixed point in 2 dimensions
As part of getting up to speed with the group’s methods, we re-implemented Wang
and Stratt’s early work on geodesics [8]. In the process we encountered a previously
unreported failure mode of the geodesic algorithm described in Section 2.2. The failure
is only of practical concern in systems with two degrees of freedom and presumably
lay undetected because the group typically treats problems 1 to 3 orders of magnitude
larger.
The Müller-Brown potential [45] is a 2-dimensional model surface that captures the
key features of typical chemically-relevant energy landscapes. It has two major basins,
an intermediate, and two saddle points—these correspond to reactant, product, inter-
mediate, and transition states between them. Formed as a superposition of Gaussians,
it is simple to compute and the surface is often used as test case for new optimization
or path-finding algorithms. So it was used by our group. The potential is shown in
figure 2.4.
The major surprise while implementing the algorithm came during the testing phase.
When finding geodesics from the upper (“reactant”) basin to the lower (“product”)
basin, the program became stuck in an infinite loop for landscape energies below
-34.06. In the opposite direction, the same thing happened at an even higher energy:
-29.93. Both of these energies are comfortably above the highest saddle connecting the
basins, -40.7, so there should have left plenty of space for a transition.
The error was not in the implementation, it is a feature of the algorithm when
11 Radical geodesics cannot be computed with landscape energy below the threshold for radical
dissociation.
35
Figure 2.4: Müller-Brown potential
In the arbitrary units of the original paper, this depiction spans an energy range of
roughly -150 to +150. Basins have depths of -146.7, -80.8, and -108.1 (diagonally from
upper left) and saddles are at energies of -40.7 and -72.3 (diagonally from upper left).
operating in two dimensions. The failure is more likely to occur the longer the algorithm
is confined to a basin with concave boundaries. En route to escape, the trial point, Rt ,
winds around the edge of the basin. As it does, the unit vector pointing at the target
e
smoothly rotates in one direction. At the same time, the direction of steepest descent,
the unit gradient, rotates smoothly in the opposite direction. If the particle does not
escape the basin in time, at some iteration, the two are parallel and point directly at
each other. Then the algorithm encounters a fixed point, and will forever take steps
back-and-forth in the same location: Rt+1 = Rt ; see Figure 2.5. The fixed point is
en
stable because following the gradient in the Newton-Raphson search (eq. 2.12) on one
e
side and the trial step towards the target (eq. 2.9) on the other will drive the system
back towards the problem location and thus the errors associated with computing in
finite precision will be of no help in escape.
The problem manifests for two degrees of freedom, but not for more because the
rotating unit vectors trace out the unit circle, S1 . Given long enough time and if they
rotate in “opposite directions”, they are guaranteed to be coincident. On unit spheres
of greater dimension, there is more than one angular degree of freedom and the target
and gradient vectors may wander without encountering each other. Thus for systems
36
Figure 2.5: Fixed point on the Müller-Brown potential
The highlighted contour is at the landscape energy. The initial path of the algorithm is
shown in black. The the dashed lines point to the product, RProduct , from the current
position, R(t) . The gradient is shown with arrows. A fixed point occurs at the when
e
the vectorepointing at the product is parallel to the gradient:
Rf – Rt k ∇V R Rt
e e e e
with n > 2 degrees of freedom, the probability of coincidence on Sn–1 is infinitesimal
and the problem does not occur.
2.3 Defining and sampling regions of interest
Our algorithm generates paths that give us information about the connectivity between
regions of configuration space. Rather than being concerned with the fine details of
minima, we sample the volumes of configuration space corresponding to the species
whose inter-conversion we wish to study: reactants, molecular and radical products,
and roaming intermediates. As geodesics are paths between points, we need a way to
sample these volumes of configurations space.
The following section details our methods of drawing points from the regions
37
of configuration space we explore. To compute geodesics, we select endpoints by
randomly drawing points from the appropriate distributions. Some of our sampling
techniques produce points in phase space (including kinetic degrees of freedom). As
our algorithm only requires points in configuration space as input, we simply project
out the momenta when necessary.
2.3.1 Monte Carlo for microcanonical reactants
To microcanonically sample region of the reactants we use the method of normal mode
approximation with anharmonic corrections described by Peslherbe and coworkers [59];
we describe the high-points here and in more detail in Appendix C. First the potential
is approximated by normal modes constructed from a diagonalization of the Hessian
matrix at the global minimum. Energy is then deposited in these modes according the
microcanonical distribution for a collection of harmonic oscillators with total energy,
Eµ . In our studies, Eµ takes on the values given in Table 2.3 on page 33.
As we desire to fix total angular momentum to 0, excess angular momentum is then
removed by subtracting the net instantaneous rotation of the molecule. First, we form
the spurious angular momentum (this is simply the total angular momentum of the
molecule),
N
~js =
X
~ri × mi~vi (2.58)
i=1
Where the sum is over atomic centers and ~ri , ~vi , and mi give the Cartesian position,
velocity, and mass of the ith atom. To correct for this term we add ~j = –~js to the
molecule by computing the vector angular velocity:
ω = I–1~j (2.59)
where I–1 is the inverse of the moment of inertia tensor [60]. We then add this angular
velocity to each center:
~vi0 = ~vi + ω × ~ri (2.60)
As we deposit energy in the normal modes, there will automatically be no net
translational momentum. Finally, to correct for anharmonicity of the potential, the
coordinates are iteratively displaced and the momenta scaled until the phase space
point reflects the desired total microcanonical energy (see Appendix C.3).
38
Using this method, 4 × 104 such phase space points were generated and then used
as inputs to molecular dynamics described in the next section. They were also used
as initial configurations for our algorithm for constructing geodesics on the energy
landscape. Of note: only those configurations, whose phase space points lead to
molecular products were used as input to the geodesic algorithm. We have no reason
to believe that such dynamical correlation would impact our results; see Section 3.6.3.
2.3.2 Molecular dynamics for products
To generate configurations corresponding to molecular and radical products, the reactant
phase space points are propagated via molecular dynamics. At each time step forces
act along the negative of the gradient. A 4th order Runge-Kutta integrator [61, equation
25.5.10] implemented in the GNU Scientific Library [62] advances the state of the
system according to Hamilton’s equations for the potential. This point in phase space is
propagated forward in steps of at most12 5 ·¯h/Eh = 0.121 fs until one of the following
occurs:
1. The center of mass distance between H2 and CO is greater than 12 Bohr (12 a0 =
6.350 Å), indicating dissociation (molecular or radical), or
2. 12.1 ps of simulation time (5×105¯h/Eh ) have elapsed without the above occurring,
indicating no reaction.
During the trajectory, the state of the system is recorded every 100¯h/EH = 2.42 fs.
These termination conditions and time step choices are commensurate with those used
by Bowman and coworkers [55] when working on the same surface.
Once a reactive trajectory has been identified, the H – H bond distance is probed to
assign the trajectory to radical or molecular products. We assign molecular products
for bond distances less than 6 a0 (3.18 Å) and radical products otherwise. We do not
address the triple-fragmentation channel as it is energetically inaccessible.
For each of the energies listed in Table 2.3 on page 33, 4 × 104 such trajectories
were simulated; their final dispositions are discussed in Section 3.3. The final states of
12 Ifthe implementing library deems the accuracy (estimated through step-doubling [63]) with the given
time-step insufficient, the time-step is reduced by 20%. This is repeated to a minimum of 0.84¯h/Eh
at which point the trajectory is discarded as a failure.
39
the reactive trajectories were then used as endpoints for geodesics13 .
2.3.3 Monte Carlo random walk for the roaming region
One of the difficulties associated with studying roaming is unambiguously deciding
when it happens. Given the high vibrational excitation necessary to elicit roaming
in H2 CO, C – H bonds may often be quite stretched during a molecular dynamics
trajectory and a hydrogen might often “look like” it is or is about to roam. The situation
is in little danger of improving as there is no consensus in the literature defining roaming.
Our notion of roaming is based on two components. The first is a static, configuration
space requirement. Like other workers [30, 42], we make use of a roaming region,
which is described below. The second is a dynamical feature of a trajectory, namely
that after excitation, the roaming hydrogen visits this region and then returns to abstract
the other H from the formyl and to dissociate to close-shelled molecular products.
Because roaming is a rare event, the molecular dynamics studies described in
the previous section do not produce enough configurations to adequately sample the
roaming region at a given energy. In this section, we also describe the random walk
Monte Carlo method used to sample the roaming region in the landscape ensemble.
Defining the roaming region
A succinct definition for the roaming region remained elusive for some time. Key to our
effort adding a force-based constraint to the otherwise geometric considerations. The
force criterion allows us to ensure that a dissociating hydrogen is indeed on a potential
energy plateau (small forces if and only if small gradients), characteristic of roaming
dynamics. For the geometric criteria, we err on the side of over-constraint so as to avoid
mis-classifying states of radical dissociation as roaming.
A configuration-space point is in the roaming region per our definition if the following
criteria are satisfied; see Table 2.4 for constants:
13 While the ends of all reactive trajectories were used as endpoints for geodesics, only those initial
phase space points that resulted in molecular products were used as start-points. We discuss the
impacts of dynamical correlation in detail in Section 3.6.3, but suffice it to say that we expect this
particular choice to have no impact on our results.
40
1. The restoring force on the roaming hydrogen is of small magnitude14 :
FH(0) < Fmax (2.61)
2. The hydrogens are separated by more than a minimum and less than a maximum
distance:
dmin < ~rHH(0) < dmax (2.62)
The roaming hydrogen, designated H(0) , is uniquely identified in configuration space
as the hydrogen with the greatest Euclidean separation from the total center of mass.
We define the restoring force on the roaming hydrogen as:
~ (0) V(R)
FH(0) = –ˆrH(0) –CM · ∇ (2.63)
H
e
where
~ (0) = ∂ ∂ ∂
∇ H , , (2.64)
∂xH(0) ∂yH(0) ∂zH(0)
and
ˆrH(0) –CM = ~rCM – ~rH(0) /
~rCM – ~rH(0)
(2.65)
As usual, V(R) is the formaldehyde potential energy function and ~rH(0) and ~rCM are the
real-space coordinates of the roaming hydrogen and total center of mass respectively.
e
One may notice that under the above definitions, attractive forces are positive and
repulsive forces are negative (eq. 2.65 results in this inversion of convention). We hope
that our choice allows us to express the bounds on the force more intuitively as it is
most important the restoring force be small in magnitude14 .
We place a maximum on the force because in roaming trajectories, there are
large, persistent H – HCO separations, which require small forces on the wayfaring
hydrogen. Bounds on hydrogen-separation are required because the force criterion
alone would mis-classify equilibrium and transition state configurations, clearly not
representative of roaming, as such. We use the hydrogen separation as opposed to
H – HCO separation because large hydrogen-formyl separation can also correspond to
14 Ifthe restoring force is to be attractive, it must be small otherwise the hydrogen is not on a plateau
and will plunge back into the reactive center. For simplicity, we place no bound on the extent to which
the force is repulsive; the geometric and energetic constraints already cover this case.
41
Parameter Value
Fmax -103 cm–1 /Å
dmin 3.10 Å
dmax 4.76 Å
Table 2.4: Roaming region constants
dissociation to molecular products15 . The upper bound excludes radical dissociation,
which otherwise comprises a majority of the space.
One benefit of the criteria we employ is the ability to distinguish a wide variety of
roaming behavior. Had we simply relied on a dynamical signature like “low rotational
quantum number after dissociation”, e.g. JCO ≤ 16, we would have missed trajectories
that manifestly roam with JCO much higher; see Section 3.3.4 for more.
Sampling the roaming region
To sample the roaming region we use a Monte Carlo random-walk technique adapted
from Brady and coworkers [64], which was originally designed to facilitate micro-
canonical sampling of Lennard-Jones clusters. Here, we use it to sample configurations
in the potential energy landscape ensemble for EL with the additional requirement that
the configurations be in the roaming region described above. The details are given in
Appendix D.
Our probability density is a function which equals 1 if the trial configuration is in
the roaming region and meets the landscape criterion and otherwise equals 0. Because
the codomain of this density is binary, the acceptance probability is as well—there are
no marginally better or worse moves, only allowed and disallowed.
To generate trial moves for the random walk, we displace a single random atom,
j, along a random coordinate, by a random amount, ∆ ~j =µ ˆj (ξ – 0.5)δx . Where ξ is a
uniformly distributed, random variable on [0, 1), δx is a scaling factor and µ
ˆj ∈ {ˆx, yˆ, zˆ}
is along a randomly selected Cartesian axis of the jth atom. After each such step, an
additional move to preserve the center of mass is made by moving all atoms by the
15 Consider,for distantly separated H2 and CO, the relative centers of mass of a single hydrogen and a
formally defined “formyl” composed of the other hydrogen and the CO.
42
same, appropriate amount. Thus the total move for each atom i is:
mj
∆~ri = µˆj (ξ – 0.5)δx δi,j – ; i = 1...N (2.66)
M
where M = N i=1 mi is the total mass of the system.
P
To select δx , we computed several “burn-in” sequences of 103 steps at EL =
37600 cm–1 during which δx was scaled up and down in 5% steps until we achieved an
acceptance ratio between 0.1 and 0.9. During the data collection phase, δx was pinned
to its final value of 0.27562 a0 , which yielded acceptance ratios between 0.80 and 0.95.
To seed the random walk we took configurations satisfying both the landscape and
roaming criteria from molecular dynamics trajectories that went to molecular products
(roaming trajectories); this yielded between 200 and 9000 configurations at each energy.
Using each configuration as a different starting position, we generated random walks
of length 105 steps, recording a configuration every 100 moves. This gave a list of
105 –107 points in the roaming region, which were then shuffled and used as inputs to
the geodesic algorithm. For energy specific details, see Table D.1.
2.3.4 Geodesic itineraries
Once points in each of the regions were generated at each of the landscape energies,
we construct itineraries for the geodesics as follows:
• A reactant configuration is drawn from the those phase space points microcanon-
ically sampled from the reactant well as described in Section 2.3.1 that later
dissociates to closed shell products.
• A radical product and a molecular product endpoints are each selected from the
products of the molecular dynamics simulation described in Section 2.3.2
• An intermediate, roaming point, is drawn from points sampled via a random walk
over the roaming region and the landscape ensemble as described in Section 2.3.3.
Geodesics are then computed as described in Section 2.2 between points as given here:
direct: reactant → molecular
product
S
roaming: reactant → roaming
point
roaming
point → molecular
product
43
radical
radical: reactant → product
The union (∪) specifies that roaming geodesics are the concatenation of two geodesics.
Tens of thousands of unique routes are thus planned and executed at each energy.
2.4 Analyzing geodesics
Having computed some hundreds of thousands of geodesics, we would like to analyze
them in aggregate. Our goal is to extract the key features of the inherent dynamics of
the of each of the dissociation channels—and in particular to understand the genesis of
roaming.
2.4.1 Total lengths
Given that the geodesics are characterized by a minimum kinematic length, measuring
the lengths of the paths is a natural place to start. The total kinematic length of a path
gives us information about the total amount of motion required along the geodesic.
This length is defined as follows:
Z t
`Total = dτ [2 · T(τ )]1/2 (2.67)
0
where T is the kinetic energy, τ is a progress variable on [0, t], specifying the limits
of integration are over the entire path. To find the total kinematic length, we insert
T
T = 12 R˙ MR˙ and have:
Z t h i1/2
T
e e
`Total = dτ R˙ MR˙ (2.68)
0 e e
dR
where R˙ = dτe is the rate of change of the configuration space vector R and M is the
diagonal mass matrix:
e e
m
1
m1
m1
M= (2.69)
m2
...
mN
44
or
(M)iµ,jν = δi,j δµ,ν mi i, j = 1 . . . N ; µ, ν = x, y, z (2.70)
Where mi is the mass of the ith particle in the N-particle system. (Recall Table 2.1 for
the relevant masses.) In Section 2.7, we will discuss the finite difference expressions
used to calculate the integral in eq. 2.68. The reduction provided by eq. 2.113 yields,
for n steps:
n–1 h i1/2
T
δRi MδRi
X
`Total = (2.71)
i=0
e e
where δRi = Ri+1 – Ri . The expression for the length is simply the mass weighted sum
of the steps along the path.
e e e
As the kinetic energy can be partitioned between many degrees of freedom, we can
interrogate kinematic lengths corresponding to a subset of them by constructing the
appropriate component of the kinetic energy (e.g.: rotation or vibration) and integrating
as before. In doing so we can probe the contributions of different kinds of motion to the
inherent dynamics. In general, we can construct arbitrary kinematic lengths provided
that we can construct a kinetic energy, Tα , such that:
X
H= Tα + V
α
Tα = Tα (R, R˙ )
e e
Though the significance of such lengths will be limited to our ability to interpret them
physically. In the case of the diatomic rotations and vibrations we treat, the motions are
well-defined and physically significant even if the species are not chemically isolated.
2.4.2 Diatomic Vibration
If we wish to understand the vibrating motions of a pair of atoms in our system, we can
construct the vibrational kinematic length. The vibrational kinetic energy for 2 particles,
α and β is:
1 2
TVib = µr˙ (2.72)
2
m ·m
Where r = k~rk =
~rα – ~rβ
is the separation between the centers and µ = mαα+mβ is
β
the reduced mass. In analogy with eq. 2.67, we insert eq. 2.72 into the expression for
45
the length (eq. 2.67) and have:
Z t
h i1/2
`Vib = dτ µr˙ 2 (2.73)
0
Z t
√
= µ dτ kr˙ k (2.74)
0
The resulting sum is:
n–1
√ X
i
`Vib = µ
δr
(2.75)
i=0
where δri = ri+1 – ri .
2.4.3 Diatomic Rotation
We can perform a similar calculation for the rotational degrees of freedom. The rotational
kinetic energy for a pair of particles, α and β, is:
1 2
TRot = I~ω (2.76)
2
1 2 dˆr 2
= µr (2.77)
2 dτ
Where ω~ is the angular velocity, I is the moment of inertia, and r and µ are defined as
2
~ 2 = dτ
above. The unit orientation vector is ˆr = ~rr and it is true that ω dˆr [60]. Again,
we insert the expression for kinetic energy into eq. 2.67 and have:
#1/2
dˆr 2
Z t "
√
`Rot = µ dτ r2 (2.78)
0 dτ
Now we compute:
~r
dˆr d
= (2.79)
dτ dτ k~rk
~r˙ ~r d
= – k~rk (2.80)
r r2 dτ !
~r˙ ~r ~r · ~r˙
= – (2.81)
r r2 r
46
With some re-arrangement, we arrive at a more compact form:
!
ˆ
dr ~ ˙
r ~
r˙
= – ˆr ˆr · (2.82)
dτ r r
1 ˙
h i
= ~r – ˆr ˆr · ~r˙ (2.83)
r
˙~r
= · 1 – ˆrˆr (2.84)
r
where 1 is the identity and we have used the dyadic identity ~x(~x · ~y) = ~y · ~x~x.
dˆr in hand, we can compute the argument to the square root in our integral:
With dτ
2 " #T " #
dˆr ~˙
r ~˙
r
r2 = r2 · I – ˆrˆr · I – ˆrˆr (2.85)
·
dτ r r
T T
= ~r˙ · I – ˆrˆr · I – ˆrˆr · ~r˙ (2.86)
T
= ~r˙ · I – ˆrˆr · ~r˙ (2.87)
The tensor terms in the penultimate step can be reduced by simple distribution and the
identity: (ˆxˆx)2 = ˆxˆx, which holds for all unit vectors ˆx; see eq. B.21 for derivation. The
reduction also follows from the idempotency of projections.
Algebra out of the way, we can write a final expression for the rotational kinematic
length:
Z t 1/2
T
√
`Rot = dτ ~r˙ · I – ˆrˆr · ~r˙ (2.88)
µ
0
which is then reduced to the sum:
n–1 i1/2
√ X h iT i i
i
`Rot = µ δr · I – r r · δr
~ ˆ ˆ ~ (2.89)
i=0
where δ~r i = ~r i+1 – ~r i and ˆr i = ~r i /r i .
2.4.4 Boundary fraction
In the geodesic approach, the potential energy surface enters into the dynamics only as a
constraint on otherwise straight-line paths. It manifests as obstacles in the configuration
space, which the geodesics then trace around. In our previous work, studying systems
47
with many hundreds of degrees of freedom, we found that the geodesics hugged these
obstacles for the near-entirety (>99%) of the path [8]. Formaldehyde, with only six
internal degrees of freedom, is sufficiently low-dimensional that the obstacles are sparse
and geodesics ricochet between them. By measuring the fraction of path spent along
the boundaries of these obstacles, we can understand how tortured and constrained
the routes taken by geodesics are.
We define the boundary-fraction of a geodesic, hΘi, as the length-fraction of the
path that is very near the landscape energy, EL :
]1/2 Θ[R(τ ); EL ]
Rt
dτ [2T
hΘi = 0 R t total (2.90)
[2T ] 1/2
e
0 dτ total
where:
1 EL – V(R) < δEL
Θ[R; EL ] = e (2.91)
e 0 otherwise
for suitably small δ = 10–4 , Θ is finite only for configurations “on the boundary”. With
δ = 10–5 our results are indistinguishable. While we used the total kinematic length as
our weighting function, we could have easily used any of the other kinetic energies:
diatomic rotation or vibration. When computing the boundary fraction in this other
way, the distributions are nearly to within statistical noise.
2.5 Estimating geodesic lengths
To verify our calculations it is desirable to make an analytic estimate of the sizes of the
kinematic lengths involved. Here we use a simple model to make such an estimate.
These estimates should be considered lower bounds for a given pair of endpoints
because they assume no potential obstacles—that the landscape energy is effectively
infinite.
2.5.1 Total Kinematic Length
We conceive of our system as having only 2 components: a mass of mH2 = 2 · mH
located at ~rH2 and a mass of mCO = mC + mO located at ~rCO . Further we will content
ourselves to deal only with the translational motions of these fragments and therefore
48
restrict ourselves to a line. Finally, we assume that all motion is barrier-free—that no
potential obstacles intervene, or that for a straight-line path, V(R) ≤ EL .
Are these reasonable simplifications? They are if the contribution to the kinematic
e
length from translation is much greater than from internal rotation or vibration of H2
and CO. In particular, we consider only direct dissociation paths, which are relatively
unhindered.
Recalling our expression for kinematic length (eq. 2.67), the required kinetic energy
of the simplified system is:
2T = mH2~r˙H22 + mCO~r˙CO
2 (2.92)
If we assume constant velocity during dissociation and a total time t, we can write:
∆~rH2 2 ∆~rCO 2
2T = mH2 + mCO
(2.93)
t t
where ∆~rCO and ∆~rH2 are the total distances traveled by CO and H2 respectively.
Fixing our total momentum to zero (implying zero net motion of the center of mass)
allows us to write:
0 = ~pH2 + ~pCO (2.94)
which implies:
∆~rCO ∆~rH2
mCO = –mH2 (2.95)
t t
and therefore:
mH2
∆~rCO = –∆~rH2 (2.96)
mCO
Inserting this relation into our expression for 2T, eq. 2.93 yields:
∆~rH2 2 mH2
2T = mH2 1+ (2.97)
t mCO
Now we only need an expression for ∆~rH2 , which we can find in terms of ∆~rH2 –CO ,
the change in the center of mass separation of H2 and CO.
∆~rH2 –CO = ~rH2 – ~rCO f – ~rH2 – ~rCO i (2.98)
= ∆~rH2 – ∆~rCO (2.99)
mH2
= ∆~rH2 1 + (2.100)
mCO
49
where we have used eq. 2.96 to eliminate ∆~rCO . Using eq. 2.100, we eliminate ∆~rH2
from 2T in eq. 2.97, which yields:
∆~rH2 –CO 2 mH2
2T =
m
(2.101)
t 1 + H2 mCO
2
∆~rH2 –CO
= µH2 –CO (2.102)
t
where µH2 –CO is the reduced mass for H2 and CO, defined in the usual way. Now our
expression for the total length is:
Z t
!
∆~rH –CO
√
2
`Total = dτ µH2 –CO (2.103)
0 t
which is readily integrated to yield:
√
`Total =
∆~rH2 –CO
µH2 –CO (2.104)
Using the appropriate masses from table 2.1 gives µH2 –CO = 3427.5 me . We
estimate
∆~rH –CO
to be of order 10 a0 because the molecular dynamics trajectories
2
used to produce endpoints for the geodesics were terminated when the centers were
separated by 12 a0 . Taken together, we estimate the lower bound on total kinematic
length to be in the range of:
√
`Total & 585 a0 me (2.105)
2.5.2 Vibrational Kinematic Length
An identical analysis can be preformed for vibrational degrees of freedom. Using the
expression for TVib from eq. 2.72 we have:
2T = µr˙ 2 (2.106)
Again using the assumption of constant velocity yields:
∆r
r˙ = (2.107)
t
Which we can combine with the expression for vibrational kinematic length, eq. 2.73:
Z t
k∆rk √
`Vib = dτ µ (2.108)
0 t
50
The expression is readily integrated to give:
√
`Vib = k∆rk µ (2.109)
That this result for diatomic vibration is quite similar to the estimate we made for
total kinematic length, eq. 2.104, is no accident. The previous analysis of the total
kinematic length effectively treated formaldehyde as single large dissociating diatomic.
Such a simplification was reasonable because we expect the internal motions to be of
secondary importance relative to the overall motion of the dissociating fragments.
In the specific case of H2 vibration, we have µ = 918.56 me . We posit k∆~r k—the
total straight-line motion of the vibrating H2 —is of order 1 a0 and therefore estimate
the vibrational length as:
√
`Vib & 30 a0 me (2.110)
2.6 A search for saddles
The existence of a S0 /S1 conical intersection for formaldehyde was first reported by
Araujo et al. in 2008 [65] and brought to our attention by one of Harding and Kilp-
penstein’s papers [36] investigating the putative “roaming” saddle. We were curious if
this conical intersection would be evident in the fitted ground state potential surface
for formaldehyde provided to us by Joel Bowman [54]. If it were, we would expect
to find a second order saddle (a saddle with 2 “downhill” directions or 2 negative
force constants) in its vicinity. The saddle does indeed exist; in this section, we detail
the methods of its finding and of our unsuccessful search for the roaming saddle on
the Bowman potential. We describe the characterization of the saddles by frequency
analysis in Section 2.7.3.
2.6.1 Methods
There a variety of methods for locating saddles of arbitrary order on potential energy
surfaces. The more sophisticated schemes such as Miller’s “eigenvector following” [66]
rely on inverting the Hessian matrix. This poses difficulties when working in Cartesian
coordinates due to the zero-frequency modes associated with translation. These can be
worked-around using projection operators [67], but complicates the matter further.
51
A method for finding saddles, which in general is bad—but in certain, limited cases
may be appropriate—is to simply minimize the square of the gradient of the potential.
2
This works at all because ∇V(R) = 0 for all stationary points of a potential, V(R). This
2
is a bad idea for large, global searches because there may be local minima of ∇V(R) ,
e e e
which interfere with locating the global extrema, all identically 0. In our case, the idea
e e
isn’t terrible because we expect to start with a good guess for the location of the saddle.
It even turns out to work.
Minimization Function & Algorithm
The object of investigation is the square of the gradient of the potential function provided
by Joel Bowman, but we lack an analytic form of this fit or its gradient. To compute the
gradient, we proceed via finite-differencing as described in Section 2.7.2.
2
To minimize such a finite-differenced function as ∇V(R) , it is wise to avoid a
method requiring a further gradient. This avoids accumulation of errors by multiple
e e
numeric differencing schemes. Nelder & Mead’s simplex algorithm [68] is a good,
general purpose algorithm which does not require a gradient. We employ the faithful
implementation of their algorithm in the GNU Scientific Library [62].
While the simplex method of minimization is performant and was effective, after
the fact, I noticed the following, perhaps-useful relation. Some straightforward vector
algebra gives the gradient of the square of the gradient of a function as the simple
product of the gradient and the Hessian:
2
h i
∇ (∇V) = ∇ ∇V · ∇V = 2∇∇V · ∇V (2.111)
e e e e e ee e
One can see that ∇∇V is the Hessian by interpreting ∇∇ as a dyad. Of course this
requires estimating the Hessian at each step, which isn’t cheap. And while this is
ee ee
better than twice applying a finite difference, it is still the (noisier) product of two noisy
(finite-differenced) objects.
Points of departure
To locate the second order saddle associated with Araujo’s conical intersection, we
applied the simplex minimization scheme described in the previous section starting
from the published coordinates (given in Table 3.2). To search for the roaming saddle,
52
we generated 105 starting coordinates by taking randomly oriented steps of random
length uniformly distributed between 0 and 3 a0 away from the proposed location of
the roaming saddle (given in Table 3.9) and then applied the same search to the given
location and the 105 other trial points. Section 3.2 details our findings and reports on
the newly located saddle.
2.7 Numerical methods
As is typical of numeric calculations on machines with finite precision, comparison
operations were made to within specified tolerances. In all cases the condition for
equality was less than or equal to one part in 106 . Double precision (64-bit) floats
were used throughout. Vector operations were performed using the BLAS linear algebra
subroutines via the GNU Scientific Library [62] in the C language with double precision
on an x86_64 machine. Data analysis was implemented in a mixture of C, Python
(with NumPy [69]), and Haskell [70] (using the hmatrix package for vector operations).
Wherever required, random numbers were produced using the MT19937 variant [71]
of the Mersenne Twister algorithm implemented in the GNU Scientific Library. Parallel
data processing was aided by the use of GNU Parallel [72]. A considerable amount of
the source code has been placed in the Brown University Digital Repository [73]. The
remainder of this section gives the details of the finite-difference calculus employed
throughout the study.
2.7.1 With respect to integration
Computing properties of the geodesic often requires computing integrals over functions
of the kinetic energy. For example, the total kinematic length, `total , in terms of the total
kinetic energy, Ttotal , is:
Z t
1/2
`total = dτ 2Ttotal (2.112)
0
where τ is an arbitrary progress variable. Because of the algorithm we use to compute
the geodesics (Section 2.2), we have configurations of the system, ~x, for unevenly spaced
53
{τi }. All such integrals were performed using the following Riemann sum:
Z τn h i Xn–1 h i
~ ~˙
dτ f x(τ ), x(τ ) ≈ ~ ~˙
δτ i f xi , xi (2.113)
τ0 i=0
d~x ≈ (~x
where δτ i = τi+1 – τi and ~x˙ i = dτ i+1 – ~xi )/(τi+1 – τi ) = δ~xi /δτ i is the rate of
change of the configuration space vector ~x computed via simple forward difference.
The geometric picture of this simple technique corresponds to summing the distances
between adjacent configuration space points with the appropriate mass weighting.
While higher order techniques for integration and the required derivatives exist, we
have directly verified that the errors introduced by this method are several orders of
magnitude smaller than the statistical noise associated with averaging over many paths.
Additionally, these “higher order” methods loose the simple geometric significance of
summing over mass-weighted path lengths.
Given that there is no physical content in τ , properties of the geodesics should be
independent of it. In all integrals for the kinematic length, δτ i drops out. We show this
T
explicitly in the case of the total kinematic length. Insert Ttotal = 12 R˙ MR˙ and eq. 2.112
becomes:
e e
Z i1/2
T
h
`total = dτ R˙ MR˙ (2.114)
e e
where M is the diagonal mass matrix given in eq. 2.69. Expressing our integral (eq. 2.114)
as a sum (via eq. 2.113) yields:
n–1 i1/2
T
X h
`Total = δτ i R˙ MR˙ (2.115)
i=0 ei ei
1/2
T
n–1 δR δR
X i i
= δτ i e M e (2.116)
δτ i δτ i
i=0
n–1 h i1/2
T
X
= δRi MδRi (2.117)
i=0 e e
And we are indeed left with an expression independent of τ .
2.7.2 Gradients
While the potential described in Section 2.1.1 has an analytic form, extracting it from
the source code would have been prohibitively tedious. We therefore treat it as a black
54
box and compute gradients numerically. Partial derivatives are computed along each of
the 12 Cartesian, lab-frame coordinates defining the configuration space. We use the
5-point formula from Abramowitz and Stegun [61, eq. 25.3.6]:
1
∂f
h5
= – 8f + 8f – + O (2.118)
f f
∂xi ~x 12 –2 –1 +2 +2
where fj = f ~x + jhˆxi . We take the inter-abscissa spacing, h = 10–4 a0 .
2.7.3 Hessian matrices & vibrational frequencies
While Abramowitz and Stegun give a 5th order expression for ∂ 2 F/∂x2 , the off diagonal
terms are not treated with the same level of precision. David Eberly gives a clear
description of finite differencing schemes and their derivation for uni- and multi-variant
functions [74], which was followed to construct the approximation we use here. The
key result is that repeated application of orthogonal finite differencing methods (e.g.,
first along x then y) give rise to expressions where the coefficients are the outer products
of the coefficients for each approximation.
2
∂ 2F 4 1 X
1,1 F
= O(h ) + C ij (2.119)
∂x∂y 144h2 ij
i,j=–2
where Fi,j = F(x + ih, y + jh) and the coefficients are given by:
1 –8 0 8 –1
–8 64 0 –64 8
1,1
C = 0 0 0 0 0 (2.120)
8 –64 0 64 –8
–1 8 0 –8 1
Indexes on C1,1 are relative to the center. As hinted, the coefficients for this mixed
derivative expression arise from the outer product of those for a first derivative: C1,1 =
C1 ⊗ C1 where C1 = [1, –8, 0, 8, –1]. And indeed, to 5th order,
e e e
2
dF 5 1 X 1
= O(h ) + C Fi
dx 12h e i
i=–2
55
with Fi = F(x + ih).
The diagonal elements of the Hessian are taken from Abramowitz and Stegun,
eq. 25.3.24 [61]:
2
∂ 2F 4) + 1 C 2 Fi
X
2
= O(h (2.121)
∂x 12h e i
i=–2
with Fi = F(x + ih, y) and C2 = [–1, 16, –30, 16, –1].
To calculate frequencies, we diagonalize the mass weighted Hessian given by,
e
∂ 2V 1
, k, l = 1 . . . N α, β = x, y, z (2.122)
∂rkα ∂rlβ m1/2 m1/2
k l
where rkα and rlβ are the lab-frame, Cartesian coordinates of the N-atom system.
The eigenvalues are the mass-weighted force constants, ki , and can be converted to
frequencies in wavenumbers, ν˜i , by the relation
p
ki
ν˜i =
2πc
where c is the speed of light. For formaldehyde, there are six vibrational modes and
so the six frequencies with largest magnitude are assigned to vibrational modes. We
assign the lowest three to net translation and the intervening three to net rotation.
We use the 4th order expressions for all elements of the Hessian matrix given in
eqs. 2.119 and 2.121. A step-size of h = 7 × 10–3 was empirically determined to
minimize the magnitude of translational and rotational modes (see Figure 2.6). If we
were trying to be still more careful, we would choose h such that it could be represented
exactly in base-2 (and therefore exactly representable on a binary computer). The
implementation of these methods was verified against analytic calculations for Morse
and harmonic oscillators with a variety of frequencies.
56
Switching functions?
To stitch together the multi-reference and single-reference portions of their fit to the
formaldehyde potential, the Bowman group employed switching functions of the form16 :
1 x≤0
w(x) = 1 – 10x3 + 15x4 – 6x5 0 < x < 1
0
1≤x
where x is a dimensionless variable corresponding to the scaled distance of the farthest
hydrogen from its nearest neighbor (see [54] for details). At x = 0, 1, w(x) is only
differentiable to second order and so higher derivatives of V are discontinuous across
the hyper-planes defined by x = 0 and x = 1. Accordingly, approximations to derivatives
of V relying on such continuities (e.g. eqs. 2.119 and 2.121) will fail if these planes
intersect the sampling region of the approximation, a hyper-cube of side length 4h
centered on the point where the derivative is to be approximated. The regions where
we compute frequencies are not expected to be near the boundaries of the switching
function. If we were computing the hessian in the roaming region (perhaps in the
vicinity of a roaming saddle), we would need to be more careful.
In one instance, computed frequencies differed slightly (< 2.5%) from the published
values (see Section 3.2.1). To rule out interference from the switching function, we
repeated the calculation evaluating the elements of the Hessian by a 2nd order approxi-
mation from Abramowitz and Stegun [61] and found the computed frequencies to be
within 0.01% of the first.
16 See Appendix E.1 for a note on a mistake in their published version.
57
104
103
102
101
100
Vibration
10−1
Rotation
Translation
10−2 −5
10 10−4 10−3 10−2 10−1
+994.85
+0.05
0.0
-0.05
-0.10
-0.15
10−5 10−4 10−3 10−2 10−1
Figure 2.6: Hessian parameters
Results from computing normal modes at the second order saddle via the Hessian,
eqs. 2.119 and 2.121. Shown are geometric averages of the vibrational, rotational, and
translational normal modes (in wavenumbers) as a function of abscissa-spacing, h in
a0 . The vertical line shows the selected value of h = 7.00 × 10–3 a0 , which is used
throughout the investigation. The lower panel is a very tight zoom on the vibrational
modes and shows that the vibrational modes are converged and stabilized for this value
of h. Recall, from the upper panel, that the vibrational modes are grossly well-behaved
over the whole range of values.
58
2.8 Computing the entropy of a sampled distribution
Part of our analysis will involve computing the entropy of continuous distributions of
unknown form, but from which we can draw samples. The entropy of a continuous
distribution is given by [75]:
Z
S/kB = – dxρ(x) ln [ρ(x)] (2.123)
Supposing that one had no access to the distribution, ρ(x), but could sample it, there
are a variety of ways that one could imagine constructing an estimate of S/kB from the
samples {xi }. One general approach is to construct a model of the distribution (via
histogramming or kernel density estimation) and then to compute a numeric integral
for eq. 2.123. This is attractive because it yields the intermediate representation of
the distribution itself. However, these methods require tuning of parameters (bin or
kernel width) and the computed entropy is sensitive to them. The estimate will also be
sensitive to the particulars of the numeric integration employed.
An alternative approach, which directly constructs the entropy from the samples,
is given by Henri Theil [76]. Given a set of n observations of real numbers, {xi }, it
is possible to construct the distribution satisfying a maximum entropy principle for
the observations. That is, the distribution with maximum entropy consistent with the
observations. Qualitatively, this maximum entropy distribution will be composed of
many intervals of uniform density. The entropy of this maximum entropy distribution
can then be had analytically as:
n
2 – 2 ln 2 1 X h n i+1 i–1 i
S0 /kB = + ln x –x (2.124)
n n 2
i=1
where x0 = x1 , xn+1 = xn and x1 < x2 < · · · < xi < · · · < xn is the ordered sequence
(or order statistics) of the set {xi }. The method is only appropriate for samples drawn
from a continuous distribution.
The Theil method, along with numeric integration over histograms and kernel density
estimates were compared to the analytic entropies for densely sampled Gaussian and
uniform distributions of many thousands of widths. The method given in eq. 2.124 had
lower relative error in all cases and is the method we use to compute the entropies
of length distributions throughout. It should be noted that eq. 2.124 takes the log of
59
differences of adjacent order statistics. If any two samples are identical, the result will be
divergent17 . We use double precision numbers throughout this work and with 64 bits
of precision, the probability of violating the strict inequality requirement is infinitesimal.
For any dimensioned distribution, ρ(x) will require normalization in the ln of
eq. 2.123. The change of dimension will introduce an unknown constant offset in the
entropy. This requirement is related to the reason that there is no absolute entropy scale
classically—without quantized observables, there is no absolute “smallest” division
and therefore no way to count up states. As a consequence, the absolute entropies
themselves will not be meaningful, but differences will be.
17 Suchdivergence, of course, is consistent with the maximum entropy principle used to construct the
estimate.
CHAPTER 3
RESULTS AND DISCUSSION
In this chapter we describe the results of our investigations into the dynamics of roaming
on the formaldehyde potential energy surface. The chapter is split sharply between
two different perspectives: a local view grounded in classical trajectories and stationary
points, and a global view, aiming to reveal the inherent dynamics in terms of geodesics
and their properties.
The first section reports on our search for several stationary points of the surface.
We successfully locate a previously unreported second order saddle. However, we
could not locate another, previously reported stationary point—the “roaming saddle”.
We discuss the implications of their respective presence and absence. We then move
on to observations provided by molecular dynamics techniques, which serve as a
point of departure and comparison for understanding the global information carried by
geodesics.
In Section 3.4, we begin our discussion of geodesics by validating some of their
basic properties and showing how the information they contain aligns with our notion
of inherent dynamics. The geodesics themselves provide novel insight to the roaming
problem on the formaldehyde surface. Our major result has to do with the entropy asso-
ciated with the each of the dynamical channels available to the fragmenting molecule.
From the relative entropy of these channels we are able to predict the decline in the
incidence of roaming at high energy.
60
61
3.1 An image of the formaldehyde potential
To orient ourselves to the space, in Figure 3.1, we plot a slice of the potential surface
along with several stationary points. We constructed this particular projection in terms
of a set of Jacobi coordinates: {rH–H , rH2 –CO , rC–O , θ1 , θ2 , φ} where rH–H , rH2 –CO , and
rC–O are H – H, H2 – CO, and C – O separation respectively, θ1 is the angle between the
vectors ~rH–H and ~rH2 –CO , θ2 is the angle between the vectors ~rC–O and ~rH2 –CO , and φ
is the out-of-plane torsional angle. To compute the image shown, we varied rH–H and
rH2 –CO and fixed the other coordinates at their equilibrium values: rC–O = 2.2883 a0 ,
θ1 = π/2, θ2 = π, φ = 0.
Note that because of our choices for the projection, some features of the surface are
not visible. For instance, this particular “shadow” of the 6-dimensional object shows
neither the cis nor trans isomers of HCOH even though they are present on the surface.
For a similar reason, it appears that some of the locations of the stationary points do
not match up with the potential. The tight transition state to molecular products, ,
is displaced slightly to the left of the saddle clearly visible on the surface. In two
dimensions, saddles of degree two appear as the peaks of hills, and the shadow of the
second order saddle we locate in Section 3.2, is in fact the hill to the right of its marker,
. The Global minimum,
, appears accurate to the potential contours because all
coordinates are fixed to the global minimum at that point. The putative roaming saddle,
4, is located in an extremely flat region—the spacing between contours is an order of
magnitude closer there than on the rest of the surface.
62
12
420
10 435
00
00
Roaming
500 Region
00
8
rH−H / a0
00
420
6
4 40000
35000
25000
2 15000
Bird’s 5000
Nest
0
0 2 4 6 8 10 12
rH2 −CO / a0
Figure 3.1: A projection of the potential surface
Vertical and horizontal axes are respectively H – H and H2 – CO separation in atomic
units. The circle,
, is the global minimum of the surface. The square, , is the tight
transition state to molecular products from the global minimum. The triangle, 4, is
the location of the putative roaming saddle. The diamond, , is the newly-located
saddle of degree two associated with the S0 /S1 conical intersection; see Section 3.2.
The colors of the contours do not change much in the upper right quadrant; this is
intentional because the potential is so flat in this region. The roaming region is defined
in Section 2.3.3; the bird’s nest will be defined and discussed in Section 3.3.1.
63
3.2 Locating a 2nd order saddle on the Bowman energy
surface
In Section 2.6, we describe the simplex-minimization-based technique to search for
saddles on the Bowman potential energy surface. Our targets are two saddles that had
not been previously located on the potential:
1. We posit the existence of and search for a saddle of degree two in the vicinity of
the S0 /S1 conical intersection1 first reported by Araujo and co-workers [65].
2. We also search for the so-called “roaming saddle”, reported by Harding et al. [31],
which some believe (see Section 1.2.2) governs the rates of roaming reaction in
the usual fashion of saddles in transition state theory. If such a saddle is essential
to roaming, we expect to find one on the Bowman surface, upon which roaming
certainly manifests.
We seek the “roaming saddle” because of the significance it takes in some theories of
roaming (again, see Section 1.2.2). If it is not present on the Bowman surface, such
a feature cannot be essential to our understanding. We search for the second order
saddle because, if it exists, we expect it to form part of ridge bounding the bird’s nest
(see Section 3.3.1).
3.2.1 Verification
We are fortunate to have available for comparison, literature values for the properties
several stationary points on the Bowman surface. The description of the surface from
Zhang et al. [54] lists stable minima and first order saddles with their energies and
normal frequencies. For each of the stationary points given in their paper, we confirm
the location by way of the minimization scheme described in Section 2.6 and we
give published and computed values for frequencies and energy. As measures of
1 Conical intersections break the degeneracy of the intersecting states along two coordinates [77].
When constructing a high-fidelity adiabatic energy surface for the lower state, the intersection may
manifest as a stationary point with two imaginary frequencies (two “down hill” directions). In a 1- or
2- degree-of-freedom system, such a feature would appear as a maximum, but in our case, there are 4
other internal degrees of freedom. These will have real frequency and so the stationary point is a saddle.
See [36] for more in the case of formaldehyde.
64
accuracy, we also give the magnitude of the gradient and the root sum squared atomic
displacements between our initial guess and the final location, for each stationary point.
We define,
v
u iµ 2
uX iµ
∆R = R – R0 = t R – R0 µ = {x, y, z}, i = 1 . . . N (3.1)
i,µ
e e
where R0 is the published configuration from which the search was initiated and R is
the point we located. ∆R thereby provides a measure of the distance from the location
e e
of the start of the search to the stationary point which was found.
Table 3.1 gives the named stationary points (St.P) along with computed and pub-
lished [54] values for comparison. In each pair of rows, the upper row holds the
computed results and the lower row holds the published values (in parentheses). In all
instances where computed frequencies differ from those published (excepting for TS3 ,
see caption), we suspect it is because of rounding; had we simply truncated the values,
they would be identical.
While the frequency deviations for the transition state to molecular products are
small (< 2.5%), we were still perplexed and investigated two possible sources of error.
It is possible that the choice for inter-abscissa spacing, h = 7 × 10–3 , was improper for
TS3 on the potential surface. However, we discarded this possibility after constructing
graphs similar to those in Figure 2.6 to test for convergence—the frequencies at TS3
were even better converged with our choice of h than for the second order saddle. It is
also possible that there would be interference from the switching function (see page 56)
if the hyper-volume within which the function was evaluated contained a discontinuity
in the 4th derivative (as exists at the boundaries). We also dispensed with this theory
after using a 2nd order approximation to evaluate the elements of the Hessian; the
computed frequencies were within 0.01% of the first. We suspect that the frequencies
we provide in Table 3.1 are correct for the surface and the published values are in error.
3.2.2 Search results
Beginning our search from the location of the S0 /S1 conical intersection reported
2
by Araujo et al. [65] and minimizing ∇V(R) , our simplex minimization method
unambiguously locates a second order saddle point. We converge such that the vertices
e e
Stationary Point V(R)/cm–1 ∇V(R) /(cm–1 /Å) ∆R/Å {˜
νi }/cm–1
e e e
Global Min 3.36 × 10–6 2.05 × 10–3 7.62 × 10–6 1165, 1337, 1624, 1807, 2811, 2950
(0.00) (1164, 1337, 1623, 1806, 2810, 2949)
trans-HOCH 17983.37 0.595 × 10–3 3.66 × 10–4 1108, 1282, 1369, 1715, 2927, 4099
(17983.37) (1108, 1281, 1369, 1714, 2927, 4098)
cis-HOCH 19517.27 1.14 × 10–3 1.12 × 10–3 1221, 1343, 1440, 1609, 2741, 3839
(19517.27) (1221, 1342, 1440, 1608, 2741, 3839)
TS1 27335.32 1.63 × 10–3 5.82 × 10–4 1074i, 967, 1208, 1465, 2542, 3531
(27335.33) (1074i, 968, 1208, 1465, 2541, 3531)
TS2 30140.00 1.17 × 10–3 1.02 × 10–3 2068i, 726, 1260, 1454, 2599, 2993
(30140.00) (2068i, 725, 1259, 1454, 2598, 2992)
TS3 30583.66 0.778 × 10–3 7.30 × 10–4 1825i, 612, 659, 1357, 1864, 2994
(30583.66) (1824i, 598, 652, 1351, 1861, 3039)
2nd order saddle 33777.20 2.42 × 10–3 0.560 541.2i, 1165.3i, 313.0, 935.9, 1862.0, 2818.5
Table 3.1: Properties of stationary points on the Bowman surface
Comparison of various published [54] and computed properties (see text) of stationary points on Bowman’s surface. The
first row, are our results. The second rows give are published values within parentheses. The configurations of the stationary
points were published [54] in terms of angles and bond lengths to a precision of one-tenth degree and one-ten-thousandth
Angstrom. TS1 is between the cis and trans isomers; TS2 is between the global minimum and the trans isomer; and TS3 is
between the global minimum and molecular products, the “tight” transition state. TS3 is the only example of departure
from the expected values;
deviations larger than 1% are bolded. Energies all agree with published values to 6 significant
figures. We give ∇V(R) as a measure of the accuracy with which we have located the stationary points; it is quite good in
all cases. The location
e eof the conical intersection from which we found the second order saddle was published in [65]; this
stationary point is the only for which the search distance, ∆R (eq. 3.1), is non-trivial.
65
66
of the simplex are within, 10–14 a0 , of their centroid2 . At the saddle, the gradient’s
magnitude is: ∇V(R) = 5.83 × 10–9 Eh /a0 = 2.42 × 10–3 cm–1 /Å. This value is
consistent with our findings for other, previously published stationary points on the
e e
surface (see Table 3.1). The presence of the second order saddle in the vicinity of a
conical intersection is inherently a multi-electronic-surface result that manifests under
the use of the Born-Oppenheimer approximation in the construction of our adiabatic
potential surface. It is non-trivial that the Bowman group’s fit [54] is able to capture
the feature. The saddle’s energy, gradient, and frequencies are listed in Table 3.1,
which shows there are indeed two imaginary frequencies. The other six normal modes,
corresponding to net -translation and -rotation, all have frequency of magnitude less
than 1.0 cm–1 /Å, which is of the same magnitued as the rotational and translational
modes for the other, previously published stationary points. The coordinates of the
second order saddle are given in Table 3.2 along with those of the conical intersection.
The coordinate system is chosen such that all of the free variables are 0. We place C
at the origin, O along the y axis, and orient such that HCO is in the x – y plane. The
hydrogen excluded from the formyl is the farthest away from the total center of mass.
Distances are given in Bohr. Coordinates for all other stationary points of interest are
given in the same coordinate system in Tables 3.3–3.9.
3.2.3 No “roaming saddle” to be found
Using the same methods, we searched the Bowman surface extensively for the reported
“roaming saddle” [31]. We initiated searches from the proposed location [35] of the
roaming saddle and perturbed its position by taking random steps in a 12-dimensional
ball of up to several a0 from the starting position (see page 51). However, even in trials
of 105 such searches, we were unable to converge the gradient to less than 102 cm–1 /Å,
which we deemed insufficiently flat—indeed, the gradients at the other stationary points
we located have magnitude 5 orders of magnitude smaller. Such searches also suffered
in the following ways:
• Of those points with “shallowly” sloping gradient, none unambiguously possessed
2 Themeasure of vertex displacement from centroid is an indicator of the degree of convergence of
our algorithm—in this case, quite high relative to features of any molecular energy landscape. This
value is consistent with those we found for the other stationary points on the surface.
67
2nd order saddle
x y z
~rH(1) = 1.672984 –1.508401 3.683196
~rO = 0.0 2.222303 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.485300 –1.472777 0.0
S0 /S1 conical intersection
x y z
~rH(1) = 0.935657 –1.296518 3.332712
~rO = 0.0 2.257969 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.238146 –1.679863 0.0
Table 3.2: Coordinates of second order saddle on Bowman’s surface
Given first are the coordinates of the saddle, which we located located on Bowman’s
surface [54]. Below, are the published coordinates [65] (rotated and translated into the
same reference frame) of the conical intersection from which the search for the saddle
was initiated.
x y z
~rH(1) = –1.793899 –1.124608 0.0
~rO = 0.0 2.288201 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.793899 –1.124608 0.0
Table 3.3: Coordinates of global minimum on Bowman’s surface
Location refined and coordinates transformed from the published configuration [54].
x y z
~rH(1) = 2.030988 –0.502804 0.0
~rO = 0.0 2.513517 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.726049 3.153715 0.0
Table 3.4: Coordinates of cis-HCOH minimum on Bowman’s surface
Location refined and coordinates transformed from the published configuration [54].
68
x y z
~rH(1) = –2.070234 –0.327514 0.0
~rO = 0.0 2.513931 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.747948 2.997965 0.0
Table 3.5: Coordinates of trans-HCOH minimum on Bowman’s surface
Location refined and coordinates transformed from the published configuration [54].
x y z
~rH(1) = 0.208602 –0.724582 2.013021
~rO = 0.0 2.510365 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.516407 3.562945 0.0
Table 3.6: Coordinates of transition state for cis-trans isomerization
Also on the Bowman Surface; in their papers, referred to as TS1 . Location refined and
coordinates transformed from the published configuration [54].
x y z
~rH(1) = –1.893152 –0.852520 0.0
~rO = 0.0 2.490956 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.920528 1.378915 0.0
Table 3.7: Coordinates of H2 CO −−→ trans-HCOH transition state
Also on the Bowman surface; intheir papers, referred to as TS2 . Location refined and
coordinates transformed from the published configuration [54].
x y z
~rH(1) = 2.909270 –1.080449 0.0
~rO = 0.0 2.212928 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 0.638885 –1.953390 0.0
Table 3.8: Coordinates of tight transition state on Bowman’s surface
In Bowman’s papers, referred to as TS3 ; location refined and coordinates transformed
from the published configuration [54].
69
x y z
~rH(1) = 2.331282 –0.475906 6.482761
~rO = 0.0 2.235156 0.0
~rC = 0.0 0.0 0.0
~rH(2) = 1.720165 –1.206485 0.0
Table 3.9: Coordinates of proposed roaming saddle
Coordinates transformed from the published configuration [35]; saddle first reported by
Harding and co-workers [31]. No such saddle could be located on Bowman’s surface.
a single imaginary vibrational frequency (most often there were 2 imaginary
frequencies in the top 6 modes).
• Further, the vibrational and rotational modes were frequently difficult to separate,
there being 4–5 modes of similar magnitude (of order 10 cm–1 ; this was the case
over several orders of magnitude of h when calculating the Hessian). It is unclear
if these were indeed zero-frequency modes, which would have indicated that the
algorithm had located a ridge.
It is possible that more sophisticated methods such as eigen-vector following [66] would
succeed in locating such a roaming saddle. It is also possible, and this is our conjecture,
that the Bowman surface simply does not capture this feature. If nothing else, it is clear
that roaming does not require a well-defined saddle. Attempts to define a roaming
transition state3 in terms of such a feature are dubious at best.
3.3 Observations from molecular dynamics
In this section, we discuss a the results of the molecular dynamics simulations described
in Chapter 2. Before we begin, Figure 3.2 provides an instructive map of space we
explore. Formaldehyde has 6 intramolecular degrees of freedom4 , but this particular 2-
dimensional projection allows us to see distinctions between each of the three channels
3 “Transition
state” used here in the sense of a minimum-flux dividing surface in configuration or
phase space for an RRKM rate calculation [78].
4 Recallthat even though there are 3 real-space coordinates for each of the 4 atoms, the potential is
invariant under net rotation (about 3 axes) and translation (along 3 axes). The system therefore has
3 × 4 – 3 – 3 = 6 intramolecular degrees of freedom.
70
in which we are interested: direct dissociation, radical dissociation, and roaming-
mediated dissociation to molecular products. We show three trajectories, one from
each pathway. The coordinate along the abscissa is separation between the H2 center-of-
mass and CO center-of-mass. Motion to the right of the figure indicates the separation of
H2 from CO; one can indeed see that the direct dissociation (dotted red) and roaming
(solid black) pathways terminate in the lower right, as separated molecular products.
The ordinate gives the separation between the two hydrogens in the system. The radical
dissociation trajectory (dashed blue) moves from reactants up the axis, indicating H – H
separation, just as we expect. In addition to the trajectories, we show several actual
and proposed stationary points of the surface; see caption.
The roaming region depicted in the upper half of the figure is the region defined in
eqs. 2.62 and 2.61. The horizontal lines follow trivially from the bounds on Hydrogen
separation in eq. 2.62. The curve bounding the right side of the region comes from the
force constraint (eq. 2.61). We obtain this boundary by projecting the configurations
from the Monte Carlo random walk for sampling the roaming region (described in
Section 2.3.3) into the rH–H –rH2 –CO space and drawing a bounding edge.
The figure immediately reveals some of the key features of configuration space and
distinctions of the roaming trajectories. The roaming trajectory (solid black) spends
a long time with the loosely associated hydrogen far from the formyl parent. This is
consistent with the findings of other workers, who observe that roaming hydrogens may
spend up to several picoseconds a distance of several Angstroms from the parent [17].
To get an idea of what an extraordinarily long time this is—and to what extent the
relevant timescales differin roaming—consider that the typical vibrational period of the
H – H or H – C stretch is of the order of 10 femtoseconds—two orders of magnitude
faster. The many wiggles in the black curve while in the roaming region are variations
in H – H separation caused by H – C vibrations in the formyl!
We can also see the distinction in product H2 vibrational excitation for the roaming
and direct pathways. By observing the H – H oscillations once H2 has dissociated from
CO (along the lower right), we see how much more vibrationally excited is the roaming
H2 than the directly dissociating H2 as evidenced in the amplitude of the oscillations.
Vibrationally hot H2 is in line with energy-conservation requirements for rotaionally
cold CO.
71
12
10
Roaming
Region
8
rH−H / a0
6
4
2 Bird’s
Nest
0
0 2 4 6 8 10 12
rH2 −CO / a0
Figure 3.2: Orientation to the H2 CO surface
Vertical and horizontal axes are respectively H – H and H2 – CO separation in atomic
units. The circle,
, is the global minimum of the surface. The square, , is the tight
transition state to molecular products from the global minimum, TS3 . The triangle, 4, is
the location of the putative roaming saddle. The diamond, , is the newly-located saddle
of degree two associated with the S0 /S1 conical intersection; see Section 3.2. A typical
directly dissociating trajectory is shown in dotted red. A typical radical dissociation
trajectory is shown in dashed blue. A typical roaming trajectory is shown in solid black.
In this and all other figures with axes of dimension length, the axes are measured in
atomic units. The trajectories shown were computed with a microcanonical energy,
Eµ = 37600 cm–1 measured from the global minimum (as will be all Eµ ).
72
Though the roaming trajectory appears to pass quite close to the proposed location
of a roaming saddle (4), recall that this figure projects flat 4 degrees of freedom and the
roaming trajectory may pass quite far “above” or “below” the region of configuration
space where, for other potentials, a saddle is reported. To show the diversity of roaming
behavior and emphasize that not all roaming trajectories pass by the proposed roaming
saddle, we offer Figure 3.3, which shows six randomly selected roaming paths, 3 three
of which avoid the roaming saddle’s location entirely.
As an aside, this particular projection may be of general utility for visualizing roaming
reactions, many of which have the same general scheme:
ABC −−→ AB···C −−→ A + BC (3.2)
Where ···C indicates the roaming species. In the case of formaldehyde A is CO and B
and C are both H. For the case of acetaldehyde (CH3 HCO), all is the same except that
C is now the methyl group, CH3 . The projection we use in Figures 3.2 and 3.3 is the
A – BC center of mass separation along one axis and B – C center of mass separation
along the other. We will make use of the general scheme in eq. 3.2 again when we
visualize geodesics in Section 3.5.
3.3.1 Bird’s nest
The region in the lower right of both of Figures 3.2 and 3.3, the “Bird’s Nest”, turns
out to be key to our analysis. It is so named for the jumbled (nest like?) appearance
of the traces of the trajectories around the reactant well. We defined the extent of the
bird’s nest to be mutually exclusive with the roaming region, which bounds the bird’s
nest from above rH–H < 5.867 a0 . The right bound is the was selected to create the
smallest region which still contained the bulk of the pre-dissociation (and pre-roaming)
configurations rH2 –CO < 3.420 a0 . This selection was made without regard for the
stationary points identified on the surface5 , but we note that the edge of the bird’s nest
as defined runs along the ridge, which connects the first order saddle corresponding to
the transition state to molecular products and the second order saddle associated with
the S0 /S1 conical intersection. The ridge is visible in the projection of the potential
shown in Figure 3.1. We will discuss its significance to geodesic analysis in Section 3.6.1
5 In fact, we defined the bird’s nest prior to identifying the relevant stationary point.
73
12
10
Roaming
Region
8
rH−H / a0
6
4
2 Bird’s
Nest
0
0 2 4 6 8 10 12
rH2 −CO / a0
Figure 3.3: Six roaming trajectories
To drive home the point that not all roaming trajectories pass through the neighborhood
of the roaming saddle, we here show six randomly selected roaming trajectories. They
show a diversity of behavior, but all have in common that they enter the roaming region
and then return to dissociate to molecular products. Each trajectory was computed with
a microcanonical energy, Eµ = 37600 cm–1 .
74
3.3.2 Molecular dynamics branching ratios
Though our analysis is not concerned with detailed prediction of branching ratios for the
H2 CO dissociation reaction, we can estimate them for the classical surface under study.
Table 3.10 gives the energy-dependent results of molecular the molecular dynamics
runs. The ratios are also plotted in Figures 3.4 and 3.5; the former shows only reactive
trajectories and the latter shows all trajectories. Recall that these plots and figures should
not be taken as representative of what formaldehyde would do in an experiment as we
perform no semi-classical filtering or selection of trajectories6 , but rather of the classical
dynamics of the Bowman potential surface (see Section 2.1.1). We are concerned with
the dynamics of roaming—a phenomenon which manifests purely classically.
Figure 3.4 tells the bulk of the story as it shows ratios normalized with respect to
trajectories that succeeded in reacting. All trajectories are initiated at an energy above
the radical dissociation threshold (Erad –1
µ > 33240 cm ). Even though it is accessible, the
radical channel is not a dominant component of the flux until energies of 38814 cm–1
and higher, when it achieves parity with the direct dissociation channel. Of particular
interest to us is the behavior of the roaming channel. Lower energies show a rocky rise
in the fraction of roaming trajectories and then a decline that begins at 38814 cm–1 .
We should note that above 41060 cm–1 , roaming is not observed experimentally [17].
It is likely that above this energy, the same electronic pathway that—at lower energies—
allows the deposition of enormous amouts of vibrational energy into the S0 , ground
state surface is no longer accessible [17]. That we use the results above 41060 cm–1
for comparison to geodesic results is sensible because we are seeking to understand
the way the inherent dynammics of this surface shift with energy.
All of these observations come with a caveat that for lower energies, they are biased
towards fast processes. We know that each trajectory has sufficient energy to dissociate,
but given that we stop integration after roughly 12 ps, not all do. We do not know the
relative time scales on which reactions through each channel occur, but we certainly
suspect that they are different, especially at high energies (see how much slower is
roaming in Figure 3.6).
6 Ourpurely classical treatment does not discard states for violating zero-point energy or other “semi-
quantum” concerns; see [58].
75
Energy / cm–1 unsuccessful no reaction direct radical roaming
34500 184 38441 1097 260 18
35000 326 36967 1825 813 69
35500 507 34637 2943 1806 107
36000 778 31231 4533 3283 175
36223 943 29295 5304 4226 232
37000 1331 22329 8549 7415 376
37600 ∗1 ∗18253 11211 10016 519
37688 1814 15750 11340 10553 543
38000 1963 13101 12747 11590 599
38688 2279 8042 15161 13761 757
38814 2322 7359 15462 14071 786
41010 2544 993 18549 17040 874
45000 2782 24 17671 18711 812
Table 3.10: Dispositions of molecular dynamics trajectories with energy
For each energy, we give the final disposition of the 40000 molecular dynamics trajec-
tories we computed on the formaldehyde energy surface (details given in Sections 2.3.1
and 2.3.2). Unsuccessful trajectories were deemed insufficiently accurate for the time
step at some point during integration and discarded. Trajectories are classified as de-
scribed in Section 2.3.2. Except for the energy column, in units of wavenumbers, all
figures give absolute number of trajectories.
*: Trajectories at 37600 cm–1 were computed without properly detecting consistency
failure and so some “unsuccessful” trajectories are erroneously categorized as “no
reaction”. Assuming a failure rate equal to that of 37688 cm–1 , the error will skew
branching ratios down by no more than 5% of the expected value.
76
100
Fraction of reactive MD trajectories
Direct
Radical
10−1
Roaming / mol. products
Roaming / total products
10−2
35000 37500 40000 42500 45000
Energy / cm-1
Figure 3.4: Molecular dynamics channels (reactive)
Final states of reactive molecular dynamics trajectories. Trajectories not reacting within
12.1 ps are excluded from the figure and the relative abundance of each channel is shown
(see Figure 3.5 for all trajectories). The magenta curve, “Roaming / mol. products”, shows
the fraction of molecular products that arose from roaming: roaming/(roaming + direct).
The black curve gives roaming as a fraction of all products: roaming/(roaming +direct+
radical). Error bars giving the two standard-deviation estimate for the proportions plotted
would be smaller than the symbols used. Vertical lines indicate energies for which
geodesics were computed.
77
100
Direct
Fraction of total MD trajectories
Radical
10−1 No reaction
Roaming
10−2
10−3
10−4
35000 37500 40000 42500 45000
Energy / cm-1
Figure 3.5: Molecular dynamics channels
Disposition of molecular dynamics trajectories as a function of excitation energy. In con-
trast to Figure 3.4, the figure includes trajectories that do not react (“No reaction”), but
does not include those trajectories that failed for numerical reasons (see Table 3.10). Er-
ror bars giving the two standard-deviation estimate for the proportions plotted would be
smaller than the symbols used for all points except the roaming fraction at 34500 cm–1 ,
which is 4.5 · 10–4 ± 2.1 · 10–4 . The vertical lines indicate the energies for which
geodesics were also computed.
78
3.3.3 An estimate of the energy-dependent incubation period
We were interested in computing the times to reaction for each of the dissociation
channels: direct, radical, and roaming. To determine this “incubation period”, we
use a scheme similar to Houston and coworkers [58]: counting frames of the stored
configurations in reverse until the criteria for dissociation no longer hold. We define
our criteria in terms of the same space we visualized in Figure 3.2. For trajectories going
to radical products, we take the last frame (counted in reverse) for which rH––H > 9 a0 .
This corresponds to just after the last time the system exits the roaming region in terms of
hydrogen separation. For trajectories going to molecular products (directly or by way of
roaming) we take the last frame, again counted in reverse, for which
~rCO – ~rH
> 7 a0 .
2
This is well past the moment that the trajectory exits the bird’s nest, but the delay is
necessary to avoid mis-classifying roaming trajectories as already dissociated. From
Figure 3.3, it can be seen that we would have made such an error for roaming trajectories
if we had simply taken the limits of the bird’s nest to demark the moment of dissociation.
To treat the large number of trajectories that did not dissociate, we computed the
time by averaging the rate (inverse reaction time). Those trajectories that did not
dissociate required effectively infinite time therefore contribute a rate of 0 to the mean.
In Figure 3.6 we plot the inverse of the mean of inverses, that is the time. The incubation
period for each channel at each energy is computed as:
N
1 X –1
t–1
incubation = ti (3.3)
N
i
where the {ti } are the set of N reaction times as defined in the previous paragraph.
The major salient observation is that roaming is consistently a slower event than
direct or radical dissociation. At present we have not been able to quantitatively connect
this fact to the analysis afforded by geodesics, but it is an area for further study.
A technique we tried, but found to be unsuccessful for estimating the rate of reaction
is instructive in its failure. Houston and coworkers [58] estimated the total rate of
formaldehyde decomposition by fitting the distribution of lifetimes, t, to the distribution:
ρ(t) ∝ exp(–kt) (3.4)
79
106
Radical
Direct
Incubation period / h¯ /EH
Roaming
105
104
103
35000 37500 40000 42500 45000
Energy / cm-1
Figure 3.6: Rates of reaction
Incubation periods in atomic units for each reaction channel as a function of excitation
energy. We treat non-reactive trajectories as having effectively 0 rate of reaction when
contributing to the mean.
80
direct channel, 38000 cm–1 roaming channel, 36223 cm–1
Figure 3.7: Failure of statistical distribution of reaction times
In both figures, the ordinate is natural log of the number of trajectories as a function
of lifetime in atomic units (along the abscissa). If eq. 3.4 were a good description of
the rates, then the slope of the line of best fit would be –k, the rate constant for the
mechanism. The plot on the left is for the direct channel at 38000 cm–1 and shows the
prediction succeeding quite well. The figure on the right is for the roaming channel
at 36223 cm–1 and is showcases one of the many failure modes of the channel; this
instance is particularly spectacular as it predicts a negative rate, which is nonsense.
Such a distribution results from the simple assumptions that [79]:
• The system is ergodic—that all states with sufficient energy will eventually react.
• All vibrational states have equal probability of reacting per unit time; i.e. vibra-
tional energy redistribution is fast relative to the reaction.
While this method yielded credible fits at most energies for the direct and radical
channels, it was not suitable for roaming, even at energies for which there were many
samples available. The failure points to the non-trivial consequences roaming has for
theories of unimolecular dissociation. See Figure 3.7.
81
3.3.4 Existence of high JCO roamers
In formaldehyde, the spectroscopic signature that first hinted at the existence of roam-
ing was the presence of a low-energy, “cold”, shoulder in the product CO rotational
distribution [9]. In retrospect, this comes as no surprise. One would certainly imagine
that, given the long excursion taken by the roaming hydrogen, its plunging reentry
and abstraction of the other hydrogen would leave the product H2 highly vibrationally
excited. By necessity of energy conservation, the CO would be less rotationally excited
than if the dissociation had proceeded via the tight transition state to molecular products,
which features far less dramatic hydrogen-hydrogen separation. This “cold shoulder”
has been used in the identification of roaming in other systems [7, 19] and it is tempting
to assume that the phenomenon and the observation are inseparably coupled.
In the present case, we find just such a “cold shoulder” in the JCO distribution for
products of molecular dynamics. We estimate JCO via the crude quantization:
~
LCO
≈ ¯h JCO (3.5)
where ~LCO is the moment of inertia of CO about its center of mass. We then take
j
k
JCO =
~LCO
/¯h (3.6)
where b·c is the floor function. This analysis is certainly sufficient as a heuristic and
for order-of-magnitude observations, but not for a detailed analysis that invoked small
differences. The results of such an analysis from three representative energies are shown
in Figure 3.8.
Our approximation represents a small deviation from full expression:
~L2 = JCO JCO + 1 ¯h
CO
Defining ~L∗CO ≡ ~LCO /¯h, a better semi-classical expression would be:
1
q
JCO = ~ ∗2
LCO + 1/4 –
2
Better, but only marginally better, especially for large ~L∗CO , which we have here. Indeed
for ~L∗ = 10, the fractional deviation is only 5% between our approximation and the
CO
exact result prior to applying the floor. A more rigorous analysis would also demand a
82
Figure 3.8: A “cold” shoulder in the JCO distribution
Histogram of JCO made by combining the molecular products from three energies:
37000 cm–1 , 37600 cm–1 , and 38000 cm–1 , corresponding to a 330 nm photolysis
laser. Molecular dissociation accounted for 23414 of 119997 trajectories (39999 at
each energy). The black curve is symmetric about its mean and serves as a guide to
the eye. The low-energy shoulder to the distribution is often taken as an indicator of
roaming. We cannot tell from the above figure, but there do exist roaming trajectories
giving rise to product states with JCO > 30.
83
Energy / cm–1 35000 36000 37000 38814 41010 45000
JCO ≥ 30 0.188 0.259 0.309 0.377 0.398 0.266
JCO ≥ 60 0 0.0058 0.0133 0.0115 0.0206 0.0185
Table 3.11: Fraction of roaming trajectories with high JCO
The table above gives the fraction of roaming trajectories with JCO ≥ 30 and JCO ≥ 60
for several energies. For the given energies, at least 15% of the roaming trajectories
hide in the symmetric, JCO ≥ 30 portion of the distribution comparable to Figure 3.8.
Up to 2% of the roaming trajectories have JCO ≥ 60.
sampling scheme that properly treated the quantum nature of the system; here we use
JCO simply as a proxy for the magnitude of the rate of angular rotation.
Seeing the low-energy component of the distribution in Figure 3.8 was one of our
first global indicators that the surface supplied to us functioned as it should. When we
examined the trajectories corresponding to low JCO , we observed that they manifestly
roamed. Our early definitions of roaming even exploited this, calling all trajectories
with JCO ≤ 15 “roaming” and all those with JCO ≥ 30 “directly dissociating”. The
excluded middle was an acknowledgment of the inherent the ambiguity in the definition
of roaming though was of little consequence as we were not seeking branching ratios
at that stage.
However, we discovered that this simple classification doesn’t tell the whole story.
In Section 2.3.3, we give a definition of roaming that relies only on passage through
and return from a special region of configuration space. The trajectories satisfying the
our definition based on the “roaming region” do indeed “look” like roaming, but some
exhibit decidedly higher JCO . As seen in Figure 3.9, there are roaming trajectories with
JCO even higher than 60! See Table 3.11 for the explicit fractions. This surprising and
complicating fact must be borne in mind when describing and defining roaming or we
risk missing large components of the distribution of roaming dynamics. Later, we will
see that roaming is indeed enabled by just such a wide distribution in features.
84
Figure 3.9: Roamers exhibiting high JCO
Distribution of product JCO for roaming trajectories at all microcanonical energies, Eµ ,
studied. Each bin along the abscissa is a range of 5 quanta, e.g. 0 corresponds to 0–4,
20 to 20–24, etc. While slow rotation of the CO fragment has been regarded as a
hallmark of roaming behavior, clearly this simplistic, energy-conservation story does
not capture the range over which energy is deposited.
3.4 Geodesics are consistent with estimate
In Section 2.5, we established a method to estimate geodesic total and vibrational
lengths. Though we deliberately do not typically analyze geodesics over their entire
length (see the discussion to come of the “bird’s nest” in Section 3.6.1), our calculations
yield accurate estimates for lower-bounds on both the vibrational and total length of
the entire path. In Figure 3.10, we show distributions of the total and H2 vibratonal
length computed at EL = 37600 cm–1 .
From eq. 2.105, we have an estimate of total length, `Total :
√
`Total & 585 a0 me (3.7)
and the H2 vibrational length, `H2 –Vib , as:
√
`H2 –Vib & 30 a0 me (3.8)
Comparing to Figure 3.10, we see that both estimates are indeed near the lower bound
√
of their respective distributions. Quantitatively, `Total = 585 a0 me corresponds to the
√
0.166th percentile of ρ(`Total ) and `H2 –Vib = 30 a0 me corresponds to the 0.067th
85
×10−3 ×10−3
Probability Density ×a0 me
Probability Density ×a0 me
10
√
√
2.5
8
2.0
1.5 6
1.0 4
0.5 2
0.0 0
0 200 400 600 800 0 40 80 120 160 √ 200 240
√ 10001200
ℓTotal / a0 me ℓH2 vib / a0 me
8
Probability Density
6
4
2
0
0.05 0.10 0.15 0.20 0.25 0.30
ℓH2 vib / ℓTotal
Figure 3.10: Distribution of lengths over full geodesic paths
Distributions composed of 19321 direct geodesics computed at EL = 37600 cm–1 .
Lengths calculated as described in Sections 2.4.1 and 2.4.2 over the full extent of the
paths. The lower-middle distribution is of the ratios of the total and vibrational length
for each geodesic. The dashed, vertical lines give our (quite accurate) estimates for
the lower bounds of each distribution. Note that our estimate for `H2 vib /`Total is not
intended to be a lower bound as we take the low estimates for both distributions rather
than a lower bound for the numerator and an upper bound for the denominator.
86
percentile of ρ(`H2 –Vib ). These extremes makes sense because we ignored many of
the motions involved; modulo our physically reasonable guesses for
∆~rH –CO
and
2
k∆~r k, the estimates should be lower bounds! Combining the two results allows us to
estimate the vibrational fraction to be:
`H2 –Vib /`Total ≈ 30/585 = 0.05 (3.9)
which is also within the tail of the distribution shown in Figure 3.10, the 0.756th
percentile.
3.5 Geodesics are inherent dynamics
In Section 2.1.2 we defined the inherent dynamics of a system as a filtered view of its
classical dynamics—revealing the essential motions stripped away of noise. A sketch of
the relationship between both such dynamics is shown in Figure 3.11. We then make
the case (in Section 2.2) that geodesics in the landscape ensemble are these inherent
dynamics.
While the group has been computing geodesics and using them to understand the
inherent dynamics of systems under study for over a decade, the geodesics themselves
have resisted direct visualization except in the two-degree of freedom case [48]. The
systems treated in the past simply have too many degrees of freedom to allow their
rendering after the fashion of Figure 3.11. Unlike simple liquids, which have many
hundreds or thousands of degrees of freedom, formaldehyde has only 6—still non-trivial,
but perhaps geodesics on H2 CO are amenable to visualization. The general scheme for
roaming reactions given in eq. 3.2 offers three suggestive coordinates. Recall that we
have ABC forming A + BC by way of roaming C. Specifically CO is A and the two H
are B and C. Forming coordinates from the inter-center-of-mass distances for each of the
ABC centers gives three coordinates: AB, BC, CA, which we can plot simultaneously.
Figure 3.12 shows a single roaming molecular dynamics trajectory overlaid with
two geodesics, one that proceeds directly from reactants to products and another that
is constrained to pass through one of the points along the trajectory in the roaming
region—a geodesic that roams. The degree of similarity between Figure 3.11 (a cartoon)
and Figure 3.12 (real data from a physical surface) and is remarkable—the geodesics
87
Figure 3.11: Inherent dynamics pictograph
We sketch the inherent dynamics (dotted black curve) of a process connecting (R)eactants
to (P)roducts and show the relationship to the classical dynamics (solid red). The classical
curve shows the oscillations of the molecular dynamics trajectories within the reactive
tube of possible paths (solid black, bounding curves). The Inherent dynamics are
the smooth, dotted path that washes out the non-essential features of the molecular
dynamics.
88
clearly match our model of the inherent dynamics. The roaming geodesic, in blue, exits
the bird’s nest with the trajectory and then swings towards the products. Even the direct
geodesic, which is not constrained to pass through the roaming region, captures the
essence of the trajectory after dissociation. This image, or one like it was the inspiration
for the the naming of the “bird’s nest” (see Section 3.3.1). It is visible as the dark scribble
in the center and results from pre-dissociation vibration of the molecule.
While this picture is evocative, we will not dwell on images like it. As we discuss
in Section 3.6.3, there is naught to be gained by restricting ourselves to dynamically
correlated7 endpoints as we do here. Indeed we can more widely explore the energy
landscape when freed from such constraints. Figure 3.12 is useful for confirming our
notion of what geodesics represent, but the power in using geodesics lies in the way
they globally sample the energy landscape and comes from the inferences we can draw
by studying them in aggregate.
7 Thisdynamical correlation refers to the fact that each of the boundary configurations are connected
by a single classical trajectory.
89
14
12
10
8
CA
6
4
2
0
15
10
AB 5
1 0
4 3 2
0 7 6 5
BC
Figure 3.12: Geodesic as inherent dynamics of formaldehyde
The black curve is a classically computed molecular dynamics trajectory, which roams;
Eµ = 37600 cm–1 . The geodesics, direct (in red) and roaming (in blue), were computed
as part of our previous study [1] with EL = 34928.7 cm–1 . Incidentally, this is very
nearly equal to the value of the potential at the point in the roaming region. The red
curve is an unconstrained geodesic between the starting and ending points of the
molecular dynamics trajectory. The blue curve is the geodesic between the same points,
constrained to pass through a point on the trajectory in the roaming region. The axes are
defined in terms of the center of mass separations between components of the general
ABC system defined in eq. 3.2. The BC axis is H – H separation and the AB and CA
axes are CO – H separation for each hydrogen, B and C.
90
3.6 Observations from geodesics
Having established that the geodesics we compute are reasonable in their magnitude
and that, in at least in some sense, they match our mental model of inherent dynamics,
we can interrogate them to understand the dissociation dynamics of formaldehyde.
Recall that geodesics are global measures of the surface that do not depend on the
fine details of minima and transition states; rather they give us information about the
connectivity and structure of the large-scale features of the energy surface.
For each of the six landscape energies we attempted to compute at least 12000
geodesics of each class (see Table 3.12). On our in-house cluster (ted), which mostly
runs Intel Xeon E5440 processors at 2.83 GHz, the average time to compute a single,
optimized geodesic is approximately 1 core-hour8 . Nodes have at least 512 MB of
memory per core and computation is always processor- rather than memory-bound.
The geodesics computed for the present study represent on the order of a quarter of a
million core-hours of computational effort.
Table 3.12 gives the rate of success for our algorithm in computing the different
classes of geodesics with energy. The broad observation is that direct and radical
geodesics are generally successfully completed at all energies. Roaming geodesics, on
the other hand vary in the their rate of completion: 69–76% at low energies and not
approaching 100% until the upper end of our scale. The algorithm will fail if there is no
path connecting the endpoints which respects the landscape criterion. The prevalence
of failures, then suggests9 that at the lower energies under study, configuration space is
disconnected—and that the dynamics are not ergodic. This is a non-trivial observation:
surely the dynamics are not ergodic at energies so low that the system is confined to the
reactive well. However, even the lower end of our energy range is nearly 2000 cm–1
above the radical dissociation threshold, Erad –1
µ = 33240 cm . At these, relatively high,
energies non-ergodicity is surprising.
We must draw this inference because the landscape energy is equal to the micro-
canonical energy, EL = Eµ . By choosing the landscape energy, EL , to be equal to the
microcanonical energy, Eµ , we have the property that all configurations that result
8 Using more modern hardware, such as the cluster at Brown’s Center for Computation and Visualiza-
tion improved this time by a rough factor of two.
9 provided that our algorithm is free from more-fundamental sources of failure
91
Energy / cm–1 direct radical roaming
35000 11998 11993 8256
36000 12000 12000 8352
37600 ∗19321 12000 8816
38814 12000 12000 9135
41010 11998 12000 11949
45000 12000 11997 11712
Table 3.12: Success rate of computed geodesics by channel and energy
Here we give the number of successfully computed geodesics by channel and energy.
*: For all channel/energy combinations except direct geodesics at 37600 cm–1 , exactly
12000 geodesics were attempted. For direct geodesics at 37600 cm–1 , 19321 were
attempted; all were completed.
from classical trajectories computed at Eµ are members of the corresponding landscape
ensemble at EL = Eµ . Therefore, there cannot be regions of the surface explored by
molecular dynamics or by the random walk sampling the roaming region which are
energetically permitted in the landscape ensemble. This disconnection is consistent
with our observation, in Section 3.6.3, that the roaming geodesics for dynamically
correlated boundary points have a broader distribution than those that are not. The
disconnect also points back to the failure of a simple statistical theory to capture the rate
of reaction for roaming events in Section 3.3.3; indeed one of the required assumptions
for the statistical theory was a connected, ergodic space.
3.6.1 Implications of the bird’s nest
The next property of the geodesics we will examine is their length. This is natural as
geodesics are defined in terms of a minimization condition on their length and the
length gives information about the relative amounts of motion involved in the different
channels. Throughout this chapter, we’ve been hinting at the significance of the region
about the reactive well, which we call the bird’s nest. One of the striking observations
revealed by Figure 3.2 is how different are trajectories from each channel. All go their
own way and have little overlap outside of the region labeled bird’s nest. That the
trajectories spend much time together in this tangle of pathways (resembling, to the
author’s eyes, the nest of a bird) is key to our analysis.
92
By construction, the geodesics for different channels go to manifestly different
regions of configuration space—indeed we demand that roaming geodesics detour
before heading towards molecular products. It would then be no surprise if the lengths
were different. To make a just comparison, we therefore confine our attention to a
special region of overlap—the bird’s nest. We see that the system oscillates there, about
the global minimum, for some time no matter the products or route taken to them.
This is the incubation period during which energy is redistributed among the different
degrees of freedom and the ultimate trajectory of the system is determined. The bird’s
nest provides a region where we can ask the same questions of different classes of paths
and make equitable comparison between them. This method does not require us to
identify a reaction coordinate, but only to locate a region where all trajectories spend
significant time before diverging.
To use the bird’s nest when analyzing geodesics, we identify as tb the moment when
a geodesic first exits the bird’s nest and then measure the length (or other property) of
the geodesic only until departure. For example, we modify the form of the total length
as given in eq. 2.67 to be:
Z t
b
`Total = dτ [2 · T(τ )]1/2 (3.10)
0
where the limits of integration have changed to only include the portion of the geodesic
in the bird’s nest. We make similar modification to our expressions for the component
kinematic lengths (eqs. 2.73 and 2.88) and for the boundary fraction (eqs. 2.90 and 3.11).
Given that trajectories for each channel overlap strongly in the bird’s nest, there is
no reason so expect the geodesics for different channels to have markedly different
properties there. Any differences we do observe then, should be attributed to the unique
dynamical features associated with each pathway.
3.6.2 Key observations at a single energy
We can use geodesics to understand the differences between the dynamical pathways
available to the formaldehyde molecule as it dissociates. The distributions of total
lengths of geodesics give us information about the relative distances the system has to
traverse to transform from one species into another. By partitioning the kinetic energy
into other, physically interesting degrees of freedom such as diatomic vibration or
93
rotation we can monitor the role that those motions have in the reaction as well. For
example, the distribution of `H2 vib. along geodesics gives an indication of the amount
of H2 stretching required within the bird’s nest along this minimum-motion path. We
will also examine the amount of time spent encountering obstacles, the boundary
fraction, which indicates how constrained are the paths taken by geodesics.
We will come to the energy-dependent properties of geodesics in Section 3.6.4, but
it is instructive and will allow us to orient our thinking, to first consider a single, extremal
landscape energy. In this section, we take up geodesics10 with EL = 35000 cm–1 , an
energy soon after the onset of roaming in formaldehyde to see what they can tell us
about how and why roaming occurs.
Broad roaming distribution in length at low energy
Figure 3.13 serves as our principle point of departure, showing the distributions of
total kinematic lengths lengths in the bird’s nest for each reaction channel. Here, even
when confined exclusively to the bird’s nest, we see a striking difference between the
roaming paths and either the direct or radical paths. While the distributions for direct
and radical paths are highly peaked, the roaming distribution is smeared across the
abscissa. This broad distribution of minimal transit lengths points to a plurality of routes
to the roaming region.
We are able to take a more detailed look at the distributions shown in Figure 3.13
by examining the component kinematic lengths in Figure 3.14. The internal degrees of
freedom plotted here show the same motif played out in each partitioning of the kinetic
energy. Direct and radical path-length distributions are sharply peaked near zero and
the roaming path distributions are considerably broader with longer tails.
These component length distributions do not give us information about the relative
product yields in formaldehyde photodissociation. Rather, the component length
distributions yield information about the extent to which kinetic energy is focused in
into different degrees of freedom during the incubation period prior to dissociation.
For example, it is remarkable that there is more H2 stretching for geodesics heading to
10 Though the distributions we examine in this chapter are quite similar to those in our paper [1], they
are computed with a single landscape and geodesics energy, EL = Eµ = 35000 cm–1 . In that paper,
the landscape energy was lower than the microcanonical excitation energy, EL = 34928.7 cm–1 <
Eµ = 37000 cm–1 .
94
Probability Density × a0 me × 103
12
Radical
10
Direct
√
8 Roaming
6
4
2
0
150 300 450 600 750
√
35000 cm-1 : Length / a0 me
Figure 3.13: Total length distributions
Total kinematic length probability distributions for geodesics computed at EL =
35000 cm–1 within the bird’s nest. We have no reason to suspect that, when confined
to this region, the geodesics for each channel would be different from each other.
However, the distributions for geodesics directly between reactants and molecular
products (direct) and from reactants to radical products (radical) are substantively more
peaked than for those that go to molecular products by way of the roaming region
(roaming). Though they appear nearly indistinguishable to those distributions in our
recent paper [1] (where we used EL = 34928.7 cm–1 unequal to Eµ = 37000 cm–1 ) the
distributions in the figure were calculated with landscape and microcanonical energies
equal, EL = Eµ = 35000 cm–1 .
95
Probability Density × a0 me × 103
15 CO CO
Rotation Vibration
10
√
5
0
H2 H2
15
Rotation Vibration
10
5
0
0 100 200 300 0 100 200 √ 300
35000 cm-1 : Length / a0 me
Figure 3.14: Component length distributions
Probability distributions for component kinematic lengths computed at EL =
35000 cm–1 . The figure provides a more detailed look at Figure 3.13. As there,
direct geodesics are in dotted red, radical geodesics are in dashed blue, and roaming
geodesics are in solid black. Again, the geodesics are only measured while in the bird’s
nest around the reactive well. As an aside, even though the chemically distinct species
indicated by our labels (H2 , CO) do not meaningfully exist, we can still measure motion
in the corresponding degrees of freedom, e.g.: H – H stretching or change in C – O orien-
tation. As in Figure 3.13 for total kinematic length, the roaming geodesics have a much
broader range of motions available to them. Though they appear nearly indistinguish-
able to those distributions in our recent paper [1] (where we used EL = 34928.7 cm–1
unequal to Eµ = 37000 cm–1 ) the distributions in the figure were calculated with
landscape and microcanonical energies equal, EL = Eµ = 35000 cm–1 .
96
the roaming region than to radical products. This suggests that roaming is not simply
aborted radical dissociation.
Even though geodesics cannot see into the future, they reveal sharp distinction
between the bird’s-nest-dynamics for the roaming channel and for direct or radical
dissociation. The distributions in Figures 3.14 and 3.13 indicate that of the three
channels, only roaming admits a wide array of partitionings of kinetic energy. In
contrast, the direct and radical channels have narrowly defined requirements on the
pre-dissociation distribution of kinetic energy.
Boundary fraction
We can understand why the roaming channel has this broad distribution of kinetic
energies through distribution of boundary fractions, hΘi. Recall, from eq. 2.90 in
Section 2.4.4 that we define the boundary fraction as:
R tb
dτ [2TTotal ]1/2 Θ[R(τ ); EL ]
hΘi = 0 (3.11)
1/2
Rt
0 dτ [2TTotal ]
e
where tb is the moment of exit from the bird’s nest and where Θ[R; EL ] is finite for the
portions of the geodesic, R(τ ) encountering an obstacle:
e
e
1 E – V(R) < δE
L L
Θ[R; EL ] = e
e 0 otherwise
We take the fractional deviation from the landscape energy, δ = 10–4 ; reducing δ to
10–5 produces no changes to the distribution.
In Figure 3.15 we find another stark difference between roaming and the direct or
radical pathways. All three classes of pathway have a peak in the density at zero; i.e.
for each channel, there are at least some geodesics that escape from the bird’s nest
via the straight-line, no-obstacles path. However, is much more pronounced for direct
and radical paths than for roaming. Of the radical distribution, 56% of the density
is identically 0, as is 74% for direct dissociation11 . In contrast, less than 14% of the
roaming geodesics escape the bird’s nest without encountering an obstacle. Aside from
this binary observation, we see that the distribution for roaming geodesics is heavily
11 See Table 3.15 for these figures as a function of landscape energy.
97
3.0
Direct
2.5
Roaming
Probability Density
2.0 Radical
1.5
1.0
0.5
0.0
0.25 0.50 0.75 1.00
35000 cm-1 : Boundary Fraction, hΘi
Figure 3.15: Boundary fraction
Probability distributions for the fraction of length spent tracing boundaries of the
potential surface within the bird’s nest for geodesics computed at EL = 35000 cm–1 .
The distributions for all three channels include some geodesics that make no contact
with obstacles before exiting the bird’s nest. This delta function contribution is shown
here with finite width because of our binning. In contrast to direct and radical geodesics,
the majority of the density for roaming geodesics is finite, and skewed towards contacting
obstacles half of the time. Indeed the mean boundary fraction for roaming geodesics
is 0.45, whereas for radical and direct geodesics the figures are respectively 0.15
and 0.057 (see Table 3.16). Though they appear nearly indistinguishable to those
distributions in our recent paper [1] (where we used EL = 34928.7 cm–1 unequal to
Eµ = 37000 cm–1 ) the distributions in the figure were calculated with landscape and
microcanonical energies equal, EL = Eµ = 35000 cm–1 .
98
weighted towards higher boundary fraction: the most probable being hΘi = 0.70±0.05.
Winding around obstacles in this way is certainly consistent with the longer paths
evidenced in Figure 3.13. It is important to note that the obstacles encountered by
roaming geodesics are regions of quite high potential energy; they must be of the
order of the landscape energy, which, at 35000 cm–1 , is already larger than the radical
dissociation threshold.
Taken together, the observation of broad length distributions and obstacle-lined
pathways to roaming offer a picture of the potential energy landscape where roaming
occurs. By comparison to radical or direct dissociation, the paths are tortured and
constrained. These obstacles give rise to a broad, unfocused distribution kinematic
lengths. The distinction between paths leading to direct or radical dissociation and
those leading to roaming is how much more narrowly constrained are the former than
the wandering latter. The relative freedom afforded roaming paths at this energy points
to an entropic advantage for the channel. We will explore this notion of an entropic
advantage quantitatively in Section 3.6.5.
3.6.3 The null effect of dynamical endpoint correlation
The results in this section were computed in conjunction with our earlier paper [1] and
correspond to a different set of geodesics than the rest described in this chapter. The
major difference is that, unlike everywhere else, the microcanonical and landscape
energies are not the same. Rather the landscape energy, EL = 34928.7 cm–1 , is
considerably lower than the microcanonical energy, Eµ = 37600 cm–1 . In that
paper [1] and for this subsection, we are nevertheless able to draw useful conclusions
about the nature of geodesics on the formaldehyde energy surface. If, in this section, the
numbers of molecular dynamics trajectories or computed geodesics seem unfamiliar,
this is also why.
We compute geodesics as described in Section 2.3.4: from reactants to molecular
and radical products, perhaps by way of the roaming region. In all cases, we uses
endpoints that are not dynamically correlated—that are not connected by classical
trajectories12 . We do this to increase the number of geodesics we can compute from a
12 Whileall the boundary- (and intermediate roaming-) points are computed via classical trajectories,
the boundaries for a given geodesic do not necessarily come from the same classical trajectory.
99
unpaired paired
92 ± 36 (49,060)
94 ± 36 (5,965)
direct
95 ± 37 (1,491)
roaming 358 ± 146 (36,608) 224 ± 117 (275)
Table 3.13: Comparison to dynamically correlated (paired) geodesics
Total lengths are reported in units of a0 m1/2
e as mean ± standard-deviation with the
number of samples (in parentheses). In keeping with the rest of our analysis, lengths
are computed only within the bird’s nest.
given number of trajectories (N trajectories yield N2 /2 endpoint pairs).
To verify that that our results are unaffected by this randomization, we also compute
direct and roaming geodesics between “paired” endpoints—configuration space points
linked by a single molecular dynamics trajectory. For direct geodesics we select 6,000
of the 7,595 directly dissociating trajectories13 at Eµ = 37600 cm–1 , and attempt to
compute geodesics between their starting and ending configurations. Our algorithm
successfully locates geodesics for 5,965 of them. In the case of roaming trajectories—
which are rare events—the requirement of dynamical connection drastically reduces
our sampling statistics14 . Of the 381 roaming trajectories13 at Eµ = 37600 cm–1 ,
292 have points in the roaming region that also satisfy the landscape criterion, EL =
34928.7 cm–1 . For each roaming trajectory, we randomly select a single configuration
in the roaming region and satisfying the landscape criterion and then pair this with the
starting and ending configurations of the trajectory. The results of theses calculations
are summarized in Table 3.13 in the column for paired geodesics.
To determine if the breadth of the paired, direct distribution is an artifact of our
statistics, we select a random subset of 1/4 of the geodesics and repeat the measurement.
If the breadth is statistical, we expect the standard deviation to double. However, we
find that the mean and variance are essentially identical, suggesting that the origin of
the width is indeed physical.
Our primary finding is that dynamical correlation has no impact on the distribution
13 Thenumber of trajectories is different than that reported in Table 3.10 because this is a different set
of molecular dynamics trajectories; see above.
14 And for this reason do we go through all the trouble described in Section 2.3.3 to sample the roaming
region via a random walk.
100
of total length for direct geodesics. The data suggest that there is perhaps a modest
impact for the distribution of total lengths for roaming geodesics. That the dynamically
correlated (paired) endpoints produce a broader distribution of roaming lengths than the
unpaired endpoints may be tied to in the degree to which the space is connected (see
Section 3.6). Our algorithm successfully finds roaming geodesics between uncorrelated
endpoints less often than for correlated endpoints. The success rate for paired geodesics
here is 94%, while we would expect closer to 70% for the same energy with unpaired
endpoints (see Table 3.12). It is likely that all sub-spaces of the roaming region are not
uniformly connected.
The possibility of disconnect aside, qualitatively, we see that the distribution of
lengths of roaming geodesics is much broader than for direct geodesics even when
computed between dynamically linked endpoints. This is the essential observation
upon which our analyses rest and provided that we are consistent in our methods, we
expect the impacts of dynamical correlation to be minimal.
3.6.4 Trends with increasing landscape energy
At low energies, we observe that the geodesics corresponding to roaming pathways
have a broad distribution of kinematic lengths. Direct and radical geodesics seem much
more narrowly constrained and have sharply peaked distributions in kinematic length.
We conclude then that roaming has an entropic advantage over the other dissociative
channels at low energy. In this section we explore what happens as energy increases.
Figure 3.16 gives the distributions for the total kinematic length for geodesics in the
bird’s nest at the upper end of our energy scale, 45000 cm–1 (in analogy to Figure 3.13,
which shows low-energy results). The roaming distribution at high energy could not
be more changed from its low-energy form. It has narrowed and the mean kinematic
length has shortened considerably. The distribution of kinematic lengths for roaming
geodesics at 45000 cm–1 more closely resembles the radical distribution at all other
sampled energies than it resembles the roaming distribution at 35000 cm–1 . It seems
that the singular aspect of the roaming pathway has diminished at this high energy.
In Figure 3.17, we see distributions of kinematic length for each of the three channels
at all of the energies under study. The distributions for energies intervening between
35000 cm–1 and 45000 cm–1 effectively interpolate between the extremal distributions.
101
Probability Density × a0 me × 103
12
Roaming
10
Radical
√
8 Direct
6
4
2
0
150 300 450 600 750
√
45000 cm-1 : Length / a0 me
Figure 3.16: Total length distributions in the high-energy limit
Probability distributions for total kinematic lengths of geodesics calculated at EL =
45000 cm–1 within the bird’s nest. The figure is a high-energy analog of Figure 3.13,
which was for EL = 35000 cm–1 . In stark contrast to the low-energy figure, here we
see the distribution of roaming kinematic collapsed towards those of radical and direct
dissociation.
The primary observation is that with increase in landscape energy, the roaming distribu-
tions of lengths converge towards the distribution of radical lengths. The distributions
for the radical and direct pathways are largely unchanged. It appears as though the
entropic advantage possessed by the roaming channel at low energy is lost as the energy
increases. The information in the figure is recapitulated in Table 3.14, which gives the
mean and standard deviation of the distributions by energy and channel. Though it is
not easy to discern from Figure 3.17, the tabulated lengths show that the mean length
of the direct geodesics does indeed decrease with rising landscape energy.
In the component length distributions (Figure 3.18), we see much the same high
102
Figure 3.17: Convergence of total length distributions
Total kinematic length distributions as a function of energy for geodesics while confined
to the bird’s nest. The distributions of lengths of direct geodesics are shown in dotted
red. The dashed blue curves give the radical distributions. Roaming distributions are
the black fields with solid, yellow bounds. While the direct and radical distributions are
relatively static with increasing energy, the roaming distribution undergoes a dramatic
shift towards a narrower distribution with smaller mean. The collapse towards the
radical length distribution signals that the roaming channel no longer has the entropic
advantage it did at lower energies.
103
Energy / cm–1 direct radical roaming
35000 95.79 ± 38.47 149.2 ± 79.05 300.0 ± 132.4
36000 93.37 ± 39.35 148.9 ± 78.24 283.2 ± 131.1
37600 86.29 ± 36.49 141.8 ± 73.58 283.1 ± 133.7
38814 89.89 ± 39.21 144.1 ± 73.81 273.7 ± 135.3
41010 89.80 ± 40.18 141.2 ± 72.64 220.2 ± 124.7
45000 86.64 ± 38.83 126.9 ± 69.56 158.1 ± 91.4
Table 3.14: Mean total kinematic length for geodesics by channel and energy
We give the mean and standard deviation for all the distributions of total kinematic
length given in Figures 3.17 (and 3.13 and 3.13). As there, these are lengths for the
geodesics while in the bird’s nest. Kinematic lengths are reported as mean ± standard-
deviation in units of a0 m1/2
e . From the table as from the figures, we see the pronounced
contraction of the distribution of roaming kinetic lengths with increasing energy.
energy trend. The roaming distributions have moved toward the radical distribution. For
the H2 degrees of freedom, the roaming and radical distributions are nearly coincident.
While the overlap for the CO degrees of freedom is less pronounced, comparison with
the distribution at 35000 cm–1 , (Figure 3.14) shows that the reorganization is still quite
dramatic.
The picture coming together is that at high energies, roaming dynamics during the
predissociation incubation period mirror those of radical dissociation, particularly in
the terms of the motions of the hydrogen atoms. We can again look to the distribution
of boundary fractions to understand the way that the energy surface influences these
dynamics. Figure 3.19 shows the high energy result in detail and Figure 3.20 shows
all the energies in our study. At 45000 cm–1 , the distribution of roaming boundary
fractions shows marked depression, a trend also visible in Table 3.16. From the figures
and from Table 3.15 we also see that much of the density has moved into the delta
peak at hΘi = 0.
The roaming boundary fraction density at high energy, most closely resembles the
distribution for the radical boundary fraction at 35000 cm–1 (Figure 3.15). The large
obstacles that broadened the roaming length distributions at lower energies are gone
and have lost their influence.
104
Probability Density × a0 me × 103
15 CO CO
Rotation Vibration
10
√
5
0
H2 H2
15
Rotation Vibration
10
5
0
0 100 200 300 0 100 200 √ 300
45000 cm-1 : Length / a0 me
Figure 3.18: Component length distributions in the high-energy limit
Offering a more detailed view of Figure 3.16, we show the component kinematic lengths
for geodesics computed at EL = 45000 cm–1 in the bird’s nest. Direct geodesics are
in dotted red, radical geodesics are in dashed blue, and roaming geodesics are in
solid black. Comparing the roaming distributions above to their low-energy analogs in
Figure 3.14, we see the same dramatic change that was evident in the distribution of
total kinematic lengths in Figure 3.17. Here, as in Figure 3.16, the roaming distributions
now bear striking similarity (rather than difference) to the radical distribution.
105
3.0
Direct
2.5
Radical
Probability Density
2.0 Roaming
1.5
1.0
0.5
0.0
0.25 0.50 0.75 1.00
45000 cm-1 : Boundary Fraction, hΘi
Figure 3.19: Boundary fraction in the high-energy limit
Probability distributions for the fraction of length spent tracing boundaries of the
potential surface within the bird’s nest for geodesics computed at EL = 45000 cm–1 .
In this high-energy limit, the contrast evident in Figure 3.15, between the roaming
distribution and those of the direct and radical channels, has faded. The fraction of each
distribution contributing to the delta function at hΘi = 0 has grown for all channels, but
by the largest amount for the roaming channel. The portion of the density identically 0
for roaming geodesics has climbed to 0.48; for radical and direct geodesics the figures
are respectively 0.76 and 0.84 (see Table 3.15).
106
Figure 3.20: Convergence of distribution of boundary fractions
Boundary fraction distributions as a function of energy for geodesics while confined
to the bird’s nest. The distributions for direct geodesics are shown in dotted red. The
dashed blue curves give the radical distributions. Roaming distributions are the black
fields with solid, yellow bounds. Similarly to the total length distributions shown in
Figure 3.17, the direct and radical distributions change little as energy increases, but
the roaming distribution essentially inverts it’s center: over the energy range we study,
the mean boundary fraction for roaming shifts from hΘi = 0.45 at EL = 35000 cm–1
to hΘi = 0.17 at EL = 45000 cm–1 . At higher landscape energies, there is a paucity
of large obstacles for the roaming geodesics to impact as they make their way to the
roaming region. The high-energy landscape encountered by roaming paths is more
similar to that of the radical and direct paths than it is different.
107
Energy / cm–1 direct radical roaming
35000 0.741 0.556 0.134
36000 0.766 0.540 0.177
37600 0.828 0.587 0.182
38814 0.799 0.593 0.210
41010 0.810 0.640 0.293
45000 0.843 0.758 0.476
Table 3.15: Portion of identically zero boundary fraction vs. energy
As shown in Figures 3.15, 3.19, and 3.20, all of the distributions of boundary fraction,
hΘi, have some non-trivial portion of the density identically 0. That is, at all energies, for
all channels, some geodesics exit the bird’s nest without encountering an obstacle. The
table above gives the proportion of each distribution that makes up the delta function
contribution at hΘi = 0 by channel and energy.
Energy / cm–1 direct radical roaming
35000 0.0572 0.149 0.446
36000 0.0539 0.155 0.391
37600 0.0383 0.134 0.375
38814 0.0450 0.131 0.351
41010 0.0438 0.111 0.284
45000 0.0369 0.0707 0.165
Table 3.16: Mean boundary fraction vs. energy
For the boundary fractions shown in Figures 3.15, 3.19, and 3.20, we give the means
of the distributions as a function of energy by channel. Evident are modest and barely
perceptible shifts for the radical and direct channels as well as the transformation in the
boundary fraction distribution for roaming geodesics.
108
Energy / cm–1 Sdirect /kB Sradical /kB Sroaming /kB
35000 4.6448 5.3179 5.9962
36000 4.6744 5.3312 5.9507
37600 4.6439 5.2765 5.9977
38814 4.7565 5.2986 5.9996
41010 4.7763 5.2935 5.8328
45000 4.7739 5.2168 5.4787
Table 3.17: Entropy of total length distributions vs. energy
3.6.5 Convergence of entropy of distributions with energy
We suspect that the broad, roaming distributions in kinematic length are of higher
entropy than the sharply peaked distributions for the direct or radical channels. Recall
that the entropy of a continuous distribution, ρ(`), is given by (eq. 2.123):
Z
S/kB = – d`ρ(`) log [ρ(`)] (3.12)
Recall also, for dimensioned ρ(`), there is always an undetermined and constant off-
set, which render the absolute magnitudes of entropy without meaning, though the
differences do convey information. Following the method in Section 2.8 to estimate the
entropy of a sampled distribution, we compute and compare the entropies of the length
distributions as a function of energy. Figures 3.21 and 3.22 give the results of such
analysis for total length and component distributions within the bird’s nest respectively;
Table 3.17 gives the numeric results for the total lengths.
The qualitative observations of the previous section are largely borne out quantita-
tively. Firstly, the roaming distributions for total length, have entropy of substantively
larger absolute magnitude than the distributions for direct or radical dissociation. This is
consistent with our notion of roaming having some kind of entropic advantage. We also
see that the roaming channel has the greatest shift in entropy. Over the 10000 cm–1
that span our energies of study, roaming entropy drops by roughly 0.5 entropy units.
Over the same energy window, the direct and radical channels have much smaller
change: on the order of 0.1 units. The magnitude of the change for the roaming channel
is substantial. We also see the roaming entropy converging towards that of the radical
distribution for total length. In the component distributions for the CO and H2 degrees
109
of freedom, we see much the same story with higher roaming entropy, and entropies
for the roaming channel converging towards that of the radical channel.
110
6.00
Roaming
5.75
5.50
Radical
S/kB
5.25
5.00
4.75 Direct
35000 37500 40000 42500 45000
Energy / cm-1
Figure 3.21: Entropy of total length distributions
Absolute magnitude of entropy of total kinematic length distributions of geodesics in
the bird’s nest. Here we see that the entropy of the roaming distributions is greater
than for either of the other channels. We also see that change in magnitude is much
larger for roaming than for the direct pathway or the radical paths. As the entropy of
the roaming distribution decreases with energy, we see that the entropy of the roaming
distribution does indeed approach that of the radical distribution.
111
6
5
4
3
2 CO Rotation CO Vibration
S/kB
6
5
4 H2 Rotation
3
2 H2 Vibration
35 40 45 35 40 45
3
EL / 10 cm -1
Figure 3.22: Entropy of component length distributions
Entropies for distributions of component lengths for the three channels while in the
bird’s nest: roaming in solid black, radical in dashed blue, and direct in dotted red.
Distributions are formed for the lengths, `α as described in Sections 2.4.2 and 2.4.3.
Again, we use the component kinematic lengths to provide more detail to the picture
from the total kinematic length. As in Figure 3.21, we see that the absolute magnitude
of the entropy of the roaming distributions is higher than for geodesics for either the
direct or radical channels. We also see that in the CO as well as H2 degrees of freedom,
entropy of the roaming distributions approaching that of the radical channel.
The distinction between the CO and H2 degrees of freedom is particularly of note—such
a divergence is usually not present. The CO entropies tell much the same story as the
entropy of the total distributions. However, in the H2 components, the direct pathway
is widely separated from the radical and roaming channels. We also see decline in
the H2 entropies for the direct pathway at low energies before leveling off; roaming
entropies, in contrast, are roughly constant at low energies before declining at higher
energies.
112
3.7 Roaming as an entropically mediated phenomenon
The foregoing observations, based on the statistical properties of geodesics on the
formaldehyde energy surface lead us to conclude that, at least in formaldehyde, roaming
is an entropically mediated phenomenon.
At lower energies, below 38814 cm–1 , dynamical effects dominate—the competition
between the radical and direct channels is evident in this energy range in Figure 3.4.
Once the dynamical effects have stabilized, statisitical effects become dominant. At
energies of 38814 cm–1 and highter, the branghing ratios for the direct and radical
channels are roughly constant, but roaming begins a decline, roughly halving in fre-
quency up to 45000 cm–1 . Just as the incidence of roaming drops from 38814 cm–1 ,
so too does the entropy of the distribution of lengths of roaming geodesics decline; see
Figure 3.21.
The entropy of distributions of geodesic lengths in the bird’s nest predict the direction
of change in the branching ratio to roaming. This is a remarkable result. Even though the
geodesics have no knowledge of the future course the system will take, their distribution
through the bird’s nest reveals the relative change in frequency of roaming.
CHAPTER 4
CONCLUDING REMARKS
At the outset, our goal in this work was to contribute to the understanding of the
story of roaming in formaldehyde. We take the perspective that such understanding
means apprehending the dynamical peculiarities that give rise to roaming and comes
from characterizing the inherent dynamics of the formaldehyde system. The geodesic
formalism enables us to globally probe the potential energy landscape of formaldehyde.
Through this window, we see that roaming dynamics are of greater entropy—are far
less constrained—than the pathways giving rise to direct or radical dissociation. Our
observation may shine some light on why roaming seems to be prevalent in many
systems. From the landscape perspective, all that is required for roaming-like (broadly
distributed) geodesics is a sufficiently flat, asymptotic region of the potential in a system
of high enough dimensionality that arriving there is an entropic (navigational) challenge.
It is then not at all surprising that roaming is a recurrent phenomenon.
4.1 New methods, new observations
We have provided evidence that roaming in formaldehyde is an entropically mediated
mechanism. The roaming system has a far greater range of dynamical options than when
dissociating via either the direct pathway to molecular products or by the radical channel.
This entropic advantage correlates with the incidence of roaming in formaldehyde.
Indeed, as the entropy difference between distributions of geodesic lengths for the
radical and roaming channels decreases with energy, the frequency of roaming in the
113
114
system also decreases.
In our work, we also expand the range of dynamical systems to be treated with
geodesics to include small molecules. We derive highly accurate lower bounds on
the lengths of these small-molecule geodesics. We also visualize geodesic paths such
that their correspondence to the inherent dynamics is clear. Our study is the first time
geodesics have been used in a system with heterogeneous masses. We confront and
resolve the attendant problems associated with center of mass conservation and present
two other general methods for doing so.
4.2 Future directions
Clearly the first angle of attack for extending our work involves quantitatively connecting
the entropic relationships we have established to other observables. It may be that the
mutual information between the distributions of roaming, radical, and direct geodesic
lengths can be tied to the relative incubation periods or branching ratios for each channel.
Such a connection would be of interest, not just to those theorists and experimentalists
desiring to calculate these properties, but also to those wishing to understand the ways
that geodesics carry information about the dynamics of small-molecule systems. The
theory of large deviations [80] or even ideas of surprisal analysis [81] may be fruitful
sources of direction.
Another important avenue of inquiry is repeating some of this work in other roaming
systems. Acetaldehyde, CH3 HCO, is an obvious target because of its structural similarity
to formaldehyde and because a global potential surface has already been developed [33].
As an aside, such a potential is not required as our methods are amenable to “on-the-fly”
calculations of potentials and gradients from first principles.
While the notion of roaming as aborted radical dissociation is not useful for un-
derstanding the dynamical influences at play, it does make a useful point about the
energetics involved: roaming fragments have insufficient translational energy to es-
cape the parent. Surely, during the incubation period prior to dissociation, energy is
distributed throughout different degrees of freedom of formaldehyde. This internal
energy redistribution is part of what controls whether or not the system will roam. By
centering our analysis in the bird’s nest, we have focused our attention squarely on
115
this critical period. One might wonder what geodesics can say about the organization
of energy flow during this time. Even though geodesics are static objects and exist in
configuration-space, where there is no notion of time or speed, it may be that changes
in their direction are correlated with energy transfer. Some hope for this thought is
provided by Miller’s work on the Reaction Path Hamiltonian [67], where the curvature
of an intrinsic reaction coordinate appears as a term in the kinetic energy.
In recent years, there has been some hand-wringing [82, 83] over the notion of “post
transition state bifurcations”—instances where multiple, chemically distinct products
arise after traversing what can be identified only as a single transition state. The authors
are surely sensible in pointing out that dynamical factors and not just potential barriers
and basins are influencing the outcomes of these reactions. However, they may be
premature in suggesting that the apparent futility of studying the structure of the surface
leaves us no tools other than trajectory simulations. As we have seen, even though
individual geodesics cannot forecast the future, by globally characterizing the H2 CO
surface and analyzing their properties while confined to the bird’s nest we are still able
to discriminate between the reactive channels, which are yet-to-be-chosen. The entire
history and future of a classical trajectory are specified by a single point in phase space
and knowledge of the potential. Because geodesics offer a global perspective on that
potential, they may be well positioned to shed light on what happens after a transition
state by looking before it.
Bibliography
[1] D. V. Cofer-Shabica, R. M. Stratt. What is special about how roaming chemical
reactions traverse their potential surfaces? Differences in geodesic paths between
roaming and non-roaming events. The Journal of Chemical Physics 2017, 146,
214303.
[2] J. M. Budarz, M. P. Minitti, D. V. Cofer-Shabica, B. Stankus, A. Kirrander, J. B.
Hastings, P. M. Weber. Observation of femtosecond molecular dynamics via
pump-probe gas phase x-ray scattering. Journal of Physics B: Atomic Molecular
and Optical Physics 2016, 49.
[3] D. V. Cofer-Shabica, R. M. Stratt, The geometries of potential energy landscapes
imply dynamical signatures for roaming reactions, American Chemical Society,
250th National Meeting, Boston, MA, 2015, PHYS 554.
[4] M. P. Minitti, J. M. Budarz, A. Kirrander, J. Robinson, T. J. Lane, D. Ratner, K. Saita,
T. Northey, B. Stankus, V. Cofer-Shabica, J. Hastings, P. M. Weber. Toward structural
femtosecond chemical dynamics: Imaging chemistry in space and time. Faraday
Discussions 2014, 171, 81–91.
[5] D. V. Cofer-Shabica, Wandering Molecules, Brown University, Research Mat-
ters, Providence, RI, 2014. Invited Talk. https://www.youtube.com/watch?v=
X3xyMP9EAco.
[6] D. Townsend, S. A. Lahankar, S. K. Lee, S. D. Chambreau, A. G. Suits, X. Zhang,
J. Rheinecker, L. B. Harding, J. M. Bowman. The Roaming Atom: Straying from the
Reaction Path in Formaldehyde Decomposition. Science 2004, 306, 1158–1161.
116
117
[7] J. M. Bowman, B. C. Shepler. Roaming Radicals. Annu. Rev. Phys. Chem. 2011,
62, 531–553.
[8] C. Wang, R. M. Stratt. Global perspectives on the energy landscapes of liquids,
supercooled liquids, and glassy systems: Geodesic pathways through the potential
energy landscape. The Journal of Chemical Physics 2007, 127, 224504.
[9] R. D. van Zee, M. F. Foltz, C. B. Moore. Evidence for a second molecular channel
in the fragmentation of formaldehyde. J. Chem. Phys. 1993, 99, 1664–1673.
[10] E. Wigner. The transition state method. Trans. Faraday Soc. 1938, 34, 29–41.
[11] W. M. Gelbart, M. L. Elert, D. F. Heller. Photodissociation of the formaldehyde
molecule: does it or doesn’t it? Chemical Reviews 1980, 80, 403–416.
[12] K.-M. Weitzel. On the distinction between tight and loose transition states in
unimolecular dissociations. Chemical Physics Letters 1991, 186, 490 – 494.
[13] X. Li, J. M. Millam, H. B. Schlegel. Ab initio molecular dynamics studies of
the photodissociation of formaldehyde, H2 CO −−→ H2 + CO: Direct classical
trajectory calculations by MP2 and density functional theory. The Journal of
Chemical Physics 2000, 113, 10062–10067.
[14] J. Rheinecker, X. Zhang, J. Bowman. Quasiclassical trajectory studies of the dy-
namics of H2CO on a global ab initio-based potential energy surface. Molecular
Physics 2005, 103, 1067–1074.
[15] P. Zhang, S. Maeda, K. Morokuma, B. J. Braams. Photochemical reactions of the low-
lying excited states of formaldehyde: T1/S0 intersystem crossings, characteristics of
the S1 and T1 potential energy surfaces, and a global T1 potential energy surface.
J. Chem. Phys. 2009, 130, 114304.
[16] S. A. Lahankar, S. D. Chambreau, X. Zhang, J. M. Bowman, A. G. Suits. Energy
dependence of the roaming atom pathway in formaldehyde decomposition. The
Journal of Chemical Physics 2007, 126, 044314.
118
[17] S. A. Lahankar, V. Goncharov, F. Suits, J. D. Farnum, J. M. Bowman, A. G. Suits.
Further aspects of the roaming mechanism in formaldehyde dissociation. Chem.
Phys. 2008, 347, 288–299.
[18] A. C. Terentis, S. E. Waugh, G. F. Metha, S. H. Kable. HCO (N,Ka,Kc,J) distributions
from near-threshold photolysis of H2CO (J,Ka,Kc). The Journal of Chemical Physics
1998, 108, 3187–3198.
[19] J. M. Bowman. Roaming. Mol. Phys. 2014, 112, 2516–2528.
[20] P. L. Houston, S. H. Kable. Photodissociation of acetaldehyde as a second example
of the roaming mechanism. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 16079–
16082.
[21] B. R. Heazlewood, M. J. T. Jordan, S. H. Kable, T. M. Selby, D. L. Osborn, B. C.
Shepler, B. J. Braams, J. M. Bowman. Roaming is the dominant mechanism for
molecular products in acetaldehyde photodissociation. Proceedings of the Na-
tional Academy of Sciences USA 2008, 105, 12719–12724.
[22] M. P. Grubb, M. L. Warter, A. G. Suits, S. W. North. Evidence of Roaming Dynamics
and Multiple Channels for Molecular Elimination in NO3 Photolysis. J. Phys. Chem.
Lett. 2010, 1, 2455–2458.
[23] M. P. Grubb, M. L. Warter, H. Xiao, S. Maeda, K. Morokuma, S. W. North. No
Straight Path: Roaming in Both Ground- and Excited-State Photolytic Channels of
NO3 −−→ NO + O2 . Science 2012, 335, 1075–1078.
[24] B. Fu, J. M. Bowman, H. Xiao, S. Maeda, K. Morokuma. Quasiclassical Trajectory
Studies of the Photodissociation Dynamics of NO3 from the D-0 and D-1 Potential
Energy Surfaces. J. Chem. Theory Comput. 2013, 9, 893–900.
[25] D. L. Osborn. Exploring Multiple Reaction Paths to a Single Product Channel. Adv.
Chem. Phys. 2008, 138, 213–265.
[26] L. B. Harding, S. J. Klippenstein. Roaming Radical Pathways for the Decomposition
of Alkanes. J. Phys. Chem. Lett. 2010, 1, 3016–3020.
119
[27] A. Li, J. Li, H. Guo. Quantum Manifestation of Roaming in H + MgH −−→ Mg +
H2 : The Birth of Roaming Resonances. J. Phys. Chem. A 2013, 117, 5052–5060.
[28] A. S. Mereshchenko, E. V. Butaeva, V. A. Borin, A. Eyzips, A. N. Tarnovsky. Roaming-
mediated ultrafast isomerization of geminal tri-bromides in the gas and liquid
phases. Nature Chemistry 2015, 7, 562–568.
[29] B. Carpenter. Intramolecular Dynamics for the Organic Chemist. Accts. Chem.
Res. 1992, 25, 520–528.
[30] S. J. Klippenstein, Y. Georgievskii, L. B. Harding. Statistical Theory for the Kinetics
and Dynamics of Roaming Reactions. J. Phys. Chem. A 2011, 115, 14370–14381.
[31] L. B. Harding, S. J. Klippenstein, A. W. Jasper. Ab initio methods for reactive
potential surfaces. Phys. Chem. Chem. Phys. 2007, 9, 4055–4070.
[32] L. B. Harding, Y. Georgievskii, S. J. Klippenstein. Roaming Radical Kinetics in the
Decomposition of Acetaldehyde. J. Phys. Chem. A 2010, 114, 765–777.
[33] B. C. Shepler, B. J. Braams, J. M. Bowman. “Roaming” Dynamics in CH3CHO
Photodissociation Revealed on a Global Potential Energy Surface. J. Phys. Chem.
A 2008, 112, 9344–9351.
[34] D. U. Andrews, S. H. Kable, M. J. T. Jordan. A Phase Space Theory for Roaming
Reactions. J. Phys. Chem. A 2013, 117, 7631–7642.
[35] B. C. Shepler, Y. Han, J. M. Bowman. Are Roaming and Conventional Saddle
Points for H2CO and CH3CHO Dissociation to Molecular Products Isolated from
Each Other? J. Phys. Chem. Lett. 2011, 2, 834–838.
[36] L. B. Harding, S. J. Klippenstein, A. W. Jasper. Separability of Tight and Roaming
Pathways to Molecular Decomposition. J. Phys. Chem. A 2012, 116, 6967–6982.
[37] K. Fukui. Formulation of the reaction coordinate. J. Phys. Chem. 1970, 74, 4161–
4163.
[38] H. Xiao, S. Maeda, K. Morokuma. Global ab Initio Potential Energy Surfaces for
Low-Lying Doublet States of NO3. J. Chem. Theory Comput. 2012, 8, 2600–2605.
120
[39] Z. Homayoon, J. M. Bowman. Quasiclassical Trajectory Study of CH3NO2 De-
composition via Roaming Mediated isomerization Using a Global Potential Energy
Surface. J. Phys. Chem. A 2013, 117, 11665–11672.
[40] H. Waalkens, S. Wiggins. Geometrical models of the phase space structures
governing reaction dynamics. Regular and Chaotic Dynamics 2010, 15, 1–39.
[41] F. A. Mauguière, P. Collins, G. S. Ezra, S. C. Farantos, S. Wiggins. Multiple transition
states and roaming in ion-molecule reactions: A phase space perspective. Chem.
Phys. Lett. 2014, 592, 282–287.
[42] F. A. Mauguière, P. Collins, G. S. Ezra, S. C. Farantos, S. Wiggins. Roaming dynam-
ics in ion-molecule reactions: Phase space reaction pathways and geometrical
interpretation. J. Chem. Phys. 2014, 140.
[43] F. A. Mauguière, P. Collins, Z. C. Kramer, B. K. Carpenter, G. S. Ezra, S. C. Faran-
tos, S. Wiggins. Phase Space Structures Explain Hydrogen Atom Roaming in
Formaldehyde Decomposition. The Journal of Physical Chemistry Letters 2015, 6,
4123–4128.
[44] F. A. Mauguière, P. Collins, Z. C. Kramer, B. K. Carpenter, G. S. Ezra, S. C. Farantos,
S. Wiggins. Roaming: A Phase Space Perspective. Annual Review of Physical
Chemistry 2017, 68, 499–524.
[45] K. Müller, L. D. Brown. Location of saddle points and minimum energy paths by a
constrained simplex optimization procedure. Theoretica chimica acta 1979, 53,
75–93.
[46] C. Wang, R. M. Stratt. Global perspectives on the energy landscapes of liquids,
supercooled liquids, and glassy systems: The potential energy landscape ensemble.
The Journal of Chemical Physics 2007, 127, 224503.
[47] W. E, E. Vanden-Eijnden. Transition-Path Theory and Path-Finding Algorithms for
the Study of Rare Events. Annu. Rev. Phys. Chem. 2010, 61, 391–420.
[48] C. N. Nguyen, J. I. Isaacson, K. B. Shimmyo, A. Chen, R. M. Stratt. How dominant
is the most efficient pathway through the potential energy landscape of a slowly
diffusing disordered system? The Journal of Chemical Physics 2012, 136, 184504.
121
[49] C. N. Nguyen, R. M. Stratt. Preferential solvation dynamics in liquids: How
geodesic pathways through the potential energy landscape reveal mechanistic
details about solute relaxation in liquids. The Journal of Chemical Physics 2010,
133, 124503.
[50] D. Jacobson, R. M. Stratt. The inherent dynamics of a molecular liquid: Geodesic
pathways through the potential energy landscape of a liquid of linear molecules.
The Journal of Chemical Physics 2014, 140, 174503.
[51] Q. Ma, R. M. Stratt. Potential energy landscape and inherent dynamics of a hard-
sphere fluid. Phys. Rev. E 2014, 90, 042314.
[52] L. Frechette, R. M. Stratt. The inherent dynamics of isotropic- and nematic-phase
liquid crystals. The Journal of Chemical Physics 2016, 144, 234505.
[53] D. Wales, Energy Landscapes: Applications to Clusters, Biomolecules and Glasses,
of Cambridge Molecular Science, Cambridge University Press, 2003.
[54] X. Zhang, S. Zou, L. B. Harding, J. M. Bowman. A Global ab Initio Potential
Energy Surface for Formaldehyde. The Journal of Physical Chemistry A 2004, 108,
8980–8986.
[55] X. Zhang, J. L. Rheinecker, J. M. Bowman. Quasiclassical trajectory study of
formaldehyde unimolecular dissociation: H2CO → H2+CO, H+HCO. The Jour-
nal of Chemical Physics 2005, 122, 114313.
[56] H. W. Kuhn, A. W. Tucker, Nonlinear Programming in Proceedings of the Second
Berkeley Symposium on Mathematical Statistics and Probability, University of
California Press, Berkeley, California, pp. 481–492.
[57] C. Eckart. Some Studies Concerning Rotating Axes and Polyatomic Molecules.
Phys. Rev. 1935, 47, 552–558.
[58] P. L. Houston, R. Conte, J. M. Bowman. Roaming Under the Microscope: Trajectory
Study of Formaldehyde Dissociation. The Journal of Physical Chemistry A 2016,
120, 5103–5114.
122
[59] G. H. Peslherbe, H. Wang, W. L. Hase, Monte Carlo Sampling for Classical Trajec-
tory Simulations, Vol. 105 of Advances in Chemical Physics, John Wiley & Sons,
Inc., 1999, pp. 171–201.
[60] H. Goldstein, Classical Mechanics, of Addison-Wesley series in physics, Addison-
Wesley Publishing Company, Reading, MA, USA, 1980.
[61] M. Abramowitz, I. Stegun, Handbook of Mathematical Functions: With Formulas,
Graphs, and Mathematical Tables, of Applied mathematics series, Dover Publica-
tions, 1964.
[62] M. Galassi, J. Davies, J. Theiler, B. Gough, G. Jungman, P. Alken, M. Booth, F. Rossi,
GNU Scientific Library Reference Manual 3rd ed., Network Theory, Ltd., 2009,
http://www.gnu.org/software/gsl.
[63] U. M. Ascher, L. R. Petzold, Computer Methods for Ordinary Differential Equa-
tions and Differential-Algebraic Equations, SIAM: Society for Industrial and Ap-
plied Mathematics, 1998.
[64] J. W. Brady, J. D. Doll, D. L. Thompson. Cluster dynamics: A classical trajectory
study of A*n → An-1+A. The Journal of Chemical Physics 1981, 74, 1026–1028.
[65] M. Araujo, B. Lasorne, M. J. Bearpark, M. A. Robb. The photochemistry of formalde-
hyde: Internal conversion and molecular dissociation in a single step? Journal of
Physical Chemistry A 2008, 112, 7489–7491.
[66] C. Cerjan, W. Miller. On finding transition-states. Journal of Chemical Physics
1981, 75, 2800–2806.
[67] W. H. Miller, N. C. Handy, J. E. Adams. Reaction path Hamiltonian for polyatomic
molecules. The Journal of Chemical Physics 1980, 72, 99–112.
[68] J. Nelder, R. Mead. A simplex-method for function minimization. Computer Journal
1965, 7, 308–313.
[69] S. van der Walt, S. C. Colbert, G. Varoquaux. The NumPy Array: A Structure for
Efficient Numerical Computation. Computing in Science & Engineering 2011, 13,
22–30.
123
[70] S. Peyton Jones, et al.. The Haskell 98 Language and Libraries: The Revised Report.
Journal of Functional Programming 2003, 13, 0–255.
[71] M. Matsumoto, T. Nishimura. Mersenne twister: a 623-dimensionally equidis-
tributed uniform pseudo-random number generator. ACM Transactions on Mod-
eling and Computer Simulation 1998, 8, 3–30.
[72] O. Tange. GNU Parallel - The Command-Line Power Tool. ;login: The USENIX
Magazine 2011, 36, 42–47.
[73] D. V. Cofer-Shabica, Source code from ‘Potential landscape perspectives on roam-
ing: Insights on formaldehyde from geodesics paths’, C & Haskell source code,
Brown University, 2018, doi:10.7301/Z0FT8JJ6.
[74] D. Eberly, Derivative Approximation by Finite Differences, Tech. Rep., Geo-
metric Tools, LLC, 2001, https://www.geometrictools.com/Documentation/
FiniteDifferences.pdf.
[75] T. Cover, J. Thomas, Elements of Information Theory, Wiley, 2006.
[76] H. Theil. The entropy of the maximum entropy distribution. Economics Letters
1980, 5, 145 – 148.
[77] D. R. Yarkony. Nuclear dynamics near conical intersections in the adiabatic repre-
sentation: I. The effects of local topography on interstate transitions. The Journal
of Chemical Physics 2001, 114, 2601–2613.
[78] D. G. Truhlar, B. C. Garrett. Variational transition-state theory. Accounts of Chem-
ical Research 1980, 13, 440–448.
[79] W. L. Hase, Dynamics of Unimolecular Reactions, Vol. 2 of Modern Theoretical
Chemistry, Plenum Press, 1976, chapter 3.
[80] H. Touchette. The large deviation approach to statistical mechanics. Physics Re-
ports 2009, 478, 1–69.
[81] R. D. Levine, R. B. Bernstein, Thermodynamic Approach to Collision Processes,
Vol. 2 of Modern Theoretical Chemistry, Plenum Press, 1976, chapter 7.
124
[82] D. H. Ess, S. E. Wheeler, R. G. Iafe, L. Xu, N. Çelebi-Ölçüm, K. N. Houk. Bifur-
cations on Potential Energy Surfaces of Organic Reactions. Angewandte Chemie
International Edition 2008, 47, 7592–7601.
[83] Y. J. Hong, D. J. Tantillo. Biosynthetic consequences of multiple sequential post-
transition-state bifurcations. Nature Chemistry 2014, 6, 104–111.
[84] F. W. Wiegel, Introduction to Path-integral Methods in Physics and Polymer Sci-
ence, World Scientific Publishing Co Pte Ltd, Singapore, 1986.
[85] J. G. Simmonds, A Brief on tensor analysis, of Undergraduate texts in mathematics,
Springer-Verlag, New York, NY, USA, 1982.
[86] G. Golub, C. Van Loan, Matrix Computations, of Matrix Computations, Johns
Hopkins University Press, 2012.
[87] W. K. Hastings. Monte Carlo Sampling Methods Using Markov Chains and Their
Applications. Biometrika 1970, 57, 97–109.
[88] Information Technology Laboratory, Secure Hash Standard (SHS), Tech. Rep. FIPS
PUB 180-4, National institute for Standards and Technology, Gaithersburg, MD
20899-8900, 2012, http://csrc.nist.gov/publications/fips/fips180-4/
fips-180-4.pdf.
APPENDIX A
DERIVATION OF PATH-INTEGRAL
FORM OF THE EQUATION FOR
FREE DIFFUSION
We present a derivation and discussion of Wang and Stratt’s path-integral expression for
the diffusion propagator [8], which suggests geodesics on energy landscapes may be a
way of getting at the inherent dynamics of a chemical system. The propagator is given
as: Z R,t " Z t 2 #
1 dR
G R0 → R, t = e D R(τ ) exp – e dτ , (A.1)
e e R0 ,0 e 4D 0 dτ
e
which specifies the probability of transitioning from state R0 to R in time t for a system
under free diffusion (no forces) as an integral over the space of all paths, R(τ ). D is the
e e
diffusion constant and R is the configuration-space vector specifying the state of the
e
system. This appendix contains two parts: a derivation of the above expression and
e
then a review of the calculus of variations to derive the unconditionally-minimized,
straight-line portion of the geodesics.
125
126
A.1 Derivation
This section closely follows the first chapter of Weigel’s excellent introduction to path
integrals [84]. In the absence of external forces1 , diffusion is a process characterized by
the following dynamical relation:
d
G(R, t) = D∇2 G(R, t) , (A.2)
dt e e
where G(R, t) is the probability of finding a system in configuration R at time t and D is
the diffusion constant, which governs the intrinsic rate of diffusion.
e e
With the initial condition G(R, t = 0) = δ(R – R0 ), one can find the solution for a
Cartesian system of α degrees of freedom to be:
e e e
2
" #
(R – R 0 )
G(R0 → R, t) = (4πDt)–α/2 exp – e e , (A.3)
e e 4Dt
as can be verified by substitution.
Suppose we are interested in a particular path2 between R0 and R. We could
represent such a path by discretizing it over (N + 1) bits of time, each of duration ε,
e e
such that: (N + 1)ε = t. For notational ease, we define,
ti = ε · i
Ri = R t i ,
e e
giving the discrete time and system configuration along the path. We also have tN+1 = t.
This procedure is illustrated in figure A.1.
To compute the probability of tracing such a path, we rely on the independence
of the probabilities of taking each such step. The probability of the path is then the
probability of propagating from R0 to R1 in time ε multiplied by the probability of
e e
1 Studying free diffusion allows us to treat complicated systems from another perspective. We could
solve the diffusion equation in the presence of a complicated potential, but this is neither easy nor
transferable. Free diffusion is much more straight-forward. However, we still don’t get a free lunch:
the potential will impose stringent boundary conditions in a yet unspecified manner.
2 This may seem an odd question. Why are we creating more work if we have solved already the
diffusion equation? The answer lies in the formal expression to which we arrive if we study the
solution as a function of paths.
127
Figure A.1: Discretizing a Path
Note that while this figure traces the time evolution of a single spatial coordinate, paths
of arbitrary dimension are amenable to this decomposition. Figure modified from [84].
propagating from R1 to R2 in time ε and so on. By inserting eq. A.3 for each step, we
have:
e e
YN
G(R0 → R, t; {R1 , R2 . . . . RN }) = G(Ri → Ri+1 , ε)
e e e e e i=0 e e
– Ri ) 2
iN+1 N (R
i+1
= (4πDε)–α/2
h X
exp – e , (A.4)
4Dε
e
i=0
which gives the probability of diffusing from R0 to R in a time t via the N ordered points
{R1 , R2 . . . . RN }. We could recover our original expression (eq. A.3) by integrating3
e e
eq. A.4 over the domain of each Ri :
e e e
e
Z Z Z
G(R0 → R, t) = dR1 dR2 . . . dRN G(R0 → R, t; {R1 , R2 . . . . RN }) (A.5)
e e e e e e e e e e
Suppose we increased N until the path appeared smooth; in the continuous limit,
3 Recall the Gaussian integral: Z +∞ r
–ax2 π
dx e =
–∞ a
128
N → ∞ and ε → 0. Focusing for the moment on the exponential, we can write4 :
2
N
X i+1(R – R i )
lim exp – e = (A.6)
4Dε
e
ε→0
N→∞ i=0
!2
1 X N R i+1 – R i
lim exp – ε = (A.7)
4D ε
e e
ε→0
N→∞ i=0
Z t dR !2
1
exp – dτ , (A.8)
4D 0 dτ
e
where R(τ ) is a continuous function on [0, t] specifying the path. Now we can write:
e
G(R0 → R, t) =
dR 2
(A.9)
e e !
Z t
iN+1 Z 1
Z Z
–α/2
h
lim (4πDε) dR1 dR2 . . . dRN exp – dτ
4D 0 dτ
e
ε→0
N→∞ e e e
To express this more compactly, we define the following operator5 :
Z R,t Z Z Z
e D[R(τ )] ≡ lim (4πDε)–α(N+1)/2 dR1 dR2 . . . dRN , (A.10)
R0 ,0 e ε→0 e e e
e N→∞
which explicitly specifies the boundary values: R0 at t = 0 and R at time t. This is a
path integral. With this notation in hand, we arrive at our original equation:
e e
Z R,t " Z t 2 #
1 dR
G(R0 → R, t) = e D[R(τ )] exp – e dτ , (A.11)
e e R0 ,0 e 4D 0 dτ
e
4 Using the definition of a derivative:
df f(t + ε) – f(t)
= lim .
dt ε→0 ε
And the approximation for an integral:
N
X Z b
lim f (a + i∆τ ) ∆τ ≈ dτ f(τ ) ,
N→∞ a
i=0
where ∆τ = b–a N .
5 Nota bene, this “operator” is really just notational shorthand for the procedure we followed to arrive
here. Take care that any subsequent manipulations respect with the discretization—they must!
129
which expresses the diffusion propagator as an integral over the space of all possible
paths connecting the boundary values.
From eq. A.11, we can observe that when diffusion is slow, D is small and therefore
the paths that will dominate must minimize the integral within the exponential6 . Those
paths will be the shortest paths. The key observation is the following: the extremization
of any quantity of the form: Z
I= dτ f(τ ) (A.12)
can also be effected by extremizing7 :
Z
I0 = dτ H f(τ ) (A.13)
where H[x] is a strictly positive function with non-negative derivative, defined on the
range of f(τ ). Thus, minimizing
dR 2
Z t
dτ (A.14)
dτ
e
0
√
can be achieved by taking H[x] = x and minimizing
s
Z t 2
dR
dτ (A.15)
dτ
e
0
Bringing the differential into the root, eq. A.15 reduces to
Z tq Z t
dR · dR = (A.16)
dR
0 e e 0 e
which is clearly the length of the path.
A.2 Unconditional minimization
In this section, we follow Goldstein [60] to find the form of the (shortest) paths, which
unconditionally minimize the action. Again, we will work in Cartesian coordinates for
6 That is, the dominant paths will obey the classical mechanical principle of least action:
Z
δS R(τ ) = 0, S R(τ ) = dτ (2T) .
e e
From here it is a simple extension to systems with unequal masses and/or non-Cartesian bases.
7 Consider the chain rule with the variational derivative.
130
an otherwise unconstrained system. See Jacobson and Stratt [50] for work dealing with
free rotations and Frechette and Stratt [52] for a case where a constraint is applied to
the free steps.
Using calculus to find the extrema of a function, for instance h(x), is a familiar
operation. By equating the derivative of h(x) with 0, we locate values of x for which h(x)
is unchanging or stationary; these locations often correspond to local extrema of the
function. The problem treated by the calculus of variations is analogous. We seek to
extremize an object that takes a function (in our case, a path) as its argument. That is,
we find some path such that the object is stationary with respect to variations in that
function.
From eq. A.11, we seek to minimize an integral over kinetic energy, which in general
is a function of coordinates and their derivatives. Working in one dimension8 , we seek
to find the path R(τ ) that extremizes:
Z t
J= dτ f R, R˙ (A.17)
0
That is, we seek R(τ ) such that J is unchanged for infinitesimal variations in R. We can
represent the paths neighboring the correct (extremal) path, R(τ ) by adding an arbitrary
function, η(τ ) in proportion to a small variable parameter, α:
R(τ , α) = R(τ ) + αη(τ ) (A.18)
As we are interested in R such that it connects R(0) to R(t), we require that:
η(0) = η(t) = 0 (A.19)
J in eq. A.17 is now a function of α as well:
Z t
J(α) = dτ f R(τ , α), R(τ
˙ , α) (A.20)
0
We can encode the requirement for stationary J in the following familiar form9 :
dJ
0= (A.21)
dα α=0
8 This procedure extends to systems with an arbitrary number of degrees of freedom.
9 We require that the derivative be evaluated at α = 0 because this corresponds to the unperturbed
path we seek.
131
We can bring the α derivative inside the integral in eq. A.20 and applying the chain
rule, we have: !
Z t
dJ ∂f ∂R ∂f ∂ R˙
= dτ +
dα 0 ∂R ∂α ∂ R˙ ∂α
From eq. A.18, we have that ∂α ∂R = η and ∂ R˙ = η,
˙ the arbitrary function of τ and its
∂α
time derivative. This leaves:
Z t
dJ ∂f ∂f
= dτ η+ η˙ (A.22)
dα 0 ∂R ∂ R˙
Consider the second term in the integral:
Z t
∂f
dτ η˙
0 ∂ R˙
This expression is susceptible by integration by parts 10 Re-arranging leaves us with:
∂f t
Z t Z t
∂f d ∂f
dτ η˙ = η – dτ η
0 ∂ R˙ ∂ R˙ 0 0 dτ ∂ R˙
Because of the bounds on η, η(0) = η(t) = 0, the surface terms vanish! Inserting back
into eq. A.22 and factoring out η gives:
Z t
dJ ∂f d ∂f
= dτ – η=0 (A.23)
dα α=0
0 ∂R dτ ∂ R˙
where evaluating the derivative at α = 0 gives R and R˙ as functions of τ alone. Since η
is an arbitrary function, the integral is stationary only if
∂f d ∂f
– =0 (A.24)
∂R dτ ∂ R˙
which gives the conditions on R to minimize J11 .
10 Recall the general form,
Z b Z b
udv = uv|ba – vdu
a a
And identify:
∂f d ∂f
u= du = dτ
∂ R˙ dτ ∂ R˙
dv = ηdτ
˙ v=η
11 Notice that inserting the Lagrangian for f gives S = J, the action, and yields Lagrange’s equations of
motion.
132
Minimizing The Force-Free Action
2
dR
From eq. A.11, we have f = dτe
in eq. A.24. Proceeding, we find that the path
which minimizes eq. A.11 is specified by the differential equation:
¨=0
R (A.25)
e
which, when integrated and solved for its boundary values, yields the equation for a
straight line (and the geodesic!):
τ τ
R(τ ) = R(0) 1 – + R(t) (A.26)
e e t e t
this amounts to linearly interpolating between R(0) and R(t) in time t for each of the 3N
coordinates. To actually compute such paths we discretize12 eq. A.26 as follows:
e e
R – Ri
Ri+1 = Ri + δs e f e i (A.27)
e e |Rf – R |
e e
Where Rf = R(t) and δR is a small step-size. Small here is relative to features of the
landscape. In practice, δR should be as large as it can be such that the lengths of
e e
computed geodesics are still converged to a maximum; see discussion in Section 2.2.2.
12 Consider R(τ + ∆) – R(τ ).
e e
APPENDIX B
T WO MORE WAYS OF PRESERVING
THE CENTER OF MASS DURING
ESCAPE STEPS
Vanilla Newton-Raphson
When the procedure for free steps described in Section 2.2 (eq. 2.9) moves the system
such that V(R(τ )) > EL , we need a way back to the boundary. That is, we seek the nearest
solution to V(R(τ )) = EL . A classic strategy for solving this problem is a Newton-Raphson
e
root search.
e
Here, we expand the function of interest to first order and run downhill:
V(R) ≈ V(R0 ) + ∇V · (R – R0 ) = EL (B.1)
e e e R0 e e
e
Where R0 was the first offending step in eq. A.27. We make the guess that heading
downhill via the steepest path will be optimal; this suggests a solution of the form:
e
(R – R0 ) = C ∇V R ≡ ∆R (B.2)
e e e e0 e
133
134
Which immediately yields the constant1 :
EL – V(R0 )
C= e2
∇V R0
e e
This procedure is then iterated until the V(R) = EL boundary is found. It may look
like eq. B.2 renders our problem solved, but as we prove in Section 2.2.1, this naïve
e
root search fails to preserve the center of mass and spuriously lengthens geodesics
computed with it. In this appendix, we present two wildly different methods of coping
with the problem. The first is the essence of simplicity, though limited in its generality.
By comparison, the second technique is a swatting a fly with a laser cannon, but
demonstrates a useful and general principle.
B.1 Swatting a fly without moving: Follow the inverse-
mass-weighted gradient
Never let it be said that a little physical intuition won’t go a long way. It turns out that the
techniques discussed in Section 2.2.1 and to be discussed in Section B.2 solve a much
harder problem than is necessary—they will both project our center-of-mass perturbing
motion from any configuration-space displacement. They are included because of their
generality and because the result in Section 2.2.1 is what we actually implemented in
our investigations of geodesics on formaldehyde.
Unfortunately physical insight doesn’t always occur prior to pages of vector algebra
or hundreds of thousands of core-hours of computer time2 . However, the result may
still be useful in the future and so we provide a pedagogical exposition here; if we’re
not careful, the lead-up will take longer than the proof.
Our task is to follow a gradient downhill without perturbing the center of mass. That
sounds awfully similar to the way that classical mechanics works; surely the general
purpose projection operators are doing more than is required. If we re-write Newton’s
1 The solution should be familiar from Section 2.2, eq. 2.11, from which we have ζ = –C.
2 which is not to say that we could have avoided all those hours if the insight had struck earlier. See
the notes on performance at the end of Appendix E on page 164; this section does not address the
bottle-neck.
135
second law for Cartesian coordinates in terms of the influence of a potential [60], we
have:
F=M·a (B.3)
e e
= –∇V (B.4)
e
Where F is the force experience by the system due to the potential V, a is the system’s
acceleration and M is the diagonal N × N mass matrix defined in eq. 2.69:
e e
(M)iµ,jν = δi,j δµ,ν mi i, j = 1 . . . N ; µ, ν = x, y, z (B.5)
In the absence of external fields, classical mechanics never changes the center of mass3 .
Suppose a dynamical system is momentarily at rest; in what direction does it first
move? Instantaneously, it moves in the direction of its acceleration (later this becomes
complicated by the fact that it is already moving). That acceleration is:
a = –M–1 ∇V (B.6)
e e
where M–1 is simply the diagonal inverse-mass matrix given by:
1
M–1
= δi,j δµ,ν (B.7)
iµ,jν mi
Equation B.6 suggests that we can escape center-of-mass perturbation simply following
the inverse-mass-weighted gradient.
Could this really be? In Section 2.2.1, we arrived at eq. 2.35 right before showing
that the “vanilla” Newton-Raphson root search displaced the center of mass:
? ~
~0 = (B.8)
M · ∇ V R n
e e e
where M~ is the 3 × 3N dimensional, block-diagonal matrix composed of N ordered,
3 × 3 blocks with mi /Mtotal along the diagonals.
e
~ = mi 1 , i = 1 . . . N
↔
M (B.9)
e i Mtotal
3 This is a restatement of Newton’s first law.
136
If, instead of following the gradient, we followed the inverse-mass-weighted gradient
suggested by eq. B.6, we would require that:
~ · M–1 · ∇V
~0 = M (B.10)
n
e e R
e
= ~1 · ∇V n (B.11)
e e R
e
where ~1 is the 3 × 3N dimensional, block-diagonal matrix composed of N ordered,
3 × 3 identity blocks4 .
e
↔
~1 = 1 , i = 1 . . . N (B.12)
e i
Equation B.11 is a sum along each coordinate, {x, y, z}, and can be rewritten as:
N
? X
0= ∇V Rn ; µ = x, y, z (B.13)
i
e e µ
Which can be cast in a more familiar form as the requirement that the sum of forces in
an isolated system vanish:
N
= ~Fi
X
0= (B.14)
i
This is proven in many classical mechanics texts [60] and follows from ~F12 = –~F21 .
Namely that the force exerted on one object by another is equal and opposite to the
force exerted by the second on the first—Newton’s third law. A Newton-Raphson root
search based around eq. B.6 will therefore preserve the center of mass of the system.
Solving eq. B.1 with the suggested guess, ∆R = C · M–1 · ∇V R yields:
e e e0
EL – V R
C= (B.15)
∇V R · M–1 · ∇V R
e
e e0 e e0
And so our center-of-mass respecting analogue of the steepest-descent Newton-Raphson
search takes steps like:
EL – V R
∆R = · M–1 · ∇V R (B.16)
∇V R · M–1 · ∇V R
e e0
e
e e0 e e0
e
4 We T ~ .
saw this matrix earlier in eq. 2.43 under the name C
e
137
B.2 Laser-cannon: A general projection
Suppose we modify eq. B.1 to respect a constraint? After all eq. B.2 was based on a
guess about following the gradient. If, instead, we chose to follow a path constrained to
be parallel to some other direction, n
ˆ , eq. B.2 would have the following form:
(R – R0 ) = Cn
ˆ (B.17)
e e
and inserting into eq. B.1 would yield:
V(R) ≈ V(R0 ) + C ∇V · n
ˆ = EL (B.18)
e e e R0
e
Solving for C yields:
EL – V(R0 )
C= e
∇ V R · n
ˆ
e e0
While this is well an good, it doesn’t quite solve our problem. We need to be able
to impose a constraint that is orthogonal to some directions, those of center of mass
motion. What if we follow the path of steepest descent subject to the constraint that
it is orthogonal to some configuration-space vector which would move the center of
mass, ξ?ˆ
To construct such a step we take a brief detour to discuss dyads. The following
treatment uses the notation of real-space vectors, but is applicable to vectors of arbitrary
dimension (such as our configuration space vectors). This treatment is reprised and
expanded in [85].
A dyad (or dyadic) is a 2nd order tensor formed from the direct product of 2 vectors.
The dyadic ~u~v is defined such that the action of this tensor on a vector w
~ is:
~u~v(w
~ ) = ~u(~v · w
~) (B.19)
The Cartesian components are (~u~v)α,γ = uα vγ or
u v u v u v
1 1 1 2 1 3
~u~v = u2 v1 u2 v2 u2 v3
u3 v1 u3 v2 u3 v3
138
Given a unit vector, u
ˆ , the definition in eq. B.19, makes it easy to construct an operator
which projects another vector onto u
ˆ . For example:
ˆu
u ˆ (w
~)=u
ˆ (u ~ ) = Projuˆ w
ˆ·w ~ (B.20)
gives the projection of w
~ on u
ˆ . See Figure B.1 for a geometric interpretation.
Figure B.1: A projection
Geometric realization of the projection of w
~ onto u
ˆ . Modified from [85].
A useful property of projection operators is their idempotency. From the geometric
significance, it must follow that repeat application of any projection, is equivalent to
single application. We show that this holds for u
ˆuˆ as follows:
(u
ˆuˆ )2 = u
ˆuˆu
ˆuˆ (B.21)
=u
ˆ (u ˆ )u
ˆ·u ˆ (B.22)
=u
ˆ (1)u
ˆ (B.23)
=u
ˆuˆ (B.24)
By induction, we can reason that (u
ˆuˆ )n = u
ˆuˆ for all n ≥ 1.
In addition to projection on u
ˆ , we can construct the projection off of (or orthogonal
to) u
ˆ . Such a projection is given by:
1 – uˆuˆ (B.25)
The orthogonality of the two projections is easily shown:
(1 – u
ˆuˆ )u
ˆuˆ (B.26)
=u ˆ– u ˆ 2 (B.27)
ˆu ˆu
=u
ˆuˆ–u
ˆuˆ (B.28)
=0 (B.29)
139
Armed with such a projection operator, the analogy to eq. B.17 for an orthogonal
downhill direction is clear. We now take a step down and project out all motion along
ˆ
ξ:
(R – R0 ) = C 1 – ξ ξ · ∇V R
ˆˆ (B.30)
e e e e e0
This, we again insert into eq. B.1, yielding:
V(R) ≈ V(R0 ) + C ∇V · 1 – ξ ξ · ∇V = EL
ˆˆ (B.31)
e e e R0 e e R0
e e
Solving for C yields:
EL – V(R0 )
C= e
∇V R · 1 – ξˆξˆ · ∇V R
e e0 e e e0
To enforce a stationary center of mass, we can introduce a coordinate transformation
to separate center of mass motion from internal coordinates:
PN
m ~R N
~Rc.m. ≡ Pj=1 j j ≡ 1
X
N
mj~rj (B.32)
m M
j=1 j j=1
where mj is the mass of the jth particle. The internal coordinates are then:
~rj ≡ ~Rj – ~Rc.m. (B.33)
Note that our system of 3N degrees of freedom is now specified with 3N+3 coordinates,
which have the following 3 constraints:
N
mj ~Rj = 0
X
(B.34)
j=1
one for each spatial dimension. We encode these constraints as vector operations in
configuration space as:
ξµ ≡ δµ,ν mj (B.35)
e jν
=⇒ ξ µ · R = 0 ∀ µ ∈ {x, y, z} (B.36)
e e
140
What does ξ µ look like? By way of example:
e
m1
0
0
m2
0
ξx =
0
.
e
..
mN
0
0
The magnitude of ξ µ is the root of the sum of the squares of the masses:
e
v
N
u
uX
ξ µ = t m2j (B.37)
u
e j=1
and so the unit vector pointing along ξ µ is:
e
δµ,ν mj
ξˆµ = qP (B.38)
jν N 2
j=1 mj
Constructing the corresponding dyad, we have:
mj mk
ˆ ˆ
ξµ ξµ = PN δ δ (B.39)
jγ,kα 2 µ,γ µ,α
j=1 mj
and the projection operator which eliminates center of mass motion along the µ direction
is then:
mj mk
1 – ξˆµ ξˆµ = δγ,α δj,k – δµ,γ δµ,α PN
2
(B.40)
jγ,kα
e j=1 mj
mj mk
= δγ,α δj,k – δµ,γ δµ,α PN (B.41)
2
m
j=1 j
141
Where we have made use of:
1 = δγ,α δj,k
e jγ,kα
and note that in eq. B.40
δγ,α δµ,γ δµ,α = δµ,γ δµ,α
as the term is finite only for α = γ = µ with µ in the range of α and γ.
To eliminate center of mass motion along each Cartesian direction, we form the
product of all of them:
Y
1–P ≡ 1 – ξˆµ ξˆµ (B.42)
e e µ e
This product reduces to a sum using the following observation:
ˆ ˆ ˆ ˆ ˆ ˆ
ξµ ξµ ξν ξν = ξµ ξµ δµ,ν
which is true because ξˆµ and ξˆν are orthogonal. This leaves:
!
1–P = 1–
X
ξˆµ ξˆµ (B.43)
e e e µ
Inserting eq. B.39 and using the same identities we did for eq. B.40, we arrive at:
mm
1 – P jγ,kα = δγ,α δj,k – PNj k 2 (B.44)
e e j=1 jm
With M2 ≡
PN 2 the projection defined by eq. B.44 has the following symmetric
j=1 mj
structure in 3 spatial dimensions:
m21
m1 m2
1– M – M
2 2
m21
m m 2
1– M – M 1 · · ·
2 2
m21
m1 m2
1 – –
M2 M2
2
1 – P = – m1 m2 m
1– M 2
(B.45)
M 2 2
2
e e
m m
1 2 m
– 1 – 2
M2 M2
m22
m1 m2
– M 1– M
2 2
.. ..
. .
142
Returning to the Newton-Raphson problem, we proceed in analogy to eq. B.30 and
posit that the appropriate solution is to follow the gradient while projecting out center
of mass motion:
(R – R0 ) = C 1 – P · ∇V R (B.46)
e e e e e e0
This, we again insert into eq. B.1, yielding:
V(R) ≈ V(R0 ) + C ∇V · 1 – P · ∇V = EL (B.47)
e e e R0 e e e R0
e e
Solving for C yields:
EL – V(R0 )
C=
∇ V R · 1 – P · ∇ V R
e
e e0 e e e e0
Which we insert back into eq. B.46 to find:
V R0 – EL 0
(R – R0 ) = – e 2 ∇ V R (B.48)
0 e e0
∇ V R0
e e
e e
where we have defined ∇0 ≡ 1 – P · ∇ as the center-of-mass invariant gradient
operator and made use of the idempotent property of projections.
e e e e
Applying eq. B.48 iteratively leads to the relation (compare to eq. 2.50):
V Rn – EL 0
Rn+1 = Rn – e 2 ∇ V R (B.49)
0 e en
∇ V Rn
e e
e e
where R0 is the first R(t) with V R(t) > EL from eq. A.27. This procedure will escape
the forbidden region while preserving the center of mass.
e e e
APPENDIX C
MICROCANONICAL SAMPLING
FOR THE ANHARMONIC
OSCILLATOR
This section follows and is expanded from the work of Peslherbe, Wang, and Hase [59].
C.1 Normal Mode Hamiltonian
In this section we give a description of the normal mode Hamiltonian and demonstrate
how to microcanonically sample its degrees of freedom.
For a system of n (separable) harmonic oscillators, the Hamiltonian of the system is
given as:
n n
1X 2
Pi + ωi2 Q2i
X
H= Ei = (C.1)
2
i=1 i=1
where the ωi are the frequencies of the oscillators and the normal coordinates, {Qi , Pi },
are related to the Cartesian coordinates and momenta, xi and pi , by the transformations:
p
Pi = √ i
mi
√
Qi = mi (xi – x0 i ) (C.2)
where ~x0 is the equilibrium position of the oscillator.
143
144
For such a system, the distribution function to microcanonically sample phase space
is an exactly obtainable object. We show its development in the next section.
Assignment of Energies to Normal Modes
Equivalent to choosing N values for Qi and Pi , we can assign to the N modes, N energies
and N phases. In this section we treat the energies. Supposing that we wish to inject a
total microcanonical energy E0 into the N normal modes, the probability that mode
1 has energy E1 is proportional to the number of ways the energy (E0 – E1 ) can be
distributed to the remaining (N – 1) modes. This is given by the density of states for the
oscillators. We then have:
(E – E1 )N–1–1
P(E1 ) = R E 0 (C.3)
0 N–1–1 dE
0 (E0 – E1 ) 1
We can extrapolate to the general expression by noticing that the number of states is
proportional to the remaining energy raised to the power of one less than the number
of remaining oscillators. With the denominator ensuring normalization we have:
Pi–1 N–1–i
(E0 – Ei – j=1 Ej )
P(Ei ) = Pi–1 (C.4)
R E0 – j=1 Ej Pi–1 N–1–i
0 (E0 – Ei – j=1 Ej ) dEi
Sampling equation C.4 directly would be a quite difficult. However the cumulative
distribution function allows sampling an arbitrary probability distribution as follows:
Z x
C(x) = P(x0 )dx0 = Ri (C.5)
–∞
Where Ri ∈ [0, 1) is a uniformly distributed random variable. For this energy distribution
specifically:
Z Ei
C(Ei ) = P(E0i )dE0i = Ri (C.6)
0
Integrating eqn. C.4 and solving for Ei yields:
i–1
Ej 1 – R1/(N–i)
X
E i = E 0 – i (C.7)
j=1
Equation C.7 can be used do assign the first (N – 1) modes. The Nth mode is assigned
EN = E0 – N–1j=1 Ej with probability 1.
P
145
Assignment of Random Phases to Normal Modes
Having assigned the energies of the normal modes, we assign to each normal mode, a
random phase. We uniformly sample the angle, by choosing a new random variable,
Ri ∈ [0, 1), for each mode and taking:
√
2Ei
Q0i =+ cos 2πRi (C.8)
ωi
0
Pi = – 2Ei sin 2πRi (C.9)
p
Where Q0i and Pi0 are the initial trial coordinate and momentum for the ith normal
mode. In this way we have specified the 2N coordinates of phase space with 2N random
variables.
We can verify that this yields the target energy by simple substitution into the
normal-mode Hamiltonian, eqn. C.1:
n
1X 2
H= Pi + ωi2 Q2i (C.10)
2
i=1
√
1
n h
i2 2Ei 2
+ ωi2
X
= – 2Ei sin 2πRi cos 2πRi (C.11)
p
2 ωi
i=1
n
1X
2Ei sin2 2πRi + cos2 2πRi
h i
= (C.12)
2
i=1
n
X
= Ei (C.13)
i=1
=H (C.14)
C.2 Construction of Normal Mode Coordinates
In this section we construct a transformation to normal coordinates for an arbitrary
potential. We proceed by approximating the Hamiltonian in the vicinity of a potential
minimum as a collection of harmonic oscillators. Then we diagonalize the approx-
imated Hamiltonian, finding the normal modes and their frequencies. Though the
normal modes will be an approximation for an anharmonic potential, the coordinate
transformation is exact.
146
In the neighborhood of a potential minimum1 , ~x0 , the Hamiltonian for an arbitrary
potential with N degrees of freedom can be brought into the form of the normal mode
Hamiltonian, eqn. C.1. We make a Taylor series expansion of the potential and truncate
all terms of greater than quadratic order:
N
1 X ∂ 2 V
V(~x) ≈ V|~x0 + xi – x0 i xj – x0 j (C.15)
2 ∂xi ∂xj
i6=j ~x0
∂V
Here we use the fact that the potential has a minimum at ~x0 to guarantee ∂x = 0 ∀ i,
i ~x0
which eliminates the first order terms.
Suitable shifting of the zero of the potential eliminates the zeroth order (constant)
term and application of equation C.2 transforms to mass weighted coordinates. To
eliminate the cross terms, we diagonalize the hessian matrix, H, given by:
2 2 2
" #
∂ V ∂ V ∂xi ∂xj ∂ V –1/2
Hij = = = mi mj (C.16)
∂Qi ∂Qj ∂xi ∂xj ∂Qi ∂Qj ∂xi ∂xj
0 ~x0 ~x0
We discuss the evaluation of such an object in Section 2.7.3.
Diagonalization [86, Chapter 8.2] yields a matrix, Ωij = δij ωi2 , containing the
frequencies of the normal modes and the eigenvectors (modes) associated with them.
The matrix L, whose columns are the eigenvectors, allows transformation from normal
coordinates back to Cartesian coordinates as in equation C.18.
L ≡ ~e1 , ~e2 , ~e3 , . . . (C.17)
where ~ei is the column eigenvector associated with the frequency ωi . In general (for
systems with N > 2) there will be 6 normal modes with frequencies approaching 0
corresponding to net translation in three dimensions and net rotation about three axes
(see also Section 2.7.3).
With these new coordinates and frequencies in hand, we follow the procedure of
section C.1 in the assignment of energies and phases.
1 More generally, this procedure can be applied anywhere ∇V = 0, whether a minimum, a maximum,
or a saddle point.
147
Transformation Back to Cartesian Coordinates
Once the normal mode phases and energies have been assigned (see sections C.1
and C.1), it is a simple matter to transform back to Cartesian coordinates.
1
~
~x =~x0 + M 2 LQ (C.18)
1
~p =M 2 L~P (C.19)
1 1
where Mij2 = δij mi2 is the matrix giving the masses associated with each coordinate and
~ and ~P are vectors giving the previously assigned normal coordinates.
Q
C.3 Corrections for Anharmonicity and Selection of Total
Angular Momentum
The normal modes only represent an approximation for our true Hamiltonian and
therefore there arises a spurious angular momentum and a deviation from the target
energy, E0 . To correct for this, we iteratively apply the following transformation until
convergence2 is reached.
1. Form the spurious angular momentum (this is simply the total angular momentum
of the molecule),
N
~js =
X
~ri × mi~vi (C.20)
i=1
Where the sum is over atomic centers and ~ri , ~vi , and mi give the Cartesian position,
velocity, and mass of the ith atom. To correct for this term we add ~j = –~js to the
molecule3 by computing the vector angular velocity:
ω = I–1~j (C.21)
where I–1 is the inverse of the moment of inertia tensor and adding this angular
velocity to each center.
~vi0 = ~vi + ω × ~ri (C.22)
2 We place the following constraints on total energy, angular momentum, and center of mass translation:
| E–E ~ ~ ~ ~˙ –8
E0 | < δE where E = H(x, p), |j| < δj, and |rcm | < δ rcm ; all tolerances are of the order of 10 .
0 ˙
3 This procedure could be extended to select arbitrary angular momentum by adding the term ~j = ~j0 –~js
where ~j0 is the desired angular momentum.
148
2. To correct the total energy, let E = H(~x, ~p) and scale the coordinates and momenta
(velocities would be fine too) as:
E0 (1/2)
~x0 =~x0 + (~x – ~x0 ) (C.23)
E
(1/2)
E
~p0 =~p 0 (C.24)
E
I have observed that this procedure can yield a fixed point (an infinitely repeated
sequence) for certain input parameters.
Explanation of Anharmonic Correction
The algorithm presented in the previous section may appear opaque, but this need not
be the case.
Step 1 simply treats the molecule as a rigid body. Equation C.20 is the expression
for the total angular momentum and equation C.21 is an inversion of another standard
expression for the angular momentum of a rigid body
~j = Iω (C.25)
Where I is the moment of inertia tensor [60].
As for step 2, it is exact to the extent that the Hamiltonian is harmonic. Consider
a system with a single harmonic degree of freedom (eqn. C.1 for N = 1) and target
total energy E0 . For convenience, we choose a shifted coordinate system such that the
minimum of the harmonic oscillator is at the origin. This has the effect of setting ~x0 = ~0
in eqn. C.23.
Our present coordinates, Q, P yield an energy E:
H(Q, P) = E 6= E0 (C.26)
and we desire that the transformed coordinates, Q0 , P0 yield the energy E0 :
?
H(Q0 , P0 ) = E0 (C.27)
149
Proceeding by substitution of the transformed coordinates into the Hamiltonian, we
have:
1 02 2
P + ω 2 Q0
H(Q0 , P0 ) = (C.28)
2
1 2 E0
2 2 E0
= P +ω Q (C.29)
2 E E
1 2
E0
P + ω 2 Q2
h i
= · (C.30)
E 2
E0
= ·E (C.31)
E
= E0 (C.32)
Which is as intended.
APPENDIX D
MONTE CARLO SAMPLING THE
LANDSCAPE ENSEMBLE
The first part of this appendix closely follows [64], whose methods we imitate. We seek
to generate a set of points which samples the roaming region (see Section 2.3.3) of the
formaldehyde potential energy surface in the landscape ensemble (Section 2.1.2). Such
sets at the appropriate landscape energies are the buckets from which the intermediate,
roaming points, are drawn when constructing roaming geodesics (Section 2.3.4).
Metropolis-Hastings sampling [87] is a Monte Carlo procedure for evaluating en-
semble averages of the form:
dxN P(x)f x
R
hfi = R Ne e (D.1)
dx P(x)
where f is a property of the N-dimensional space determined by x and P(x) is the
e
probability of a given state, x. The Metropolis algorithm is of particular use when P(x)
e e
is difficult to sample directly or to normalize. Its output is a set of states, xi , such that
e e
the average in eq. D.1 can be replaced by:
e
1X
hfi = (D.2)
f xi
N
i
e
The set, xi , is generated by a biased random walk in the space which compares
the relative probabilities, P, for the current and succeeding states. In this way the
e
probability of being in a particular state at a particular step only depends explicitly on
the prior step; such a sequence is called a “Markov chain”.
150
151
D.1 General requirements
Given an initial state, x0 , a (potentially un-normalized) probability distribution, P(x),
0
and a propagator for generating trial moves, F : xn → xn+1 , then the sequence of the
e e
random walk is given recursively as follows:
e e
x0 = x0 (choosen or otherwise given)
e e 0
P(x )
F(xn ) , with probability: min 1, en+1
P(xn )
(D.3)
xn+1 = e e
, otherwise
x
n
e
e
The probabilistic component of eq. D.3 can be handled by comparing a uniform,
random variable on [0, 1), ξ, to the target probability, p, and accepting if ξ < p. If this is
confusing, think of the limiting cases: p = 0 and p = 1.
F, the propagator can take many forms, but here, we use a center-of-mass preserving
variant of the one used in Brady et al. [64]. To generate trial moves for the random
walk, we displace a single random atom, j, along a random coordinate, by a random
amount, ∆~j =µ ˆj (ξ – 0.5)δx . Where ξ is a uniformly distributed, random variable on
[0, 1), δx is a scaling factor and µ
ˆj ∈ {ˆx, yˆ, zˆ} is along a randomly selected Cartesian axis
of the jth atom. After each such step, an additional move to preserve the center of mass
is made by moving all atoms by the same, appropriate amount. Thus the total move for
each atom i is:
mj
∆~ri = µ
ˆj (ξ – 0.5)δx δi,j – ; i = 1...N (D.4)
M
PN
where M = i=1 mi is the total mass of the system. We encode this move through the
equation:
F(x) = x + ∆R (D.5)
e e e
where ∆R = (∆~r1 , ∆~r2 . . . ∆~rN ) is the configuration-space vector of all single-atom
displacements defined in eq. D.4.
e
There is a bit of lore surrounding Markov chains, which indicates that an acceptance
ratio (i.e., the fraction of accepted trial steps) of 50% is “best”. Conversations with
Jimmie Doll (one of the authors of [64]) indicated that this is very rough and that a
better guide is perhaps “bigger than 10% and less than 90%”. One can adjust the
152
scaling factor, δx , to achieve the desired acceptance ratio during a trial phase. However,
once sampling for the set in eq. D.2, δx must remain fixed or the chain will fail to satisfy
detailed balance and the intended distribution will not be sampled. We describe a
naïve, but effective implementation of a scheme to pick δx in section a later section on
page 154.
D.2 Implementation specifics
We now give the specifics of our implementation of the foregoing procedure to sample
points from the roaming region of the formaldehyde potential energy surface.
A Probability-like Function
In the first section we stated that we could use un-normalized probability distributions.
This is because the distribution, P only appears in eq. D.3 in ratio with itself. This is
quite convenient because it obviates the need to compute what would be—in this case
and many others—a complicated, many-dimensional configurational integral.
Our definition of the roaming region is explained in detail in Section 2.3.3, but
recall that the structure of the region is defined in terms of requirements that
1. The restoring force on the roaming hydrogen is of small magnitude:
FH(0) < Fmax (D.6)
2. The hydrogens are separated by more than a minimum and less than a maximum
distance:
dmin < ~rHH(0) < dmax (D.7)
Hydrogen-hydrogen separation is given by ~rHH(0) and we define the restoring force as:
~ (0) V(R)
FH(0) = –ˆrH(0) –CM · ∇ (D.8)
H
e
where
~ (0) = ∂ ∂ ∂
∇ H , , (D.9)
∂xH(0) ∂yH(0) ∂zH(0)
153
and
ˆrH(0) –CM = ~rCM – ~rH(0) /
~rCM – ~rH(0)
(D.10)
These criteria can be formally encoded in a function as follows:
Proaming (R; Fmax , dmin , dmax ) ∝ Θ Fmax – FH(0) (D.11)
e
·Θ
~rHH(0)
– dmin (D.12)
·Θ dmax –
~rHH(0)
(D.13)
where Θ is the step function:
1, x>0
Θ(x) = (D.14)
0, otherwise
To sample the potential energy landscape ensemble ([46] and Section 2.1.2), we
can use a similar scheme:
Ppele (R; EL ) ∝ Θ EL – V(R) (D.15)
e e
where EL is the landscape energy.
Combining equation D.15 with D.11:
P(R) = Ppele (R; EL ) × Proaming (R; Fmax , dmin , dmax ) (D.16)
e e e
allows us to sample the portion of the roaming region that is also in the landscape
ensemble for a given EL .
It should be noted that since our combined function has binary outputs, the proba-
bilities in eq. D.3 will always be 0 or 1. This is a special case of the more general class
of problems than the Metropolis-Hastings algorithm is capable of handling—within the
allowed region, we seek uniform sampling rather than a variable density.
Put another way, because we have no notion of “sort of roaming” or “sort of in the
potential energy landscape ensemble”, the random walk will never do something “sort
of bad” i.e., will never take “uphill” steps. When Brady and co-workers[64] implemented
this scheme, it was to sample the energy shell of the microcanonical ensemble. While
there is no notion of “sort of the right energy” in the microcanonical ensemble, they
used a pre-limit form of the delta function for their probability, which allowed their
walker to meander back and forth across the energy shell.
154
Picking a Scaling Factor
For the linear propagator described in section D.1, we would like to optimize the scaling
factor, δx , such that the acceptance ratio falls within the desired bounds.
The binary probability function could introduce some added complications, but we
avoid them under the following assumption:
• The set to be sampled is dense; that is, points in the set are arbitrarily close to
others in the set
This condition allows us to assume that for sufficiently small steps, moves will always
fall within the acceptance region. This is important because it allows us to take the
δx → 0 limit as yielding an acceptance ratio of 1. In the case of a binary probability
function, the other limit, δ → ∞, implies an acceptance ratio equal to the value hP(~x)i.
In the case of the roaming region of formaldehyde, this average is 0 because the volume
of the roaming region is finite1 .
It would be convenient to assume that the set is connected; that is, it is possible
to translate between any two points in the set while remaining in the set. This would
actually be a stronger requirement than the first assumption and would allow us to
assume that the entire set is reachable from any x0 . But, in the potential energy
landscape ensemble (or microcanonical ensemble), we have no guarantees that such
e
an assumption will hold2 .
The general scheme for selecting δx is this:
1. Guess an initial value for δx
2. Compute a Markov chain of length k while keeping track of the acceptance ratio
3. If the ratio is larger than desired, scale by (1 + α) to increase the length. Likewise,
if the ratio is smaller than desired, scale by (1 – α) to decrease the step.
4. Go-to 2 and repeat M times.
1 orat least grows more slowly than the volume of the allowed regions of the landscape ensemble
with EL .
2 Indeed, we strongly suspect that it does not; see Section 3.6.
155
If α is too large, the results will be unstable as δx oscillates in and out of the acceptable
range. Too small and convergence will take a very long time. Similar observations
can be made about k. We find acceptable results with α = 0.05 and k = 1000 at
EL = 37600 cm–1 Using this plan for picking δx We achieved an acceptance ratio of
89.5% with a scaling factor of δx = 0.27562 a0 after 1.1 × 106 total steps. This step
size was then used for sampling the roaming region at all energies. All parameters used
to select δx are given in the table below.
Parameter Value
δx (guess) 0.01 a0
k 1000
target acceptance ratio 10% to 90%
α 0.05
M 1100
δx (final) 0.27562 a0
acceptance ratio 89.5%
The Formaldehyde Potential
Given that we have no reason to expect that the roaming region is self-connected, we
should initiate our random walk from as many x0 as possible. To seed the random
walk we took configurations satisfying both the landscape and roaming criteria from
e
molecular dynamics trajectories (see Section 2.3.2) that went to molecular products
(roaming trajectories). These criteria yielded 200–9000 points in the roaming region
for each landscape energy, EL (see Table D.1 for details). Using each of the points as
a different value for x0 , we generated Markov chains of length 105 , recording values
every 100 moves. These points were then shuffled and used as intermediate roaming
e
points in the geodesic path-finding algorithm; see Section 2.3.4
156
EL / cm–1 acceptance ratio seed configurations
34500 0.827227 248
∗35000 0.849454 2071
35500 0.865258 2588
∗36000 0.875344 3379
36223 0.878148 4040
37000 0.889417 6025
∗37600 0.895281 7627
37688 0.895886 7149
38000 0.898983 7168
38688 0.903714 8884
∗38814 0.903394 8723
∗41010 0.904021 7319
∗45000 0.942785 6432
Table D.1: Energy-specific acceptance ratios and seed configurations
We give acceptance ratios and the number of seed configurations for all energies for
which we sampled the roaming region. For all walks, δx = 0.27562 a0 was fixed as
determined at EL = 37600 cm–1 . For each seed configuration, we computed walks of
105 steps, recording a configuration every 100 moves. An asterisk (*) indicates energies
for which geodesics were actually computed.
APPENDIX E
INTEGRATING
EXTERNALLY- SUPPLIED FORTRAN
CODE
Joel Bowman’s group at Emory University provided us with the global potential energy
surface (PES) for formaldehyde [54]. Briefly, the function was formed by splicing together
several fits to two sets of high level (CCSD(T)/aug-cc-PVTZ and MR-CI/aug-cc-PVTZ)
ab initio calculations. Bowman transmitted the function as a set of “zipped” FORTRAN
source code files. When decompressed, the files represent about 5000 lines of code.
For future checks of the integrity of these files, their sha256 checksums [88] are given
here:
% sha256sum *.f
f708238435ffaa0ec30088327440d4ac6fcfdb3fe55d9e67ff08fc6e50525437 bonds.f
82f074e9c3d350e76c8ca152425b4c37258914bb6e1f12fd9014a9437f259cf0 getpot.f
3eea2fd4a3a5c52df4e803460700debfd00e59638653eb846c63839c03133f22 h2co-ccsdt.f
181db4ab7fe3aec03eb0ebaa6a3c397755d1b2ba6aceea31db744b5a9e10c39b h2co-mrci.f
d4d01e238d90375e4e43fc0756e5a552c4cd15d66db3821693bd8b624d0c63a8 main.f
%
As the provided code was written in FORTRAN and the project implemented in C,
some effort had to be expended to interpolate between the two. Primarily, we want to
be able to call the subroutine Getpot, contained in the file getpot.f.
157
158
This function and relevant declarations are given below. The declarations are taken
from the file main.f, which calls Getpot.
REAL(8), DIMENSION(36,3) :: xx
REAL(8), DIMENSION(36) :: xmass
REAL(8) :: v,v1,v2,s
INTEGER :: natom
subroutine Getpot (natom,xmass,xx,v,v1,v2,s)
This snippet declares two double precision arrays of dimension 36 × 3 and 36 × 1
called xx and xmass respectively as well as 4 double precision floating point values
and an integer. The subroutine Getpot takes these variables as parameters. A call to
getpot reads the contents of xx, xmass, and natom then sets the values of v, v1, v2,
and s. The variable v contains the value of the potential at the position specified by xx;
the other variables can be used internally.
The following comment in the file h2co-mrci.f:
c note h1 and h2 are not equivalent, it is assumed that
c h2 is further away from the other atoms than h1
suggest that the two hydrogens used internally are not equivalent. Indeed, one sees
differences on the order of a few percent when permuting the hydrogens if one is far
removed from the formyl. Our best-guess interpretation of the comment is that the
hydrogen most removed from the total center of mass should be input to Bowman’s
function as ‘h2’. We respect this stricture by including a check on the condition in our
wrapper around Bowman’s code. When it was not satisfied, we swap the hydrogens in
the external function call while preserving our internal representation. In this way, the
hydrogens are indeed identical to our wrapped function.
The two arrays are of length 36 because the code was written with reuse in mind.
The natom parameter is used to specify the portion of the arrays used. The code accepts
positions in Cartesian coordinates (in the order H1 , O, C, H2 ), which are then converted
to internal, rotationally and translationally invariant coordinates. The 4 atom system has
12 Cartesian coordinates and 12 – 6 = 6 internal coordinates. The internal coordinate
159
system in the Bowman-group code is the set of all inter-atomic distances1 .
The sources were compiled to object files using the command gfortran -O3 -c
bonds.f getpot.f h2co-ccsdt.f h2co-mrci.f. The flag -O3 indicates the highest
level of compiler optimizations and the flag -c indicates that the files should be compiled,
but not linked.
Once compiled, the object files can be inspected with the command nm, e.g.:
% nm --defined-only getpot.o
0000000000000000 T getpot_
0000000000000010 d rech.1042
0000000000000008 d rehh.1049
0000000000000000 d reoh.1050
%
Here, we can see that the function included in the file getpot.o is getpot_ (indi-
cated by T). Note the added underscore; the naming convention is compiler-dependent2 .
This is the name of the function we call from C code.
To call this function in a C source file, we declare an external, void function using
the name found above:
extern void EXTGETPOT(const int * natom, const double xmass[36]
,double xx[3][36], double * v, double * v1
,double * v2, double * s);
Several things are noteworthy. First we constant, EXTGETPOT, so that the name of the
FORTRAN function can be specified at compile time (for gfortran, from the GNU com-
piler collection, the appropriate call is getpot_). Second, all variables in FORTRAN are
passed by reference. Therefore C variables that are not already pointers (arrays are fine)
should be passed as pointers. Finally, FORTRAN’s memory layout for multidimensional
arrays is the transpose of C’s. For this reason, xx is declared as 3 × 36 instead of 36 × 3.
With an external function declared, it can be used in code! The following function
takes a gsl_vector* describing the position of all centers and returns the potential
1 Were there more than 4 centers, this set would over-determine the system in 3D. N(N–1) > 3N – 6
2
for N > 4.
2 The xl compilers used on IBM Blue Gene systems, for instance do not add the underscore.
160
energy of the configuration via a call to EXTGETPOT (perhaps getpot_) while respecting
difference between h1 and h2. Comments in the Bowman group’s code suggest that
the return value is in wavenumbers, but it is not (we verified this by comparison to
their published values [54]). The value is returned in atomic units or Hartrees (1 Ha =
EH = 2.19474 × 105 cm–1 ; see also Section 2.1.1).
double potential_formaldehyde(const gsl_vector * r){
assert(r->size == 12);
static const int natom = 4;
static const double xmass[36]={
1837.15 // H
,29156.95 // O
,21874.66 // C
,1837.15 // H
}; //will initialize the first 4 compnents and leave the others
//double xx[3][36];
static double v = 0, v1 = 0, v2 = 0, s = 0;
static double xx[3][36]; //transpose of fortran's 36x3
for (size_t coor = 0; coor < 3; coor++){
for (size_t atom = 0; atom < 4; atom++){
xx[coor][atom]=gsl_vector_get(r,(atom*3)+coor);
}
}
if (HOCHgetCMDist(r,formaldehydeOffset.H1) >
HOCHgetCMDist(r,formaldehydeOffset.H2)){
double temp;
const int H1 = 0;
const int H2 = 3;
161
for (size_t coor = 0; coor < 3; coor++){
temp = xx[coor][H1];
xx[coor][H1] = xx[coor][H2];
xx[coor][H2] = temp;
}
}
EXTGETPOT(&natom, xmass, xx, &v, &v1, &v2, &s);
return v;
}
The following constants and helpers are also defined in a separate header file.
#ifndef EXTGETPOT
#define EXTGETPOT getpot_
#endif
const int formaldehydeNumCenters = 4;
const size_t formaldehydeNumDimensions = 12;
const double formaldehydeMassTotal = 54705.91;
const struct offset_t formaldehydeOffset = {
.H1 = 0,
.O = 3,
.C = 6,
.H2 = 9
};
const struct mass_t formaldehydeMass = {
.H1 = 1837.15,
162
.O = 29156.95,
.C = 21874.66,
.H2 = 1837.15
};
const double formaldehydeMassArray[4] = { 1837.15,
29156.95,
21874.66,
1837.15};
const size_t formaldehydeOffsetArray[4] = {0,3,6,9};
Using the constants defined above, we compute the Hydrogen to center-of-mass
separation in the following function:
/*
gives the distance of the offset-specified atom from the total
center of mass.
*/
double HOCHgetCMDist(const gsl_vector *r, size_t offset){
const size_t space = 3;
double cm_array[space];
gsl_vector_view cm = gsl_vector_view_array(cm_array, space);
gsl_vector_set_zero(&cm.vector);
for (int i = 0; i < formaldehydeNumCenters; i++){
gsl_vector_const_view v = gsl_vector_const_subvector(
r,formaldehydeOffsetArray[i],space);
gsl_blas_daxpy(
formaldehydeMassArray[i] / formaldehydeMassTotal,
&v.vector, &cm.vector);
}
163
gsl_vector_const_view atom = gsl_vector_const_subvector(
r, offset, space);
gsl_vector_sub(&cm.vector, &atom.vector);
return gsl_blas_dnrm2(&cm.vector);
}
The C file is compiled to object code in the usually way:
% gcc -std=c99 -Werror -pedantic -Wall -W -Wmissing-prototypes \
-Wstrict-prototypes -Wconversion -Wshadow -Wpointer-arith -Wcast-qual \
-Wcast-align -Wwrite-strings -Wnested-externs -fshort-enums -fno-common \
-g -O2 -DHAVE_INLINE -I/share/apps/include -c potential_formaldehyde.c
%
A number of errors are checked by enabling the compiler warnings and is recom-
mended by the GNU Scientific Library (GSL) project [62]. Note the -I flag, which
specifies the location of the GSL headers; these may be in an alternate location. As
of this writing, the version installed on ted is 1.14; version 1.99 is installed under
/home/vale/toor/.
With C and FORTRAN object files in hand, the program can be linked:
% gcc bonds.o getpot.o h2co-ccsdt.o h2co-mrci.o \
potential_formaldehyde.o # ... other C objects follow \
-lgsl -lm -lgfortran -lgslcblas -L/share/apps/lib \
%
The last line of flags link the GSL, C math, and GCC FORTRAN libraries respectively.
The flag -lgslblas specifies using the GSL Basic Linear Algebra Subprograms (BLAS)
library. Another, optimized library could also be used. We attempted using the TACC
implementation, GotoBLAS23 , using the flags -lgoto2 -L/share/apps/GotoBLAS2,
but performance was substantively harmed4 . The location of the GSL library must be
3 See https://www.tacc.utexas.edu/tacc-projects/gotoblas2.
4 Note to future (student) readers, it might be worth trying the ATLAS implementation too.
164
specified with an -L flag.
Two quick notes about performance: Upon profiling the code we used to compute
geodesics, we found that approximately 95% of the program’s effort was spent in
the externally supplied FORTRAN code. This does not imply that the Bowman code is
inefficient, but just that fitting the formaldehyde potential is computationally demanding.
One result of this observation is that the program to find geodesics could have been
written in just about any reasonably performant language and we would have retained
the same order-of-magnitude performance as long as it was able to efficiently call the
Bowman code. Also, modern compilers are good things to use as compiler technology
has improved and can generate faster code. The version of gcc on ted is ancient—4.4
came out nearly a decade ago. By installing newer versions (e.g. 6.2) we were able to
achieve an order of magnitude speed-up in pure C code written for analysis.
E.1 Form of switching functions
The switching functions employed by Bowman’s group in constructing the fit are of the
form given below:
1 x≤0
w(x) = 1 – 10x3 + 15x4 – 6x5 0