Directly comparing GW150914 with numerical solutions of Einstein’s equations for binary black hole coalescence B. P. Abbott et al.* (LIGO Scientific Collaboration and Virgo Collaboration) (Received 10 June 2016; published 14 September 2016) We compare GW150914 directly to simulations of coalescing binary black holes in full general relativity, including several performed specifically to reproduce this event. Our calculations go beyond existing semianalyticmodels, because for all simulations—including sources with two independent, precessing spins —we perform comparisons which account for all the spin-weighted quadrupolar modes, and separately which account for all the quadrupolar and octopolar modes. Consistent with the posterior distributions reported by Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016)] (at the 90% credible level), we find the data are compatible with a wide range of nonprecessing and precessing simulations. Follow-up simulations performed using previously estimated binary parameters most resemble the data, even when all quadrupolar and octopolar modes are included. Comparisons including only the quadrupolar modes constrain the total redshifted mass Mz ∈ ½64 M⊙ − 82 M⊙�, mass ratio 1=q ¼ m2=m1 ∈ ½0.6; 1�, and effective aligned spin χeff ∈ ½−0.3; 0.2�, where χeff ¼ ðS1=m1 þ S2=m2Þ · L̂=M. Including both quadrupolar and octopolar modes, we find the mass ratio is even more tightly constrained. Even accounting for precession, simulations with extreme mass ratios and effective spins are highly inconsistent with the data, at any mass. Several nonprecessing and precessing simulations with similar mass ratio and χeff are consistent with the data. Though correlated, the components’ spins (both in magnitude and directions) are not significantly constrained by the data: the data is consistent with simulations with component spin magnitudes a1;2 up to at least 0.8, with random orientations. Further detailed follow-up calculations are needed to determine if the data contain aweak imprint from transverse (precessing) spins. For nonprecessing binaries, interpolating between simulations, we reconstruct a posterior distribution consistent with previous results. The final black hole’s redshifted mass is consistent with Mf;z in the range 64.0 M⊙ − 73.5 M⊙ and the final black hole’s dimensionless spin parameter is consistent with af ¼ 0.62–0.73. As our approach invokes no intermediate approximations to general relativity and can strongly reject binaries whose radiation is inconsistent with the data, our analysis provides a valuable complement to Abbott et al. [Phys. Rev. Lett. 116, 241102 (2016)]. DOI: 10.1103/PhysRevD.94.064035 I. INTRODUCTION On September 14, 2015 09∶50∶45 UTC, gravitational waveswere observed in coincidence by the twin instruments of the Laser Interferometer Gravitational-wave Observatory (LIGO) located at Hanford, Washington, and Livingston, Louisiana, in the USA, an event known as GW150914 [1]. LVC-detect [1], LVC-PE [2], LVC-TestGR [3], and LVC- Burst [4] demonstrated consistency between GW150914 and selected individual predictions for a binary black hole coalescence, derived using numerical solutions of Einstein’s equations for general relativity. LVC-PE [2] described a systematic, Bayesianmethod to reconstruct the properties of the coalescing binary, by comparing the data with the expected gravitational wave signature from binary black hole coalescence [5], evaluated using state-of-the-art semi- analytic approximations to its dynamics and radiation [6–8]. In this paper, we present an alternative method of recon- structing the binary parameters of GW150914, without using the semianalytic waveform models employed in LVC-PE [2]. Instead, we compare the data directly with the most physically complete and generic predictions of general relativity: com- puter simulations of binary black hole coalescence in full nonlineargeneral relativity (henceforth referred toasnumerical relativity, or NR). Although the semianalytic models are calibrated to NR simulations, even the best available models only imperfectly reproduce the predictions of numerical relativity, on a mode-by-mode basis [9,10]. Furthermore, typical implementations of these models, such as those used inLVC-PE [2], consider only thedominant spherical-harmonic mode of the waveform (in a corotating frame). For all NR simulations considered here—including sources with two independent, precessing spins—we perform comparisons that account for all the quadrupolar spherical-harmonic waveform modes, and separately comparisons that account for all the quadrupolar and octopolar spherical-harmonic modes. The principal approach introduced in this paper is different from LVC-PE [2], which inferred the properties of GW150914 by adopting analytic waveform models. Qualitatively speaking, these models interpolate the outgoing gravitational wave strain (waveforms) between the well-characterized results of numerical relativity, as*Full author list given at the end of the article. PHYSICAL REVIEW D 94, 064035 (2016) 2470-0010=2016=94(6)=064035(30) 064035-1 © 2016 American Physical Society http://dx.doi.org/10.1103/PhysRevD.94.064035 http://dx.doi.org/10.1103/PhysRevD.94.064035 http://dx.doi.org/10.1103/PhysRevD.94.064035 http://dx.doi.org/10.1103/PhysRevD.94.064035 provided by a sparse grid of simulations. These interpolated or analytic waveforms are used to generate a continuous posterior distribution over the binary’s parameters. By contrast, in this study, we compare numerical relativity to the data first, evaluating a single scalar quantity (the marginalized likelihood) on the grid of binary parameters prescribed and provided by all availableNR simulations.We then construct an approximation to the marginalized like- lihood that interpolates betweenNR simulations with differ- ent parameters. To the extent that the likelihood is a simpler function of parameters than thewaveforms, thismethodmay require fewer NR simulations and fewer modeling assump- tions. Moreover, the interpolant for the likelihood needs to be accurate only near its peak value, and not everywhere in parameter space. A similar study was conducted on GW150914 using a subset of numerical relativity wave- forms directly against reconstructed waveforms [4]; the results reported here are consistent but more thorough. Despite using an analysis that has few features or code in common with the methods employed in LVC-PE [2], we arrive at similar conclusions regarding the parameters of the progenitor black holes and the final remnant, although we extract slightly more information about the binary mass ratio by using higher-order modes. Thus, we provide independent corroboration of the results of LVC-PE [2], strengthening our confidence in both the employed stat- istical methods and waveform models. This paper is organized as follows. Section II provides an extended motivation for and summary of this investigation. Section III A reviews the history of numerical relativity and introducesnotation to characterize simulatedbinaries and their radiation. Section III B describes the simulations used in this work. Section III C briefly describes the method used to compare simulations to the data; seePE+NR-Methods [11] for further details. Section III D describes the implications of using NR simulations that include only a small number of gravitational-wave cycles. Section III F relates this investiga- tion to prior work. Section IV describes our results on the pre- coalescence parameters. We provide a ranking of simulations asmeasured by a simplemeasure of fit (peakmarginalized log likelihood). When possible, we provide an approximate posterior distribution over all intrinsic parameters. Using both our simple ranking and approximate posterior distributions, wedrawconclusions about the range of source parameters that are consistent with the data. SectionV describes our results on the post-coalescence state. Our statements rely on the final black hole (BH) masses and spins derived from the full NR simulations used. We summarize our results in Sec. VI. In Appendix A, we summarize the simulations used in this work and their accuracy, referring to the original literature for complete details. II. MOTIVATION FOR THIS STUDY This paper presents an alternative analysis of GW150914 and an alternative determination of its intrinsic parameters. The methods used here differ from those in LVC-PE [2] in two important ways. First, the statistical analysis here is performed in a manner different than and independent of the one in LVC-PE [2]. Second, the gravitational waveform models used in LVC-PE [2] are analytic approximations of particular functional forms, with coefficients calibrated to match selected NR simulations; in contrast, here we directly use waveforms from NR simulations. Despite these differences, our conclusions largely corroborate the quan- titative results found in LVC-PE [2]. Our study also addresses key challenges associated with gravitational wave parameter estimation for black hole binaries with total mass M > 50 M⊙. In this mass regime, LIGO is sensitive to the last few dozens of cycles of coalescence, a strongly nonlinear epoch that is the most difficult to approximatewith analytic (or semianalytic) wave- formmodels [6,12–14]. Figure 1 illustrates the dynamics and expected detector response in this regime, for a source like GW150914. For these last dozen cycles, existing analytic waveform models have only incomplete descriptions of precession, lack higher-order spherical-harmonic modes, and do not fully account for strong nonlinearities. Preliminary investigations have shown that inferences about the sourcedrawnusing these existinganalytic approximations can be slightly or significantly biased [9,14–16]. Systematic studies are underway to assess how these approximations influence our best estimates of a candidate binary’s param- eters. At present, we can only summarize the rationale for these investigations, not their results. To provide three concrete examples of omitted physics, first and foremost, even the most sophisticated models for binary black hole coalescence [6] do not yet account for the asymmetries [16] responsible for the largest gravitational-wave recoil kicks -2 -1 0 1 2 -0.2 -0.15 -0.1 -0.05 0 0.05 nwodgniR regreMlaripsnI S tr ai n (1 0-2 1 ) Time (seconds) H1 estimated strain, incident H1 estimated strain, band-passed FIG. 1. Simulated waveform: Predicted strain in H1 for a source with parameters q ¼ 1.22, χ1;z ¼ 0.33, χ2;z ¼ 0.44, simulated in full general relativity; compare to Fig. 2 in LVC-detect [1]. The gray line shows the idealized strain response hðtÞ ¼ FþhþðtÞ þ F×h×ðtÞ, while the solid black line shows the whitened strain response, using the same noise power spectrum as LVC-detect [1]. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-2 [17,18]. Second, the analytic waveform models adopted in LVC-PE [2] adopted simple spin treatments (e.g., a binary with aligned spins, or a binarywith single effective precessing spin) that cannot capture the full spin dynamics [19–21]. A single precessing spin is often a good approximation, particularly for unequal masses where one spin dominates the angular momentum budget [8,19,22–24]. However, for appropriate comparable-mass sources, two-spin effects are known to be observationally accessible [16,25] and cannot be fully captured by a single spin. Finally, LVC-PE [2] and LVC- TestGR [3] made an additional approximation to connect the inferred properties of the binaryblack holewith the final black hole state [26,27]. The analysis presented in this work does not rely on these approximations: observational data are directly compared against awide range ofNRsimulations and the final blackholeproperties are extracted directly fromthese simulations, without recourse to estimated relationships between the initial and final state. By circumventing these approximations, our analysis can corroborate conclusions about selected physical properties ofGW150914presented in those papers. Despite the apparent simplicity of GW150914, we find that a range of binary black hole masses and spins, including strongly precessing systems with significant misaligned black hole spins [28], are reasonably consistent with the data. The reason the data cannot distinguish between sources with qualitatively different dynamics is a consequence of both the orientation of the source relative to the line of sight and the time scale of GW150914. First, if the line of sight is along or opposite the total angular momentum vector of the source, even the most strongly precessing black hole binary emits a weakly modulated inspiral signal, which lacks unambiguous signatures of precession and is easily mistaken for a nonprecessing binary [25,29]. Second, because GW150914 has a large total mass, very little of the inspiral lies in LIGO’s frequency band, so the signal is short, with few orbital cycles and even fewer precession cycles prior to or during coalescence. The short duration of GW150914 provides few opportunities for the dynamics of two precessing spins to introduce distinctive amplitude and phase modulations into its gravitational wave inspiral signal [19]. Although the orientation of the binary and the short duration of the signal make it difficult to extract spin information from the inspiral, comparable-mass binaries with large spins can have exceptionally rich dynamics with waveform signatures that extend into the late inspiral and the strong-field merger phase [8,30]. By utilizing full NR waveforms instead of the single-spin (precessing) and double-spin (nonprecessing) models applied in LVC-PE [2], the approach described here provides an independent opportunity to extract additional insight from the data, or to independently corroborate the results of LVC-PE [2]. Our study employs a simple method to carry out our Bayesian calculations: for each NR simulation, we evaluate the marginalized likelihood of the simulation parameters given the data. The likelihood is evaluated via an adaptive Monte Carlo integrator. This method provides a quantita- tive ranking of simulations; with judicious interpolation in parameter space, the method also allows us to identify candidate parameters for follow-up numerical relativity simulations. To estimate parameters of GW150914, we can simply select the subset of simulations and masses that have a marginalized likelihood greater than an observatio- nally motivated threshold (i.e., large enough to contribute significantly to the posterior). Even better, with a modest approximation to fill the gaps between NR simulations, we can reproduce and corroborate the results in LVC-PE [2] with a completely independent method. We explicitly construct an approximation to the likelihood that interpo- lates between simulations of precessing binaries, and demonstrate its validity and utility. It is well known that the choice of prior may influence conclusions of Bayesian studies when the data do not strongly constrain the relevant parameters. For example, the results of LVC-PE [2] suggest that GW150914 had low to moderate spins, but this is due partly to the conventional prior used in LVC-PE [2] and earlier studies [5]. This prior is uniform in spin magnitude, and therefore unfavorable to the most dynamically interesting possibilities: comparable- mass binaries with two large, dynamically significant precessing spins [25]. In contrast, by directly considering the (marginalized) likelihood, the results of our study are independent of the choice of prior. For example, we find here that GW150914 is consistent with two large, dynami- cally significant spins. Finally, our efforts to identify even subtle hints of spin precession are motivated by the astrophysical opportunities afforded by spin measurements with GW150914; see LVC- Astro [31]. Using the geometric spin prior adopted in LVC- PE [2], the data from GW150914 are just as consistent with an origin from a nonprecessing or precessing binary, as long as the sum of the components of the spins parallel to the orbital angular momentum L is nearly zero. If the binary black hole formed from isolated stellar evolution, one could reasonably expect all angular momenta to be strictly and positively aligned at coalescence; see LVC- Astro [31] and [32]. Hence, if we believe GW150914 formed from an isolated binary, our data would suggest black holes are born with small spins: a1 ¼ jS1j=m2 1 ≤ 0.22 and a2 ≤ 0.25, where Si andmi are the black hole spins and masses (LVC-PE [2]). If these strictly aligned isolated evolution formation scenarios are true, then a low black hole spin constrains the relevant accretion, angular momen- tum transport, and tidal interaction processes in the progenitor binary; cf. Refs. [32–34]. On the other hand, the data are equally consistent with a strongly precessing black hole binary with large component spins, formed in a densely interacting stellar cluster (LVC-Astro [31]). Measurements of the binary black holes’ transverse spins DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-3 will therefore provide vital clues as to the processes that formed GW150914. In this work we use numerical relativity to check for any evidence for or against spin precession that might otherwise be obscured by model systematics. Like LVC-PE [2], we find results consistent with but with no strong support for precessing spins. III. METHODS A. Numerical relativity simulations of binary black hole coalescence The first attempts to solve the field equations of general relativity numerically began in the 1960s, by Hahn and Lindquist [35], followed with some success by Smarr [36,37]. In the 1990s, a large collaboration of US universities worked together to solve the “Grand Challenge” of evolving binary black holes [38–40]. In 2005, three groups [41–43] developed two completely independent techniques that pro- duced the first collisions of orbiting black holes. The first technique [41] involved the use of generalized harmonic coordinates and the excision of the black hole horizons, while the second technique [42,43], dubbed the “moving punctures approach,” used singularity-avoiding slices of the black hole spacetimes. Since then, the field has seen an explosion of activity and improvements in methods and capabilities; see, e.g., Refs. [44–47]. Multiple approaches have been pursued and validated against one another [10,48,49]. Binaries can now be evolved in wide orbits [50], at high mass ratios up to 100∶1 [51,52], with near-maximal black hole spin [53–55], and for many orbits before coalescence [28,56]. At sufficiently large separations, despite small gauge and frame ambiguities, the orbital and spin dynamics evaluated using numerical relativity agrees with post-Newtonian calculations [6,57–60]. The stringent phase and amplitude needs of gravitational wave detectionandparameter estimationprompted thedevelopment of revised standards for waveform accuracy [61,62]. Several projects have employed numerical relativity-generated wave- forms to assess gravitational-wave detection and parameter estimation strategies [12,63–67]. These results have been used to calibrate models for the leading-order radiation emitted from binary black hole coalescence [6,8,9,13,58,68,69]. Our study builds on this past decade’s experience with modeling the observationally relevant dynamics and radiation from comparable-mass coalescing black holes. In this andmostNRwork, the initial data for a simulationof coalescing binaries are characterized by the properties and initial orbit of its two component black holes: by initial black holes masses m1, m2 and spins S1, S2, specified in a quasicircular orbit such that the (coordinate) orbital angular momentum is alignedwith the ẑ axis and the initial separation is along the x̂ axis. In this work, we characterize these simulations by the dimensionless mass ratio q ¼ m1=m2 (adopting the convention m1 ≥ m2), the dimensionless spin parameters χi ¼ Si=m2 i , and an initial dimensionless orbital frequency Mω0. For each simulation, the orientation-depen- dent gravitational wave strain hðt; r; n̂Þ at large distances can be efficiently characterized by a (spin-weighted) spherical harmonic decomposition of hðt; r; n̂Þ as hðt; r; n̂Þ ¼P l≥2 P l m¼−l hlmðt; rÞ−2Ylmðn̂Þ. To a good first appro- ximation, only a few terms in this series are necessary to characterize the observationally accessible radiation in any direction [15,70–73]. For example, when a binary is widely separated, only two terms dominate this sum: ðl; jmjÞ ¼ ð2; 2Þ. Conversely, however, several terms (modes) are required for even nonprecessing binaries, viewed along a generic line of sight; more are needed to capture the radiation from precessing binaries. For nonprecessing sources with a well-defined axis of symmetry, individual modes ðl; mÞ have distinctive char- acters, and can be easily isolated numerically and compared with analytic predictions. For precessing sources, however, rotation mixes modes with the same l. To apply our procedure self-consistently to both nonprecessing and precessing sources, we include all modes ðl; mÞ with the same l. However, at the start of each simulation, the ðl; mÞ mode oscillates atm times the orbital frequency. Form > 3, scaling our simulations to the inferred mass of the source, this initial mode frequency is often well above 30 Hz, the minimum frequency we adopt in this work for parameter estimation. We therefore cannot safely and self-consistently compare all modes with l > 3 to the data using numerical relativity alone: an approximation would be required to go to higher order (i.e., hybridizing each NR simulation with an analytic approximation at early times). Therefore, in this paper, we use all five of the l ¼ 2 modes to draw conclusions about GW150914, which is necessary and sufficient to capture the leading-order influence of any orbital precession. To incorporate the effect of higher harmonics, we repeat our calculations, using all of the l ≤ 3modes. We defer a careful treatment of higher-order modes and the m ¼ 0 modes to PE+NR- Methods [11] and subsequent work. B. Numerical relativity simulations used Our study makes use of 1139 distinct simulations of binary black hole quasicircular inspiral and coalescence. Table II summarizes the salient features of this set, namely, the mass ratio and initial spins for the simulations used here, all initially in a quasicircular orbit with orbital separation along the x̂ axis and velocities along �ŷ. The RIT group provided 394 simulations. The simulations include binaries with a wide range of mass ratios, as well as a wide range of black hole angular momentum (spin) magni- tudes and directions [18,28,74–76], including a simulation with large transverse spins and several spin precession cycles which fits GW150914 well [28], as described below. The SXS group has provided both a publicly available catalog of coalescing black hole binary mergers [59], a new catalog of nonprecessing simulations [77], and selected supplementary B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-4 simulations described below. Currently extended to 310 members in the form used here, this catalog includes many high-precision zero- and aligned-spin sources, selected precessing systems, and simulations including extremely high black hole spin. The Georgia Tech group (GT) provided 406 simulations; see Refs. [16] and [78] for further details. This extensive archive covers a wide range of spin magni- tudes and orientations, including several systematic one- and two-parameter families. The Cardiff-UIB group provided 29 simulations, all specifically produced to follow up GW150914 via a high-dimensional grid stencil, performed via the bifunctional adaptive mesh (BAM) code [79,80]. These four sets of simulations explore the model space near the event in a well-controlled fashion. In addition to previously reported simulations, several groups performed new simulations (108 in total) designed to reproduce the parameters of the event, some of which were applied to our analysis. These simulations are denoted in Table II and our other reports by an asterisk (�). These follow-up simulations include three independent simulations of the same param- eters drawn from the distributions in LVC-PE [2], from RIT, SXS, and GT, allowing us to assess our systematic error. These simulations were reported in LVC-detect [1] and LVC-Burst [4], and are indicated by (þ) in our tables. The simulations used here have either been published previously, or were produced using one of these well-tested procedures to evolve binary black holes, operating in familiar circumstances. For reference, in Appendix A, we outline the four groups’ previously established methods and results. For this application, we trust these simulations’ accuracy, based on their past track record of good perfor- mance and internal validation studies. By incorporating simulations of identical physics provided by different groups and at different resolutions, our methods provide a direct corroboration: simulations with similar physics produce similar results; see the discussion in Sec. IVD. C. Directly comparing NR with data For each simulation, each choice of seven extrinsic parameters θ (four spacetime coordinates for the coales- cence event; three Euler angles for the binary’s orientation relative to the Earth), and each choice for the redshifted total binary mass Mz ¼ ð1þ zÞM, we can predict the response hk of both of the k ¼ 1, 2 LIGO instruments to the implied gravitational wave signal. Using λ to denote the combination of redshifted mass Mz and the numerical relativity simulation parameters needed to uniquely specify the binary’s dynamics, we can therefore evaluate the likelihood of the data given the noise: lnLðλ; θÞ ¼ − 1 2 X k hhkðλ; θÞ − dkjhkðλ; θÞ − dkik − hdkjdkik; ð1Þ where hk is the predicted response of the kth detector due to a source with parameters λ, θ, dk are the detector data in instrument k, and hajbik ≡ R ∞ −∞ 2df ~aðfÞ� ~bðfÞ=Sh;kðjfjÞ is an inner product implied by the kth detector’s noise power spectrum Sh;kðfÞ; see, e.g., Ref. [81] for more details. In practice, as discussed in the next section, we adopt a low-frequency cutoff flow, so all inner products are modified to hajbik ≡ 2 Z jfj>flow df ~aðfÞ� ~bðfÞ Sh;kðjfjÞ : ð2Þ Except for an overall normalization constant and a different choice for the low-frequency cutoff, our expression agrees with Eq. (1) in LVC-PE [2]. The joint posterior probability of λ, θ follows from Bayes’ theorem: ppostðλ; θÞ ¼ Lðλ; θÞpðθÞpðλÞR dλdθLðλ; θÞpðλÞpðθÞ ; ð3Þ where pðθÞ and pðλÞ are priors on the (independent) variables θ, λ.1 For each λ—that is, for each simulation and each redshifted mass Mz—we evaluate the margin- alized likelihood LmargðλÞ≡ Z Lðλ; θÞpðθÞdθ ð4Þ via direct Monte Carlo integration, where pðθÞ is uniform in 4-volume and source orientation [81].2 The marginalized likelihood measures the similarity between the data and a source with parameters λ and enters naturally into full Bayesian posterior calculations. In terms of the margin- alized likelihood and some assumed prior pðλÞ on intrinsic parameters like masses and spins, the posterior distribution for intrinsic parameters is ppostðλÞ ¼ LmargðλÞpðλÞR dλLmargðλÞpðλÞ : ð5Þ If we can evaluate Lmarg on a sufficiently dense grid of intrinsic parameters, Eq. (5) implies that we can reconstruct the full posterior parameter distribution via interpolation or other local approximations. This reconstruction needs to accurately reproduce Lmarg only near its peak value; for example, if LmargðλÞ can be approximated by a d- dimensional Gaussian, then we anticipate that only configurations λ with lnLmax=LmargðλÞ > χ2d;ϵ=2 ð6Þ 1For simplicity, we assume all BH-BH binaries are equally likely anywhere in the Universe, at any orientation relative to the detector. Future direct observations may favor a correlated distribution, including BH formation at higher masses at large redshift. 2Our choice for pðθÞ differs only superficially from that adopted in LVC-PE [2], by adopting a narrower prior on the geocentric event time. Here, we allow �0.05 s around the time reported by the online analysis; LVC-PE [2] allowed �0.1 s. DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-5 contribute to the posterior distribution at the 1 − ϵ credible interval, where χ2d;ϵ is the inverse-χ2 distribution. Based on the similarity of our distribution to a suitably parametrized multidimensional Gaussian, we anticipate that only the region of parameter space with lnLmax − lnLmargðλÞ≲ 6.7 can potentially impact our conclusions regarding the 90% credible level for d ¼ 8 (i.e., two masses and two precessing spins); for d ¼ 4, more relevant to the most strongly accessible parameters (i.e., two masses and two aligned spins), the corresponding interval is lnLmax − lnLmargðλÞ ≲ 4. Each NR simulation corresponds to a particular value of seven of the intrinsic parameters (themass ratio and the three components of each spin vector) but can be scaled to an arbitrary value of the total redshifted mass Mz. Therefore each NR simulation represents a one-parameter family of points in the eight-dimensional parameter space of all possible values of λ. For each simulation, we evaluate the marginalized log likelihood versus redshifted mass lnLmargðMzÞ on an array of masses, adaptively exploring each one-parameter family to cover the interval lnLmax − lnLmargðλÞ < 10. To avoid systematic bias intro- duced by interpolation or fitting, our principal results are simply these tabulated function values, explored almost continuously in mass Mz and discretely, as our fixed simulation archive permits, in other parameters. The set of intrinsic parameters VC ≡ fλ∶ lnLmarg > Cg above a cutoff C identifies a subset of binary configurations whose gravitational wave emission is consistent with the data.3 Though this approach provides a powerfully model- independent approach to gravitational-wave parameter esti- mation, as described above it is restricted to the discrete grid of NR simulation values. Fortunately, the brevity and simplicity of the signal—only a few chirping and little- modulated cycles—requires the posterior distribution to be broad and smooth, extending overmany numerical relativity simulations’ parameters. This allows us to go beyond comparisons on a discrete grid of NR simulations, and instead interpolate between simulations to reconstruct the entire distribution. To establish a sense of scale, we can use a simple order- of-magnitude calculation for lnLmarg. The signal-to-noise ratio ρ and peak likelihood of any assumed signal are related: ρ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2maxθ lnL p . Even at the best intrinsic parameters λ, the marginalized log-likelihood lnLmarg will be well below the peak value maxθ lnL, because only a small fraction of extrinsic parameters θ have support from the data [83]. Using GW150914’s previously reported signal amplitude [ρ ¼ 23.5–26.8], its extrinsic parameters and their uncertainty (LVC-PE [2]), and our prior pðθÞ, we expect the peak value of lnLmarg to be of order 240–330. The interval of lnLmarg selected by Eq. (6) is a small fraction of the full range of lnLmarg, identifying a narrow range of parameters λ which are consistent with the data. Our analysis of this event, as well as synthetic data, suggests that lnLmarg is often well approximated by simple low-order series in intrinsic parameters λ. This simple behavior is most apparent versus the total mass Mz. Figure 2 shows examples of the marginalized log likelihood evaluated using two of our most promising simulation candidates: they are well approximated by a quadratic over the entire observationally interesting range. We approxi- mate lnLmargðMzÞ as a second-order Taylor series, lnLmargðMzÞ≃ lnL − 1 2 ΓMMðMz −Mz;�Þ2; ð7Þ where the constants lnL, Mz;�, and ΓMM represent the largest value of lnLmarg, the redshifted mass at which this maximum occurs, and the second derivative at the peak value. Even in (rare) cases when a locally quadratic approximation slightly breaks down, we still use lnL to FIG. 2. Likelihood versus mass: Examples: Raw Monte Carlo estimates for lnLmargðMzÞ versus Mz for two nonprecessing binaries: SXS:BBH:305 (blue) and d0_D10.52_q1 .3333_a-0.25_n100 (red). To guide the eye, for each simulation we also overplot a local quadratic fit to the results near each peak. Results were evaluated with fmin ¼ 30 Hz; compare to Table III. Error bars reflect the standard Monte Carlo estimate of the integral standard deviation, multiplied by 2.57 in the log to increase contrast (i.e., the nominal 99% credible interval, assuming the relative Monte Carlo errors are normally distributed). To guide the eye, a shaded region indicates the interval of lnLmarg selected by our ansatz given a credible interval 90% and a peak value of lnLmarg of 273; see Sec. III and Eq. (6). 3While this approach works for multidimensional Gaussians, it can break down in coordinate systems where the prior is particularly significant (e.g., diverges; see the grid-based method in Ref. [82]) or where the likelihood has strong features (e.g., corners and tails) in multiple dimensions. For example, a like- lihood constant on a sphere plus thin, long spines (e.g., the shape of a sea urchin) will have little posterior support on the spines, but each of the spines would be selected by a likelihood cut of the kind used here. As our calculations below demonstrate, margin- alization over extrinsic parameters eliminates most complexity in the likelihood: our function is smooth, dominated by a handful of parameters, without corners or narrow tails. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-6 denote our estimate of the peak of lnLmargðMzÞ.4 As a means of efficiently communicating trends in the quality of fit versus intrinsic parameters, the two quantities lnL and Mz;� are reported in Table III. Motivated by the success of this approximation, in Sec. IV Bwealso supply a quadratic approximation to lnLmarg near its peak, under the restrictive approximation that all angular momenta are parallel, using information from only non- precessing simulations. Using this quadratic approximation, we can numerically estimate lnLmarg and hence the posterior [Eq. (5)] for arbitrary aligned-spin binaries. For any coordinate transformation z ¼ ZðλÞ, we can use suitable supplementary coordinates and direct numerical quadrature to determine the marginal posterior density ppostðzÞ ¼ R ppostðλÞδðz − ZðλÞÞ. As shown below, this procedure yields results comparable to LVC-PE [2] for nonprecessing binaries. D. Are there sufficiently many and long NR simulations? Because of finite computational resources, NR simula- tions of binary black holes cannot include an arbitrary number of orbits before merger. Instead, they start at some finite initial orbital frequency. While many NR simulations follow enough binary orbits to be compared with GW150914 over the entire LIGO frequency band, some NR simulations miss some early time information. Therefore, in this section we describe a simple approxi- mation (a low-frequency cutoff) we apply to ensure that simulations with similar physics (but different initial orbital frequencies) lead to similar results. At the time of GW150914, the instruments had relatively poor sensitivity to frequencies below 30 Hz and almost no sensitivity below 20 Hz. For this reason, the interpretations adopted in LVC-PE [2] adopted a low-frequency cutoff of 20 Hz. Because of the large number of cycles accumulated at low frequencies, a straightforward Fisher matrix estimate [84,85] suggests these low frequencies (20–30 Hz) provide a nontrivial amount of information, particularly about the binary’s total mass. Equivalently, using the techniques described in this paper, the function lnLmargðMzÞ will have a slightly higher and narrower peak when including all frequencies than when truncating the signal to only include frequencies above 30 Hz; see PE+NR-Methods [11]. Because of limited computational resources, relatively few simulations start in a sufficiently wide orbit such that, for Mz ¼ 70 M⊙, their radiation in the most significant harmonics of the orbital frequency will be at or below the lowest frequency (20 Hz) adopted in LVC-PE [2]. If fmin is the low-frequency cutoff, Mω0=m≲ 0.02ðMz=70 M⊙Þ× ðfmin=20 HzÞð2=mÞ, where Mω0 is the initial orbital frequency of the simulation reported in Table II, can be safely used to analyze a signal containing a significant contribution from themth harmonic of the orbital frequency. Figure 3 shows examples of the strain in LIGO-Hanford, predicted using simulations of different intrinsic duration, superimposed with lines approximately corresponding to different gravitational wave frequencies. To facilitate an apples-to-apples comparison incorporating the widest range of available simulations, in this workwe principally report on comparisons calculated byadopting a low-frequency cutoff of 30 Hz; see, e.g., Table III. (We also briefly report on comparisons performed using a low-frequency cutoff of 10 Hz.) As we describe in subsequent sections, while this choice of 30 Hz slightly degrades our ability to make subtle distinctions between different precessing configurations, it does not dramatically impair our ability to reconstruct parameters of the event, given other significant degeneracies. Even this generous low-frequency cutoff is not perfectly safe: for each simulation, a minimum mass exists at which the starting gravitational wave frequency is 30 Hz or larger. In the plots and numerical results reported here, we have eliminated simulation and mass choices that correspond to scaling an NR simulation to a starting frequency above 30 Hz. The inclusion or suppression of these configurations does not significantly change our principal results. FIG. 3. Best-fit detector response: A plot of the detector response (strain) hðtÞ ¼ FþhþðtÞ þ F×h×ðtÞ evaluated at the LIGO-Hanford detector, similar to Fig. 2 in LVC-detect [1], Fig. 6 in LVC-PE [2], and Fig. 2 in LVC-TestGR [3], evaluated using two of the best-fitting numerical relativity simulations and total redshifted masses reported in Table III. The redshifted masses Mz and extrinsic parameters θ necessary to evaluate the detector response have been identified using the methods used in this work andPE+NR-Methods [11], using all l ¼ 2modes, a low-frequency cutoff of 30 Hz, and omitting the impact of calibration uncertainty. For comparison, the shaded region shows the 95% credible region for the waveform reported in LVC-PE [2], an analysis which accounts for calibration uncertainties and includes frequencies down to 20 Hz but approximates the radiation and omits higher harmonics [e.g, the ð2;�1Þmodes]. To guide the eye, two vertical lines indicate the approximate time at which the signal crosses these two gravitational wave frequency thresholds. 4We find similar results using more sophisticated nonpara- metric interpolation schemes. The results reported in Table III use one-dimensional Gaussian process interpolation to determine the peak value. DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-7 This paper uses enough NR simulations to adequately sample the four-dimensional space of nonprecessing spins, particularly for comparable masses. As described below, this high simulation density ensures we can reliably approximate the marginalized likelihood lnLmarg for non- precessing systems. On the other hand, the eight-dimen- sional parameter space of precessing binaries is much more sparsely explored by the simulations available to us. But because the reconstructed gravitational wave signals in LVC-detect [1] and LVC-PE [2] exhibit little to no modulation, we expect that the remaining four parameters must have at best a subtle effect on the signal: the likelihood and posterior distribution should depend only weakly on any additional subdominant parameters. Having identified dominant trends using nonprecessing simulations, we can use controlled sequences of simulations with similar parameters to determine the residual impact of transverse spins. Even if the marginalized likelihood cannot be safely approximated in general, a simulation’s value of lnL provides insight into the parameters of the event. Motivated by the parameters reported in LVC-PE [2] and our results in Table III, several follow-up simulations were performed to reproduce GW150914. These simulations are responsible for most of the best-fitting aligned-spin results reported in Table III. E. Impact of instrumental uncertainty For simplicity, our analysis does not automatically account for instrumental uncertainty (i.e., in the detector noise power spectrum or instrument calibration), as do the methods in LVC-PE [2]. LVC-PE [2] suggests that, for the intrinsic parameters λ of interest here, the impact of these systematic instrumental uncertainty effects are relatively small. We have repeated our analysis using two versions of the instrumental calibration; we find no significant change in our results. F. Comparison with other methods LVC-Burst [4] reported on direct comparisons between radiation extracted fromNR simulations and nonparametri- cally reconstructed estimates of the gravitational wave signal; see, e.g., their Fig. 12. Their comparisons quickly identifiedmasses,mass ratios, and spins thatwere consistent with the data. Our study, which attempts a fully Bayesian direct comparison between the data and the multimodal predictions of NR, produces results consistent with those of LVC-Burst [4]; see, e.g., Fig. 4 described below. FIG. 4. Mass, mass ratio, and effective spin are constrained and correlated: Colors represent the marginalized log likelihood as a function of redshifted total mass Mz, mass ratio q and effective spin parameter χeff. Each point represents an NR simulation and a particular Mz. Points with 265.8 < lnLmarg < 268.6 are shown in light gray, those with lnLmarg > 268.6 are shown in black, and those with lnLmarg < 265.8 are shown according to the color scale on the right (points with lnLmarg < 172 have been suppressed to increase contrast). Marginalized likelihoods are computed using flow ¼ 30 Hz, using all l ¼ 2 modes, and without correcting for (small) Monte Carlo integral uncertainties. These figures include both nonprecessing and precessing simulations. For comparison, the black, blue, and green contours show estimated 90% credible intervals, calculated assuming that the binary’s spins and orbital angular momentum are parallel. The solid black contour corresponds to the 90% credible interval reported in LVC-PE [2], assuming spin-orbit alignment; the solid blue contour shows the corresponding 90% interval reported using the semianalytic precessing model (IMRP) in LVC-PE [2]; the solid green curve shows the 90% credible intervals derived using a quadratic fit to lnLmarg for nonprecessing simulations using l ¼ 2modes; and the dashed green curve shows the 90% credible intervals derived using lnLmarg from nonprecessing simulations, calculated using all modes with l ≤ 3; see Sec. IV B for details. Unlike our calculations, the black and blue contours from LVC-PE [2] account for calibration uncertainty and use a low-frequency cutoff of 20 Hz. Left panel: Comparison forMz, χeff . This figure demonstrates the strong correlation between the total redshifted mass and spin. Right panel: Comparison for q, χeff . This figure is consistent with the similar but simpler analysis reported in LVC-Burst [4]; see, e.g., their Fig. 12. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-8 LVC-PE [2] performed Bayesian inference on the data using semianalytic models for the gravitational waves from a coalescing compact binary. We directly compare our posterior distribution with that of LVC-PE [2] for the special case of aligned spins. Differences between these posterior distributions can be due to many factors: our choice of starting frequency is slightly higher (30 versus 20 Hz), our approach does not account for calibration uncertainty, and of course we employ NR instead of a semianalytic waveform model. To isolate the effects of NR, we have repeated our analysis but with the same non- precessing waveform model used in LVC-PE [2] rather than with NR waveforms. Using the same input waveforms, our method and that of LVC-PE [2] produce very similar results; see PE+NR-Methods [11] for details. To isolate the effects of the low-frequency cutoff, we performed the nonprecessing analysis reported in LVC-PE [2] with a low-frequency cutoff of 30 Hz; we found results similar to LVC-PE [2]. IV. RESULTS I: PRE-COALESCENCE PARAMETERS We present two types of results. For generic, precessing NR simulations, we evaluate the marginalized likelihood of source parameters given the data, but because the param- eter-space coverage of NR simulations is so sparse, we do not attempt to construct an interpolant for the likelihood as a function of source parameters. For nonprecessing sources, we construct such an interpolant, and we compare with the results of LVC-PE [2]. Using the computed likelihoods, we quantify whether the data are consistent with or favor a precessing source. A. Results for generic sources, without interpolation Because the available generic NR simulations represent only a sparse sampling of the parameter space, for generic sources we adopt a conservative approach: we rely only on our estimates of the marginalized likelihood lnLmarg, and FIG. 5. Distributions agree (nonprecessing case): Comparison between the posterior distributions reported in LVC-PE [2] for nonprecessing binaries (solid) and the posterior distributions implied by a leading-order approximation to lnLmarg [Eq. (9)] derived using l ≤ 2 (dotted) and l ≤ 3 (dashed). Left panel: m1;z (black) and m2;z (red). Center panel: Mass ratio 1=q ¼ m2;z=m1;z. The data increasingly favor comparable-mass binaries as higher-order harmonics are included in the analysis. Right panel: Aligned effective spin χeff . The noticeable differences between our χeff distributions and the solid curve are also apparent in Figs. 7 and 4: our analysis favors a slightly higher effective spin. FIG. 6. Likelihood versus spins: Nonprecessing: Maximum likelihood lnL (colors, according to the color bar on the right) as a function of spins χ1;z, χ2;z for different choices of mass ratio 1=q, computed using all l ¼ 2 modes. Each point represents a nonprecessing NR simulation from Table III. To increase contrast, simulations with lnL < 171 have been suppressed. Only simulations with fstart < 30 Hz are included. Dashed lines and labels indicate contours of constant χeff . The left two panels show that for mass ratio q≃ 1, the marginalized likelihood is approximately constant on lines of constant χeff . For more asymmetric binaries (q ¼ 2), the marginalized likelihood is no longer constant on lines of constant χeff . Along lines of constant χeff and q, lnL decreases versus χ2;z DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-9 we do not interpolate the likelihood between intrinsic parameters, nor do we account for Monte Carlo uncertainty in each numerical estimate of lnLmarg. Using the inverse χ2 distribution, we identify two thresholds in lnLmarg using Eq. (6): one (our preferred choice) obtained by adopting d ¼ 4 observationally accessible parameters, and the other adopting d ¼ 8.5 Both thresholds on lnLmarg are derived using (a) our target credible interval (90%) and (b) the peak log likelihood attained over all simulations (Table III). Below, we find that the peak log likelihood over all simulations is lnLmarg ¼ 272.5; as a result, these two thresholds are lnLmarg ¼ 268.6 and lnLmarg ¼ 265.8, for d ¼ 4 and d ¼ 8, respectively. The configurations of masses and intrinsic parameters that pass either of these two thresholds are deemed consistent with the data. In subsequent figures, we will color these two classes of configurations in black (those configurations with lnLmarg > 268.6) and gray (those configurations with lnLmarg > 265.8). We use this set of points in parameter space to bound (below) the range of parameters consistent with the data. For the progenitor black hole parameters, our results using l ¼ 2 modes are summarized in Figs. 4 and 8 (for generic sources), as well as by Figs. 6 and 7 (for non- precessing sources). For comparison, these figures also include the results obtained in LVC-PE [2], using approx- imations appropriate for nonprecessing (black curves) and simply [19] precessing (blue curves) binaries. The first column of Table I shows the one-dimensional range inferred for each parameter by our threshold-based method, using l ¼ 2 modes only. Before describing our results, we first demonstrate why our strategy is effective: Figs. 4, 6, 7, and 8 show that the likelihood is smooth and slowly varying, dominated by a few key parameters. As seen in the right panel of Fig. 4, even our large NR array is relatively sparse. However, as the color scale on this and other figures indicate, the marginalized likelihood varies smoothly with parameters, over a range of more than e100. The simplicity of lnLmarg is most apparent using controlled one- and two-parameter subspaces; for example, Fig. 6 shows that lnL [i.e., the peak of lnLmargðMzÞ] varies smoothly as a function of χ1;z, χ2;z for nonprecessing binaries of different mass ratios q ¼ m1=m2. Targeted NR simulations have corroborated the simple dependence of the likelihood seen here. Despite employing simulations with two strongly precessing spins and including higher harmonics, two factors which have been previously shown to be able to break degeneracies [25,86–90], Table III reveals that simulations with the same values of q and χeff almost always have similar values of lnL. In other words, these two simple parameters explain most of the variation in L, even when L changes by up to a factor of e100. Finally and critically, simulations with similar physics produce very similar results. By adopting flow ¼ 30 Hz and thereby largely standardizing simulation duration, we find similar values of lnLwhen comparing the data to simulations performed by different groups with similar (or even identical) parameters. Our results and that of LVC-PE [2] constrain the progenitor binary’s redshifted mass, mass ratio, and aligned effective spin χeff ; see Table I. The effective spin is defined as [91,92] χeff ¼ ðS1=m1 þ S2=m2Þ · L̂=M: ð8Þ For example, the color scale in Fig. 4 provides a graphical representation of lnL versus χeff ; large values of jχeff j (only possible for spin-aligned systems) are inconsistent with the FIG. 7. Aligned spin components not well constrained (aligned only shown): Colors represent the marginalized log likelihood as a function of the aligned spin components χ1;z and χ2;z. Each point represents an NR simulation; only nonprecessing simu- lations are included. Points with 265.8 < lnL < 268.6 are shown in light gray, those with lnL > 268.6 are shown in black, and those with lnL < 265.8 are shown according to the color scale on the right (points with lnLmarg < 172 have been suppressed to increase contrast). [The quantity lnL is the maximum value of lnLmarg with respect to mass; see Eq. (7).] Consistent with our other results, flow ¼ 30 Hz. For comparison, the solid black contours show the 90% credible intervals derived in LVC-PE [2], assuming spin-orbit alignment and omitting corrections for waveform systematics. The solid and dashed green contours are the nominal 90% credible interval derived using an approxi- mation to our data for lnLmarg, assuming both spins are exactly parallel to the orbital angular momentum, for l ¼ 2 (solid) and l ¼ 3 (dashed), respectively; see Sec. IV B for more details. 5The second choice (d ¼ 8) would be appropriate if the posterior was well approximated by an eight-dimensional Gaus- sian. The first choice (d ¼ 4) is motivated by past parameter estimation studies when the posterior distribution principally constrains the component masses and aligned spins. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-10 data. The agreement between our results and LVC-PE [2] persists despite using a much larger simulation set than those used to calibrate the models used in LVC-PE [2], and despite employing simulations with black hole spins that are both precessing and with magnitude significantly out- side the range χ < 0.5 − 0.8 for which these models were calibrated [12,27,93]. The three parameters Mz, q, and χeff are well known to have a strong and tightly correlated impact on the gravi- tational wave signal and hence on implied posterior distributions [83–85,94,95]. Since general relativity is scale free, the total redshifted binary mass Mz sets the character- istic physical time scale of the coalescence. Due to strong spin-orbit coupling, aligned spins (χeff > 0) extend the temporal duration of the inspiral [19] and coalescence of the two black holes (e.g., the hangup effect [96]); aligned spins also increase the final black hole spin and hence extend the duration of the post-merger quasinormal ring- down [97]. More extreme mass ratios extend the duration of the pre-merger phase while dramatically diminishing the amplitude and frequency of post-merger oscillations [68,69,98,99]. As noted above, the data tightly constrain one of these combinations (e.g., the total redshifted mass at fixed simulation parameters). Hence, our ability to con- strain any individual parameter Mz, q, or χeff is limited not by the accuracy to which Mz is determined for each simulation (i.e., the width 1= ffiffiffiffiffiffiffiffiffiffi ΓMM p ), but rather by differences between simulations (i.e., trends in lnL versus χeff , q) which break the degeneracy between these tightly correlated parameters. Simulations with a variety of physics fit the data, including strongly precessing systems. In Table III, several simulations with large transverse spins but nearly zero net aligned spin fit the data almost as well as the best-fitting nonprecessing simulations (e.g., SXS:BBH:3; RIT simu- lation D10_q0.75_a-0.8_xi0_n100). As described below, Table V shows that these and other long precessing simulations fit even better when more low-frequency content is included. The correspondence between our results and those presented in LVC-PE [2] merits further reflection: by construction our fiducial analysis (Table III) omitted non- trivial early time information (i.e., f < 30 Hz) which, for each simulation, more tightly constrains the range of masses that could be consistent with the data. In fact, as we show below, strong degeneracies in the gravitational FIG. 8. Large spins possible: Colors represent the marginalized log likelihood as a function of ð−1Þiþ1jχi × L̂j and χi · L̂, where i ¼ 1, 2 index the first and second black hole, evaluated using each simulation’s initial conditions (Table II); compare to the left panel of Fig. 5 in LVC-PE [2]. Each simulation therefore appears twice in this figure: once on the left and once on the right. Spin magnitudes and directions refer to the initial configuration of each NR simulation, not to properties at a fixed reference frequency as in LVC-PE [2]. Points with 265.8 < lnL < 268.6 [cf. Eq. (6)] are shown in light gray, those with lnLmarg > 268.6 are shown in black, and those with lnLmarg < 265.8 are shown according to the color scale on the right. [The quantity lnL is the maximum value of lnLmarg with respect to mass; see Eq. (7).] While this figure was evaluated using l ¼ 2 modes only, the corresponding figure for l ≤ 3 modes is effectively indistinguishable. This diagram demonstrates that both black holes could have large dimensionless spin χ. The solid black circle represents the Kerr limit jχj ¼ 1; to guide the eye, the dashed circles show jχ1;2j ¼ 0.2, 0.4, 0.6, 0.8. For comparison, the blue contour shows the corresponding 90% credible interval reported in LVC-PE [2], using spin configurations at 20 Hz. The structure in this contour (e.g., the absence of support near the axis) should not be over-interpreted: similar structure arises when reconstructing the parameters of synthetic nonprecessing sources. Left panel: All simulations are included. Right panel: To increase contrast only simulations with q < 2 and χeff ∈ ½−0.5; 0.2� are shown; these limits are chosen to be consistent with the two-dimensional posterior in q, χeff shown in the right panel of Fig. 4. DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-11 wave signal between mass, mass ratio, and spin imply that our ability to break these degeneracies dominates our reconstruction of source parameters. Omitting information from low frequencies marginally reduces our ability to identify the range of masses that are consistent with the data for one simulation; however, this omission does not impair our ability to draw conclusions overall, after accounting for uncertain spins and mass ratio. Directly comparable to Fig. 12 in LVC-Burst [4], the right panel of Fig. 4 provides a visual representation of one key correlation between q and χeff : only a narrow range of mass ratios and aligned effective spin χeff are consistent with the data. This range includes both nonprecessing and precessing simulations. Most other parameters have a subdominant effect. For example, restricting attention to nonprecessing binaries for clarity, the data do not strongly discriminate between systems with similar χeff but different χ1;z, χ2;z; see, e.g., Fig. 6. B. Results for nonprecessing sources, including interpolation Both LVC-PE [2] and our highly ranked simulations in Table III demonstrate that binary black holes with non- precessing spins can reproduce GW150914. Only four parameters characterize a nonprecessing binary: the two component masses m1, m2 and the components of each BH’s dimensionless spin χi projected perpendicular to the orbital plane (χ1;z, χ2;z). Nonprecessing binary black hole coalescences have been extensively simulated [46]; see, e.g., Table II. Several models have been developed to reproduce the leading-order gravitational wave emission [the ðl; jmjÞ ¼ ð2; 2Þ modes] [6,8,13,58,68,69]; one, the SEOBNRv2 model [93], was adopted as a fiducial reference by LVC-PE [2]. While this model has not been calibrated to NR for large values of χeff and q [12], it has been shown to accurately reproduce the (2,2) mode from binaries with comparable mass and low spins [9,12,62]. Because of degeneracies, data from GW150914 do not easily distinguish between different points in parameter space that have the same values ofMz, q, χeff ; in particular, it is difficult to individually measure χ1;z and χ2;z when q≃ 1, χeff ≃ 0 and χ1;z ≃ −χ2;z (see, e.g., Ref. [95]). Because GW150914 has comparable masses and is ori- ented face-off with respect to the line of sight, even including higher-order modes in the gravitational wave- form (which we do in our approach here but is not done for the analytic waveform models) does not strongly break these degeneracies and allows us to distinguish individ- ual spins. By stitching together our fits for lnLmargðMzÞ and reconstructing the relevant parts of the likelihood for all masses and aligned spins, we can estimate the full posterior distribution for Mz, q, χ1;z, χ2;z using Eq. (5). Due to inevitable systematic modeling errors in the fit, as described below, this approximation may not have the statistical purity of the method presented in LVC-PE [2]: any credible intervals or deductions drawn from it should be interpreted with judicious skepticism. On the other hand, this method enables the reader to recalculate the posterior distribution using any prior pðλÞ, including astrophysically motivated choices. Fitting to nonprecessing simulations, we find lnLmarg for lnLmarg > 262 is reason- ably well approximated by a quadratic function of the intrinsic parameters Mz ¼ ðm1;zm2;zÞ3=5=ðm1;z þm2;zÞ1=5, TABLE I. Constraints on Mz, q, χeff : Constraints on selected parameters of GW150914 derived by directly comparing the data to numerical relativity simulations. The first column reports the extreme values of each parameter consistent with lnLmarg > 268.6 [Eq. (6), with d ¼ 4], corresponding to the black points shown in Figs. 4, 7, and 10. These are computed using all the l ¼ 2modes of the NR waveforms. Because these extreme values are evaluated only on a sparse discrete grid of NR simulations, this procedure can underestimate the extent of the allowed range of each parameter. The second column reports the 90% credible interval derived by fitting lnLmarg versus these parameters for nonprecessing binaries, to enable interpolation between points on the discrete grid in λ; see Sec. IV B for details. The third column is the union of the two intervals. For comparison, the fourth column provides the interval reported in LVC-PE [2], including precession and systematics. The remaining three columns show our results derived using all l ≤ 3 modes. � � � NR grid Aligned fit Overall Previously NR grid (l ≤ 3) Aligned fit (l ≤ 3) Overall (l ≤ 3) Detector-frame initial total mass MzðM⊙Þ 65.6–77.7 67.2–77.2 65–77.7 66–75 67.1–76.8 67.2.3–77.3 67.1–77.3 Detector-frame m1;zðM⊙Þ 35–45 35–45 35–45 35–45 34.5–43.9 35–45 34.5–45 Detector-frame m2;zðM⊙Þ 27–36 27–36.7 27–36.7 27–36 30–37.5 28–37 28–37.5 Mass ratio 1=q 0.66–1 0.62–1 0.62–1 0.62–0.98 0.67–1 0.69–1 0.67–1 Effective spin χeff −0.3–0.2 −0.2–0.1 −0.3–0.2 −0.24–0.09 −0.24–0.1 −0.2–0.1 −0.24–0.1 Spin 1 a1 0–0.8 0.03–0.80 0–0.8 0.0–0.8 0–0.8 0.03–0.83 0–0.83 Spin 2 a2 0–0.8 0.07–0.91 0–0.91 0.0–0.9 0–0.8 0.11–0.92 0–0.92 Final total mass Mf;zðM⊙Þ 64.0–73.5 � � � 64.0–73.5 63–71 64.2–72.9 64.2–72.9 Final spin af 0.62–0.73 0.62–0.73 0.60–0.72 0.62–0.73 0.62–0.73 B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-12 η ¼ ðm1;zm2;zÞ=ðm1;z þm2;zÞ2, δ ¼ ðm1;z −m2;zÞ=Mz, χeff , and χ− ≡ ðm1;zχ1 −m2;zχ2Þ · L̂=Mz: lnLmarg ≃ 268.4 − 1 2 ðλ − λ�ÞaΓabðλ − λ�Þb − Γχ−δχ−δ ð9aÞ where the indices a, b run over the variablesMz, η, χeff , χ−. In this expression, λa represents the vector (Mz, η, χeff , χ−), λ�a corresponds to the vector (Mz ¼ 31.76 M⊙, η ¼ 0.255, χeff ¼ −0.037, χ− ¼ 0) of parameters which maximize lnLmarg, and Γ is a matrix (indexed by Mz, η, χeff , χ−, δ) with numerical values Γ ¼ 2 6666664 3.75 −224.2 −52.0 0 0 −224.2 22697.2 2692 0 0 −52.0 2692: 846.9 0 0 0 0 0 2.57 −16.3 0 0 0 −16.3 0 3 7777775 : ð9bÞ Here we retain many significant digits to account for structure inΓ, which is nearly singular. Equation (9) respects exchange symmetry m1;z; χ1 ↔ m2;z; χ2. Our results do not sensitively depend on the value of Γχ−;χ− , indicating that this quantity is not strongly constrained by the data. Conversely, the posteriors do depend on Γχ−;δ. As the contrast between the first term in Eq. (9) and the data Table III makes immediately apparent, this coarse approximation can differ from the simulated results by of order 1.7 in the log (rms residual). This reflects the combined impact of Monte Carlo error, systematic error caused by too few orbits in some simulations, and systematic errors caused by sparse place- ment of NR simulations and nonquadratic behavior of lnLmarg with respect to parameters. Repeating our calcu- lation while including all the l ≤ 3 modes (Table IV), we find the same functional form as Eq. (9), but with a different vector λð3Þ�;a¼ðMz¼38.1M⊙;η≃0.32;χeff¼0.11;χ−¼0Þ, and a different matrix Γð3Þ ¼ 2 6666664 3.746 −235.5 −51.5 0 0 −235.5 17970 2941 0 0 −51.5 2941 833.2 0 0 0 0 0 0.57 −12.57 0 0 0 −12.57 0 3 7777775 : ð10Þ We label λð3Þ and Γð3Þ with a superscript “3” to distinguish this result from the corresponding result using only l ¼ 2 modes shown in Eq. (9). For nonprecessing sources, using Eq. (5) and a uniform prior in χ1;z, χ2;z and the two component masses, we can evaluate the marginal posterior probability pðzÞ for any intrinsic parameter(s) z. The two-dimensional marginal posterior probability is shown as a green solid (l ¼ 2) and dashed (l ≤ 3) line in Figs. 4 and 7. Both the l ¼ 2 and l ≤ 3 two-dimensional distributions are in reasonable agreement with the posterior distributions reported in LVC-PE [2] for nonprecessing binaries, shown as a black curve in these figures. These two-dimensional distributions are also consistent with the distribution of simulations with lnLmarg > 268.6 (i.e., black points). Additionally, Fig. 5 shows several one-dimensional marginal probability dis- tributions (m1;z, m2;z, q, χeff ), shown as dotted (l ¼ 2) or dashed lines (l ≤ 3); for comparison, the solid line shows the corresponding distribution from LVC-PE [2] for non- precessing binaries. Despite broad qualitative agreement, these comparisons highlight several differences between our results and LVC- PE [2], and between results including l ¼ 2 modes and those including all l ≤ 3 modes. For example, Fig. 4 shows that the distribution in Mz, q, χeff , computed using our method (solid green lines and black points) is slightly different than the corresponding distributions in LVC-PE [2]. As seen in this figure and in Fig. 7, the posterior distribution in LVC-PE [2] includes binaries with low effective spin, outside the support of the distributions reported here. These differences are directly reflected in the marginal posterior pðχeffÞ (right panel of Fig. 5) and in Table I. Our results for the component spins χ1;z, χ2;z, the effective spin χeff , the total mass Mz, and the mass of the more massive object m1;z do not change significantly when l ¼ 3 modes are included. The mass ratio distribution pðqÞ is also slightly different from LVC-PE [2] when l ¼ 3 modes are included; see Fig. 5. Compared to prior work, this analysis favors comparable-mass binaries when higher modes are included; see, e.g., the center panel of Fig. 5. The differences between the results reported here and LVC-PE [2] should be considered in context: not only does our study employ numerical relativity without analytic waveform models, but it also adopts a slightly different starting frequency, omits any direct treatment of calibration uncertainty, and employs a quadratic approximation to the likelihood. That said, comparisons conducted under similar limitations and using real data, differing only in the underlying waveform model, reproduce results from LALINFERENCE; see PE+NR-Methods [11] for details. By assuming the binaries are strictly aligned but permit- ting generic spin magnitudes, our analysis (and that in LVC-PE [2]) neglects prior information that could be used to significantly influence the posterior spin distributions. For example, the part of the posterior in the bottom right quadrant of Fig. 7 is unstable to large angle precession [101]: if a comparable-mass binary formed at large sepa- ration with χ1;z > 0 and χ2;z < 0, it could not remain DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-13 aligned during the last few orbits. Likewise, the astro- physical scenarios most likely to produce strictly aligned binaries—isolated binary evolution—are most likely to result in both χ1;z, χ2;z > 0: both spins would be strictly and positively aligned (see, e.g, Ref. [102]). In that case, only the top right quadrant of Fig. 7 would be relevant. Using the analytic tools provided here, the reader can regenerate approximate posterior distributions employing any prior assumptions, including these two considerations. C. Transverse and precessing spins Figure 8 shows the maximum likelihood for the available NR simulations, plotted as a function of the magnitude of the aligned and transverse spin components. The figure shows that there are both precessing and nonprecessing simulations that have large likelihoods (black points), indicating that many precessing simulations are as con- sistent with the data as nonprecessing simulations. Moreover, simulations with large precessing spins are consistent with GW150914: many configurations have χeff ≃ 0 but large spins on one or both BHs in the binary. Keeping in mind the limited range of simulations available, the magnitude and direction of either BHs’ spin cannot be significantly constrained by our method. Not all precessing simulations with suitable q, χeff are consistent with GW150914; some have values of lnL that are not within 10 of the peak (see the right panel of Fig. 8). The marginal log likelihood lnL depends on the transverse spins, not just the dominant parameters (q, χeff , Mz). As a concrete illustration, Fig. 9 shows that the marginalized log likelihood depends on the specific direction of the trans- verse spin, in the plane perpendicular to the angular momentum axis. Specifically, this figure compares the peak marginalized log likelihood (lnL) calculated for each simulation with the value of lnL predicted from our fit to nonprecessing binaries. For precessing binaries, lnL is neither in perfect agreement with the nonprecessing pre- diction, nor independent of rotations of the initial spins about the initial orbital angular momentum by an angle ϕ. While the transverse spins do influence the likelihood, slightly, the data do not favor any particular precessing configurations. No precessing simulations had margin- alized likelihoods that were both significant overall and significantly above the value we predicted assuming aligned spins. In other words, the data do not seem to favor precessing systems, when analyzed using only information above 30 Hz. Our inability to determine the most likely transverse spin components is expected, given both our self-imposed restrictions (flow ¼ 30 Hz) and the a priori effects of geometry. For example, the lack of apparent modulation in the signal reported in LVC-detect [1] and LVC-Burst [4] points to an orientation with J parallel to the line of sight, along which precession-induced modulations are highly suppressed. In addition, the high mass and hence extremely short observationally accessible signal above 10 Hz pro- vides relatively few cycles with which to extract this information. The time scales involved are particularly unfavorable to attempts to extract precession-induced modulation from the pre-merger signal: the pre-coalescence precession rate for these sources is low [Ωp ≃ ð2þ 3m2= m1ÞJ=2r3 ≃ 2π × 1 Hzðf=40 HzÞ5=3 for this system, where J is the magnitude of the total orbital angular momentum and we assume J ≃ L; see Ref. [19]], implying at best two pre-merger precession cycles could be acces- sible from the early signal; see LVC-PE [2]. As with the total binary mass, spin information will be accessible at lower frequencies (i.e., between 10–30 Hz); however, our fiducial analysis using flow ¼ 30 Hz is not well suited to extract it. For a suitably oriented source, the strongly nonlinear merger phase can in principle encode significant informa- tion about the coalescing binary’s precessing spins. Qualitatively speaking, this information is encoded in the relative amplitude and phase of subdominant quasi- linear perturbations, causing the radiation from the final black hole to appear to precess [16,30]. This information also influences the final black hole mass and spin. The model used in LVC-PE [2] adopted a geometric ansatz to incorporate these effects at leading order, using a lower- dimensional effective model for a single precessing spin. However, in this work, despite including higher modes and having direct access to as-yet unmodeled effects, our analysis shows no significant difference from the previ- ously reported conclusions regarding the transverse spin distribution. The low-frequency content of GW150914 may contain some further signature consistent with two precessing FIG. 9. Transverse spins can influence the marginal likelihood: Δ lnL, the difference between the computed lnL of a precessing simulation [see Eq. (7)] and the estimated value of lnL from our fit to nonprecessing simulations, plotted as a function of an angle ϕ, for two one-parameter families of simulations whose initial conditions differ only by a rotation of the initial spins through an angle ϕ around the initial angular momentum axis. The color scale indicates the value of χeff . The simulations shown have a single nonzero spin a1 ¼ 0.8 with q ¼ 2, from Ref. [18]. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-14 spins. Simultaneously with this work, an analysis has been performed using semianalytic models that can fully capture both spins’ dynamics (LVC-SEOBNRv3 [103]). Within the context of this study, Table V shows an analysis without an artificially imposed low-frequency cutoff. As expected, the best-fitting long simulations seen in our previous report fit equally well and agree. Notably, however, we find an increase in the marginalized likelihood for precessing simulations like SXS:BBH:308 and D21.5_q1_a0.2 _0.8_th104.4775_n100. More broadly, when we include low-frequency content, many precessing simula- tions that previously had not fit the high-frequency content as well become more significant. However, to extract low- frequency content reliably, we will need to both hybridize these precessing simulations and interpolate the likelihood as a function of both precessing spins. These further investigations are beyond the scope of the present study. D. Uncertainties from simulations In the above discussion, we have not described or propagated any systematic errors associated with the underlying NR simulations, because the statistical uncer- tainties are much larger than these systematic errors for this analysis. As a concrete example, the shaded area in Fig. 3 represents the statistical error in a nonprecessing analysis, transformed into uncertainty in strain. Any systematic errors in the hðtÞ produced by NR simulations would need to be on the order of those shown in Fig. 3 in order to significantly distort our interpretation of GW150914. In contrast, the self-consistency checks and convergence tests of many NR groups demonstrate errors in hðtÞ that are a few to several orders of magnitude smaller than the statistical uncertainty represented in Fig. 3; see Appendix A for details, and Ref. [10] for one salient example. In short, previous convergence studies and Fig. 3 suggest that statistical errors will dominate the residual impact of systematic uncertainty in each simulation. Several calculations bear out this broad conclusion. Our detailed results in Table III, available electronically, quan- titatively suggest that simulations with similar physics produce similar results. Since our table explicitly includes results from different NR groups with different approaches and codes, this demonstrates by selected examples that our results are independent of implementation, algorithm, initial data, waveform extraction procedure, resolution, etc. In the context of the method used in this work, PE +NR-Methods [11] will provide a detailed analysis of the impact of several sources of numerical error, notably numerical truncation error and waveform extraction error. A related study [14], where template-based parameter estimation was carried out on synthetic data derived from numerical relativity, also found that simulation systematic uncertainty negligibly impacts the derived posterior dis- tribution. In selected studies like Ref. [10] and LVC-detect [1], different NR groups’ simulations that reproduce GW150914 agree with one another extremely well (e.g., mismatch less than 10−3 despite completely different analytical and numerical methods for initial data, evolution, and waveform extraction). This large body of evidence is somewhat anecdotal: we do not have a resolution study or detailed, ready-to-use procedure to assess all possible sources of numerical error available for every simulation in our archive. However, these and related results provide a quantitative rationale for trusting these simulations’ accuracy, based on their past track record of good performance under similar circum- stances. As subsequent work will illustrate (e.g., PE+NR- Methods [11] and Ref. [14]), a detailed error budget and propagation analysis should not significantly alter the conclusions of this analysis; we therefore omit them. On a related note, the analysis reported in LVC-PE [2] employed two semianalytic models which had been tuned to numerical relativity, using some of the simulations employed here. So there may be concern that our agree- ment with LVC-PE [2] is due to the use of 41 NR simulations in common. However, these few specific model-calibration simulations are not critical to our analy- sis or interpretation; for example, we employ many other simulations with similar parameters. To demonstrate the complete independence of this work from the analysis performed in LVC-PE [2], we have repeated our calcu- lations after removing these few model-calibration simu- lations and we find no significant difference in our posterior distributions. V. RESULTS II: STRONG-FIELD PROPERTIES AND POST-COALESCENCE PARAMETERS The numerical relativity simulations listed in Table II have been previously used to develop accurate models for the final black hole mass and spin [26,27,104,105]. The relationships developed in Ref. [26] for nonprecessing binaries were used in LVC-PE [2] and LVC-TestGR [3] to infer the final black hole mass and spin, based on the pre- coalescence spins. By construction, this approximation neglects the impact of transverse spins. Both this work and that in LVC-PE [2] have shown that GW150914 is consistent both with nonprecessing and precessing pre- coalescence spins. When large, these spins are well known to significantly impact the final black hole mass and spin [18,76,106–109]. With direct access to both an accurate multimodal waveform for generic precessing systems and the final black hole state, the method applied in this work is uniquely well equipped to identify the final black hole mass and spin. Figure 10 shows our results. Rather than approximate a posterior distribution—a significant chal- lenge in eight dimensions—we simply report sets of points Mf;z; af corresponding to simulations and initial redshifted masses Mz so lnLmargðMzÞ is greater than some cutoff. When we include only nonprecessing simulations, we find DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-15 results consistent with the reported values in LVC-PE [2] and LVC-TestGR [3]. While many simulations listed in Table III have some transverse spin, many also have zero transverse spin, so overall the transverse spin distribution of our simulations is more concentrated towards zero than the prior adopted in LVC-PE [2]. Given the excellent agree- ment between our results and LVC-PE [2] for pre-coales- cence parameters, particularly in the subset of spin-aligned binaries, we cannot identify any nonzero difference for final parameters that is introduced by our methodology (e.g. our restriction to flow ¼ 30 Hz). VI. CONCLUSIONS Using a full Bayesian parameter estimation technique, we directly compared GW150914 with a large set of binary black hole simulations produced using full numerical relativity. Our comparisons employed physics and radiation content (l ≤ 3 modes) not available or only partially captured by the two semianalytic models used in LVC- PE [2]. Using our completely independent approach, we nonetheless arrived at results similar to those of LVC-PE [2]. Comparisons including only the dominant modes (all l ≤ 2) constrain the total redshifted mass Mz [64–82 M⊙], mass ratio 1=q ¼ m2=mq ∈ ½0.6; 1�, and effective aligned spin χeff ∈ ½−0.3; 0.2�. Including l ¼ 3 modes, we found that the mass ratio is even more tightly constrained. Both nonprecessing and precessing simulations fit the data; no compelling evidence exists for or against a precessing origin. Even accounting for precession, simulations with extreme mass ratios and effective spins are highly incon- sistent with the data, at any mass. Several nonprecessing and precessing simulations with similar mass ratio and χeff are consistent with the data. Though correlated, the component spins (both in magnitude and direction) are not significantly constrained by the data: the data are consistent with simulations with component spin magni- tudes a1;2 up to at least 0.8, with random orientations. This paper also provides the first concrete illustration, using real gravitational wave data, of several methods to aid the interpretation of gravitational wave observations using numerical relativity. First and foremost, this method dem- onstrates that the marginalized likelihood can be efficiently evaluated on a grid [81,82]. Straightforward reconstruc- tions (e.g., fits, interpolation) allow us to reconstruct the posterior at low cost. Further, NR simulations are suffi- ciently dense, and the marginal log-likelihood lnLmarg sufficiently simple, that lnLmarg can be effectively approxi- mated using available catalogs of NR simulations. Second, we provided and employed a simple but effective approxi- mation to the marginalized likelihood. A particularly efficient way to communicate results, this data product enables further investigations, including the impact of the prior on our conclusions, the ability to incorporate the spin- precession instability into our posterior [101], and anything involving conditional distributions, which are trivially produced using the fit. Third, this investigation has dem- onstrated the critical role that numerical relativity can play in data analysis while simultaneously illuminating a path forward in the era of frequent detections. We demonstrate that NR results can be directly applied to data analysis, without intervening approximations. In the future, while low-frequency sensitivity will improve, so will our ability to effectively hybridize these simulations, so this approach will remain valuable even when very long signal models are required to reproduce the data. Targeted follow-up can be performed guided by lnL, our measure of overall fit (maximizing lnLmarg over mass). Fourth, as described in PE+NR-Methods [11], this method provides a direct and unambiguous method to assess the relative impact of higher harmonics, waveform extraction, and modeling uncertainty on a point-by-point basis. Investigations using this tech- nique will provide a valuable complement to parallel studies with LALINFERENCE [14]. As noted in LVC-Astro [31], the inferred spin magni- tudes and misalignments provide unique and distinctive clues to the astrophysical origin of GW150914. Notably, strongly misaligned spins require a violent origin, either through exceptionally dynamic stellar processes or a cluster origin. Our analysis cannot definitively support or rule out such an origin. We recommend further analysis of GW150914 with improved models for binary inspiral and coalescence, whether derived semianalytically or via hybridization and/or interpolation of pure numerical rela- tivity. For example, LVC-SEOBNRv3 [103] reported FIG. 10. Final redshifted mass and spin: The final redshifted black hole massesMf;z and spins af. Each point represents an NR simulation; both nonprecessing and precessing simulations are included. Points with 265.8 < lnLmarg < 268.6 are shown in light gray, those with lnLmarg > 268.6 are shown in black, and those with lnLmarg < 265.8 are shown according to the color scale on the right (points with lnLmarg < 172 have been sup- pressed to increase contrast). For comparison, the solid black curve shows the 90% credible interval on Mf;z and af derived in LVC-PE [2] and LVC-TestGR [3] using a spin-aligned model; the blue curve shows the corresponding result derived from a single- spin precessing (IMRP) model. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-16 marginally tighter constraints on (two) precessing spins, by comparing GW150914 against a model for the emitted radiation including the very early inspiral, which by necessity NR simulations must omit. Combined with this method, we further anticipate that a large-scale simulation campaign in full numerical relativity to explore simulations comparable to GW150914 could allow us to extract more insight into its nature. PE+NR-Methods [11] will provide further details on and examples with the method employed in this work. ACKNOWLEDGMENTS The authors gratefully acknowledge helpful feedback from an anonymous referee. The authors gratefully acknowledge the support of the United States National Science Foundation (NSF) for the construction and oper- ation of the LIGO Laboratory and Advanced LIGO as well as the Science and Technology Facilities Council (STFC) of the United Kingdom, the Max-Planck-Society (MPS), and the State of Niedersachsen/Germany for support of the construction of Advanced LIGO and construction and operation of the GEO600 detector. Additional support for Advanced LIGO was provided by the Australian Research Council. The authors gratefully acknowledge the Italian Istituto Nazionale di Fisica Nucleare (INFN), the French Centre National de la Recherche Scientifique (CNRS) and the Foundation for Fundamental Research on Matter supported by the Netherlands Organisation for Scientific Research, for the construction and operation of the Virgo detector and the creation and support of the EGO consortium. The authors also gratefully acknowledge research support from these agencies as well as by the Council of Scientific and Industrial Research of India, Department of Science and Technology, India, Science & Engineering Research Board (SERB), India, Ministry of Human Resource Development, India, the Spanish Ministerio de Economía y Competitividad, the Conselleria d’Economia i Competitivitat and Conselleria d’Educació, Cultura i Universitats of the Govern de les Illes Balears, the National Science Centre of Poland, the European Commission, the Royal Society, the Scottish Funding Council, the Scottish Universities Physics Alliance, the Hungarian Scientific Research Fund (OTKA), the Lyon Institute of Origins (LIO), the National Research Foundation of Korea, Industry Canada and the Province of Ontario through the Ministry of Economic Development and Innovation, the National Science and Engineering Research Council Canada, the Brazilian Ministry of Science, Technology, and Innovation, the Leverhulme Trust, the Research Corporation, Ministry of Science and Technology (MOST), Taiwan and the Kavli Foundation. The authors gratefully acknowledge the support of the NSF, STFC, MPS, INFN, CNRS, and the State of Niedersachsen/ Germany for provision of computational resources. The SXS Collaboration also gratefully acknowledges Compute Canada, the Research Corporation, and California State University Fullerton for computational resources, as well as the support of the National Science Foundation, the Research Corporation for Science Advancement,and the Sherman Fairchild Foundation. The RIT team gratefully acknowledges the NSF for financial support, as well as Blue Waters and XSEDE for computational resources. This paper has been assigned the document number LIGO- P1500263. APPENDIX A: SIMULATION LIST In this appendix, we enumerate the simulations used in this work (Table II), providing a more detailed description of the simulations performed and their relationship to the literature. Unless otherwise noted, we extract ~ψ4;lmðfÞ [and therefore ~hlmðfÞ and hlmðtÞ] at infinity using a perturbative extrapolation [110] reexpressed in the Fourier domain; see PE+NR-Methods [11] for further details. RIT simulations: Binary black hole (BBH) data were evolved using the LAZEV [111] implementation of the moving puncture approach [42,43] with the conformal function W ¼ ffiffiffi χ p ¼ expð−2ϕÞ suggested by Ref. [112]. For the run presented here, we use centered, sixth-order finite differencing in space [113] and a fourth-order Runge- Kutta time integrator. (Note that we do not upwind the advection terms.) This code uses the EINSTEINTOOLKIT [114] / CACTUS [115] / CARPET [116] infrastructure. The CARPET mesh refinement driver provides a “moving boxes” style of mesh refinement. In this approach, refined grids of fixed size are arranged about the coordinate centers of both holes. The CARPET code then moves these fine grids about the computational domain by following the trajectories of the two BHs. The RIT group used AHFINDERDIRECT [117] to locate apparent horizons. The magnitude of the horizon spin is computed using the isolated horizon algorithm detailed in Ref. [118] and as implemented in Ref. [119]. Note that once we have the horizon spin, we can calculate the horizon mass via the Christodoulou formula mH ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m2 irr þ S2H=ð4m2 irrÞ p , where mirr ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A=ð16πÞp , A is the surface area of the horizon, and SH is the spin angular momentum of the BH (in units of M2). The 128 simulations reported in Zlochower and Lousto [18], denoted in Table II by RIT-Kicks, have only one black hole spinning with jχj ¼ 0.8. For a handful of these simulations, the estimate of the final black hole mass and spin has been updated since the original publication. These simulations include (a) a simulation with large transverse spins and several spin precession cycles which fits the data well [28]; (b) a wide range of simulations with large aligned and antialigned spins for mass ratios near and far from unity [74]; (c) a set of simulations with targeted mass ratios and spins, designed to systematically explore the parameter space and reconstruct generic recoil kicks DIRECTLY COMPARING GW150914 WITH NUMERICAL … PHYSICAL REVIEW D 94, 064035 (2016) 064035-17 when q > 1 [18]; (d) a set of equal mass simulations with large spins (0.8) and generic orientations, designed to systematically explore the parameter space and reconstruct recoil when q ¼ 1 [75]; and (e) 38 simulations of equal mass black holes with equal magnitude and precessing spins [120]. SXS simulations: SXS provided simulations from their public catalog—initially reported in Ref. [59]—as well as several selected follow-up simulations. The SXS Collaboration uses the Spectral Einstein Code (SpEC) [121] for evolution. Quasiequilibrium initial data are constructed in the extended conformal thin-sandwich formalism using a pseudospectral elliptic solver [122,123]. The evolution occurs on a grid extending from inner excision boundaries, slightly inside the apparent horizons, to an outer boundary on which constraint-pre- serving boundary conditions are imposed [124]. The code uses a first-order generalized harmonic representation of Einstein’s equations with damped harmonic gauge [125–129]. After merger, the grid is updated to include only one excision boundary [130,131]. The excision boundaries are dynamically adjusted to conform to the shapes of the apparent horizons [130,131]. The initial orbital eccentricity is reduced with an iterative procedure [132,133]. Other improvements have been applied to enable long simulations [134] and simulations of highly spinning black holes [135]. GT simulations: Initial data was evolved using the moving puncture approach with Maya, which was used in previous BH-BH studies [136–143]. The grid structure for each run consisted of ten levels of refinement provided by CARPET [116], a mesh refinement package for CACTUS [115]. Each successive level’s resolution decreased by a factor of 2. Sixth-order spatial finite differencing was used with the Baumgarte-Shapiro/Sasaki-Nakamura (BSSN) equations implemented with Kranc [144]. Simulations denoted by GT-Aligned refer to the z, zq, and zU series in Refs. [16,30]; the GT-Misaligned case refers to the S and Sq series; and GT-Tilting refers to the T and Tq series. Where available, we adopt the naming convention used in Ref. [78]. In particular, the 452 simulations in Ref. [78] surveyed the most extensive parameter space of BBH systems with 49 nonspinning, 81 aligned-spinning and 324 generic precessing spins BBH simulations. They covered mass ratios ranging from q ≤ 15 for nonspinning and q ≤ 8 for precessing spinning BBH systems, and included generic spin orientations and spin magnitudes, jaj < 0.8. BAM simulations: The Cardiff-UIB group provided 29 simulations using parameters similar to the event, with approximately random initial configurations within the 99% credible region inferred for GW150914 in LVC-PE [2]. These BBH simulations were produced by the BAM NR code [79,80]. The BAM code solves the Einstein evolution equations with the moving puncture approach using the BSSN [145,146] formulation of the 3þ 1 decomposed Einstein field equations. The BSSN equations are integrated with a fourth-order finite-difference Runge- Kutta time integrator, with a fixed time step along with a sixth-order accurate finite-difference algorithm based on the method of lines for spatial derivatives. The χ variation of the moving-puncture method is used where a new conformal factor is defined as χ ¼ ψ−4 which is finite at the puncture [42]. The lapse and shift gauge functions are evolved using the 1þ Log slicing condition and the Gamma driver shift condition respectively [42]. Conformally flat puncture initial data [147–149] are calcu- lated using the pseudospectral elliptic solver described in Ref. [150]. 1. Follow-up simulations Several groups performed new simulations in response to GW150914, indicated in Table II by an asterisk (�). Some of these simulations were made available for this analysis. The SXS group performed eight targeted simulations near the maximum a posteriori parameters reported in LVC-PE [2]. The RIT group performed a systematic follow-up campaign on nonprecessing binaries, targeting the mass ratio and spin range favored by LVC-PE [2]. This campaign included 52 new simulations of nonprecessing binaries in the range of mass ratio 1=2 ≤ q ≤ 1 for spinning binaries and up to q ¼ 1=6 without spin. So far the sequence also includes 11 new precessing simulations in the observatio- nally relevant mass ratio range of 1=3 ≤ q ≤ 3=4 to further calibrate results. B. P. ABBOTT et al. PHYSICAL REVIEW D 94, 064035 (2016) 064035-18 TABLE II. List of simulations: Table of simulations used in this work. Columns indicate the group; an (internal) shorthand for the simulation; the mass ratio; and the components of the dimensionless spins χ1 ¼ Si=m2 i ; the effective aligned spin ξ; the estimated initial starting orbital frequency Mω0; and (where available) the final black mass and spin. (We indicate where the black hole mass and spin was unavailable by using X for the corresponding entry.) An asterisk (�) denotes a new simulation motivated by GW150914; a (þ) denotes one of the simulations reported in LVC-detect [1]; an S denotes simulations used to calibrate the SEOB model [6] used in LVC- detect [1]; and a p denotes simulations used to calibrate the IMRP model [27] used in LVC-detect [1]. (The printed table only shows a few entries from each group; the full table is available as online Supplementary Material [100].) Name Key q χ1;x χ1;y χ1;z χ2;x χ2;y χ2;z χeff Mω0 Mf=M af RIT-Generic D10.50_q0.1667_a0.0_0.0_n100(�) 6.000 � � � � � � � � � � � � � � � � � � � � � 0.026 0.986 0.372 RIT-Generic D10_q0.33_a-0.8_xi0_n100(�) 2.999 0.757 0.030 0.259 � � � � � � −0.800 −0.006 0.029 0.965 0.756 RIT-Generic D10_q0.33_a0.8_xi0_n120(�) 2.999 0.754 0.031 −0.268 � � � � � � 0.800 −0.001 0.030 0.970 0.607 RIT-Generic D10_q0.50_a-0.50_0.50_n100(�) 2.000 � � � � � � 0.500 � � � � � � −0.500 0.167 0.028 0.953 0.751 RIT-Generic D10_q0.50_a-0.8_xi0_n100(�) 2.000 0.696 0.059 0.392 � � � −0.006 −0.801 −0.005 0.030 0.956 0.768 SXS-All SXS:BBH:0001[p] 1.000 � � � � � � � � � � � � � � � � � � � � � 0.012 0.952 0.686 SXS-All SXS:BBH:0010 1.501 0.248 0.028 −0.433 � � � � � � � � � −0.260 0.014 0.962 0.563 SXS-All SXS:BBH:0100 1.500 � � � � � � � � � � � � � � � � � � � � � 0.012 0.955 0.664 SXS-All SXS:BBH:0101 1.501 � � � � � � −0.500 � � � � � � � � � −0.300 0.011 0.963 0.540 SXS-All SXS:BBH:0102 1.500 0.496 0.051 −0.001 0.494 0.071 −0.001 −0.001 0.014 0.954 0.695 RIT-Kicks RIT:BBH:NQ16TH115PH0 6.000 0.725 � � � −0.338 � � � � � � � � � −0.290 0.033 0.991 0.554 RIT-Kicks RIT:BBH:NQ16TH115PH120 6.000 −0.363 0.628 −0.338 � � � � � � � � � −0.290 0.034 0.991 0.552 RIT-Kicks RIT:BBH:NQ16TH115PH150 6.000 −0.628 0.363 −0.338 � � � � � � � � � −0.290 0.034 0.991 0.556 RIT-Kicks RIT:BBH:NQ16TH115PH30 6.000 0.628 0.363 −0.338 � � � � � � � � � −0.290 0.032 0.991 0.553 RIT-Kicks RIT:BBH:NQ16TH115PH60 6.000 0.363 0.628 −0.338 � � � � � � � � � −0.290 0.034 0.991 0.556 RIT-OlderWork RIT:BBH:KTH22.5PH0 1.000 −0.026 0.304 0.760 −0.008 0.310 −0.759 0.001 0.042 0.960 0.695 RIT-OlderWork RIT:BBH:KTH22.5PH120 1.000 −0.272 −0.157 0.757 −0.272 −0.157 −0.757 � � � 0.043 0.961 0.698 RIT-OlderWork RIT:BBH:KTH22.5PH150 1.000 −0.157 −0.272 0.757 −0.157 −0.272 −0.757 � � � 0.043 0.961 0.697 RIT-OlderWork RIT:BBH:KTH22.5PH30 1.000 −0.185 0.257 0.756 −0.157 0.272 −0.757 −0.001 0.042 0.960 0.695 RIT-OlderWork RIT:BBH:KTH22.5PH60 1.000 −0.297 0.138 0.751 −0.272 0.157 −0.757 −0.003 0.042 0.960 0.695 GT GT:BBH:564 1.000 � � � � � � −0.400 � � � � � � −0.400 −0.400 0.026 0.961 0.560 GT GT:BBH:476 1.000 � � � � � � −0.200 � � � � � � −0.200 −0.200 0.025 0.956 0.624 GT (0.0,1.0) 1.000 � � � � � � � � � � � � � � � � � � � � � 0.030 0.952 0.686 GT (0,1.0,’M100’) 1.000 � � � � � � � � � � � � � � � � � � � � � 0.029 0.951 0.687 GT (0.0,1.0,’M120’,’D11’) 1.000 � � � � � � � � � � � � � � � � � � � � � 0.029 0.951 0.686 GT GT:BBH:456 1.500 0.346 � � � 0.200 � � � � � � 0.400 0.280 0.024 0.947 0.753 GT GT:BBH:455 1.500 0.424 � � � 0.424 � � � � � � 0.600 0.495 0.020 0.937 0.822 GT GT:BBH:457 1.500 0.520 � � � 0.300 � � � � � � 0.600 0.420 0.021 X X GT GT:BBH:764 1.500 0.600 � � � � � � � � � � � � 0.600 0.240 0.021 X X GT GT:BBH:458 2.000 0.346 � � � 0.200 � � � � � � 0.400 0.267 0.023 0.954 0.722 GT GT:BBH:550 2.000 0.424 � � � −0.424 � � � � � � 0.600 −0.083 0.032 0.964 0.549 GT GT:BBH:545 2.000 � � � � � � −0.600 � � � � � � 0.600 −0.200 0.033 0.967 0.465 GT GT:BBH:556 2.000 −0.600 � � � � � � � � � � � � 0.600 0.200 0.032 0.955 0.698 GT EK_D6.2_a0.6_th000_M77 1.000 0.584 0.143 0.002 −0.584 −0.143 0.002 0.002 0.070 0.951 0.686 GT GT:BBH:482 1.000 0.520 0.300 � � � −0.520 −0.300 � � � � � � 0.071 0.951 0.684 GT GT:BBH:483 1.000 0.424 0.424 � � � −0.424 −0.424 � � � � � � 0.071 0.950 0.683 GT GT:BBH:484 1.000 0.300 0.520 � � � −0.300 −0.520 � � � � � � 0.072 0.950 0.681 GT GT:BBH:485 1.000 � � � 0.600 � � � � � � −0.600 � � � � � � 0.072 0.949 0.680 GT aa_b5_a0.2_M77 1.000 � � � � � � 0.200 � � � � � � −0.200 � � � 0.027 X X GT aa_b5_a0.4_M77 1.000 � � � � � � 0.400 � � � � � � −0.400 � � � 0.028 X X GT aa_b5_a0.6_M77 1.000 � � � � � � 0.600 � � � � � � −0.600 � � � 0.029 X X GT aa_b5_a0.8_M77 1.000 � � � � � � 0.800 � � � � � � −0.800 � � � 0.031 X X GT fr_b5_a0.6_random2_M77 1.000 � � � � � � 0.600 � � � � � � −0.600 � � � 0.027 X X GT fr_b3.1_a0.6_oth.000_M77 1.000 � � � � � � 0.600 −0.600 � � � � � � 0.300 0.057 X X GT fr_b3.1_a0.6_oth.015_M77 1.000 0.155 � � � 0.580 −0.600 � � � � � � 0.290 0.057 X X GT fr_b3.1_a0.6_oth.030_M77 1.000 0.300 � � � 0.520 −0.600 � � � � �