I'm Assistant Professor of Statistical Astronomy at Princeton University's Department of Astrophysical Sciences and the Center for Statistics and Machine Learning. I lead the Princeton Astronomy Data Lab, where we develop algorithms that accurately capture the properties of billions of stars and galaxies despite the limitations of real-world instruments.
My central research question right now is: how can we optimally combine multiple data sets to extract more information than from an individual analysis. My lab designs a system that combines data from the upcoming surveys LSST, Euclid, and WFIRST at the pixel level. We develop techniques for source separation, mixture modeling, and data fusion, using proximal techniques and neural networks.
On an even larger scale, we optimize the full scientific duty cycle, from instrument design to observing strategy to data analysis for maximum yield: precision measurements and discovery potential. Funded by the Schmidt Futures Foundation, we build modern statistical and machine learning methods and focus on the target selection of the upcoming PFS survey.
With the increasing complexity of the matrix factorization problem we solve inside of scarlet, we've reached the limits of the traditional proximal gradient method. That method requires analytic knowledge and computation of the "correct" step sizes, which isn't possible for data fusion (think LSST + WFIRST). Instead, we use the powerful optimization method Adam to provide adaptive step sizes on the fly for a new proximal quasi-Newton method. It's not just better, it's also faster.
Strike 1: In this paper we develop the source separation framework scarlet for image analysis.
HSC (shown in the left) and LSST have imaging so deep that sources routinely overlap, rendering measurements for single sources largely obsolete.
We used Constrained Matrix Factorization (see earlier post below) with several custom constraints and fast likelihood computation to achieve robust deblending.
For kicks, we also looked how well we can separate AGN jets from their host galaxies.
The answer: very well!
More scarlet analyses coming soon.
In many areas of science, one wants to separate multiple contributions to a given signal. If those are strictly linear superpositions, as it's often the case, one can and should use the Constrained Matrix Factorization (consider yourself encouraged). However, the existing algorithms were slow and didn't allow us to flexibly apply constraints/priors to the solution. In this paper, Fred Moolekamp and I devise a scheme that is both fast and flexible. The code is of course public. Running source separation on multispectral images works like charm!
In a study with David Spergel and former Princeton undergrad Arianna Lanz, I tried something mildly crazy: how well can we center on a star if its core is completely saturated by using only the diffraction spikes? I was skeptical... But it turns out this is really precise for sufficiently bright stars and much less sensitive to pixel-level artifacts. When done with WFIRST, we should be able to reach precisions better than 10 μas and that would allow us to detect Earth-mass planets around the nearby stars and provide masses for many of WFIRST coronagraph/starshade targets.
In astronomy and elsewhere (social as well as physical sciences), it's quite common to have incompletely observed data or even gaps. For density estimation that's a problem because there are now two processes: the one we care about that genenated the samples, and another one that removed some of them. I've developed a new method to correct for the latter by drawing "missing" samples from the model itself. It works even with noise and an unrelated background. Together with Andy Goulding I've applied it directly to the photon events from Chandra because it has gaps, noise, and a prominent particle background.
When you want to do cosmology with galaxy clusters, you need to know how massive they are. I led the team that determined the masses of optically detected clusters in DES Science Verification data by aggregating their gravitational shear signal. We also quantified every systematic contamination we could think off. Now we know that earlier work on SPT SZ (blue 68% confidence contours) and on weak lensing with SDSS (gray) is fully consistent with ours (red), while we extend the former to lower-mass systems and the latter from redshifts z < 0.3 to z < 0.8. Get ready for DES cluster cosmology!
How do you get many people to do something boring? Turn it into a game! For the rather tedious task of identifying flaws in Dark Energy Survey images, I created a crowdsourcing app that allows survey participants to explore fully reduced images simply in their browsers.
We aggregate their reports when something is fishy, and reward the effort with honor and glory. Symmetry magazine reported about the project a while back, but in the paper we tell the entire story. Key findings: 1) Collaborative discovery is fun! 2) We found and fixed a number of flaws in hardware and software, and made the survey better. Win-win!
Recently, the ASAS-SN program detected a super-luminous supernova, and follow-up spectroscopy revealed that it’s the brightest one on record (cf. Dong et al. 2015). I got a request from ASAS-SN colleagues at OSU to check whether the host galaxy (identified as APMUKS(BJ) B215839.70-615403.9) is in the DES footprint. It sure is. From our latest Year-2 data we publicly released the deepest image of the host and added 5-band optical photometry to existing NIR and IR observations. We determined the galaxy to be a massive old elliptical, highly unusual for a SLSN host.
For the first paper with data from the Dark Energy Survey, I used the Dark Energy Camera (DECam) on the CTIO 4-m telescope to study four massive galaxy clusters. The targeted clusters were well known—one of them is the famous Bullet Cluster—so that the findings from our data could be cross-checked with existing results. In a larger international team, we validated the instrument and the analysis pipelines for the delicate task of weak gravitational lensing. And then we utilized the enormous 3 deg2 field of view to reveal extended filamentary features from which these cluster accrete their material.
With images and spectroscopy from SDSS, I measured, for the first time, the really weak lensing signal of cosmic voids.
These huge underdense volumes of the cosmic web imprint tiny distortions of the shapes of backgrund galaxies, which we extracted from the data (Nautilus reported). While only marginally significant, equivalent to a 2.9 σ detection, this measurement confirms that voids are as underdense in dark matter as they are in galaxies.