Running the MegaBEAST

Background

The MegaBEAST is specifically setup to use the output of BEAST runs. It uses the results ordered spatially to have all the stars in a region split in to smaller pixels. For example, the BEAST results for a PHAT brick will have many files with each file giving the results for a 5x5 arcsec^2 pixel. The MegaBEAST uses the BEAST nD pPDFs, usually in the form of a small subset of the full pPDFs points sampled from the region with significant probability.

The MegaBEAST expects to find filenames in the format of name_XX_YY_lnp.hd5 where the XX and YY values are the pixel coordinates x and y. The translation between the x,y values and ra,dec are given in the name_nstars.fits file that was created when the BEAST spatial reordering was done.

More information on the BEAST spatial reordering can be found in the BEAST documentation.

Running

Once the spatially reordered BEAST data is ready, the MegaBEAST is run from the commandline.

$ python megabeast.py input_file

The options can be found with the usual command.

$ python megabeast.py --help
usage: megabeast.py [-h] [-v] megabeast_input_file

positional arguments:
  megabeast_input_file  Name of the file that contains settings, filenames,
                        etc

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         verbose output

The input file is an ascii file that contains information about run settings and file names. An example can be found here.

  • Information from the BEAST
    • beast_seds_filename: SEDs from the BEAST
    • beast_noise_filename: noise model from the BEAST
    • av_prior_model, rv_prior_model, fA_prior_model: dictionaries with the priors used when running the BEAST
    • nstars_filename: fits file with the number of stars in each spatially-reorganized pixel
    • lnp_file_prefix: path+prefix for the spatially reorganized log likelihood files
  • Information for the MegaBEAST
    • projectname: the name to use for the folders and files generated by the MegaBEAST
    • fit_param_names: list of the names of the parameters to fit
    • min_for_fit: minimum number of stars needed in a pixel for fitting

MegaBEAST wrapper

In the examples, there is also a wrapper that can run both the MegaBEAST (as descibed above) and run code to create diagnostic plots.

$ python megabeast_wrapper.py --help
usage: megabeast_wrapper.py [-h] [-m] [-p] [-v] megabeast_input_file

positional arguments:
  megabeast_input_file  Name of the file that contains settings, filenames,
                        etc

optional arguments:
  -h, --help            show this help message and exit
  -m, --run_megabeast   Run the MegaBEAST
  -p, --diagnostic_plots
                        Generate diagnostic plots
  -v, --verbose         verbose output

If the diagnostic plots are made, it will create a PDF file with two (or more) plots:

  • completeness vs Av, which evaluates how well that stars with a given Av are recovered
  • histograms of Av (optionally log spaced) within each of the spatially-reorganized pixels
  • if chi2_plot is set for plot_input_data: histograms of Av, but with one (or more) different cuts on chi2

MegaBEAST outputs

The results of the MegaBEAST are the best fit values for the ensemble model for the A(V) distribution. Currently, only the A(V) ensemble model is implemented. The output filenames have the format projectname_param_bestfit.fits where the allowed values of param are AV1, AV2, sigma1, sigma2, and N12_ratio. There are two lognormal components in the A(V) ensemble model where the lognormal parameters are N (number of stars), AV (A(V) at the peak of the distribution), and sigma (width of lognormal). The MegaBEAST does not fit for the total number of stars, so only the the ratio between the normalizations of the two lognormals (N12_ratio) is fit.

Maps of the best fit parameters can be created with parameter_maps.py. The optional input n_col (default is 2) sets the number of columns of plots to put on the page.

from parameter_maps import parameter_maps
parameter_maps('megabeast_input.txt', n_col=2)