Day 3 - Get the Weak Lensing Signals

In this session, we will use the things described in the last two sessions and make a weak lensing measurement pipeline for ourselves. We will start with writing out the different steps required to get the pipeline ready. We then code up each of them as individual functions, which will be called in our main code.

We are here studying the matter distribution around SDSS redmapper galaxy clusters - arXiv:1303.3562 by the weak lensing technique using shape catalog data from HSC S16A data - arXiv:1705.06745. We suggest readers to have a look at these references to have a better understanding of data provided to them. The readers can also compare their results with the same study done on these clusters using shape catalog from SDSS - arXiv:1707.01907.

We highly suggest readers to make a separate notebook for this session as it will be useful for later use.

In the codes below I have left some black spaces indicated using ??. Please fill up the ?? in the code below and make sure you are doing it with right equations and conversions.

Required Steps

Reading data from the catalog and appying the selection cuts.
Computing the tangential shear \(e_{\rm t}\) and inverse critical density \(\Sigma^{-1}_{\rm crit}\).
\(\Delta \Sigma (R)\) measurements using cKDTree and writing the output to a file.
Plotting the weak lensing signal.

[15]:

#loading the required packages
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import cKDTree
from astropy.cosmology import FlatLambdaCDM
import glob

Reading data from the catalog and appying the selection cuts.

We are going to use the selection cuts on lensing sample in the day-1 session. The below code can be use for other cuts too but for now lets stick to defaults as given below.

[8]:

# selection cut on the lens sample
def lens_select(zmin=0.1, zmax=0.33, lammin=55, lammax=100):
    #please check the file path properly
    data = pd.read_csv('/home/idies/workspace/Storage/divyar/IAGRG_2022/DataStore/redmapper.dat', delim_whitespace=1)
    #sample selection cut
    idx  = (data['lambda']>lammin) & (data['lambda']<=lammax)
    idx  = idx & (data['zred']>zmin) & (data['zred']<=zmax)
    ra   = data['ra'].values[idx]
    dec  = data['dec'].values[idx]
    zred = data['zred'].values[idx]
    #as we have no weights to apply we set them to unity
    wgt  = ra*1.0/ra
    print('number of lenses=%d'%len(ra))
    return ra, dec, zred, wgt

Similar we define a function for sources and collect the data from it. Please note that there are many columns in source file but for now we are only using some of them. Here we are only using

ra_gal, decgal : ra and dec for the sources
e1gal, e2gal: decribes the shapes of the sources
wgal, rms_egal: weights and Intrinsic shape dispersion per component
zphotgal: redshift of the sources

For now we will neglect the data in other columns. As it is used for correcting the biases for our measurements and we will decribe how to use them at the end of this session.

[9]:

def read_sources(ifil):
    # various columns in sources
    # ragal, decgal, e1gal, e2gal, wgal, rms_egal, mgal, c1gal, c2gal, R2gal, zphotgal
    data = pd.read_csv(ifil, delim_whitespace=1).values
    zphotgal = data[:,-1]
    # sanity checks on the sources data
    idx = (np.sum(np.isnan(data), axis=1)==0) &  (zphotgal>0)
    datagal = np.zeros((np.sum(idx),7))
    datagal[:,:6] = data[idx,:6]
    datagal[:,6]  = data[idx,-1]
    # collects only -  ragal, decgal, e1gal, e2gal, wgal, rms_egal, zphotgal
    return datagal

Computing the tangential shear \(e_{\rm t}\) and inverse critical density \(\Sigma^{-1}_{\rm crit}\)

Now we will follow the formalism described in Prof Surhud More’s lectures. We will first code up function to compute tangential component of the ellipticity given the positions for the lenses (\(\alpha_l\), \(\delta_l\)) and sources(\(\alpha_s\), \(\delta_s\)) along with the shape measurements (\(e_1\),\(e_2\)) for the sources.

We will first compute angle \(\theta\) between a given lens-source pair.

\[\cos \theta = \cos \delta_l \cos \delta_s \cos(\alpha_l - \alpha_s) + \sin\delta_l \sin \delta_s\]

where \(\delta_{l,s}\) and \(\alpha_{l,s}\) are ra and dec for lens(l) and source(s). The tangential component of ellipticity \(e_t\) is given as

\[e_t = - e_1 \cos 2\phi - e_2 \sin 2\phi\]

\[\sin \phi = \frac{-\sin \delta_l \cos \delta_s + \cos \delta_l \sin \delta_s \cos(\alpha_s - \alpha_l)}{|\sin\theta|}\]

\[\cos \phi = \frac{\cos\delta_l \sin(\alpha_s - \alpha_l)}{|\sin\theta|}\]

For more info related to above equations refer to Lecture Notes for Tangential-shear-computation

Coding up these equations requires trigonometric functions from numpy package. Please check them out - sin, cos

Please note that these functions by default uses angles in radians and here we are working with catalog data with (ra,dec) in degrees. So, we need to do the needful conversion first

[12]:

# lra, ldec - lenses position
# sra, sdec - sources position
# se1 and se2 - source ellipticities
def get_et(lra, ldec, sra, sdec, se1, se2):
    # angle_in_radian = angle_in_degrees * np.pi/180
    lra  = ??
    # fill up the blanks
    ldec = ??
    sra  = ??
    sdec = ??
    # c_theta = cos_theta, s_theta = sin_theta
    # use the equations above and complete the expressions
    c_theta = np.cos(??)*np.cos(sdec) * ?? + np.sin(ldec)*np.sin(??)
    s_theta = np.sqrt(1-c_theta**2)

    # phi to get the compute the tangential shear
    c_phi   = ?? *1.0/s_theta
    s_phi   = (-np.sin(ldec)*np.cos(sdec) + ?? )*1.0/s_theta
    # tangential shear
    e_t     = - se1*(??) - se2*(??)

    return e_t

Exercise:

Tell what are you getting the output for \(e_t\) if you use input lra=0.0, ldec=0.0, sra=0.123, sdec=0.045, se1 = 4.5e-2, se2 = 1.7e-2.
running command : print(get_et(lra=0.0, ldec=0.0, sra=0.123, sdec=0.045, se1 = 4.5e-2, se2 = 1.7e-2)) in the next cell below the function defination cell. Please add cell below using the insert option on the top.

we will now move on to writing a function to get \(\Sigma^{-1}_{\rm crit} (z_l,z_s)\) given the lense redshift (\(z_l\)) and source redshift (\(z_s\)) and it also needs a instance of astropy class (cc). Remember we need to create a instance for the astropy cosmo class given the input cosmological parameters. In the current code structure we will initiate the astopy cosmo class in our main code.

To review you can refer to Day-1 docs page to see how astropy is used to get cosmological distances. Please note that here we are working in comoving coordinates to do the signal computations so the corresponding \(\Sigma^{-1}_{\rm crit} (z_l,z_s)\) is given as

\[\Sigma^{-1}_{\rm crit}(z_l,z_s) = \frac{4\pi G}{c^2} \frac{d_{\rm ang}(z_l) d_{\rm ang}(z_l, z_s) (1 + z_l)^2} {d_{\rm ang}(z_s)}\]

where \(d_{\rm ang}(z)\) is the angular diameter distance for redshift \(z\).

[4]:

def get_sigma_crit_inv(lzred, szred, cc):
    # some important constants for the sigma crit computations
    gee = 4.301e-9 #km^2 Mpc M_sun^-1 s^-2 gravitational constant
    cee = 3e5 #km s^-1
    # sigma_crit_calculations for a given lense-source pair
    sigm_crit_inv = ?? * cc.angular_diameter_distance_z1z2(lzred, szred).value * (1.0 + lzred)**?? * 1.0/cc.angular_diameter_distance(szred).value
    sigm_crit_inv = sigm_crit_inv * 4*np.pi*gee*1.0/cee**2
    sigm_crit_inv = 1e12*sigm_crit_inv #esd's are in pc not in Mpc

    return sigm_crit_inv

Exercise:

In the cell below the function defination, first initiate the cosmo class from astropy using “cc = FlatLambdaCDM(H0=100, Om0=0.999)”.
running command : print(get_sigma_crit_inv(lzred=0.33, szred=0.8, cc=cc))

Remember how to convert from ra,dec (degrees) to x,y,z coordinates.

[ ]:

def get_xyz(ra, dec):
    ra = ra*np.pi/??
    dec = dec*??
    x = np.cos(dec)*np.cos(ra)
    y = np.cos(??)*np.sin(??)
    z = np.sin(dec)
    return x, y, z

Exercise:

Tell the output of : print(get_xyz(30, 60))

\(\Delta \Sigma\) measurements using cKDTree and writing the output to a file.

Our aim in this section is to write the main function to measure weak lensing signal \(\Delta \Sigma (R)\). As discussed in the lectures the \(\Delta \Sigma (R)\) for a comoving projected radial bin \(R\) as given by

\[\Delta \Sigma (R) = \frac{\sum_{ls} w_{\rm ls} e_{t,ls} (\Sigma^{-1}_{\rm crit})^{-1}}{2\mathcal{R} \sum_{ls} w_{\rm ls}}\]

where \(w_{ls}\) is the source-lense weights given as \(w_{ls} = w_l w_s \Sigma^{-2}_{\rm crit}\) and the \(\mathcal{R}\) is the responsivity given as \(\mathcal{R} = 1 - \frac{\sum_{\rm ls} w_{ls} e^2_{\rm rms}}{\sum_{\rm ls} w_{ls}}\). The \(e_{\rm rms}\) is the intrinsic shape dispersion given by the shape catalog for individual galaxies.

Decription of algorithm used in the code below (run_pipe) to get the signal \(\Delta \Sigma\).

Global Input Parameters:
1. Omegam : Omega matter - will be used in cosmological distance calculation, default = 0.315
2. rmin : Minimum projected radius - will be used for setting projected radial range, default = 0.2
3. rmax : Maximum projected radius - will be used for setting projected radial range, default = 2.0
4. nbins : number of projected radial bins - use to setup the logarithmic binning, default = 10
5. zdiff : a small offset added to lense redshift to select cleaner background sources \(z_s > (z_l + {\rm zdiff})\), default = 0.4
6. outputfile: dat file name for the output, default = ‘iagrg_dsigma.dat’
Initialize the astropy cosmo then getting projected radial binning array ready
We are computing the numerators and denominators for the signals separately and they have to be initialize first:
1. sumdsig_num : array for \(\Delta\Sigma\) numerator
2. sumdsigsq_num: array for estimating the shape noise (covered in lectures) numerators
3. sumwls : array for summed lense-source weights
4. sumwls_resp : array for \(\Delta\Sigma\) numerator(we can push the 1 into summed w_ls - look at the expression)
Collect lenses data and convert the position (ra,dec) –> (x,y,z) using the above defined functions. Then use these x,y,z to get a lens tree using cKDtree.
We also need to set a maximum search radius for the query_ball_point (refer to day-2 tutorials). The maximum is theratio between maximum projected radius rmax and minimum comoving distance to our lense sample - rmax/(comoving_distance(zlmin)), zlmin - minimum lense redshift.
Now we use glob package to collect the file paths to our many many files. We loop over these path and read each file using the read_source function define above.
Then we take the source ra, dec and convert them to x,y,z so that we can query around each source using lense tree and collect lenses within maximum search radius. This will give us a list of ids of lenses for each sources.
Once we have a list of indices of lenses for all galaxies in the file. We then loop over the galaxies in individual file to compute the array of comoving projected radial distance for the each galaxy with it’s lenses that we got in the last step. Here we also put a cut that for a given lense-source pair with the source redshift \(z_s\) and lense redshift \(z_l\) , \(z_s > z_l +\) zdiff and we also skip sources which doesn’t have any lense around them.
The projected distance array from last step then used to look for corresponding bin number in our binning scheme as described in the Global parameters. Then we use the above equations along with the above defined functions to get the values of array defined in step 3.
As said in step 6, we run step 7-9 for all the source files in our directory and then write the resultant output in a file named by the Global Parameters.

[ ]:

[6]:

def run_pipe(Omegam=0.315, rmin=0.2, rmax=2.0, nbins=10, zdiff=0.4, outputfile = 'iagrg_dsigma.dat'):
    #set the cosmology with omegaM parameter
    cc = FlatLambdaCDM(H0=100, Om0=Omegam) # fixing H0=100 to set units in Mpc h-1

    # set the projected radial binning
    rmin  =  rmin
    rmax  =  rmax
    nbins = nbins #10 radial bins for our case
    rbins = np.logspace(np.log10(rmin), np.log10(rmax), nbins + 1)
    rdiff = np.log10(rbins[1]*1.0/rbins[0]) # difference in log binning

    # initializing arrays for signal compuations
    sumdsig_num   = np.zeros(len(rbins[:-1]))
    sumdsigsq_num = np.zeros(len(rbins[:-1]))
    sumwls        = np.zeros(len(rbins[:-1]))
    sumwls_resp   = np.zeros(len(rbins[:-1]))

    # getting the lenses data
    lra, ldec, lred, lwgt = ??
    # convert lense ra and dec into x,y,z cartesian coordinates
    lx, ly, lz = ??
    # putting kd tree around the lenses
    lens_tree = ??
    print('lenses tree is ready\n')

    # setting maximum search radius
    dcommin = cc.comoving_distance(np.min(lred)).value
    dismax  = (rmax*1.0/(dcommin))

    # lets first catch the file list for sources
    sflist = np.sort(glob.glob('/home/idies/workspace/Storage/divyar/IAGRG_2022/DataStore/hsc/*.dat'))

    # Ready to pounce on the source data
    for ifil in sflist:
        # source data in datagal array  using read source function
        datagal = ??
        Ngal = len(datagal[:,0])  # total number of galaxies in the source file

        # first two entries are ra and dec for the sources
        allragal  = datagal[:,0]
        alldecgal = datagal[:,1]

        # ra and dec to x,y,z for sources
        allsx, allsy, allsz = ??

        # query in a ball around individual sources and collect the lenses ids with a maximum radius
        slidx = lens_tree.query_ball_point(np.transpose([allsx, allsy, allsz]), dismax)

        # looping over all the galaxies
        for igal in range(Ngal):
            ragal    = datagal[igal,0]
            decgal   = datagal[igal,1]
            e1gal    = datagal[igal,2]
            e2gal    = datagal[igal,3]
            wgal     = datagal[igal,4]
            rms_egal = datagal[igal,5]
            zphotgal = datagal[igal,6]

            # array of lenses indices
            lidx = np.array(slidx[igal])

            # skip sources which doesn't have any lenses around them
            if len(lidx)==0:
                continue

            # selecting a cleaner background
            zcut = (lred[lidx] < (zphotgal - zdiff)) #only taking the foreground lenses

            # again skipping the onces which doesn't satisfy the above criteria
            if np.sum(zcut)==0.0:
                continue

            # collecting the  data of lenses around individual source
            lidx   = lidx[zcut] # this will catch the array indices for our lenses
            sra    = ragal
            sdec   = decgal

            l_ra   = lra[lidx]
            l_dec  = ldec[lidx]
            l_zred = lred[lidx]
            l_wgt  = lwgt[lidx]

            sx, sy, sz = ?? # individual galaxy sra,sdec-->sx,sy,sz
            lx, ly, lz = ?? # individual galaxy lra,ldec-->lx,ly,lz

            # getting the radial separations for a lense source pair
            sl_sep = np.sqrt((lx - sx)**2 + (ly - sy)**2 + (lz - sz)**2)
            sl_sep = sl_sep * cc.comoving_distance(l_zred).value

            # here we are binning the separation in projected radial bins
            # ll will enumerate the lenses
            for ll,sep in enumerate(sl_sep):

                #check whether separation sits inside our projected radial range
                if sep<rmin or sep>rmax:
                    continue

                # finding the corresponding radial bins for separations
                rb = int(np.log10(sep*1.0/rmin)*1/rdiff)

                # get tangantial components given positions and shapes
                e_t = ??(lra = l_ra[ll], ldec = l_dec[ll], sra = ??, sdec = ??, se1 = ??,  se2 = ??)

                # sigma_crit_calculations for a given lense-source pair with cc as astropy cosmo instance
                sigm_crit_inv = ??(l_zred[ll], ??, cc)

                # following equations given in the surhud's lectures
                w_ls    = l_wgt[ll] * wgal * (??)**??
                w_ls_by_av_sigc_inv = l_wgt[ll] * ?? * sigm_crit_inv

                # separate numerator and denominator computation
                sumdsig_num[rb]   += w_ls_by_av_sigc_inv  * e_t
                sumdsigsq_num[rb] += (w_ls_by_av_sigc_inv * e_t)**2
                sumwls[rb]        += w_ls
                sumwls_resp[rb]   += w_ls * (1-rms_egal**2)

        print(ifil)

    outputfile = outputfile
    fout = open(outputfile, "w")
    fout.write("# 0:rmin/2+rmax/2 1:DeltaSigma  2:SN_ErrDeltaSigma\n")
    for i in range(len(rbins[:-1])):
        rrmin = rbins[i]
        rrmax = rbins[i+1]
        Resp       = sumwls_resp[i]*1.0/sumwls[i]
        dsigma     = sumdsig_num[i]*1.0/sumwls[i]/2./Resp # follow from the equation above
        dsigma_err = np.sqrt(sumdsigsq_num[i])*1.0/sumwls[i]/2./Resp

        fout.write("%le\t%le\t%le\n"%(rrmin/2.0+rrmax/2.0, dsigma, dsigma_err))
    fout.write("#OK")
    fout.close()

    return 0

[ ]:

run_pipe()

Plotting the weak lensing signal

[ ]:

dat = np.loadtxt('iagrg_dsigma.dat')

plt.errorbar(dat[:,0], dat[:,1], yerr=dat[:,2], fmt='.', capsize=3, label='Data')
plt.legend()

plt.xlabel(r'$R[{\rm h^{-1}Mpc}]$')
plt.ylabel(r'$\Delta\Sigma [{\rm h M_\odot pc^{-2}}]$')
plt.xscale('log')
plt.yscale('log')

The bias corrected measurements

For doing a bias corrected measurements one has to add few more things in our \(\Delta \Sigma\) expression. Interested reader can check this expression for \(\Delta \Sigma\) in arXiv:2107.05641 and can modify the above code to get the bias corrected measurements.

Please note that within each source file we have already provided you the data coloumns required to do these computations.

[ ]: