The models that Li et al are using are absolutely enormous. Especially compared to what we are looking at.
To better drive this home, here is a set of galaxies created using our parameters, compared to a sampling of these models. The cutout sizes are a STAGGERING 1501×1501 pixels, 6 times the sizes of our already large cutouts! The limits are the same, and you can see just how massive (and bright!) these galaxies are.
Lastly, here are the cutouts I’m sending in for low and high density simulations. The low density cutouts are selected using the top left of the 2D histogram, as we discussed at the last meeting.
Lots of trial and error later and I have some good updates.
TBriDGE now has some nice additions which will speed up the pipeline quite significantly.
Background subtraction can be done on a set of profiles quickly. No more confusing methods.
To the isophote fitting, I added the option to set an alarm using the python signal package. Using this, I can set a maximum possible time when trying to extract a profile. If this alarm is triggered, the fitting stops and either continues to the next object. This both gets rid of issues due to infinite loops in photutils AND profile extractions getting “stuck” in rare cases. Users can adjust the time to whatever is reasonable.
Additional methods to save and load sets of profiles and cutouts have been added, meaning I can speed up work by not having to repeat certain processing.
Below are the single-bin profiles for various gradients, for both low-density and high-density cutouts. This is with masking fixed, and the violet line is post background extraction. You might notice that for the low-density regions the background-subtracted curves are ABOVE the blue curves in places. This is due, I would assume. to a negative median background, which for the blue curve cancels out light from nearby objects.
Above are the gradients that I used to test severe differences in gradients. The models that I get after adding them to a random selection of these backgrounds are below. Below that are the bins (with extracted medians). Please note that since I was doing these locally, I extracted the profiles without the linear mode active (so less data but much less computationally intensive).
Carrying on with our discussion on Thursday, I generated the “number within 126 pixels vs magnitude of brightest object within 126 pixels” plot. Not as nice a correlation as I think we expected. We can check the code if we want to, but I’m decently confident that I have it right.
To get cutouts of different densities (which the simulation code is now adjusted to run with), I first selected 3 regions that I had a decent number of cutouts for.
Following this, I gathered a bunch of cutouts and visually selected 20 from each. (I can select more or remove certain ones if we want more).
Low, Medium, and High Density Cutouts from the 2D Histogram
Visually selected density cutouts (hopefully which belong to which density region is clear).
We can look at these in greater depth during the meeting.
As for the Cedar job, it’s still running! This is not surprising to me, as 10x the objects will be roughly 10x the run-time. The place where I divide the code up to multiple cores is on the bin-level, so the run-time will scale with the number of objects. No crashes this time though, so it should run to completion (even if we are waiting a while longer). Fingers crossed! If this isn’t working, I could in theory alter the code to rune in multiprocessing mode on the object level, which will make single bins run faster. Not entirely sure if it is worth the hassle, but it’s possible with a couple hours of troubleshooting.
So I ran into some roadblocks over the past couple days while getting density catalogues. Reviewing the code caused me to realize an issue, which upon fixing resulted in a different 2D distribution. I’ve included how I get the number of galaxies within 126 pixels, as well as the distance to the 5th nearest neighbour for random points. This is AFTER fixing the code, but a sanity check would be great!
#c is a skycoord object containing all galaxies from the CLAUDS catalog with i < 24.5
#r_u is a skycoord object with all the random point locations (100000 of them)
for i in range(0, len(r_u)):
coord = r_u[i]
# The sum of nearby should give me the number of objects (since it becomes a truth array)
nearby = c.separation(coord) < (126 * 0.168) * u.arcsec
# Get the 5th nearest neighbor by getting the index and then checking the separation distance (a slightly roundabout way but should work)
nearest_neighbors = coord.match_to_catalog_sky(c, nthneighbor=5)
nn = c[nearest_neighbors[0]]
sep = nn.separation(coord).arcsec
rows.append([(r_u[i].ra * u.deg).value, (r_u[i].dec * u.deg).value, sum(nearby), sep])
The updated 2D histogram, as well as the same plot with the clusters from Angelo’s catalogue overlaid, look like this:
As you can see you get this strange bow-tie shape, but there definitely seems to be a more defined distribution than before. The reason for that empty space on the bottom left is a consequence of how I get the values. You can’t have a position of the 5th nearest neighbour below 21 arcseconds (126 pixels) if you have less than 5 objects within that space.
There are still bugfixes to be done on the simulation code. An initial attempt to run the code on Cedar (with the now-incorrect catalogues) failed unfortunately, and I’m still looking into why. I had to retool the code quite a bit to use a density catalogue instead of picking images at random, and while it worked in tests I wasn’t entirely confident that I would have results by our next meeting.
That being said, much of the code is in place to quickly remake the density catalogues if there are issues. I also wrote some code to match these positions with the COSMOS coverage map, meaning getting the required positions is a simple exercise. I am still working on making a system that allows you to manually select which profiles it extracts (for the benefit of time), but it is definitely giving me trouble.
I also wrote some code to generate some cutouts, which can be seen below. I selected three bins (left and upper, middle, and lower right), and grabbed cutouts from them. There doesn’t seem to be much difference visually (after fixing up the code) so perhaps there are some bugs that need to be found.
I have an idea to rectify this though, at least for the technical paper. Since we are running sims, why don’t we simulate regions of different densities? Since we have KDEs, we could make a simulated background and add galaxies at random (using 2D Sersic profiles). I’ve done this sort of testing before, so it should not be very difficult. Another advantage to this is that we can simulate regions that won’t be found in COSMOS (such as, say, hyper-dense environments that are beyond the scope of the density KDEs I get for COSMOS).
UPDATE: Here is the 2D KDE (after cleaning up the code) with both the Camira clusters as well as the spectroscopic clusters chosen from the BCGs. I was able to speed up the the process of getting the 5th nearest neighbour, but wasn’t able to find a method to quickly get the number of objects within a given distance. As for the plot, looks like the x-ray clusters have greater numbers towards the bottom of the plot (as expected), but there is still a distribution. Maybe those groups have low numbers? If you have a group of 4 galaxies tightly clustered but isolated, the distance to the 5th nearest neighbour will be high.
To begin, I changed the colorbar to not include a vmin, and I also changed things to a log scale. I also changed the binning to be on a step size of 1. The second plot shows a re-binned version (sizes 5×5). I also include the number of objects in each bin.
I also took Angelo’s cluster catalogue (the camira clusters), took the cluster positions from that, and ran them through the same process I used to get the 2d histogram. Overplotted as red points, it looks like this:
There definitely seems to be more clustering towards the bottom of the plot (as expected), but there’s quite a distribution! Perhaps the clusters with few members, or more diffuse clusters spatially are causing this spread?
If I apply a mask to the cluster catalogue by asking for clusters with the number of members > 20, the plot then looks like the one below. What I’m curious about is that many of them don’t have many galaxies within the 126 pixel radius (at least, quite a few well below 20). My reasons for this might include the magnitude cut on the galaxy catalogue, or maybe the clusters are more diffuse than I would expect.
So as we look to find a function to fit the data, I’ve settled on the sigmoid function as it seems to do a really good job. I modified it a little bit to better fit our purposes, with added parameters to adjust the vertical/horizontal stretch, and the horizontal displacement. The equation for the model I created is as follows:
A adjusts the vertical stretch, b adjusts the displacement, and c adjusts the horizontal stretch. I’m pretty sure this is the appropriate way to incorporate stretches and shifts into the curve shape.
One thing I discovered with the local-background subtraction profiles is that we hit a value of 1 and then stay there, which is different than for the other profiles. This is due, I believe, to the presence of lots of 0 values that I hit in that intensity profile. Since they are all zeros, you’re dividing input by input, and getting 1 throughout. Below is an example:
Regardless, let’s fit a few of these functions with that modified sigmoid profile and see what we get! I use Astropy to generate a custom model, and fit it to the data with the LevMarSQF fitting algorithm. Below are examples for 6 bins:
For the most part, it does a really solid job! I also have the r^2 values plotted, but the general fit to the shape seems pretty good.
I also finished an Acenet run of the local background subtraction. For some strange reason, a row of bins is missing (I’m looking into why – since it appears in both the mean and median extractions it’s definitely in how I specified the bins before running the simulation).
Below are the model matrices for both. The first is using the mean, the second is using the medians.
As for densities, I took 100000 random points and measured the number of galaxies within 251 pixels, and the distance to the 5th nearest neighbor. I saved all of this to a table and am finishing up prep for the simulation code to use locations from a density file.
The first plot I worked on making is the collapsed matrix plot, where we show both the profiles and the differences in the same plot. I chose to collapse the rows, but I could in theory do this for columns just as easily. An example is shown below.
The run of local background subtraction failed on AceNet yet again, this time because I forgot to update the number of cores I wanted it to run with. Since I really want to show some results, here are the results from 3 bins, which I ran on my laptop.
I estimated a global background and plotted the results as the orange curve. The local-background-subtracted plots are in violet. The first matrix plot shows the mean isophote analysis, and the second uses the median. We can look at these in greater detail at the meeting.
MeansMedians
I am also working on density plots. I take random points , unmasked and within u, and look at the number of galaxies with i < 24.5 within 251 pixels. The histogram for 20000 random points is included below:
The mask at its finest resolution looks like this (with 900000 points). I also include a lower resolution mask (selecting only some of the points) which I used for the following plots processed locally on my own machine.
As per Marcin’s instructions, I took a look at using the nearest neighbour method to see which densities we can trust. Some of my preliminary results (I’ll have to run the more detailed analysis on ACENET) are included below. I chose the colorbar scaling somewhat arbitrarily (so we could see the edges), but that can be adjusted.
I also have it set up so that I can get an output table. This means I can output all these results to the main table, so it will be easier to use in the simulations.
I will also have the single bin results to look at tomorrow!
Unfortunately, I do not have figures that show local background subtraction. Despite me testing it locally on my own machine (and it running in test mode without issue), it failed on Cedar without me knowing. I’ve been doing bug fixes, but I’ve been getting put into the priority queue quite a bit, which has been severely hampering my progress. It is overwhelmingly likely an import issue that went unnoticed during local tests.
One a brighter note, I due have results both for the u-band psfs and for the density estimates. I already sent the u-band psf plots via email, but I will include them here. Numbers are still smaller, but looking much better than before! I also have the median in the top left corner for each band.
For the density plots, I changed the method around to ensure that it was indeed in arcseconds. This is what I get, as we saw on Thursday. I also got a density map corresponding to the number of objects within 251 pixels. There are less objects than before, but this was just in the interest in generating a quick update for the blog. I will have higher “resolution” for the meeting on Monday.