Impact craters are distributed randomly on Mars and many other celestial bodies. Their radius most likely follow an exponential distribution. By estimating the mean of the exponential distribution in question, selecting 100 random locations, and determining how many lie in (at least) one crater, you can determine the age of the celestial body.
This turns out to be a very interesting mathematical problem, even allowing you to compute the first few decimals of exp(-Pi) using some simple laboratory experiment, then compare it with the real value. The solution involves one of the most well known special (transcendental) mathematical functions.
The problem
The problem can be stated as follows: if you randomly throw discs of radius R — R being a random variable with density f(r) — over the two dimensional plane, at a rate of v discs per unit area on average, what proportion of the plane will be covered by these discs? Some of these discs will be overlapping. Of course, the answer depends on v and on R.
Before reading any further, I strongly encourage you to answer this question yourself, either by performing simulations, or by mathematical computations. This is a classic coverage problem in stochastic geometry.
The solution
First, let’s assume that R is constant. Then, the probability that a random location won’t be covered by at least one disc, is equal to the probability that no disc center is closer than R distance units, to this location. That is, there is a circle of radius R, centered around the location in question, not containing any center of any disc. Given the fact that disc centers follow a Poisson point process of intensity v, the probability for such an event is Exp(- v Pi R^2), as the number of expected points of a Poisson process, falling in a given area, follows an exponential distribution of mean equal to v times the area. In this case, the area (of a circle of radius R) is Pi R^2.
If R is a random variable with density f(r), then instead of one Poisson process for the disc centers, we have a superimposition of an infinite number of Poisson processes, each one corresponding to a specific value R = r. Thus, the probability for a location to not be covered by any disc center becomes INT{ Exp(- v Pi r^2) * f(r) dr }, INT denoting an integral over all possible values of R.
Thus, the proportion p of the plane covered by these discs is equal to
p = 1 – INT{ Exp(- v Pi r^2) * f(r) dr }
Interesting cases
In the case of the craters on a celestial body, one can expect R (the radius of the craters) to have an exponential distribution, and thus p – the proportion of the surface covered by these craters – can be expressed in terms of the error function (aka the bell curve). Here we assume that locally we can approximate a large portion of a planet by a plane, just like when we draw Earth maps.
Also, if R is constant and equal to (say) R = 1, and if v = 1, then p = 1 – Ex(-Pi). This allows you to easily perform Monte-Carlo simulations (throwing these discs at random and computing coverage) to get an estimate for Exp(-Pi).
Related article: what is the probability to have twin stars?
DSC Resources
- Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs
- Knowledge: Research | Competitions | Webinars | Our Book | Members Only | Search DSC
- Buzz: Business News | Announcements | Events | RSS Feeds
- Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers
Additional Reading
- What statisticians think about data scientists
- Data Science Compared to 16 Analytic Disciplines
- 10 types of data scientists
- 91 job interview questions for data scientists
- 50 Questions to Test True Data Science Knowledge
- 24 Uses of Statistical Modeling
- 21 data science systems used by Amazon to operate its business
- Top 20 Big Data Experts to Follow (Includes Scoring Algorithm)
- 5 Data Science Leaders Share their Predictions for 2016 and Beyond
- 50 Articles about Hadoop and Related Topics
- 10 Modern Statistical Concepts Discovered by Data Scientists
- Top data science keywords on DSC
- 4 easy steps to becoming a data scientist
- 22 tips for better data science
- How to detect spurious correlations, and how to find the real ones
- 17 short tutorials all data scientists should read (and practice)
- High versus low-level data science
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge