Talk:Mark and recapture

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject Statistics (Rated Start-class, Low-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.
 
WikiProject Ecology (Rated Start-class, Top-importance)
WikiProject icon This article is within the scope of the WikiProject Ecology, an effort to create, expand, organize, and improve ecology-related articles.
Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Top  This article has been rated as Top-importance on the importance scale.
 
WikiProject Biology (Rated Start-class, Mid-importance)
WikiProject icon Mark and recapture is part of the WikiProject Biology, an effort to build a comprehensive and detailed guide to biology on Wikipedia.
Leave messages on the WikiProject talk page.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 
Note icon
This article has been marked as needing immediate attention.
WikiProject Mathematics (Rated Start-class, Low-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
Start Class
Low Importance
 Field: Probability and statistics

Other methods[edit]

jollys method should be mentioned also.

Yes, the cormack jolly seber method should be described here. It is of extreme importance. 24.147.119.33 17:51, 20 March 2007 (UTC) gukarma

Sources[edit]

Some Sources, References, Links and related topics would be nice... Anyone with knowledge of the subject cares to do it?

--Lucas Gallindo 22:52, 12 July 2007 (UTC)

Lincoln Index[edit]

Please see Talk:Lincoln Index to discuss relation of content here with that of the newer article. Melcombe (talk) 12:20, 2 December 2010 (UTC)

Merge discussion[edit]

I would like to propose that Tag and release is merged with Mark and recapture for two main reasons;

  1. Tag and release has very little content
  2. Tag and release is just another mark and recapture method

Jamesmcmahon0 (talk) 12:04, 2 May 2013 (UTC)

  • OpposeTag and release can include Mark and recapture, but it can also be quite different. For example small archival tags can be attached to marine animals like fish. These archival tags can be equipped with a camera or sensors that monitor and log things like salinity, temperature, depth, acceleration, and pitch and roll. They are designed to detach at a later date and float to the surface where some method can be used to retrieve the logged data. This has nothing to do with "marking" or recapture. Tag and release is often linked with catch and release, and is a term widely used, particularly in fisheries and by recreational fishermen. --Epipelagic (talk) 04:21, 3 May 2013 (UTC)
  • Support – In school, we learned about Mark and recapture as Tag and release. I actually have never heard of mark and recapture and "tag and release" is what I always knew "mark and recapture" as. Mark and recapture is definitely known as tag and release in many textbooks and the MCPS curriculum. 173.79.218.246 (talk) 09:32, 29 May 2013 (UTC)
Also, an example of an organization that refers to the practice as "tag and release" despite not being related to fishing: http://www.thevlm.org/turtle_tag_release.aspx 173.79.218.246 (talk) 09:36, 29 May 2013 (UTC)

Statistical treatment[edit]

I was requested here to improve this article. However my expertise is on Bayesian statistics and so any contribution of mine on Mark and Recapture might be considered original research.

Assume that K animals out of a population of the unknown size N have been marked. Later n animals are captured out of which k animals turned out to be marked.

NK animals are unmarked. Nn animals are uncaptured. nk captured animals are unmarked. Kk marked animals are uncaptured. NKn+k uncaptured animals are unmarked. As all these numbers are non-negative, the following inequalities result: ( NK+nk) and (nk) and (Kk) and (k ≥ 0).

Knowing K and n and k the problem is to estimate N.

Probability distribution[edit]

The conditional probability (k|N) of observing k knowing N (and n and K), is the hypergeometric distribution.

(k|N)=\frac{\binom K k \binom{N-K}{n-k}}{\binom N n}

But we were interested in estimating N knowing k.

Credibility distribution[edit]

When there is no prior knowledge regarding N and k, the credibility distribution (N|k) is proportional to the likelihood function (k|N).

(N|k)=\frac{(k|N)}{\sum_{N=K+n-k}^\infty (k|N)}

Inserting the expression for (k|N) and cancelling the common factor:

(N|k)=\frac{\frac{\binom{N-K}{n-k}}{\binom N n}}{\sum_{N=K+n-k}^\infty  \frac{\binom{N-K}{n-k}}{\binom N n}}

The denominator series is convergent for k ≥ 2.

Graphs[edit]

load 'plot' NB. plotting software 
LF =: 4 : 0  NB. Likelihood Function
'K n k'=:x
(N>:K+n-k)*((n-k)!N-K)%n!N=:i.y
)
g =: [: 'dot; labels 1 0 ; pensize 4' & plot LF
  11 10 0 g 701
  11 10 1 g 501
  11 10 2 g 301
  11 10 3 g 101
  11 10 4 g 71
Likelihood for Mark and Recapture. K=11, n=10, k=0
Likelihood for Mark and Recapture. K=11, n=10, k=1
Likelihood for Mark and Recapture. K=11, n=10, k=2
Likelihood for Mark and Recapture. K=11, n=10, k=3
Likelihood for Mark and Recapture. K=11, n=10, k=4

This J programming created the 5 graphs to the right showing likelihood functions for the total number of animals, N, for K=11 marked animals and n=10 captured animals, and for k= 0, 1, 2, 3 and 4 recaptured marked animals.

When k=0 we didn't recapture any marked animals and the physical limit on how many animals there are around is not reduced by the mark and recapture observation. The likelihood function has no maximum, and so the maximum likelihood estimate is infinite.

When k=1 we recaptured a single marked animal, and the likelihood function has a maximum at N=110, but the median is infinite.

When k=2 we recaptured two marked animals. The maximum likelihood is at N=55, and the median is at N=137. A 95% confidence interval is 19≤N≤1367, but the mean value is infinite.

When k=3 we recaptured three marked animals. The maximum likelihood is at N=36, and the median is at N=57. A 95% confidence interval is 18≤N≤236. The mean value is finite but the standard deviation is infinite.

When k=4 the maximum likelihood is at N=27. The median is at N=36. The 95% confidence interval is 17≤N≤ 96.

The frequentist formulas from the article gives the estimates NKn/k = 27.5 and N ≈ (K+1)(n+1)/(k+1)−1 = 25.4.

  b=.6000 LF~a=.11 10 4
  +/0>:2-/\b
27
  *`%/a
27.5
  *`%/&.:>:a
25.4
  +/0 0.95 0.5<:/~(%{:)+/\b
17 96 36

Bo Jacoby (talk) 06:14, 6 March 2014 (UTC).

Order of magnitude and statistical uncertainty[edit]

Knowing the credibility distribution function, (N|k), one can compute the order of magnitude, μ, and the statistical uncertainty, σ, of the unknown number N.

N\approx \mu \pm \sigma

where

\mu =\sum_{N=K+n-k}^\infty  (N|k)N
\sigma^2+\mu^2 =\sum_{N=K+n-k}^\infty (N|k)N^2

Bo Jacoby (talk) 14:10, 27 February 2014 (UTC).


Summation[edit]

A closed form for the above sums can be found using Gosper's algorithm. However Wolframalpha does not immediately do it [1]. But the following detour does the trick.

Define the sums

S_Q = {\sum_{N=K+n-k}^\infty \frac{\binom{N-K}{n-k}\binom{N}{Q}}{\binom{N}{n}}} for Q = 0, 1, 2

so

\mu=\frac{S_1}{S_0}

and

\frac{\sigma^2}{\mu}+\mu-1= 2\frac{S_2}{S_1}

The sums are evaluated [2]

\sum _{N=K+n-k}^{m-2} \frac{\binom{N-K}{n-k}\binom{N}{Q}}{\binom{N}{n}}=
\frac{\binom{n+K-k}{Q}}{\binom{n+K-k}{n}}A_Q
-\frac{\binom{m-1}{Q}}{\binom{m-1}{n}}\binom{m-K-1}{n-k}B_Q

where

A_Q=\,_2F_1(1+K-k,1+n-k;1+K+n-k-Q;1)

and

B_Q=\,_3F_2(1,m-K,m-n;m-Q,m-(K+n-k);1)

are generalized hypergeometric functions.

The limiting case for m → ∞ is

S_Q=\frac{\binom{K+n-k}{Q}}{\binom{K+n-k}{n}}A_Q

and so

Q \frac{S_Q}{S_{Q-1}}=(K+n-k-Q+1){A_Q\over A_{Q-1}}

Gauss's theorem

_2F_1 (a,b;c;1)=\frac{(c-1)!}{(c-a-1)!} \frac{(c-a-b-1)!}{(c-b-1)!}

gives the simplification

A_Q=\frac{(K+n-k-Q)!}{(K-Q-1)!}\frac{(k-Q-2)!}{(n-Q-1)!}

so

\frac{A_Q}{A_{Q-1}}=\frac{K-Q}{K+n-k-Q+1}\frac{n-Q}{k-Q-1}

and

Q \frac{S_Q}{S_{Q-1}}=\frac{K-Q}1\frac{n-Q}{k-Q-1}

So μ and σ are given by

\mu=\frac{K-1}1\frac{n-1}{k-2} for k≥3

and

\frac{\sigma^2}{\mu}+\mu-1=\frac{K-2}1\frac{n-2}{k-3} for k≥4.

and the final result is [3]

N\approx \frac{K-1}1\frac{n-1}{k-2}\pm\sqrt{\frac{K-1}1\frac{n-1}{k-2}\frac{K-k+1}{k-2}\frac{n-k+1}{k-3}}

Bo Jacoby (talk) 09:51, 25 July 2014 (UTC).

Example[edit]

   ]a=.11 10 4, 11 10 5,: 11 10 6
11 10 4
11 10 5
11 10 6
   MR=.[:({.,.[:%:{.*[:>:-~/)[:(*`%`:3"1)0 1-~/1 1 2-~"1/]
   MR a
  45 35.4965
  30 14.4914
22.5     7.5

This calculation for K = 11 and n = 10 shows that if k = 4 then N ≈ 45 ± 35.5. If k=5 then N ≈ 30 ± 14.5. If k=6 then N ≈ 22.5 ± 7.5.

Bo Jacoby (talk) 20:07, 8 July 2014 (UTC).


Thanks for this, I think I mostly followed what you have here, the stuff I've been looking at is using frequentist MLE approximations. Seeing it done from a Bayesian perspective is very interesting though. Have you seen any references or is it purely yourself? Jamesmcmahon0 (talk) 11:18, 4 March 2014 (UTC)