A Jackknife Variance Estimator for Unequal Probability Sampling (2024)

Summary

The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.

Inclusion probabilities, Linearization, Pseudovalues, Smooth function of means, Stratification

1. Introduction

Jackknife methods are widely used for standard error estimation in sample surveys (e.g. Wolter (1985) and Shao and Tu (1995)). Tukey's (1958) original idea of jackknife variance estimation has been developed to handle stratified multistage sampling by Lee (1973), Jones (1974), Kish and Frankel (1974) and Krewski and Rao (1981), among others, and the properties of various forms of the jackknife estimator for this case have been studied both theoretically and empirically (e.g. Krewski and Rao (1981), Rao and Wu (1985), Kovar et al. (1988), Rao et al. (1992) and Shao and Tu (1995)). The restriction of the jackknife method to stratified multistage designs constrains its applicability compared, for example, with linearization estimators, which have been defined for any unequal probability sampling design without replacement (Särndal et al. (1992), section 5.5). In this paper we address this constraint by proposing, in Section 3, a jackknife variance estimator, which is applicable for the same general class of sampling designs.

Our approach is based on the analogy between the jackknife and linearization methods, in which the analytic derivative in linearization is replaced by a numerical approximation (Davison and Hinkley (1997), page 50). The estimator that is proposed is a jackknife analogue of a standard linearization variance estimator for unequal probability designs. The same estimator was effectively also proposed by Campbell (1980) in an impressively general paper, which seems unfortunately to have received little attention in the subsequent survey sampling literature. This paper goes beyond Campbell (1980) by investigating the properties of this estimator both theoretically and numerically.

The class of point estimators, for which the variance estimator proposed is defined, is set out in Section 2. We demonstrate in Section 4 that the estimator is consistent for the same asymptotic variance as the linearization estimator. We support this result with a small simulation study in Section 6 comparing the sampling properties of our estimator with three existing jackknife variance estimators that are described in Section 5.

2. The class of point estimators

Before considering variance estimation, it is necessary to define the point estimator, the variance of which is to be estimated. We consider a finite population 𝒰={1,…,i,…,N} containing N units and suppose that values yqi, q=1,…,Q, for Q survey variables are associated with the unit that is labelled i. We assume that a sample 𝒮⊂𝒰 is selected according to a probability sampling design and that there is no non-response.

We motivate the class of point estimators by first defining a class of population parameters θ of interest. We assume that this parameter can be expressed as a function of means, θ=g1,…,μQ), where g(·) is a smooth function (see Appendix A) from ℝQ to ℝ and μq is the finite population mean, μq=N−1Σi ∈ 𝒰yqi. This definition of θ includes most parameters of interest arising in common survey applications, such as ratios, subpopulation means and correlation and regression coefficients. We assume that θ is a scalar for simplicity although the approach could be generalized to multivariate θ.

We now define the point estimator θ^ as the substitution estimator θ^=g(μ^1,,μ^Q),, where

μ^q=iSwiyqi

is the Hájek (1981) ratio estimator of μq, the weight wi is given by

wi=1/N^πi,

(1)

N^=ΣiSπi1 is an unbiased estimator of N and πi denotes the first-order inclusion probability of unit i. Many parameters of interest in surveys, e.g. ratios and correlation coefficients, are invariant to multiplication of each μq in g1,…,μQ) by a common constant; in such cases the specification of N^ in equation (1) is arbitrary and θ^ could be viewed alternatively as a function of estimated totals.

3. The proposed jackknife variance estimator

We adopt a design-based approach and consider the estimation of the variance of θ^ with respect to the sampling design. We propose to estimate this variance by

var^(θ^)=iSjSπijπiπjπijε(i)ε(j),

(2)

where πij denotes the probability that both units i and j are selected,

ε(i)=(1wi)(θ^θ^(i)),

(3)

θ^(j)=g(μ^1(j),,μ^Q(j)), μ^q(j)=ΣiSjπi1yqi/N^(j),, N^(j)=ΣiSjπi1, 𝒮j consists of 𝒮 with the jth unit deleted and n is the size of the sample 𝒮.

The estimator in equation (2) takes the form of the variance estimator of Horvitz and Thompson (1952) for the sample sum of empirical influence values (Davison and Hinkley (1997), chapter 2), where these empirical influence values are numerically approximated by the jackknife pseudovalues. This is analogous to the linearization variance estimator (Särndal et al. (1992), page 175) which takes the same form but with the empirical influence values obtained by analytic differentiation. This perspective was first set out by Campbell (1980), who noted how both these estimators could be constructed but did not evaluate their properties in detail.

The factor 1−wi is a correction for unequal πi, reducing the contribution of observations which have higher πi-values and thus make smaller contributions to the variance. The inclusion of this factor ensures that equation (2) reduces to the usual linearization variance estimator (Särndal et al. (1992), page 182) when θ^ is the Hájek estimator μ^1, say, in which case ɛ(i) reduces to (y1iμ^1). The (1−wi)-correction was suggested by Campbell (1980), who noted an algebraic equivalence with the weighted jackknife method of Hinkley (1977).

4. Consistency

In this section we consider the design consistency of the variance estimator proposed. Building on the analogy between linearization and jackknife variance estimation, we follow the approach of Särndal et al. (1992), who treated the linearization variance estimator under an unequal probability design as an estimator of an approximate linearized variance and then referred to other evidence that this approximate variance agrees well with the actual variance in large samples (Särndal et al. (1992), page 175). The approximate linearized variance (Robinson and Särndal, 1983) var(θ^)L in our case (using expressions (5.5.10) and (5.7.4) in Särndal et al. (1992)) is given by

var(θ^)L=(μ)TΣ(μ),

(4)

where

Σ=1N2iUjUπijπiπjπiπj(yiμ)(yjμ)T,(x)=(g(μ)μ1,,g(μ)μQ)μ=xT,

yi=(y1i,…,yQi)T, (x) denotes the gradient of g(·) at x ∈ Q and it is assumed that g(·) is continuous and differentiable at μ=(μ1,…,μQ)T.

To demonstrate the consistency of the proposed variance estimator for the approximate linearized variance, we first define our asymptotic framework. Let {𝒮t} be a sequence of samples selected from the sequence of nested finite populations {𝒰t} of sizes Nt by a sequence of sampling designs, such that 𝒮t is composed of a fixed number nt of distinct elements selected from 𝒰t (nt<Nt) for t=1,2,…. For simplicity of notation, the index t will be suppressed in what follows and all limiting processes will be understood to be as t→∞. We shall denote by →p and D respectively convergence in probability and in distribution when t→∞.

Theorem 1

Provided that the linearization variance estimator (11) is design consistent and under regularity assumptions that are given in Appendix A, the proposed variance estimator (2) is also design consistent, i.e.

var^(θ^)/var(θ^)Lp1

(4)

The proof of theorem 1 is given in Appendix A.

It follows as a corollary of theorem 1 that if

θ^θvar(θ^)L1/2DN(0,1),

(6)

i.e. if appropriate conditions hold for the linearization variance estimator to generate asymptotically valid confidence intervals, then by slu*tsky's lemma

θ^θvar^(θ^)1/2DN(0,1).

Confidence intervals based on var^(θ^) will then be asympotically valid.

The key requirement for condition (6) to hold is that the Horvitz–Thompson estimators underlying the definition of θ^ are asymptotically normal. Sufficient conditions for asymptotic normality have been investigated to a limited extent in the survey sampling literature, but some examples of conditions are given by Hájek (1964) and Rosén (1972).

5. Alternative jackknife variance estimators

For comparison with the variance estimator proposed, we now consider some alternative jackknife estimators that have been proposed in the literature. The standard jackknife variance estimator of θ^ (Tukey, 1958) is defined by

var^(θ^)J=n1niS(θ^(i)θ¯)2,

(7)

where θ¯=n1ΣiSθ^(i). If we ignore the finite population correction and if we assume that the sample is selected with simple random sampling without replacement, equation (2) reduces to equation (7). The variance estimator in equation (7) has been shown to be consistent for independent and identically distributed observations (e.g. Shao (1989, 1993) and Shao and Tu (1995)).

For the case of stratified simple random sampling without replacement, Lee (1973) (see also Kish and Frankel (1974)) proposed the variance estimator

var^(θ^)ST=h=1Hnh1nhiSh(θ^(i)θ^)2,

(8)

where 𝒮h is the sample of size nh in the hth stratum Uh. For comparison, equation (2) reduces under this design to

var^(θ^)=h=1H(1nhNh)nh1nhiSh(θ^(i)θ¯h)2,

(9)

where θ¯h=nh1ΣiShθ^(j). Ignoring the finite population correction, equation (9) is the jackknife estimator that was proposed by Jones (1974). Thus, when θ^θ¯h and the finite population correction is negligible, equation (8) is close to equation (9). It is worth noting that equation (9) naturally includes a finite population correction which is absent in equation (8).

Rao et al. (1992) described a customary ‘delete cluster’ jackknife variance estimator for a general weighted point estimator in stratified multistage designs. For the case when the clusters are single units and the weights are Horvitz–Thompson weights πj1, their estimator reduces to

var^(θ^)R=h=1Hnh1nhiSh(θ^(i)*θ^)2,

(10)

where θ^(i)* is computed by omitting unit i ∈ 𝒮h and by modifying the weights so that πj1 is replaced by nhπj1/(nh1) for all j ∈ 𝒮h and the weight stays unaltered for all other j.

6. Monte Carlo study

In this section, the proposed variance estimator (2) is compared numerically with the alter- native jackknife estimators (7), (8) and (10). We use a population frame given in Valliant et al. (2000), appendix B, and available at the John Wiley World Wide Web site ftp://ftp.wiley.com/public/sci_tech_med/finite_populations. This population frame is extracted from the September 1976 Current Population Survey in the USA. We duplicate this population frame five times to create an artificial population of N=2390 individuals from which samples will be selected. This population is stratified into H=3 strata. The variables that are of interest are the number of hours worked per week (y1i) and the weekly wages (y2i). The population parameter that is considered is the finite population correlation coefficient between these two variables ρ=σ12(σ12σ22)1/2 where σ12i ∈ 𝒰(y1i−μ1)(y2i−μ2) and σk2=ΣiU(ykiμk)2 (k=1,2). The population value is ρ=0.49. We propose to estimate ρ by

ρ^=σ^12(σ^12σ^22)1/2,

where σ^12=ΣiSwi(y1iμ^1)(y2iμ^2) and σ^k2=ΣiSwi(ykiμ^k)2(k=1,2).

We consider a stratified sampling design with proportional allocation with at least two units selected per stratum, using the Chao (1982) sampling design for selection within each stratum. The πi are proportional to a skewed size variable correlated with the y2i, with a correlation coefficient of 0.83. The size variable has a coefficient of variation of 1.22, a Fisher coefficient of skewness of 3.13 and a kurtosis of 14.7. The πij are computed exactly by using an expression given by Chao (1982).

For each simulation, 10 000 samples were selected to compute the empirical relative bias

RB(%)=100bias{var^(ρ^)}var(ρ^),

where biasbias{var^(ρ^)}=E{var^(ρ^)}var(ρ^) and the empirical relative root-mean-square error

RRMSE(%)=100MSE{var^(ρ^)}1/2var(ρ^)

of equations (2), (7), (8) and (10). The variance var(ρ^) is the empirical variance of the 10 000 observed values of ρ^.

The relative bias for the various estimators is given in Table 1 for several sampling fractions f=n/N. The second column gives the relative bias of ρ^, RB(ρ^). Estimators (7), (8) and (10) seriously overestimate the variance. For all the sampling fractions that were considered, the proposed estimator (2) has negligible bias. Table 2 gives RRMSE for equations (2), (7), (8) and (10). We see that the proposed estimator (2) has the smallest RRMSE for almost every value of f.

Table 1

Relative bias with and without finite population correction (FPC)

fRB((ρ^))Relative bias (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation (10)Equation (10) with FPC
0.03−6.161.1820.1816.5616.8613.3418.6615.09
0.05−4.30−1.0812.857.2311.055.5212.226.63
0.07−2.76−2.349.331.658.120.528.991.33
0.10−2.080.4311.390.2510.53−0.5211.200.08
0.12−1.93−0.0110.58−2.699.88−3.3110.46−2.81
0.15−1.301.7012.69−4.2412.11−4.7312.60−4.31
0.20−0.880.7712.96−9.6312.53−9.9812.91−9.67
0.40−0.45−1.1622.68−26.3922.44−26.5322.66−26.40
fRB((ρ^))Relative bias (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation (10)Equation (10) with FPC
0.03−6.161.1820.1816.5616.8613.3418.6615.09
0.05−4.30−1.0812.857.2311.055.5212.226.63
0.07−2.76−2.349.331.658.120.528.991.33
0.10−2.080.4311.390.2510.53−0.5211.200.08
0.12−1.93−0.0110.58−2.699.88−3.3110.46−2.81
0.15−1.301.7012.69−4.2412.11−4.7312.60−4.31
0.20−0.880.7712.96−9.6312.53−9.9812.91−9.67
0.40−0.45−1.1622.68−26.3922.44−26.5322.66−26.40

Open in new tab

Table 1

Relative bias with and without finite population correction (FPC)

fRB((ρ^))Relative bias (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation (10)Equation (10) with FPC
0.03−6.161.1820.1816.5616.8613.3418.6615.09
0.05−4.30−1.0812.857.2311.055.5212.226.63
0.07−2.76−2.349.331.658.120.528.991.33
0.10−2.080.4311.390.2510.53−0.5211.200.08
0.12−1.93−0.0110.58−2.699.88−3.3110.46−2.81
0.15−1.301.7012.69−4.2412.11−4.7312.60−4.31
0.20−0.880.7712.96−9.6312.53−9.9812.91−9.67
0.40−0.45−1.1622.68−26.3922.44−26.5322.66−26.40
fRB((ρ^))Relative bias (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation (10)Equation (10) with FPC
0.03−6.161.1820.1816.5616.8613.3418.6615.09
0.05−4.30−1.0812.857.2311.055.5212.226.63
0.07−2.76−2.349.331.658.120.528.991.33
0.10−2.080.4311.390.2510.53−0.5211.200.08
0.12−1.93−0.0110.58−2.699.88−3.3110.46−2.81
0.15−1.301.7012.69−4.2412.11−4.7312.60−4.31
0.20−0.880.7712.96−9.6312.53−9.9812.91−9.67
0.40−0.45−1.1622.68−26.3922.44−26.5322.66−26.40

Open in new tab

Table 2

Relative root-mean-square error with and without finite population correction (FPC)

fRRMSE (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation 10Equation (10) with FPC
0.0391.13126.78122.52123.71119.61124.46120.29
0.0574.9597.6792.2896.4491.2096.8691.55
0.0766.6781.5675.3480.8474.7881.1074.95
0.1059.2571.2463.2970.7462.9671.0063.10
0.1255.3564.8856.3964.5056.1864.7456.29
0.1550.0858.1548.4157.8348.2958.0348.33
0.2043.2450.3640.1150.1340.0950.3040.07
0.4028.6740.1733.0540.0033.1540.1433.05
fRRMSE (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation 10Equation (10) with FPC
0.0391.13126.78122.52123.71119.61124.46120.29
0.0574.9597.6792.2896.4491.2096.8691.55
0.0766.6781.5675.3480.8474.7881.1074.95
0.1059.2571.2463.2970.7462.9671.0063.10
0.1255.3564.8856.3964.5056.1864.7456.29
0.1550.0858.1548.4157.8348.2958.0348.33
0.2043.2450.3640.1150.1340.0950.3040.07
0.4028.6740.1733.0540.0033.1540.1433.05

Open in new tab

Table 2

Relative root-mean-square error with and without finite population correction (FPC)

fRRMSE (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation 10Equation (10) with FPC
0.0391.13126.78122.52123.71119.61124.46120.29
0.0574.9597.6792.2896.4491.2096.8691.55
0.0766.6781.5675.3480.8474.7881.1074.95
0.1059.2571.2463.2970.7462.9671.0063.10
0.1255.3564.8856.3964.5056.1864.7456.29
0.1550.0858.1548.4157.8348.2958.0348.33
0.2043.2450.3640.1150.1340.0950.3040.07
0.4028.6740.1733.0540.0033.1540.1433.05
fRRMSE (%) for the following variance estimators:
Equation (2)Equation (7)Equation (7) with FPCEquation (8)Equation (8) with FPCEquation 10Equation (10) with FPC
0.0391.13126.78122.52123.71119.61124.46120.29
0.0574.9597.6792.2896.4491.2096.8691.55
0.0766.6781.5675.3480.8474.7881.1074.95
0.1059.2571.2463.2970.7462.9671.0063.10
0.1255.3564.8856.3964.5056.1864.7456.29
0.1550.0858.1548.4157.8348.2958.0348.33
0.2043.2450.3640.1150.1340.0950.3040.07
0.4028.6740.1733.0540.0033.1540.1433.05

Open in new tab

To see whether the difference between the bias of equations (2), (7), (8) and (10) is due to the finite population correction, we have multiplied the variance estimators (7), (8) and (10) by 1−f. The RB- and the RRMSE-values are given in the columns that are headed by ‘with FPC’ in Tables 1 and 2. We see that, for large sampling fractions, this correction tends to lead to underestimation of the variance. For small sampling fractions, the finite population correction cannot eliminate the large positive bias. This may be caused by the skewness of the πi and the small sample size.

7. Discussion

The jackknife variance estimator that is proposed in equation (2) is applicable to general unequal probability designs and is design consistent in circ*mstances where the linearization variance estimator is consistent. A Monte Carlo study shows that the estimator proposed can demonstrate clear improvements compared with existing jackknife estimators. It naturally includes a finite population correction which is usually absent in the standard jackknife methods and may be of particular use for surveys with large sampling fractions.

The jackknife method proposed may be extended in various ways. Point estimators, such as calibration estimators (e.g. Deville and Särndal (1992)), which employ auxiliary population information may often be expressible as functions of means if the function g(·) may be specified in terms of this auxiliary finite population information. The method may in principle be extended to other point estimators which may be expressed as differentiable functionals (Hampel, 1974; Campbell, 1980), although it is well known that the consistency result will not extend to all non-smooth functions of means, such as quantiles.

The practical advantage of the method proposed is its breadth of applicability. A potential disadvantage is that it is constructed by deleting one sample element at a time in contrast with the usual deletion of clusters and this may lead to a major increase in computation. Furthermore, the method assumes that joint inclusion probabilities πij for sample units are available. If not, then various approximations to these joint inclusion probabilities may be used (e.g. Hájek (1964) and Berger (1998)). Multistage sampling with unequal probability sampling without replacement at each stage merits particular further research. The application of the method proposed when the first- and second-order inclusion probabilities are available for each stage of sampling and the potential use of equation (2) at each stage could be considered and compared with standard jackknife methods which delete primary sampling units.

Acknowledgements

The authors are grateful to J. N. K. Rao (Carleton University, Canada) and to two referees for helpful comments.

References

1

Berger

,

Y. G.

(

1998

)

Rate of convergence to asymptotic variance for the Horvitz–Thompson estimator

.

J. Statist. Planng Inf.

,

74

,

149

168

.

2

Campbell

,

C.

(

1980

)

A different view of finite population estimation

.

Proc. Surv. Res. Meth. Sect. Am. Statist. Ass.

,

319

324

.

Google Scholar

OpenURL Placeholder Text

3

Chao

,

M. T.

(

1982

)

A general purpose unequal probability sampling plan

.

Biometrika

,

69

,

653

656

.

4

Davison

,

A. C.

and

Hinkley

,

D. V.

(

1997

)

Bootstrap Methods and Their Application

. Cambridge:

Cambridge University Press

.

5

Deville

,

J. C.

and

Särndal

,

C. E.

(

1992

)

Calibration estimators in survey sampling

.

J. Am. Statist. Ass.

,

87

,

376

382

.

7

Hájek

,

J.

(

1981

)

Sampling in Finite Population

. New York:

Dekker

.

Google Scholar

OpenURL Placeholder Text

8

Hampel

,

R. F.

(

1974

)

The influence curve and its role in robust estimation

.

J. Am. Statist. Ass.

,

69

,

383

393

.

9

Harville

,

D. A.

(

1997

)

Matrix Algebra from a Statistician's Perspective

. New York:

Springer

.

10

Hinkley

,

D. V.

(

1977

)

Jackknife in unbalanced situations

.

Technometrics

,

19

,

285

292

.

11

Horvitz

,

G. G.

and

Thompson

,

D. J.

(

1952

)

A generalization of sampling without replacement from a finite universe

.

J. Am. Statist. Ass.

,

4

,

663

685

.

Google Scholar

OpenURL Placeholder Text

12

Isaki

,

C. T.

and

Fuller

,

W. A.

(

1982

)

Survey design under the regression superpopulation model

.

J. Am. Statist. Ass.

,

77

,

89

96

.

13

Jones

,

H. L.

(

1974

)

Jackknife estimation of functions of stratum means

.

Biometrika

,

61

,

343

348

.

Google Scholar

OpenURL Placeholder Text

14

Kish

,

L.

and

Frankel

,

M. R.

(

1974

)

Inference from complex samples (with discussion)

.

J. R. Statist. Soc. B

,

36

,

1

37

.

Google Scholar

OpenURL Placeholder Text

15

Kovar

,

J. G.

,

Rao

,

J. N. K.

and

Wu

,

C. F. J.

(

1988

)

Bootstrap and other methods to measure errors in survey estimates

.

Can. J. Statist.

,

16

,

25

45

.

16

Krewski

,

D.

and

Rao

,

J. N. K.

(

1981

)

Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods

.

Ann. Statist.

,

9

,

1010

1019

.

17

Lee

,

K.

(

1973

)

Variance estimation in stratified sampling

.

J. Am. Statist. Ass.

,

68

,

336

342

.

18

Rao

,

J. N. K.

and

Wu

,

C. F. J.

(

1985

)

Inference from stratified samples: second-order analysis of three methods for nonlinear statistics

.

J. Am. Statist. Ass.

,

80

,

620

630

.

19

Rao

,

J. N. K.

,

Wu

,

C. F. J.

and

Yue

,

K.

(

1992

)

Some recent work on resampling methods for complex surveys

.

Surv. Methodol.

,

18

,

209

217

.

Google Scholar

OpenURL Placeholder Text

20

Robinson

,

P. M.

and

Särndal

,

C. E.

(

1983

)

Asymptotic properties of the generalized regression estimator in probability sampling

.

Sankhya

B,

45

,

240

248

.

Google Scholar

OpenURL Placeholder Text

21

Rosén

,

B.

(

1972

)

Asymptotic theory for successive sampling with varying probabilities without replacement, I

.

Ann. Math. Statist.

,

43

,

373

397

.

22

Särndal

,

C. E.

,

Swenson

,

B.

and

Wretman

,

J. H.

(

1992

)

Model Assisted Survey Sampling

. New York:

Springer

.

23

Shao

,

J.

(

1989

)

The efficiency and consistency of approximation to the jackknife variance estimator

.

J. Am. Statist. Ass.

,

84

,

114

119

.

24

Shao

,

J.

(

1993

)

Differentiability of statistical functionals and consistency of the jackknife

.

Ann. Statist.

,

21

,

61

75

.

25

Shao

,

J.

and

Tu

,

D.

(

1995

)

The Jackknife and Bootstrap

. New York:

Springer

.

26

Tukey

,

J. W.

(

1958

)

Bias and confidence in not-quite large samples (abstract)

.

Ann. Math. Statist.

,

29

,

614

.

Google Scholar

OpenURL Placeholder Text

27

Valliant

,

R.

,

Dorfman

,

A. H.

and

Royall

,

R. M.

(

2000

)

Finite Population Sampling and Inference: a Prediction Approach

. New York:

Wiley

.

Google Scholar

OpenURL Placeholder Text

28

Wolter

,

K. M.

(

1985

)

Introduction to Variance Estimation

. Berlin:

Springer

.

Google Scholar

OpenURL Placeholder Text

29

Yates

,

F.

and

Grundy

,

P. M.

(

1953

)

Selection without replacement from within strata with probability proportional to size

.

J. R. Statist. Soc.

B,

15

,

253

261

.

Google Scholar

OpenURL Placeholder Text

Appendix A: Assumptions and proof of theorem 1

The following assumptions will be made.

  • (a)

    var^(θ^)L/var(θ^)Lp1, where var^(θ^)L is the linearization variance estimator that is given by

    var^(θ^)L=(μ^)TΣ^(μ^)

    (11)

    where

    Σ^=iSjSDijwiwj(yiμ^)(yjμ^)T,

    with Dij=(πijπiπj)πij1

  • (b)

    |1−wi|α>0 for all i ∈ 𝒰, where α is a constant (free of t).

  • (c)

    liminf{nvar(θ^)L}>0.

  • (d)

    (1/n)ΣiSwiτyiμ^τ=Op(nτ), for all τ2, where ║·║ denotes the Euclidean norm defined by ║A║=tr(ATA)1/2.

  • (e)

    Gs=iSjSji(Dij)2=Op(1), where

    Dij={Dij,if Dij<0,0,otherwise.

    (12)

  • (f)

    Hs=iSjSji(Dij+)2=Op(1), where

    Dij+={0,if Dij<0,Dij,otherwise.

    (13)

  • (g)

    (x) is Lipschitz continuous of order δ>0 (e.g. Shao and Tu (1995), page 43) in the sense that

    (x1)(x2)λx1x2δ

    for a constant λ>0, where x1 and x2 are in the neighbourhood of μ.

  • (h)

    (μ^)=Op(1).

Assumption (a) states that the linearization variance estimator is consistent. An example of sufficient conditions for this assumption to hold can be found in Krewski and Rao (1981). Assumption (b) ensures that none of the weights (1) can approach 1, which would represent a degenerate design. Assumption (c) holds in the standard circ*mstances where the linearized variance decreases with rate n−1 (Shao and Tu (1995), page 260). It holds when var(θ^)Lν/n, where ν is a positive constant. This inequality is similar to the Cramér–Rao lower bound. Assumption (d) is an assumption about the behaviour of the weights and the existence of moments of the yi, which would hold, for example, if the nwi and the yi were bounded. Assumptions (e) and (f) are mild assumptions on the design, similar to ones in Isaki and Fuller (1982). For example, with simple random sampling without replacement, Gs1−n/N=Op(1) and Hs=0. Moreover if the condition of Yates and Grundy (1953) holds, Dij<0 for all i and j, implying that Hs=0. Assumptions (g) and (h) are smoothness requirements of the function g(·).

A.1. Proof of theorem 1

From the mean value theorem, we have

θ^θ^(i)=g(μ^)g(μ^(i))=(ξi)T(μ^μ^(i))=(μ^)T(μ^μ^(i))+ri*,

where ξi is a point between μ^ and μ^(i) and ri* is the remainder given by

ri*=((ξi)(μ^))T(μ^μ^(i)).

Thus,

ε(i)=(μ^)T(1wi)(μ^μ^(i))+ri,

where

ri=(1wi)ri*.

(14)

It can be shown that

(1wi)(μ^μ^(i))=wi(yiμ^),

(15)

implying that

ε(i)=(μ^)Twi(yiμ^)+ri

(16)

Thus, by substituting equation (16) into equation (2), we obtain

var^(θ^)=A+B+2C

with

A=(μ^)T{iSjSDijwiwj(yiμ^)(yjμ^)T}(μ^),B=iSjSDijrirj,

(17)

C=iSjSDijriwj(yjμ^)T(μ^)

(18)

Hence, theorem 1 follows if we may show

Avar(θ^)L1p1,

(19)

Bvar(θ^)L1p0,

(20)

Cvar(θ^)L1p0.

(21)

Assumption (a) implies expression (19). It is therefore only necessary to show expressions (20) and (21). We start by showing expression (20). From equation (17),

B=12iSjSDij(rirj)2+12iSjSDij(ri2+rj2).

Furthermore, by definition of Dij and Dij+ in expressions (12) and (13), we have

BB1+B2,

(22)

where

B1=12iSjSDij(rirj)2

and

B2=12iSjSDij+(ri2+rj2).

By the Cauchy inequality,

B1var(θ^)LGs1/22{1var^(θ^)L2iSjS(rirj)4}1/2.

Now, as

iSjS(rirj)4=2niS(rir¯)4+6{iS(rir¯)2}2

with r¯=n1iSri, we have

B1var(θ^)LGs1/22(B3+B4)1/2,

(23)

where

B3=2nvar(θ^)L2iS(rir¯)4,B4=6var(θ^)L2{iS(rir¯)2}2.

(24)

Moreover,

B3B˜3=2nvar(θ^)L2iSri4,

(25)

B4B˜4=6var(θ^)L2(iSri2)2.

(26)

Thus, assumption (e) and inequality (23) imply that B1var(θ^)L1p0, if B˜3p0 and B˜4p0. The Cauchy inequality (e.g. Harville (1997), page 62) further implies that

|ri|(ξi)(μ^)(1wi)(μ^μ^(i)).

Combining this last inequality with equation (15), we obtain

|ri|(ξi)(μ^)wiyiμ^.

(27)

Assumption (g) implies that there are constants λ>0 and δ>0 such that

(ξi)(μ^)λξiμ^δ

(28)

As ξi is a point between μ^ and ξi, we have ξiμ^μ^μ^(i). Combining this last inequality with equation (15), we obtain

ξiμ^wi(1wi)1(yiμ^)

which combined with inequality (28) gives

(ξi)(μ^)λwiδ|1wi|δyiμ^δ

Now, using assumption (b), we have

(ξi)(μ^)λαδwiδyiμ^δ

(29)

Thus, inequalities (27) and (29) imply that

|ri|λαδwi1+δyiμ^1+δ.

(30)

First, we show that B˜3p0. Combining inequalities (25) and (30), we obtain

B˜32λ4α4δn4{nvar(θ^)L}2(1niSwi4(1+δ)yiμ^4(1+δ)).

(31)

Assumption (c) implies that

{nvar(θ^)L}2=O(1).

(32)

Now assumption (d) and expressions (31) and (32) imply that B˜3=n4Op(n4(1+δ)), i.e.

B˜3p0.

(33)

Secondly, we show that B˜4p0. Combining inequalities (26) and (30), we obtain

B˜46λ4α4δn4{nvar(θ^)L}2(1niSwi2(1+δ)yiμ^2(1+δ))2.

(34)

Now assumption (d) and expressions (34) and (32) imply that B˜4=n4Op(n2(1+δ))2, i.e.

B˜4p0.

(35)

Thirdly, assumption (e) and expressions (23), (33) and (35) imply that

B1var(θ^)L1p0.

(36)

Now, we show that B2var(θ^)L1p0. We have by the Cauchy inequality

B2var(θ^)LHs1/22{1var(θ^)L2iSjS(ri2+rj2)2}1/2=Hs1/22(B˜3+B˜43)1/2.

Thus, assumption (f) and expressions (33) and (35) imply that

B2var(θ^)L1p0.

(37)

Consequently, expression (20) follows from expressions (36) and (37). To complete the proof we need to show expression (21). By the triangle inequality, equation (18) implies that

|C|iSjS|Dij||ri||y˜j|=C1+C2

with y˜j=wj(yjμ^)T(μ^), where

C1=iSjSDij+|ri||y˜j|

and

C2=iSjSDij|ri||y˜j|.

By the Cauchy inequality, C1Gs1/2C31/2 and C2Hs1/2C31/2, with

C3=iSri2jS|y˜j|2.

(38)

Thus, expression (21) follows from assumptions (e) and (f), if we can show that C3var(θ^)L2p0. The Cauchy inequality implies that |y˜j|wjyjμ^(μ^). By substituting the last inequality and inequality (30) into equation (38), we obtain

C3var(θ^)L2(μ^)2λ2α2δn4{nvar(θ^)L}2(1niSwi2(1+δ)yiμ^2(1+δ))(1njSwj2yjμ^2).

(39)

Now, from assumptions (c) and (d) and expressions (32) and (39), we have

C3var(θ^)L2=n4Op(n2(1+δ))Op(n2)p0,

which implies expression (21), completing the proof.

© 2005 Royal Statistical Society

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

A Jackknife Variance Estimator for Unequal Probability Sampling (2024)
Top Articles
Latest Posts
Article information

Author: Arielle Torp

Last Updated:

Views: 6391

Rating: 4 / 5 (61 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Arielle Torp

Birthday: 1997-09-20

Address: 87313 Erdman Vista, North Dustinborough, WA 37563

Phone: +97216742823598

Job: Central Technology Officer

Hobby: Taekwondo, Macrame, Foreign language learning, Kite flying, Cooking, Skiing, Computer programming

Introduction: My name is Arielle Torp, I am a comfortable, kind, zealous, lovely, jolly, colorful, adventurous person who loves writing and wants to share my knowledge and understanding with you.