1 Introduction

The classical one-dimensional median associated with a probability distribution P on the Borel subsets of the real line \(\mathbb {R}\) is any value \(\theta \) that satisfies the inequalities

$$\begin{aligned} P\bigl ((-\infty ,\theta ]\bigr ) \ge 1/2\,, \quad P\bigl ([\theta ,\infty )\bigr ) \ge 1/2\,. \end{aligned}$$
(1)

Under certain conditions, for example if P has a positive density, we have equalities in (1) and the median is unique. In general, the set of solutions is the bounded closed interval \(\left[ \hbox {Med}_-(P),\hbox {Med}_+(P)\right] \) with

$$\begin{aligned} \hbox {Med}_-(P)\,&:=\,\inf \bigl \{x\in \mathbb {R}\!: P\bigl ((-\infty ,x]\bigr )\ge 1/2\bigr \},\\ \hbox {Med}_+(P)\,&:=\,\sup \bigl \{x\in \mathbb {R}\!: P\bigl ([x,\infty )\bigr )\ge 1/2\bigr \} \end{aligned}$$

as the smallest and the largest median of P. Often, ‘the’ median of P is defined as the corresponding midpoint,

$$\begin{aligned} \hbox {Med}_m(P):= \bigl (\hbox {Med}_-(P) + \hbox {Med}_+(P)\bigr )/2. \end{aligned}$$
(2)

For a data set \(x_1,\ldots ,x_n\) of real numbers a (sample) median is any (distributional) median associated with the empirical distribution \(P_n:=n^{-1} \sum _{k=1}^n \delta _{x_k}\), where \(\delta _x\) denotes unit mass at x. With \(x_{(1)}\le \cdots \le x_{(n)}\) as the values obtained by arranging \(x_1,\ldots ,x_n\) in increasing order,

$$\begin{aligned} \hbox {Med}_m(P_n)=\hbox {Med}_-(P_n)=\hbox {Med}_+(P_n)=x_{\left( (n+1)/2\right) } \end{aligned}$$

is the unique sample median if n is odd, and we get \(\bigl [x_{(n/2)},x_{(n/2+1)}\bigr ]\) as the set of sample medians if n is even.

Historically, the first attempt to generalize this to dimensions \(d>1\) was to use coordinates: If \(x_1,\ldots ,x_n\) is a subset of \(\mathbb {R}^d\) for some \(d\ge 2\) then a coordinatewise or marginal median of the empirical distribution \(P_n:=n^{-1} \sum _{k=1}^n \delta _{x_k}\) is any vector whose ith coordinate is a (one-dimensional) median of the respective ith coordinates of the data vectors. This is the sample version; for a probability measure P on the Borel subsets \(\mathcal {B}^d\) of \(\mathbb {R}^d\) the push-forwards \(P^{\pi _i}\) of P under the coordinate projections \(\pi _i\), \(i=1,\ldots ,d\), would be used. Formally, the marginal medians of P are elements of the d-dimensional interval \(\bigl [\hbox {MMed}_-(P),\, \hbox {MMed}_+(P)\bigr ]\) where \(\hbox {MMed}_\pm (P)\) have \(\hbox {Med}_\pm (P^{\pi _i})\) as their ith component, \(i=1,\ldots ,d\), and

$$\begin{aligned} \hbox {MMed}_m(P):=\bigl (\hbox {MMed}_-(P)+\,\hbox {MMed}_+(P)\bigr )/2 \end{aligned}$$
(3)

is the midpoint marginal median. It is well known that such component-wise extensions of the one-dimensional median are not equivariant with respect to orthogonal transformations and thus depend on the coordinate system chosen for the representation of the data. One way to repair this is by ‘averaging out’ (symmetrization), see Grübel (1996).

Fig. 1
figure 1

The city data (see text for details)

Other generalizations of the one-dimensional median to higher dimensions start with some characterizing property of the classical version and then use a multidimensional analogue. For example, it is known that in the one-dimensional case any median \(\theta \) minimizes the function \(x\mapsto \int |y-x|\, P(dy)\), which leads to the spatial or \(L^1\)-median of the data set, defined as a minimizer of \(x\mapsto \sum _{k=1}^n\Vert x-x_k\Vert \), where \(\Vert \, \cdot \,\Vert \) denotes Euclidean distance. As this distance is invariant under orthogonal transformation, the spatial median does not depend on the chosen coordinates.

Closer to the above equipartition property are the various concepts related to data depth, where for each \(x\in \mathbb {R}^d\) and each affine hyperplane H with \(x\in H\) the number of data points to one side of H is the basic ingredient: A centerpoint median is defined as any such x with the property that the minimum number is at least a fraction \(n/(d+1)\) of the data, and Tukey’s median is a point that maximizes these data depth values; see e.g. Donoho and Gasko (1992). This whole area has attracted the attention of many researchers, often with emphasis on robustness. The literature on multivariate medians up to about 1990 is covered in the review of Small (1990), which includes an interesting discussion of the history of the concept. For more recent reviews of this subject we refer to the treatises of Oja (2013) and Dhar and Chaudhuri (2011).

In the present paper, following an idea of the late Dietrich Morgenstern, we again start with the marginal median as in the work of Grübel (1996), but then use a variant of the equipartition property instead of symmetrization. In dimension \(d=2\), for example, a coordinate system specifies four quarter spaces via the pairwise intersections of the half-spaces underlying the marginal median, and a quarter median of P may be defined as any vector \(\theta \in \mathbb {R}^2\) such that for some coordinate system centered at \(\theta \) all four of the respective quarter spaces have probability at least 1/4. Figure 1 shows a simple example that Professor Morgenstern would have liked: The data are the locations of the cities in Germany with a population of at least 100,000 (in 2017). We have \(n=76\) such cities in total, so we can aim for a partition with 19 cities in each open quarter space. One resulting quarter median is nearby the exit Freudenberg of the A45 (Sauerlandlinie), and with the North–South axis tilted by about 39\(^{\circ }\) counterclockwise; see the lines in Fig. 1. The data (longitude/latitude) for drawing the German border are extracted from the file \(\mathtt gadm36_{-}DEU_{-}0_{-}sp.rds\) available at https://gadm.org/data.html.

Investigation of this concept leads to a number of interesting questions, starting with the obvious ones concerning existence and uniqueness. Next, what are the equivariance properties of the quarter median? Further, in contrast to the spatial median a quarter median comes with a system of associated hyperplanes, which may be regarded as a part of the estimator. In the two-dimensional case these are just two orthogonal lines and can be parametrized by a counterclockwise rotation of the horizontal axis, thus providing a connection to a real interval. In higher dimensions, however, specification of a topological notion of nearness of systems of coordinate hyperplanes is less straightforward, but it is needed in connection with consistency of the estimators. For asymptotic normality even a local linearization is required. Finally, there is the algorithmic issue: How do we find a quarter median for a given data set?

Section 2 contains our main results, on existence, equivariance, and large sample behavior. Elliptical distributions are treated in some detail and we compare the quarter median with other estimators for several specific families. We also consider algorithmic aspects together with the problem of measurable selection from the respective solution sets. Section 3 presents numerical comparisons with existing multivariate medians. Proofs are collected in Sect. 4.

2 Results

We first assemble some formal definitions and then give our results in separate subsections.

2.1 Preliminaries and basic definitions

We regard vectors \(x\in \mathbb {R}^d\) as column vectors and write \(\langle x, y\rangle \) for the inner product of \(x,y\in \mathbb {R}^d\). The transpose of the vector x is \(x^{\text {t}}\), so that \(\langle x, y\rangle =x^{\text {t}}y\). Similarly, \(A^{\text {t}}\) denotes the transpose of a matrix A. We write \(\Vert \cdot \Vert \) for the Euclidean norm on \(\mathbb {R}^d\), \(\mathbb {S}_{d-1}\) is the unit sphere in \(\mathbb {R}^d\), \(\mathbb {O}(d)\) is the group of orthogonal \(d\times d\)-matrices, and \(\mathbb {SO}(d)\) denotes the group of rotations, consisting of those elements of \(\mathbb {O}(d)\) that have determinant 1.

If \((\varOmega ,\mathcal {A},P)\) is a probability space, if \((\varOmega ',\mathcal {A}')\) is a measurable space, and if the mapping \(S:\varOmega \rightarrow \varOmega '\) is \((\mathcal {A},\mathcal {A}')\)-measurable then we write \(P^S\) for the push-forward of P under S, i.e. \(P^S(A')=P(S^{-1}(A'))\) for all \(A'\in \mathcal {A}'\). If \((\varOmega ,\mathcal {A})=(\mathbb {R}^d,\mathcal {B}^d)\) and \(b\in \mathbb {S}_{d-1}\) or \(U\in \mathbb {O}(d)\) then \(P^{b}\) and \(P^U\) respectively refer to the push-forwards of P under the mappings \(x\mapsto \langle b, x\rangle \) and \(x\mapsto Ux\).

For \(a\in \mathbb {R}^d\) and \(b\in \mathbb {S}_{d-1}\) we formally define the associated half-spaces by

$$\begin{aligned} H_-(a;b) := \bigl \{x\in \mathbb {R}^d:\, \langle b,x-a\rangle \le 0\bigr \}, \ H_+(a;b):= \bigl \{x\in \mathbb {R}^d:\, \langle b,x-a\rangle \ge 0\bigr \}. \end{aligned}$$

Let \(e_1,\ldots ,e_d\) be the vectors of the canonical basis of \(\mathbb {R}^d\). Then a marginal median for a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) is any vector \(\theta \in \mathbb {R}^d\) with the property that

$$\begin{aligned} P\bigl (H_+(\theta ;e_i)\bigr )\ge 1/2,\ P\bigl (H_-(\theta ;e_i)\bigr )\ge 1/2 , \quad 1\le i\le d. \end{aligned}$$
(4)

For two orthogonal unit vectors \(b, b'\in \mathbb {S}_{d-1}\) and \(a\in \mathbb {R}^d\) the quarter spaces are the intersections of these half-spaces,

$$\begin{aligned} V_{\pm \pm }(a;b,b') := H_\pm (a;b)\cap H_\pm (a;b'), \end{aligned}$$

with the obvious notational convention for the four possible combinations of plus and minus signs. We then define a quarter median of a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) as any vector \(\theta \in \mathbb {R}^d\) with the property that, for some orthonormal basis \(B=\{b_1,\ldots ,b_d\}\) of \(\mathbb {R}^d\),

$$\begin{aligned} P\bigl (H_+(\theta ;b_i)\bigr )\ge 1/2,\ P\bigl (H_-(\theta ;b_i)\bigr )\ge 1/2 , \quad 1\le i\le d, \end{aligned}$$
(5)

and

$$\begin{aligned} P\bigl (V_{\pm \pm }(\theta ;b_i,b_j)\bigr )\ge 1/4, \quad 1\le i<j\le d, \end{aligned}$$
(6)

for all combinations of plus and minus signs. If (5) and (6) hold then we call the pair \((\theta ,U)\in \mathbb {R}^d\times \mathbb {O}(d)\), where U has rows \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\), a solution of the quarter median problem for the probability measure P. Again, the data versions of these notions refer to the empirical distribution \(P_n\).

If P is absolutely continuous then hyperplanes have zero probability and (5) follows from (6). In general, this is not the case as the following simple example shows: Let \(d=2\) and

$$\begin{aligned} P=\tfrac{1}{3}\bigl (\delta _{(0,0)^{\text {t}}}+\delta _{(1/2,1)^{\text {t}}} +\delta _{(1,1/2)^{\text {t}}}\bigr ). \end{aligned}$$

Then (6) holds with \(b_1=e_1\), \(b_2=e_2\) and \(\theta = (1,1/2)^{\text {t}}\), but (5) is not true. Further, with this choice of \(b_1,b_2\) it is easily verified that the unique marginal median \(\theta =(1/2,1/2)^{\text {t}}\) is a quarter median. This example also shows that the quarter median may not be unique. Further, if hyperplanes have probability zero, then the inequalities in (5) and (6) can be replaced by equalities, and with \((\theta ,U)\) as solution of (6) we then speak of a solution of the equipartition problem for P.

2.2 Existence and uniqueness

The existence of a median and thus of a marginal median follow immediately from the monotonicity of distribution functions. For the quarter median counting the number of variables and the number of constraints may give a first impression. For simplicity, we temporarily assume that hyperplanes have probability zero, so that we have an equipartition problem. If \(d=2\), there are then three unknowns, two for the location parameter and one for the angle of rotation, see also Remark 1 (b) below, and one of the four constraints is redundant, so that the number of unknowns is the same as the number of equations. For \(d>2\) we first note that the conditions in (5) and (6) are not independent. Indeed, for a given basis \(\{b_1,\ldots ,b_d\}\) of \(\mathbb {R}^d\) it is enough to have probability 1/2 for \(H_+(\theta ,b_i)\), \(i=1,\ldots ,d\), and probability 1/4 for \(V_{++}(\theta ;b_i,b_j)\) for \(1\le i < j\le d\), so that \(d + d(d-1)/2\) conditions remain. On the other hand, we have \(d(d-1)/2\) parameters for the (special) orthogonal group, and there are d components of the location vector, hence the number of unknowns is again the same as the number of independent constraints. This, of course, is just a heuristic argument, as the respective conditions are nonlinear.

Using a general tool from algebraic topology, the homotopy invariance of the Brouwer degree of continuous mappings, Makeev (2007) solved the equipartition problem under the additional condition that P is absolutely continuous. A smoothing argument leads to the extension needed here.

Theorem 1

A solution \((\theta ,U)\) of the quarter median problem exists for every probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\).

This result raises an obvious question: Is there a stronger equipartition in dimension \(d>2\), such as an ‘octomedian’ if \(d=3\), with each of the octants receiving the same probability? This is certainly possible for specific distributions, such as the multivariate standard normal. However, if \(d=3\) then we have three unknowns for the location vector, and three for the rotation, which can be parametrized by the Euler angles, and the condition that the octants all have probability 1/8 would similarly lead to seven constraints, which is one too many. Obviously, the difference would be even greater for dimension \(d>3\) if all \(2^d\) intersections of coordinate half-spaces were to have the same probability. According to Makeev (2007, p. 554) this dimension counting argument already implies that a solution does not exist for a generic distribution.

The notion of uniqueness requires some attention. We say that a distribution P has a unique quarter median if \(\theta _1=\theta _2\) for any two solutions \((\theta _1,U_1)\) and \((\theta _2,U_2)\) of the quarter median problem. Obviously, the full pair \((\theta ,U)\) will never be unique: Any linear hyperplane, i.e. a subspace H of \(\mathbb {R}^d\) of dimension \(d-1\), may be specified as the orthogonal complement of the one-dimensional space (line) generated by a unit vector \(b\in \mathbb {S}_{d-1}\) via \(H(b):=\{x\in \mathbb {R}^d:\, b^{\text {t}}x=0\}\), but obviously \(-b\) would lead to the same hyperplane. This simple observation already implies that we may restrict U to be an element of \(\mathbb {SO}(d)\) when searching for a quarter median. Further, the conditions (5) and (6) are invariant under permutations of the rows of U. Conversely, given a set of coordinate hyperplanes, a corresponding basis is a set with elements that are unique up to a factor \(-\,1\). Putting this together we write \(U\sim V\) for \(U,V\in \mathbb {O}(d)\) if U and V lead to the same set of linear hyperplanes; we then say that the solution of the quarter median problem is unique if for any two solutions \((\theta _1,U_1)\) and \((\theta _2,U_2)\) we have \(\theta _1=\theta _2\) and \(U_1\sim U_2\).

The following formal approach will turn out to be useful. Let \(G_1\) be the subgroup of \(\mathbb {O}(d)\) that consists of the diagonal matrices with diagonal entries from \(\{-\,1,+\,1\}\), i.e. the subgroup that represents the compositions of reflections at the coordinate axes, and let \(G_2\) be the subgroup of permutation matrices, which corresponds to coordinate permutations. We write G for the subgroup generated by \(G_1\) and \(G_2\). This is the system of all \(d\times d\)-matrices A with entries \(a_{ij}\in \{-1,0,1\}\) and such that for some permutation \(\pi \) of \(\{1,\ldots ,d\}\) we have \(a_{ij}\not =0\) if and only if \(j=\pi (i)\). The above transition from equality to equivalence can be then regarded as a transition from the group \(\mathbb {O}(d)\) to the factor group

$$\begin{aligned} \mathbb {H}(d) := \mathbb {O}(d)/G, \end{aligned}$$
(7)

with the equivalence classes \([U] :=\{V\in \mathbb {O}(d):\, V\sim U\}\), \(U\in \mathbb {O}(d)\), as elements. If we think of a solution of the quarter median problem as a pair \((\theta ,H)\in \mathbb {R}^d\times \mathbb {H}(d)\), then uniqueness means that two solutions are equal.

Remark 1

  1. (a)

    The \(\theta \)-part of a solution \((\theta ,H)\) for the quarter median problem may be unique while the H-part is not. For example, if P has a density that depends on \(x\in \mathbb {R}^d\) only through \(\Vert x\Vert \) then 0 is the unique quarter median, but all \(H\in \mathbb {H}(d)\) lead to a solution (0, H).

  2. (b)

    In dimension \(d=2\) there is an (almost) canonical choice of an element \(U\in \mathbb {O}(d)\) from an equivalence class \(H\in \mathbb {H}(d)\) corresponding to a set of hyperplanes, here just two orthogonal lines \(L_1,L_2\) meeting at the point \(0\in \mathbb {R}^2\). With a suitable half-open interval I of length \(\pi /2\), such as \(I=[-\pi /4,\pi /4)\), there is exactly one \(\alpha \in I\) such that the lines are represented by

    $$\begin{aligned} U_\alpha :=\begin{pmatrix} \cos (\alpha )&{}\quad -\sin (\alpha )\\ \sin (\alpha ) &{}\quad \cos (\alpha )\end{pmatrix}, \end{aligned}$$
    (8)

    which is the counterclockwise rotation of the canonical coordinate lines by the angle \(\alpha \), in the sense that \(L_i=\{x\in \mathbb {R}^2:\, b_i^{\text {t}}x=0\}\), where \(b_1^{\text {t}},b_2^{\text {t}}\) are the rows of U. The corresponding topology would then be that of the quotient space \(\mathbb {R}/I\).

  3. (c)

    While we work with the space \(\mathbb {R}^d\) of column vectors throughout the notions of quarter median and quarter median problem make sense for an arbitrary finite-dimensional Euclidean space \((E,\langle \cdot ,\cdot \rangle )\). If \((\theta ,U)\) solves the quarter median problem for a distribution P on \((\mathbb {R}^d,\mathcal {B}^d)\), then \(\{b_2,\ldots ,b_d\}\) is an orthonormal basis for the linear space \(E=H(b_1)\), which has dimension \(d-1\). Here \(b_i^{\text {t}}\) denotes the ith row of U, \(i=1,\ldots ,d\); equivalently, \(b_i\) is the ith column of \(U^{\text {t}}\). If \(d\ge 3,\) this can be used to obtain a projectivity or consistency property that relates \((\theta ,U)\) to a solution \((\theta _E,U_E)\) of the quarter median problem for the push-forward of P under the orthogonal projection \(\pi _E:\mathbb {R}^d\rightarrow E\). In fact, let \(\mathbb {S}_{d-1}^E=\{y\in E:\Vert y\Vert =1\}\) be the unit sphere in E. For \(c\in E\) and orthogonal unit vectors \(d,d'\in \mathbb {S}_{d-1}^E\) the associated half-spaces and quarter spaces are given by

    $$\begin{aligned} H_-^E(c;d) := \bigl \{y\in E:\, \langle d,y-c\rangle \le 0\bigr \}, \ H_+^E(c;d) := \bigl \{y\in E:\, \langle d,y-c\rangle \ge 0\bigr \}, \end{aligned}$$

    and

    $$\begin{aligned} V_{\pm \pm }^E(c;d,d') := H_\pm ^E(c;d)\cap H_\pm ^E(c;d'). \end{aligned}$$

    Using that a vector \(x\in \mathbb {R}^d\) can uniquely be written as \(x=\sum _{i=1}^d\langle x, b_i \rangle \,b_i\) we see that the orthogonal projection of x in E is \(x_E:=\pi _E(x)=\sum _{i=2}^d\langle x, b_i \rangle \,b_i\). It follows from \(\pi _E^{-1}\left( H_{\pm }^E(\theta _E;b_i)\right) = H_{\pm }(\theta ;b_i)\) for \(2\le i\le d\), and \(\pi _E^{-1}\left( V_{\pm \pm }^E(\theta _E;b_i,b_j)\right) = V_{\pm \pm }(\theta ;b_i,b_j)\) for \(2\le i<j\le d\), that \(P^{\pi _E}\left( H_{\pm }^E(\theta _E;b_i)\right) =P\left( H_{\pm }(\theta ;b_i)\right) \) for \(2\le i\le d\), and that \(P^{\pi _E}\left( V_{\pm \pm }^E(\theta _E;b_i,b_j)\right) =P\left( V_{\pm \pm }(\theta ;b_i,b_j)\right) \) for \(2\le i<j\le d\).

2.3 Equivariance

In the context of equivariance properties of location estimators for d-dimensional data we start with a family (often a group) \(\mathcal {S}\) of measurable transformations \(S:\mathbb {R}^d\rightarrow \mathbb {R}^d\) and we regard the location parameter \(\theta \in \mathbb {R}^d\) as a function \(\theta =T(P)\) of P. Recall that \(P^S\) is the push-forward of P under the transformation S. Then T is said to be \(\mathcal {S}\)-equivariant if

$$\begin{aligned} T(P^S)=(S\circ T)(P)\quad \text {for all }S\in \mathcal {S}. \end{aligned}$$

In cases where the parameter is not unique this might better be expressed as \(S(\theta )\) being a parameter of this type for \(P^S\) if \(\theta \) is for P.

For example, with \(\mathcal {S}\) being one of the groups \(G_1\) and \(G_2\) that were used in connection with (7), it is easy to check that

$$\begin{aligned} S^{-1} \bigl (\hbox {MMed}_m(P^S)\bigr ) = \hbox {MMed}_m(P) \quad \text {for all } S\in \mathcal {S}, \end{aligned}$$
(9)

i.e. the specific marginal median introduced in (3) is equivariant with respect to reflections and coordinate permutations. It is easily seen that we may even take \(\mathcal {S}=G\). This specific marginal median is also equivariant with respect to separate rescalings, where the corresponding group is the set of all regular diagonal matrices.

The general idea is of course that statistical inference should respect the structure of the data space. Equivariance of a location estimator with respect to shifts and orthogonal linear transformations means that the estimator interacts in this way with the full isometry group of Euclidean space.

At various stages the following function will be important,

$$\begin{aligned} \varPsi (U,P)\, :=\, U^{\text {t}}\,\hbox {MMed}_m(P^U), \end{aligned}$$
(10)

with \(U\in \mathbb {O}(d)\) and P a probability measure on \((\mathbb {R}^d,\mathcal {B}^d)\); \(\varPsi (U,P)\) is the specific marginal median in the coordinate system associated with U, transformed back to canonical coordinates. By definition, a quarter median is always a marginal median in a specific coordinate system. Below, marginal medians in

(11)

will be of particular interest.

Proposition 1

Let P be a probability distribution on \((\mathbb {R}^d,\mathcal {B}^d)\), and let \(\mathcal {S}\) be the set of Euclidean motions.

(a):

If \(\theta \) is a quarter median for the distribution P then, for all \(S\in \mathcal {S}\), \(S(\theta )\) is a quarter median for \(P^S\).

(b):

If \(U,V\in \mathbb {O}(d)\) are such that \(U\sim V\) then \(\varPsi (U,P)=\varPsi (V,P)\).

(c):

If \((\theta ,U)\) solves the quarter median problem for P, then there is a marginal median \(\eta \in \mathcal {M}(P^U)\) such that \((\tilde{\theta },U)\) with \(\tilde{\theta }=U^{\text {t}}\eta \) is also a solution.

(d):

If \((\theta ,U)\) solves the quarter median problem for P and \(S\in \mathcal {S}\), \(S(x)=Ax+b\), then \((A\theta + b, U\!A^{\text {t}})\) solves the quarter median problem for \(P^S\). Moreover, if \(P^b\) has a unique median for all \(b \in \mathbb {S}_{d-1}\),

$$\begin{aligned} \varPsi \bigl (U\!A^{\text {t}},P^S\bigr )\, = \, S\bigl (\varPsi (U,P)\bigr ). \end{aligned}$$
(12)
(e):

Suppose that \(P^b\) has a unique median for all \(b \in \mathbb {S}_{d-1}\). Then, the function \(\varPsi (\,\cdot \,,P):\mathbb {O}(d)\rightarrow \mathbb {R}^d\) is continuous.

Part (a) is the equivariance on the set level. Part (b) shows that \(\varPsi (\cdot ,P)\) may be regarded as a function on \(\mathbb {H}(d)\). Part (c) is of importance when developing an algorithm for searching a solution of empirical quarter median problems: For each \(U\in \mathbb {O}(d)\) the set of potential \(\theta \)-values in (11) consists of three vectors only, separately for each coordinate. It is tempting to think that this could be reduced further to the respective midpoint, but the uniform distribution on the six points \((-2,-2)^{\text {t}},(-1,3)^{\text {t}},(1,-1)^{\text {t}},(2,2)^{\text {t}},(3,4)^{\text {t}},(4,0)^{\text {t}}\) in the plane provides a counterexample: The pair \(\bigl ((2,2)^{\text {t}},U\bigr )\) with \(U=\mathrm{diag}(1,1)\) solves the quarter median problem for this distribution, but the midpoint of the associated rectangle \([1,2]\times [0,2]\) of multivariate medians would lead to \(\bigl ((1.5,1)^{\text {t}},U\bigr )\), which is not a solution. Part (d) notices the Euclidean motion equivariance of the quarter median for distributions P with the property that \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), a condition that will be used repeatedly below.

Similar to the spatial median and the orthomedian introduced by Grübel (1996), the quarter median is not affine equivariant. In particular, it is not equivariant with respect to separate rescalings of the coordinates, in contrast to the special marginal median, but for distributions P with the property that \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), equivariance does hold (and is used in the proofs below) for simultaneous rescalings, where the group \(\mathcal {S}\) consists of all positive scalar multiples of the identity matrix.

Fig. 2
figure 2

The median curve for the city data

Any quarter median is a marginal median in a suitably chosen coordinate system, which we may take to be generated by an orthogonal matrix with determinant 1. Grübel (1996) introduced the orthomedian of P as the integral of the function \(\varPsi (\cdot ,P)\) over \(\mathbb {SO}(d)\) with respect to the Haar measure with total mass 1. Thus, the orthomedian may be seen as the expected value of this function with U chosen uniformly at random, and its orthogonal equivariance is then an immediate consequence of the fact that the Haar measure is invariant under the group operations. In contrast, the quarter median picks a marginal median from the range of the set-valued function

$$\begin{aligned} U\mapsto \Bigl \{U^{\text {t}}\eta : \, \eta \in \bigl [\hbox {MMed}_-(P^U),\,\hbox {MMed}_+(P^U)\bigr ]\Bigr \} \end{aligned}$$

via the partition constraint (6). Figure 2 shows the range of the function \(U\mapsto \varPsi (U,P_n)\) for the empirical distribution \(P_n\) associated with the city data in Fig. 1, where \(n=76\). The parts colored red correspond to possible quarter medians.

2.4 Asymptotics

Throughout this section we assume that P is a probability measure on \((\mathbb {R}^d,\mathcal {B}^d)\) with the property that

$$\begin{aligned} P^a~\text {has a unique median for all}~a \in S_d,\end{aligned}$$
(13)

and that \(X_1,X_2,\ldots \) are independent copies of a random vector X with distribution P. We further assume that, for each \(n\in \mathbb {N}\), the pair \((\theta _n,U_n)\in \mathbb {R}^d\times \mathbb {O}(d)\) with \(\theta _n=\theta _n(X_1,\ldots ,X_n)\), \(U_n=U_n(X_1,\ldots ,X_n)\), is a solution of the quarter median problem for the (random) empirical distribution \(P_n\) associated with \(X_1,\ldots ,X_n\). We also assume that \(\theta _n\) and \(U_n\) are measurable with respect to the respective Borel \(\sigma \)-fields on \((\mathbb {R}^d)^n\). A discussion of the associated selection problem and an extension of the results to more general random elements is given in Sect. 2.6.

We deal with consistency first and then consider asymptotic normality. Recall that uniqueness and convergence refer not to the U-matrices themselves, but to the associated equivalence classes \([U]\in \mathbb {H}(d)\), hence we have to specify what convergence of the U-part means. On \(\mathbb {O}(d)\) we use the topology induced by \(\mathbb {R}^{d\times d}\) where a sequence of matrices converges if all entries converge individually. It is well known that this makes \(\mathbb {O}(d)\) a compact and separable Hausdorff space. We use the quotient topology on \(\mathbb {H}(d)\), which is the finest topology with the property that the mapping \(\mathbb {SO}(d)\rightarrow \mathbb {H}(d)\), \(U\mapsto [U]\), is continuous. As we have \(\mathbb {H}(d)=\mathbb {O}(d)/G\) with some finite group \(G\subset \mathbb {O}(d)\), the space \(\mathbb {H}(d)\) with this topology is again a compact and separable Hausdorff space.

Theorem 2

Suppose that (13) holds.

(a):

If the quarter median for P is unique and given by \(\theta \in \mathbb {R}^d\), then \(\theta _n\) converges almost surely to \(\theta \) as \(n\rightarrow \infty \).

(b):

If the solution of the quarter median problem for P is unique and given by \((\theta ,H)\in \mathbb {S}_{d-1}\times \mathbb {H}(d)\), then \((\theta _n,[U_n])\) converges almost surely to \((\theta ,H)\) as \(n\rightarrow \infty \).

In connection with distributional asymptotics we require (13) and also that P is absolutely continuous. Let \(M_a=M(a)\) be the (unique) median of \(P^a\), \(a\in \mathbb {S}_{d-1}\). As in the paper of Grübel (1996) we further assume that \(P^a\) has a density \(f_a\) such that

$$\begin{aligned}&(a,t)\rightarrow f_a(t)\text { is continuous at each}~(a,t) \in D:=\big \{\big (a,M_a\big ):a\in \mathbb {S}_{d-1}\big \}, \end{aligned}$$
(14)
$$\begin{aligned}&f_a(M_a)>0~\text {for each}~a\in \mathbb {S}_{d-1}, \ \end{aligned}$$
(15)

and, in addition, that

$$\begin{aligned}&\text {there is a point}~\theta \in \mathbb {R}^d~\text {such that }M_a=a^{\text {t}}\theta \text { for each}~a\in \mathbb {S}_{d-1}. \end{aligned}$$
(16)

For example, (16) is satisfied if P is symmetric about \(\theta \) in the sense that \((X_1-\theta )\) and \(-(X_1-\theta )\) have the same distribution. Of course, if (13) and (16) hold then the quarter median for P is unique and given by \(\theta \).

For \(U\in \mathbb {O}(d)\) with rows \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\,\) let

$$\begin{aligned} \varDelta (P,U) \, :=\, \mathrm{diag}\biggl (\frac{1}{f_{b_1}(M_{b_1})^2},\cdots , \frac{1}{f_{b_d}(M_{b_d})^2}\biggr ). \end{aligned}$$
(17)

Of course, \(f_a\) and \(M_a\) also depend on P. Finally, we write \(X_n\rightarrow _{\text {\tiny d}}Q\) for convergence in distribution of a sequence \((X_n)_{n\in \mathbb {N}}\) of random variables to a random variable with distribution Q, and \(N_d(\mu ,\varSigma )\) for the d-dimensional normal distribution with mean vector \(\mu \) and covariance matrix \(\varSigma \).

Theorem 3

Suppose that the conditions (13)–(16) hold and that the solution of the quarter median problem for the absolutely continuous probability measure P is unique and given by \((\theta ,H)\). Then, in the above setup, and with \(U\in H\),

$$\begin{aligned} \sqrt{n}\bigl (\theta _n-\theta ) \, \rightarrow _{\text {\tiny d}}\, N_d\bigl (0,\varSigma (P,U)\bigr ) \quad \text {as } n\rightarrow \infty , \end{aligned}$$
(18)

where \(\,\varSigma (P,U):= U^{\text {t}}\varDelta (P,U)U\) and \(\varDelta (P,U)\) is given by (17).

As part of the proof we will show that \(\varSigma (P,U)\) does not depend on the choice of the element U of H.

2.5 Elliptical distributions

We now consider a special family of distributions in more detail. Let \(h:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be a function with the property

$$\begin{aligned} 0 \,<\, c_d(h):= \int _{\mathbb {R}^d} h(\Vert x\Vert ^2)\, dx \, <\, \infty . \end{aligned}$$
(19)

Then, for each \(\mu \in \mathbb {R}^d\) and each positive definite matrix \(\varSigma \in \mathbb {R}^{d\times d}\),

$$\begin{aligned} f(x) = f(x;\mu ,\varSigma )\; =\; \frac{1}{c_d(h) (\det \varSigma )^{1/2}} \,h\bigl ((x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\bigr ), \quad x\in \mathbb {R}^d, \end{aligned}$$

is the density of a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\), the elliptical distribution with location vector \(\mu \) and dispersion matrix \(\varSigma \). We abbreviate this to \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\). If \(\mu =0\) and \(\varSigma =I_d\) we write \(\mathrm{Sym}_d(h)\) for \(\mathrm{Ell}_d(0,I_d;h)\) and speak of a spherically symmetric distribution with the density \(h\big (\Vert x \Vert ^2\big )/c_d(h),\,x\in \mathbb {R}^d\). A transformation to spherical coordinates leads to

$$\begin{aligned} c_d(h)\, =\, \frac{2\pi ^{d/2}}{\varGamma (d/2)} \int _0^\infty r^{d-1} h(r^2)\, dr =\, \frac{\pi ^{d/2}}{\varGamma (d/2)} \int _0^\infty r^{\frac{d}{2}-1} h(r)\, dr. \end{aligned}$$
(20)

The function h is said to be the density generator of the elliptical distributions \(\mathrm{Ell}_d(\mu ,\varSigma ;h).\) For \(h(t)=\exp (-t/2)\) the associated constant is \(c_d(h)= (2\pi )^{d/2}\), and we obtain the multivariate normal distributions \(N_d(\mu ,\varSigma )\).

For later purposes we list some properties of elliptical distributions. Clearly, if \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) and if \(T:\mathbb {R}^d\rightarrow \mathbb {R}^d\), \(T(x) = Ax + b\) with \(A\in \mathbb {R}^{d\times d}\) regular, \(b\in \mathbb {R}^d\), then \(P^T=\mathrm{Ell}_d(\mu +b, A \varSigma A^{\text {t}}; h)\). Further, for \(d\ge 2\) the univariate marginal densities of \(P=\mathrm{Sym}_d(h)\) coincide and are given by

$$\begin{aligned} g_1(y)=g(y^2)/c_d(h),\ y\in \mathbb {R}, \end{aligned}$$

where

$$\begin{aligned} g(y)\&=\ \int _{\mathbb {R}^{d-1}} h\big (y + x_2^2+ \cdots + x_d^2\big ) \, dx_2\ldots dx_d\\&=\ \frac{\pi ^{\frac{d-1}{2}}}{\varGamma \big (\frac{d-1}{2}\big )}\int _{y}^\infty h(t)\big (t-y \big )^{\frac{d-1}{2}-1}\,dt,\quad y\ge 0. \end{aligned}$$

The function g is nonincreasing and continuous on \((0,\infty )\). Hence the univariate marginal distributions are unimodal with mode 0 and their density is continuous on \(\mathbb {R}{\setminus } \{0\}\); see e.g. Fang et al. (1990, p. 37).

In what follows we deal with generators h that satisfy the condition

$$\begin{aligned} \sup _{0\le t\le \epsilon } h(t)<\infty \text {for some finite interval}~[0,\epsilon ]. \end{aligned}$$
(21)

Then g and \(g_1\) are bounded and continuous. Additionally, with

$$\begin{aligned} c_k(h)\, :=\, \frac{2\pi ^{k/2}}{\varGamma (k/2)} \int _0^\infty r^{k-1} h(r^2)\, dr \, =\, \frac{\pi ^{k/2}}{\varGamma (k/2)} \int _0^\infty r^{\frac{k}{2}-1} h(r)\, dr \end{aligned}$$
(22)

for \(k=1,\ldots ,d-1\), we have \(0<c_k(h)<\infty \); especially, \(g(0)=c_{d-1}(h).\) Finally, the transformation law mentioned above can be generalized to \(k\times d\)-matrices with rank \(k<d\) at the cost of changing the function h. In particular, the first marginal distribution associated with \(\mathrm{Ell}_d(0,\varSigma ;h)\) is \(\mathrm{Ell}_1(0, \sigma _{11}; g)\) if \(\varSigma =(\sigma _{ij})_{i,j=1}^d\), and has a density that is bounded, continuous and strictly positive at 0. More generally, if \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) then the density \(f_a\) of \(P^a\) for \(a\in \mathbb {S}_{d-1}\) is given by

$$\begin{aligned} f_a(t)\,=\,\frac{1}{c_d(h) (a^{\text {t}}\varSigma a)^{1/2}}\,g\left( \frac{(t-a^{\text {t}}\mu )^2}{a^{\text {t}}\varSigma a}\right) , \quad t\in \mathbb {R}, \end{aligned}$$

\(M_a=a^{\text {t}}\mu \) is the unique median of \(P^a\), \(f_a(M_a)\) is positive for all \(a\in \mathbb {S}_{d-1},\) and the function \((a,t)\rightarrow f_a(t)\) is continuous. Hence the assumptions (13)–(16) of Theorem 3 are satisfied.

As \(\varSigma \) is positive definite and symmetric the principal axes theorem applies and gives the representation

$$\begin{aligned} \varSigma \,=\, U^{\text {t}}\, \mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\, U \end{aligned}$$
(23)

with some \(U\in \mathbb {O}(d)\) and \(\lambda _1\ge \lambda _2\ge \cdots \ge \lambda _d>0\), where the \(\lambda _i\)’s are the eigenvalues of \(\varSigma \). The distribution \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) is said to be strictly elliptical if no two of these eigenvalues coincide. The next theorem deals with the uniqueness of (the solution of) the quarter median (problem) for elliptical distributions. Note that condition (21) is not imposed there.

Theorem 4

If \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\), then the quarter median of P is unique and given by \(\mu \). Moreover, if P is strictly elliptical, then the solution of the quarter median problem is unique and given by \((\mu ,[U])\) with U as in (23).

For samples from strictly elliptical distributions with density generators satisfying (21) we obtain asymptotic normality for a sequence of associated quarter medians \(\theta _n=\theta _n(X_1,\ldots ,X_n)\). Part (a) of the following result is essentially a corollary to Theorem 3. In the symmetric case we only have weak uniqueness, hence we need to consider this situation separately. Intermediate cases, where only some of the eigenvalues coincide, can be treated similarly, but require a certain amount of bookkeeping.

Theorem 5

Let \(X_n\), \(n\in \mathbb {N}\), be independent random vectors with distribution \(P=\mathrm{Ell}_d(\mu ,\varSigma ; h)\), with h satisfying (21). Let \(c_{d-1}(h),c_d(h)\) be as in (19) and let

$$\begin{aligned} \sigma _{{{\textsc {QMed}}}}^2 := c_{d}(h)^2/(2c_{d- 1}(h))^2. \end{aligned}$$
(a):

Suppose that \((\theta _n,U_n)_{n\in \mathbb {N}}\) is a sequence of random variables with values in \(\mathbb {R}^d\times \mathbb {O}(d)\) such that, for all \(n\in \mathbb {N}\), \((\theta _n,U_n)\) solves the quarter median problem for \(P_n\). Then, if P is strictly elliptical,

$$\begin{aligned} \sqrt{n}(\theta _n-\mu ) \; \rightarrow _{\text {\tiny d}}\; N_d\big (0,\sigma _{{{\textsc {QMed}}}}^2\,\varSigma \big )\quad \text {as } n\rightarrow \infty . \end{aligned}$$
(24)
(b):

Suppose that \(\varSigma =\lambda I_d\) for some \(\lambda >0\). Let \((U_n)_{n\in \mathbb {N}}\) be an arbitrary sequence of elements of \(\mathbb {O}(d)\) and suppose that, for all \(n\in \mathbb {N}\), \(\theta _n\) is a marginal median of \(P_n^{U_n}\). Then

$$\begin{aligned} \sqrt{n}(\theta _n-\mu ) \; \rightarrow _{\text {\tiny d}}\; N_d\big (0,\sigma _{{{\textsc {QMed}}}}^2\,\varSigma \big )\quad \text {as } n\rightarrow \infty . \end{aligned}$$
(25)

Remarkably, the covariance matrix of the limiting normal distribution always is a fixed multiple of the dispersion matrix \(\varSigma \). This is in contrast to other familiar estimators of \(\mu \) such as the empirical spatial median \({\textsc {SMed}}(X_1,\dots ,X_1),\) for example. In fact, with reference to Bai et al. (1990) where the asymptotics of the spatial median are considered in an even more general context, Somorčík (2006) states that in the d-dimensional case under some weak conditions

$$\begin{aligned} {\textsc {SMed}}(X_1,\dots ,X_1) \; \rightarrow _{\text {\tiny d}}\; N_d(0,V), \end{aligned}$$
(26)

where the asymptotic covariance matrix is given by

$$\begin{aligned} V=D_1^{-1}D_2D_1^{-1} \end{aligned}$$
(27)

with

$$\begin{aligned} D_1=E\left( \frac{1}{\Vert X_1-\mu \Vert }\left( I_d - \frac{(X-\mu )(X-\mu )^{\text {t}}}{\Vert X-\mu \Vert ^2}\right) \right) \end{aligned}$$

and

$$\begin{aligned} D_2=E\left( \frac{(X-\mu )(X-\mu )^{\text {t}}}{\Vert X-\mu \Vert ^2}\right) . \end{aligned}$$

In the special (spherical) symmetric case \(\varSigma =I_d\) the identity (27) simplifies to

$$\begin{aligned} V=\sigma _{{\textsc {SMed}}}^2I_d, \end{aligned}$$

where

$$\begin{aligned} \sigma _{{\textsc {SMed}}}^2=\frac{d}{(d-1)^2}\left( E\left( \frac{1}{\Vert X-\mu \Vert }\right) \right) ^{-2}. \end{aligned}$$
(28)

It was shown by Grübel (1996) that in this symmetric case the empirical orthomedian has the same asymptotic covariance matrix as the spatial median. In what follows, we give a comparison of the covariances of the limit distributions for some estimators of \(\mu \), in particular for \({{\textsc {QMed}}}(X_1,\dots ,X_n)\) as the empirical quarter median, the sample mean \(\frac{1}{n}\sum _{j=1}^nX_j\), and the maximum likelihood estimator \(\hbox {ML}(X_1,\dots ,X_n)\), for some special distribution families, where our main focus is on strictly elliptical distribution families. In all cases, the limit distribution of the standardized estimators is a centered multivariate normal, and by asymptotic covariance we mean the respective covariance matrix.

Example 1

We work out the details for some specific distribution families in the case of dimension \(d=2\).

(a) For normal distributions \(P=N_2(\mu ,\varSigma )\) Theorem 5 (a) leads to

$$\begin{aligned} \sqrt{n}({{\textsc {QMed}}}(X_1,\dots ,X_n)-\mu ) \; \rightarrow _{\text {\tiny d}}\; N_2\big (0,\tfrac{\pi }{2}\varSigma \big ). \end{aligned}$$
(29)

The maximum likelihood estimator of \(\mu \) is the sample mean; its (asymptotic) covariance is simply \(\varSigma \) itself, hence the asymptotic efficiency of the quarter median is about \(64\%\). Of course, the quarter median is more robust with respect to gross outliers in the data. In the symmetric case \(\varSigma =I_2\) both orthomedian and spatial median have asymptotic covariance \(\tfrac{4}{\pi }I_2\).

For non-symmetric two-dimensional normal distributions with \(\varSigma =\mathrm{diag}(1,\lambda )\), \(0<\lambda <1\), Brown (1983) and Grübel (1996) obtained expressions for the asymptotic covariance matrices of the spatial median and the orthomedian. Both are of diagonal form, with the two diagonal entries given by expressions involving infinite series and double integrals respectively. These can be used numerically, see Brown (1983, Table 1) and Grübel (1996, Table 1). For the second value, referring to the shorter axis, we have the simple explicit formula \(\lambda \pi /2\) for the quarter median, which is smaller than the value for the spatial median if \(\lambda <0.038\), and for the orthomedian if \(\lambda <0.1\).

(b) Taking \(h(t)=\exp (-\tfrac{1}{2}\sqrt{t}),\,t\ge 0,\) we obtain the bivariate doubly exponential distributions \(\mathrm{MDE}_2(\mu ,\varSigma )\), with densities given by

$$\begin{aligned} f(x;\mu ,\varSigma )=\frac{1}{8\pi }\frac{1}{\sqrt{\det \varSigma }} \exp \left( -\tfrac{1}{2}\left( (x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\right) ^{\frac{1}{2}}\right) ,~x\in \mathbb {R}^2. \end{aligned}$$

The asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the density \(f(\cdot ;\mu ,\varSigma )\) is given by \(\mathrm{cov}(X_1)=12\,\varSigma \); see, e.g. (Gómez et al. 1998, Proposition 3.2 (iii)). Using

$$\begin{aligned} c_1(h)=2\int _0^\infty e^{-\frac{1}{2}t}\,dt =4,c_2(h)=2\pi \int _0^\infty t e^{-\frac{1}{2} t}\,dt=8\pi , \end{aligned}$$

we obtain \(\sigma ^2_{{{\textsc {QMed}}}}=\pi ^2\), so that \(\pi ^2\varSigma \) is the asymptotic covariance matrix for the sample quarter median. For a given fixed symmetric positive definite \(\varSigma \) the Fisher information matrix can be calculated to be \(\frac{1}{8}\varSigma ^{-1}\) (see, e.g. Mitchell 1989, formulas (3.2), (3.3)). So, by the general asymptotic theory of maximum likelihood estimation \(8\varSigma \) is the asymptotic covariance of the maximum likelihood estimator \(\hbox {ML}(X_1,\dots ,X_n).\) In the symmetric case \(\varSigma =I_d\), it follows from (26) and (28) that

$$\begin{aligned} {\textsc {SMed}}(X_1,\dots ,X_n) \; \rightarrow _{\text {\tiny d}}\; N_2(0,8I_2), \end{aligned}$$
(30)

which means that the asymptotic covariances of \({\textsc {SMed}}(X_1,\dots ,X_n)\) and \(\hbox {ML}(X_1,\dots ,X_n)\) coincide in this case. This is not surprising, because \({\textsc {SMed}}(X_1,\dots ,X_n)\) is the maximum likelihood estimator of \(\mu \) in this symmetric case.

(c) With \(h(t)=(1+t)^{-3/2}\) we obtain a class of bivariate Cauchy distributions \(\mathrm{MC}_2(\mu ,\varSigma )\) with densities

$$\begin{aligned} f(x;\mu ,\varSigma )=\frac{1}{2\pi }\frac{1}{\sqrt{\det \varSigma }} \left( 1 + (x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\right) ^{-\frac{3}{2}},~x\in \mathbb {R}^2. \end{aligned}$$

The associated marginal distributions are univariate Cauchy distributions. For this distribution family the sample mean is not even consistent, hence we do not take it into consideration as an estimator for \(\mu \). Again we use (Mitchell 1989, formulas (3.2),(3.3)) to see that for given fixed positive definite \(\varSigma \) the Fisher information matrix is \(\frac{3}{5}\varSigma ^{-1}.\) Thus, by the general asymptotic theory of maximum likelihood estimators,

$$\begin{aligned} \hbox {ML}(X_1,\dots ,X_n) \; \rightarrow _{\text {\tiny d}}\; N_2\left( 0,\tfrac{5}{3}\varSigma \right) . \end{aligned}$$

Using

$$\begin{aligned} c_1(h)=2\int _0^\infty \big (1+t^2\big )^{-\frac{3}{2}}\,dt =2,c_2(h)=2\pi \int _0^\infty t\big (1+t^2\big )^{-\frac{3}{2}} \,dt=2\pi \end{aligned}$$

we arrive at the value \(\tfrac{\pi ^2}{4}\varSigma \) for the asymptotic covariance matrix of the sample quarter median. In the symmetric case \(\varSigma =I_d\), using (26) and (28), it follows that

$$\begin{aligned} {\textsc {SMed}}(X_1,\dots ,X_n) \; \rightarrow _{\text {\tiny d}}\; N_2(0,2I_2). \end{aligned}$$
(31)

(d) With \(h(t)=(1-t)^2\) for \(0\le t \le 1\) and 0 elsewhere we have a special class of symmetric bivariate Pearson type II distributions \(\mathrm{SMPII}_2(\mu ,\varSigma )\) with densities

$$\begin{aligned} f(x;\mu ,\varSigma )=\frac{3}{\pi }\frac{1}{\sqrt{\det \varSigma }} \max \bigl \{0, 1 - (x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\bigr \}^2,~x\in \mathbb {R}^2. \end{aligned}$$

The asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the density \(f(\cdot ;\mu ,\varSigma )\) is \(\mathrm{cov}(X_1)=\frac{1}{8}\varSigma \); see Fang et al. (1990, p. 89). The centered bivariate limit normal distribution of the maximum likelihood estimator has the covariance matrix \(\frac{1}{12}\varSigma \). Using

$$\begin{aligned} c_1(h)=2\int _0^1 \big (1-t^2\big )^2\,dt=\frac{16}{15},c_2(h)=2\pi \int _0^1 t\big (1-t^2\big )^2\,dt=\frac{\pi }{3} \end{aligned}$$

it follows that \(\frac{25}{1024}\pi ^2\varSigma \) is the asymptotic covariance matrix for the sample quarter median. In the symmetric case \(\varSigma =I_d\) we obtain

$$\begin{aligned} {\textsc {SMed}}(X_1,\dots ,X_n) \; \rightarrow _{\text {\tiny d}}\; N_2\left( 0,\frac{25}{128}I_2\right) . \end{aligned}$$

(e) Taking \(h(t)=\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2},\,t\ge 0,\) we obtain the symmetric bivariate logistic distributions \(\mathrm{SML}_2(\mu ,\varSigma )\), with densities given by

$$\begin{aligned} f(x;\mu ,\varSigma )=\frac{1}{\pi \sqrt{\det \varSigma }} \frac{\exp \left( -(x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\right) }{\left( 1+\exp \left( -(x-\mu )^{\text {t}}\varSigma ^{-1}(x-\mu )\right) \right) ^2},~x\in \mathbb {R}^2. \end{aligned}$$

For \(X\sim \mathrm{SML}_2(0,I_2)\) the distribution of \(R^2=\Vert X\Vert ^2\) is the univariate half-logistic distribution with density \(2\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2},\,t\ge 0\). Then,

$$\begin{aligned} m_{\mathrm{SML}_2}:=\frac{1}{2}E(R^2)\,=\,\int _0^\infty t \frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2}\,dt\,\approx \,0.69314718 \end{aligned}$$
(32)

and \(\mathrm{cov}(X)=m_{\mathrm{SML}_2}I_2.\) From this we deduce that the asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the above density \(f(\cdot ;\mu ,\varSigma )\) is \(\mathrm{cov}(X_1)=m_{\mathrm{SML}_2}\varSigma \). With

$$\begin{aligned} k_{\mathrm{SML}_2}:=4\int _0^\infty t\left( \frac{1-\exp (-t)}{1+\exp (-t)}\right) ^2\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2}\,dt\,\approx \,1.59086291 \end{aligned}$$
(33)

the Fisher information matrix is \(k_{\mathrm{SML}_2}\varSigma ^{-1}\), and the centered bivariate limit normal distribution of the maximum likelihood estimator has the covariance matrix \(k_{\mathrm{SML}_2}^{-1}\varSigma \). Further, with

$$\begin{aligned} c_1(h)= & {} \int _0^\infty t^{-\tfrac{1}{2}}\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2}\,dt \approx 0.67371824,\\ c_2(h)= & {} \pi \int _0^\infty \frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2}\,dt=\frac{\pi }{2} \end{aligned}$$

we obtain that \(\frac{c_2(h)^2}{(2c_1(h))^2}\varSigma \, \approx \,1.35901156\varSigma \) is the covariance matrix of the centered limiting normal distribution of the quarter median. In the symmetric case \(\varSigma =I_d\) we get

$$\begin{aligned} {\textsc {SMed}}(X_1,\dots ,X_n) \; \rightarrow _{\text {\tiny d}}\; N_2\left( 0,\frac{1}{2c_1(h)^2} I_2\right) , \end{aligned}$$

where \(\frac{1}{2c_1(h)^2}\approx 1.10157328\). Numerical values for the ratios \(k_{\mathrm{SML}_2}^{-1}/m_{\mathrm{SML}_2}\) and \(k_{\mathrm{SML}_2}^{-1}/(\frac{c_2(h)}{2c_1(h)})^2\) are shown in the last line of Table 1.

We summarize some special results presented in Example 1 for the respective strictly elliptical cases in Table 1, where the relative asymptotic efficiencies \(\mathrm{eff}(\mathrm{Mean}, \mathrm{ML})\), and \(\mathrm{eff}({{\textsc {QMed}}},\mathrm{ML})\) of the mean and the quarter median with respect to the maximum likelihood estimator are shown; these are the ratios \(\sigma _\mathrm{ML}^2/\sigma _\mathrm{Mean}^2\) and \(\sigma _\mathrm{ML}^2/\sigma _{{{\textsc {QMed}}}}^2\).

Table 1 Relative asymptotic efficiencies

In contrast to the other multivariate versions of the median that appear in the above example the quarter median goes beyond providing a location estimate as it comes with an equipartition basis. In the elliptical case \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) the basis elements are the unit vectors proportional to the eigenvectors of \(\varSigma \) and thus contain information about the dispersion parameter. In Theorem 3 the asymptotic behavior of \((U_n)_{n\in \mathbb {N}}\) is important. In the strictly elliptical case we have consistency and we may assume that convergence takes place in \(\mathbb {SO}(d)\); see also the beginning of the proof of Theorem 3. We aim at a second order statement about the distributional asymptotics of this sequence.

In the following theorem we only consider \(d=2\), but see the ensuing remarks. We can then use the parametrization of \(\mathbb {SO}(2)\) given in Remark 1 (b) by an angle from an half-open interval I of length \(\pi /2\). We assume that \(U=U_\alpha \) with I chosen such that \(\alpha \) is in its interior. Then consistency implies that \(U_n=U_{\alpha _n}\) with I-valued random variables \(\alpha _n\) that converge to \(\alpha \) with probability 1. It then makes sense to consider the distributional asymptotics of \(\sqrt{n}(\alpha _n-\alpha )\) as \(n\rightarrow \infty \).

Theorem 6

Let \(P=\mathrm{Ell}_2(\mu ,\varSigma ;h)\) and let \(X_n, \theta _n, U_n\), \(n\in \mathbb {N}\), be as in Theorem 5 (a). Let \(\lambda _1\) and \(\lambda _2\) be the eigenvalues of \(\,\varSigma \), with \(\lambda _1>\lambda _2\). Then, with \(\alpha \) and \(\alpha _n\) as defined above,

$$\begin{aligned} \sqrt{n}(\alpha _n-\alpha )\; \rightarrow _{\text {\tiny d}}\; N\bigl (0,\sigma ^2(\lambda _1,\lambda _2)\bigr ), \end{aligned}$$

where

$$\begin{aligned} \sigma ^2(\lambda _1,\lambda _2)\, =\, \frac{c_2(h)^2\lambda _1\lambda _2}{4c_1(h)^2(\sqrt{\lambda _1}-\sqrt{\lambda _2})^2}. \end{aligned}$$
(34)

Moreover, \(\alpha _n\) and \(\theta _n\) are asymptotically independent.

In particular, the asymptotic variance in (34) is large if the eigenvalues are close to each other. This is to be expected as \(\mathrm{Ell}_2(\mu ,\varSigma ;h)\) is then close to a symmetric distribution, where U is not unique.

In dimensions higher than two a possible approach could be based on the representation of the Lie group \(\mathbb {SO}(d)\) by its Lie algebra \(\mathfrak {so}(d)\), which consists of the skew-symmetric matrices \(A\in \mathbb {R}^{d\times d}\), where \(U\in \mathbb {SO}(d)\) is written as the matrix exponential \(U=\exp (A)\) of some \(A\in \mathfrak {so}(d)\); see Muirhead (1982, Theorem A9.7) for a proof of the latter assertion, and Hall (2004) for an elementary introduction to Lie groups, Lie algebras, and representations. This leads to an asymptotic distribution Q for \((U^{\text {t}}U_n)^{\sqrt{n}}\) as \(n\rightarrow \infty \), where Q is a probability measure on (the Borel subsets of) \(\mathbb {SO}(d)\), and it avoids any ambiguities caused by parametrization. In fact, Q is the distribution of the random orthogonal matrix \(\exp (A)\), where A is a random skew symmetric matrix with jointly normal subdiagonal entries. If \(d=2\), as in the above theorem, then

$$\begin{aligned} U_\alpha = \exp \bigl (A(\alpha )\bigr )\quad \text {with } A:= \begin{pmatrix} 0 &{}\quad -\alpha \\ \alpha &{}\quad 0\end{pmatrix}, \end{aligned}$$

and A is then specified by its one subdiagonal entry, which has a central normal distribution with variance given by (34).

2.6 Measurability, selection and algorithms

Let S be a topological space with Borel \(\sigma \)-field \(\mathcal {B}(S)\). We write \(C_b(S)\) for the set of bounded continuous functions \(f:S\rightarrow \mathbb {R}\). Let \(Z,Z_1,Z_2,\ldots \) be S-valued random variables, i.e. \(\mathcal {B}(S)\)-measurable functions on some probability space (which may depend on the respective variable). In its classical form, convergence in distribution \(Z_n\rightarrow _{\text {\tiny d}}Z\) of \(Z_n\) to Z as \(n\rightarrow \infty \) means that

$$\begin{aligned} \lim _{n\rightarrow \infty }Ef(Z_n) = Ef(Z) \quad \text {for all }f\in C_b(S). \end{aligned}$$
(35)

This is the notion that we used in Sect. 2.4. An extension of this concept, due to Hoffmann–Jørgensen, can be applied if the \(Z_n\)’s are not measurable with respect to the full Borel \(\sigma \)-field on S; roughly, the expectations \(Ef(Z_n)\) in (35) are then replaced by outer expectations \(E^\star f(Z_n)\). Similarly, almost sure convergence now refers to outer probabilities. The research monographs of Dudley (1999) and van der Vaart and Wellner (1996) give an in-depth treatment of this circle of ideas, together with a variety of applications. This extension appears in our proof of Theorem 3 in connection with empirical processes; it can also be used if the estimators \((\theta _n,U_n)\) are not measurable. We refer the reader to the paper of Kuelbs and Zinn (2013), where this is worked out in detail for the related situation of quantile processes.

While this avoids the problem of choosing an estimator from the respective solution set in a measurable way, it is of independent interest whether such a selection is possible. For the classical one-dimensional median and the marginal median this can obviously be done by choosing the midpoint of the respective (component) interval, but no such simple rule seems to exist for the quarter median.

We think of estimators as functions of the random variables \(X_i\), \(i\in \mathbb {N}\), and it is then enough to establish measurability for these functions. To be precise, for a given \(n\in \mathbb {N}\) and an n-tuple \((x_1,\dots ,x_n)\in (\mathbb {R}^d)^n\) we denote by \(\big (\theta _n(x_1,\dots ,x_n),U_n(x_1,\dots ,x_n)\big )\in \mathbb {R}^d\times \mathbb {SO}(d)\) a solution of the quarter median problem for the empirical distribution \(P_{n;x_1,\dots ,x_n}=n^{-1}\sum _{k=1}^n\delta _{x_k}\).

Proposition 2

There exists a function \(\tau :(\mathbb {R}^d)^n\rightarrow \mathbb {R}^d\times \mathbb {SO}(d)\) that is measurable with respect to the Borel \(\sigma \)-algebra on \((\mathbb {R}^d)^n\) and the Borel \(\sigma \)-algebra on \(\mathbb {R}^d\times \mathbb {SO}(d)\) such that, for each \((x_1,\dots ,x_n)\in (\mathbb {R}^d)^n\),

$$\begin{aligned} \big (\theta _n(x_1,\dots ,x_n),U_n(x_1,\dots ,x_n)\big )\, :=\, \tau (x_1,\dots ,x_n) \end{aligned}$$

is a solution of the quarter median problem for \(P_{n;x_1,\dots ,x_n}\).

Moreover, \(\tau \) may be chosen to be permutation invariant in the sense that

$$\begin{aligned} \tau (x_{\pi (1)},\ldots ,x_{\pi (n)}) = \tau (x_1,\ldots ,x_n) \end{aligned}$$
(36)

for all permutations \(\pi \) of \(\{1,\ldots ,n\}\).

Permutation invariance means that the estimates depend on the data only through the respective empirical distribution.

Fig. 3
figure 3

Number of cities in the upper right quadrant, \(\alpha \in [0,\pi /2)\)

Selecting a solution in a measurable way from the respective set of all solutions may seem as a corollary to establishing an algorithm that returns an estimate for every data input \(x_1,\ldots ,x_n\). Indeed, without such an algorithm a statistical procedure would seem to be of limited use. In dimension two, and using the parametrization in Remark 1 (b), we obtain a real function on the interval \([0,\pi /2)\) by counting the number of data points in the upper right quadrant specified by \(U_\alpha \) and \(\hbox {MMed}_m(P^{U_\alpha })\); see Fig. 3 for a plot of this function for the city data in Fig. 1, where the counts refer to the open quarter space. By construction, all half-spaces contain at least half of the data points, so each angle that leads to a count of n/4 (if n is divisible by 4) would give a quarter median. For absolutely continuous distributions this approach, together with the intermediate value theorem, leads to a proof of Theorem 1 in dimension 2, and this may be used numerically via bisection. Such a bisection argument also works for functions that are piecewise constant and have jumps of size \(\pm 1\) only, provided a solution exists.

In what follows, still considering the case \(d=2\), we present a different algorithm. Let \(x_1,\dots ,x_n\) be \(n\ge 2\) pairwise distinct data points in \(\mathbb {R}^2\). For \(1\le i<j\le n\) let \(b_{ij}=\frac{x_j-x_i}{\Vert x_j-x_i\Vert }\in \mathbb {S}_1\) be the direction of the line through \(x_i\) and \(x_j,\) and let \(b_{ij}'\in \mathbb {S}_1\) be orthogonal to \(b_{ij}.\) With the empirical distribution \(P_n=n^{-1}\sum _{k=1}^n \delta _{x_k}\) we associate the finite set \(\mathscr {L}(P_n)\) of pairs \((\theta _{ij},U_{ij})\), with \(U_{ij}\in \mathbb {O}(2)\) as the matrix with row vectors \(b_{ij}^{\text {t}}\) and \({b_{ij}'}^{\text {t}}\), and \(\theta _{ij}=U_{ij}^{\text {t}}\eta \) with some \(\eta \in \mathcal {M}\bigl (P_n^{U_{ij}}\bigr )\).

Theorem 7

There is an element \((\theta _{ij},U_{ij}) \in \mathscr {L}(P_n)\) that solves the quarter median problem for \(P_n\).

Hence, checking successively the conditions (5) and (6) for all \(\left( \theta _{ij},U_{ij}\right) \in \mathscr {L}(P_n)\), \(1\le i<j\le n\), provides a solution of the quarter median problem for \(P_n\). This procedure, however, may lead to an estimator that is not permutation invariant in the sense of (36), meaning that it would not be a function of the empirical distribution \(P_n\). For our introductory data example, see Fig. 1, this happens for both of the above algorithms: Fig. 3 shows that different solutions appear depending on whether the angles are scanned clockwise or counterclockwise. Similarly, the algorithm based on Theorem 7 could lead to the (ij)-pair corresponding to the cities Berlin and Frankfurt am Main, or München and Dortmund, for example.

3 Numerical comparison

We present a small simulation study on the performance of the quarter median (\({{\textsc {QMed}}}\)) as a location estimator, comparing it with that of three other procedures—the spatial median (\({\textsc {SMed}}\)); Oja’s simplicial median (\(\textsc {OMed}\)), see Oja (1983); and Tukey’s half-space median (\({\textsc {TMed}}\)), see Tukey (1975). For a detailed description of these classical estimators we refer to the survey papers of Small (1990) and Oja (2013). A Monte Carlo study on the performance of the three (and some other bivariate location) estimators is given by Massé and Plante (2003).

Samples are drawn from the strictly elliptical distributions \(N_2(\mu ,\varSigma )\) (bivariate normal), \(\mathrm{MDE}_2(\mu ,\varSigma )\) (bivariate doubly exponential), \(\mathrm{SMPII}_2(\mu ,\varSigma )\) (symmetric bivariate Pearson type II), and \(\mathrm{SML}_2(\mu ,\varSigma )\) (symmetric bivariate logistic) introduced in Example 1. The bivariate Cauchy distributions \(\mathrm{MC}_2(\mu ,\varSigma )\) also discussed there are not taken into consideration for the simulation study, as the Oja median does not exist for this distribution family. In fact, for a given distribution P on \(\mathbb {R}^2\), independent \(X_j\sim P\), \(j=1,2\), and \(\theta \in \mathbb {R}^2\), let \(\varDelta (X_1,X_2,\theta )\) be the area of the triangle in \(\mathbb {R}^2\) with vertices \(X_1,X_2,\theta .\) If the expectation \(\gamma _P(\theta ):=E\left( \varDelta (X_1,X_2,\theta )\right) \) is finite for all \(\theta \in \mathbb {R}^2\), the Oja median is a point \(\theta _\mathrm{Oja}(P)\in \mathbb {R}^2\) that minimizes the function \(\theta \mapsto \gamma _P(\theta )\). Obviously, \(\gamma _P\equiv \infty \) for \(P=\mathrm{MC}_2(0,\varSigma )\), while for the other elliptical distributions P under consideration \(\theta _\mathrm{Oja}(P)=\mu .\)

Without loss of generality we take \(\mu =0\). The dispersion matrix chosen is \(\varSigma =\begin{pmatrix} 1 &{} 0 \\ 0 &{} \lambda \end{pmatrix}\), with \(\lambda \in \{0.01,0.1,0.5,0.9\}\). Repeatedly, with \(r=10{,}000\) replications, samples of size \(n=100\) are drawn from the underlying distribution. Let \({\textsc {SMed}}_{n,i},\textsc {OMed}_{n,i},{\textsc {TMed}}_{n,i},{{\textsc {QMed}}}_{n,i}\) be the observed values of the corresponding estimators obtained in the ith repetition. Table 2 shows the numerical values of the components \(\hat{m}_1,\hat{m}_2\) of the empirical means and the numerical values of the eigenvalues \(\hat{l}_1,\hat{l}_2\) of the empirical covariance matrices of \(\sqrt{n}{\textsc {SMed}}_{n,i},\sqrt{n}\textsc {OMed}_{n,i},\sqrt{n}{\textsc {TMed}}_{n,i},\sqrt{n}{{\textsc {QMed}}}_{n,i}\), \(i=1,\ldots ,r.\) The simulations are conducted by using the statistical software environment R; the calculation of the estimators \({\textsc {SMed}}\), \(\textsc {OMed}\), and \({\textsc {TMed}}\) is easily done by applying the corresponding functions med(...,method="Spatial"), med(...,method="CR20",...) and TukeyMedian(...) provided with the additional R software packages depth and TukeyRegion. Note that, in the later package, the Tukey median is defined to be the barycenter of the Tukey region.

Roughly, for the bivariate distributions considered the Oja median seems to have the best power performance; it is followed by the Tukey median, the spatial median and the quarter median (in this order). But note the exceptional case \(\lambda =0.01,\) with the smaller values of \(\hat{l}_2\) and \(\hat{l}_1\hat{l}_2\) for \({{\textsc {QMed}}}\) compared to that of \({\textsc {SMed}}\). Note also that as we already pointed out above, the quarter median contains information about the dispersion parameter and so goes beyond providing a location estimate.

Table 2 Components \(\hat{m}_1,\hat{m}_2\) of the empirical means, and eigenvalues \(\hat{l}_1,\hat{l}_2\) of the empirical covariance matrices of the \(\sqrt{n}\)-scaled estimators \({\textsc {SMed}}\), \(\textsc {OMed}\), \({\textsc {TMed}}\), \({{\textsc {QMed}}}\) obtained by simulations with 10,000 replications for strictly elliptical distributions with dispersion matrix \(\varSigma =\begin{pmatrix} 1 &{} 0 \\ 0 &{} \lambda \end{pmatrix}\), \(\lambda \in \{0.01,0.1,0.5,0.9\}\); sample size \(n=100\)

For samples taken from bivariate normal distributions with dispersion matrix \(\varSigma =\begin{pmatrix} 1 &{} 0 \\ 0 &{} \lambda \end{pmatrix}\) the limit distributions of \({\textsc {SMed}}\), \(\textsc {OMed}\), and \({{\textsc {QMed}}}\) as \(n\rightarrow \infty \) are the centered bivariate normal distributions with the diagonal covariance matrices \(\varSigma _{{\textsc {SMed}}}=\begin{pmatrix} l_{1,{\textsc {SMed}}} &{} 0 \\ 0 &{} l_{2,{\textsc {SMed}}} \end{pmatrix}\), see Brown (1983), where also expressions for the diagonal elements \(l_{1,{\textsc {SMed}}},l_{2,{\textsc {SMed}}}\) and their numerical values for some special values of \(\lambda \) are given, \(\varSigma _{\textsc {OMed}}=\begin{pmatrix} 4/\pi &{} 0 \\ 0 &{} 4\lambda /\pi \end{pmatrix}\), see Oja and Niinimaa (1985), and \(\varSigma _{{{\textsc {QMed}}}}=\begin{pmatrix} \pi /2 &{} 0 \\ 0 &{} \pi \lambda /2 \end{pmatrix}\). The limit distribution of \({\textsc {TMed}}\) is the distribution of \(\arg \sup _{t\in \mathbb {R}^2} \inf _{u\in \mathbb {S}_{1}} \left( Z(u) - \frac{1}{2\pi \lambda ^{1/2}} u^{\text {t}}t\right) \), where \(\left( Z(u),u\in \mathbb {S}_{1}\right) \) is a centered Gaussian process with the covariance function \(\mathrm{cov}\left( Z(u),Z(v)\right) =P(V_{++}(0;u,v)-1/4,\,u,v\in \mathbb {S}_{1};\) see Nolan (1999). To the best of our knowledge, nothing is known about the type of this distribution. In this respect, see for example also the interesting results on the asymptotic behaviour of the empirical Tukey depth process given by Massé (2004).

Table 3 shows the values of \(\hat{l}_1,\hat{l}_2\) and, for comparison, that of the diagonal elements \(l_1,l_2\) of \(\varSigma _{{\textsc {SMed}}},\varSigma _{\textsc {OMed}}\), and \(\varSigma _{{{\textsc {QMed}}}}\).

Table 3 Bivariate normal case: Empirical eigenvalues \(\hat{l}_1,\hat{l}_2\) and eigenvalues \(l_1,l_2\) of \(\varSigma _{{\textsc {SMed}}},\varSigma _{\textsc {OMed}}\), and \(\varSigma _{{{\textsc {QMed}}}}\)

4 Proofs

4.1 Proof of Theorem 1

Using Makeev’s result, we only need to remove the smoothness assumption.

For an arbitrary probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) let \(P_\epsilon =P * N_d(0,\epsilon I_d)\) be the convolution of P with a centered d-dimensional normal distribution with independent components that all have variance \(\epsilon >0\). Then \(P_\epsilon \) is absolutely continuous with an everywhere positive density, so that a solution \(\bigl (\theta _\epsilon , U_\epsilon )\) of the equipartition problem exists. Let \(|\!|\!|\cdot |\!|\!|\) be a matrix norm on \(\mathbb {R}^{d\times d}\) that is compatible with the Euclidean norm \(\Vert \cdot \Vert \) on \(\mathbb {R}^d\). From \(\theta _\epsilon =U_\epsilon ^{\text {t}}U_\epsilon \theta _\epsilon \) it then follows that

$$\begin{aligned} \Vert \theta _\epsilon \Vert \le |\!|\!|U_\epsilon ^{\text {t}}|\!|\!|\Vert U_\epsilon \theta _\epsilon \Vert \le s \Vert U_\epsilon \theta _\epsilon \Vert , \end{aligned}$$
(37)

where \(s=\sup _{U\in \mathbb {SO}(d)}|\!|\!|U^{\text {t}}|\!|\!|<\infty .\) Let \(t=\limsup _{\epsilon \rightarrow 0} \Vert U_\epsilon \theta _\epsilon \Vert \). Hence there exists a sequence \((\epsilon _n)_{n\in \mathbb {N}}\) with \(\epsilon _n\rightarrow 0\) such that \(\Vert U_{\epsilon _n}\theta _{\epsilon _n}\Vert \rightarrow t \) and \(U_{\epsilon _n}\rightarrow U \in \mathbb {O}(d)\) as \(n\rightarrow \infty \). As a consequence, \(P_{\epsilon _n}^{U_{\epsilon _n}}\rightarrow P^U\) weakly. Due to the fact that \(U_{\epsilon _n}\theta _{\epsilon _n}\) is a marginal median of \(P_{\epsilon _n}^{U_{\epsilon _n}}\) we deduce from this that the sequence \(\left( \Vert U_{\epsilon _n}\theta _{\epsilon _n}\Vert \right) _{n=1}^\infty \) is bounded; hence, by (37), the sequence \((\theta _{\epsilon _n})_{n=1}^\infty \) is also bounded. Therefore, \(\theta _{\epsilon _{n'}}\rightarrow \theta \in \mathbb {R}^d\) and \(U_{\epsilon _{n'}}\rightarrow U \in \mathbb {O}(d)\) along some subsequence \((\epsilon _{n'})\) of \((\epsilon _{n}).\) To see that \((\theta ,U)\) solves the quarter median problem for P we use that \(P_{\epsilon _{n'}}^{U_{\epsilon _{n'}}}\rightarrow P^U\) weakly and argue as follows: With \(b_{\epsilon _{n'},1}^{\text {t}},\dots ,b_{\epsilon _{n'},d}^{\text {t}}\) and \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) as the row vectors of \(U_{\epsilon _{n'}}\) and U,  respectively, we have that, for each \(1\le i<j\le d\) and each \(\delta >0\),

$$\begin{aligned} \frac{1}{4}&\ \le \ \limsup _{n'\rightarrow \infty }P_{\epsilon _{n'}}\big (V_{+-}\big (\theta _{\epsilon _{n'}};b_{\epsilon _{n'},i},b_{\epsilon _{n'},j}\big )\big )\\&\ \le \ \limsup _{n'\rightarrow \infty }P_{\epsilon _{n'}}\big (b_{\epsilon _{n'},i}^{\text {t}}x \ge b_{i}^{\text {t}}\theta -\delta , \, b_{\epsilon _{n'},j}^{\text {t}}x \le b_{i}^{\text {t}}\theta + \delta \big )\\&\ \le \ P\big (b_{i}^{\text {t}}x \ge b_{i}^{\text {t}}\theta -\delta , \, b_{j}^{\text {t}}x \le b_{i}^{\text {t}}\theta + \delta \big ). \end{aligned}$$

Let \(\delta \downarrow 0\) to obtain \(P\big (V_{+-}(\theta ;b_i,b_j)\big )\ge \frac{1}{4}.\) The other inequalities stated in (5) and (6) can be verified in the same way.

4.2 Proof of Proposition 1

(a) Let \(\theta \) be a quarter median for P with \(b_1,\ldots ,b_d\) an associated orthonormal basis of \(\mathbb {R}^d\) such that the constraints (5) and (6) are satisfied. Further, let \(S(x)=Ax+c\) with \(A\in \mathbb {O}(d)\), \(c\in \mathbb {R}^d\), be a Euclidean motion. Then \(b_1',\ldots ,b_d'\) with \(b_i':=Ab_i\), \(1\le i\le d\), is again an orthonormal basis for \(\mathbb {R}^d\), and it is easily checked that

$$\begin{aligned}&S^{-1}\bigl (H_{\pm }\big (A\theta + c;b_i'\big )\bigr ) \, = \, H_{\pm }(\theta ;b_i),\quad 1\le i \le d,\\&S^{-1}\bigl (V_{\pm \pm }\big (A\theta + c;b_i',b_j'\big )\bigr ) \, = \, V_{\pm \pm }(\theta ;b_i,b_j),\quad 1\le i<j\le d. \end{aligned}$$

In view of the definition of push-forwards this implies that \(S(\theta )\) is a quarter median for \(P^S\), with \(b_1',\ldots ,b_d'\) an associated equipartition basis.

(b) If \(V=AU\) with \(A\in G\) then

$$\begin{aligned} \varPsi (V,P)\, =\, (AU)^{\text {t}}\, \hbox {MMed}_m\bigl (\big (P^U\big )^A\bigr )\, =\, U^{\text {t}}A^{\text {t}}A \, \hbox {MMed}_m\big (P^U\big )\, =\, \varPsi (U,P), \end{aligned}$$

where we have used (9) and \(G\subset \mathbb {O}(d)\).

(c) Let \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\) be the row vectors of U. With

$$\begin{aligned} \theta _U=(\theta _{U,1},\ldots ,\theta _{U,d})^{\text {t}}\,:= \, U \theta = \big (b_1^{\text {t}}\theta ,\ldots ,b_d^t\theta \big )^{\text {t}}\end{aligned}$$

it holds that \(\theta =U^{\text {t}}\theta _U=\sum _{i=1}^d\theta _{U,i}b_i\) and that, by definition, \(\theta _U\) is a marginal median of \(P^U.\) We can now choose an element \(\tilde{\theta }_U=(\tilde{\theta }_{U,1},\ldots ,\tilde{\theta }_{U,d})^{\text {t}}\in \mathcal {M}\left( P^U\right) \) such that

$$\begin{aligned} P\big (b_i^{\text {t}}x \ge \theta _{U,i}\big )=P\big (b_i^{\text {t}}x \ge \tilde{\theta }_{U,i}\big ),P\big (b_i^{\text {t}}x \le \theta _{U,i}\big )=P\big (b_i^{\text {t}}x \le \tilde{\theta }_{U,i}\big ) \end{aligned}$$

for \(1\le i\le d\), and

$$\begin{aligned} P\big (b_i^{\text {t}}x \ge \theta _{U,i},\, b_j^{\text {t}}x \ge \theta _{U,j}\big )&=P\big (b_i^{\text {t}}x \ge \tilde{\theta }_{U,i},\, b_j^{\text {t}}x \ge \tilde{\theta }_{U,j}\big )\\ P\big (b_i^{\text {t}}x \ge \theta _{U,i},\, b_j^{\text {t}}x \le \theta _{U,j}\big )&=P\big (b_i^{\text {t}}x \ge \tilde{\theta }_{U,i},\, b_j^{\text {t}}x \le \tilde{\theta }_{U,j}\big )\\ P\big (b_i^{\text {t}}x \le \theta _{U,i},\, b_j^{\text {t}}x \ge \theta _{U,j}\big )&=P\big (b_i^{\text {t}}x \le \tilde{\theta }_{U,i},\, b_j^{\text {t}}x \ge \tilde{\theta }_{U,j}\big )\\ P\big (b_i^{\text {t}}x \le \theta _{U,i},\, b_j^{\text {t}}x \le \theta _{U,j}\big )&=P\big (b_i^{\text {t}}x \le \tilde{\theta }_{U,i},\, b_j^{\text {t}}x \le \tilde{\theta }_{U,j}\big ) \end{aligned}$$

for \(1\le i<j\le d\). Then putting \(\tilde{\theta }=U^{\text {t}}\tilde{\theta }_U=\sum _{i=1}^d \tilde{\theta }_{U,i} b_i\) we have

$$\begin{aligned} P\big (b_i^{\text {t}}x \ge b_i^{\text {t}}\theta \big )=P\big (b_i^{\text {t}}x \ge b_i^{\text {t}}\tilde{\theta }\big ),P\big (b_i^{\text {t}}x \le b_i^{\text {t}}\theta \big )=P\big (b_i^{\text {t}}x \le b_i^{\text {t}}\tilde{\theta }\big ) \end{aligned}$$

for \(1\le i\le d\), and

$$\begin{aligned} P\bigl (b_i^{\text {t}}x \ge b_i^{\text {t}}\theta ,\, b_j^{\text {t}}x \ge b_j^{\text {t}}\theta \bigr )&=P\bigl (b_i^{\text {t}}x \ge b_i^{\text {t}}\tilde{\theta },\, b_j^{\text {t}}x \ge b_j^{\text {t}}\tilde{\theta }\bigr )\\ P\bigl (b_i^{\text {t}}x \ge b_i^{\text {t}}\theta ,\, b_j^{\text {t}}x \le b_j^{\text {t}}\theta \bigr )&=P\bigl (b_i^{\text {t}}x \ge b_i^{\text {t}}\tilde{\theta },\, b_j^{\text {t}}x \le b_j^{\text {t}}\tilde{\theta }\bigr )\\ P\bigl (b_i^{\text {t}}x \le b_i^{\text {t}}\theta ,\, b_j^{\text {t}}x \ge b_j^{\text {t}}\theta \bigr )&=P\bigl (b_i^{\text {t}}x \le b_i^{\text {t}}\tilde{\theta },\, b_j^{\text {t}}x \ge b_j^{\text {t}}\tilde{\theta }\bigr )\\ P\bigl (b_i^{\text {t}}x \le b_i^{\text {t}}\theta ,\, b_j^{\text {t}}x \le b_j^{\text {t}}\theta \bigr )&=P\bigl (b_i^{\text {t}}x \le b_i^{\text {t}}\tilde{\theta },\, b_j^{\text {t}}x \le b_j^{\text {t}}\tilde{\theta }\bigr ) \end{aligned}$$

for \(1\le i<j\le d\). Thus, \((\tilde{\theta },U)\) solves the quarter median problem for P.

(d) The first statement may be regarded as an equivariance property of solution pairs for the quarter median problem; its proof proceeds as in (a). For the second statement we note that \(A\in \mathbb {O}(d)\) and then calculate, using the shift equivariance of the marginal median,

$$\begin{aligned} \varPsi (U\! A^{\text {t}},P^S)\&=\ (U\! A^{\text {t}})^{\text {t}}\, \hbox {MMed}_m\bigl (\big (P^S\big )^{U\! A^{\text {t}}}\bigr ) \\&=\ A U^{\text {t}}\big (\hbox {MMed}_m\big (P^{UA^{\text {t}}A}\big ) + U\!A^{\text {t}}b\big )\\&=\ A\varPsi (U,P) + b\quad =\ S\big (\varPsi (U,P)\big ). \end{aligned}$$

(e) Let \(U\in \mathbb {O}(d)\) and \((U_n)_{n=1}^\infty \) be a sequence of elements in \(\mathbb {O}(d)\) converging to U. Due to \(P^{U_n}\rightarrow P^U\) weakly and the fact that \(P^{b}\) has a unique median for all \(b\in S_d,\) it follows that \(\hbox {MMed}_m(P^{U_n})\rightarrow \, \hbox {MMed}_m(P^U).\) Then, from

$$\begin{aligned} |\!|\!|U_n&^{\text {t}}\,\hbox {MMed}_m\big (P^{U_n}\big ) - U^{\text {t}}\,\hbox {MMed}_m\big (P^U\big ) |\!|\!|\\&= ~|\!|\!|U_n^{\text {t}}\,\hbox {MMed}_m\big (P^{U_n}\big ) - U_n^{\text {t}}\,\hbox {MMed}_m\big (P^U\big ) + U_n^{\text {t}}\,\hbox {MMed}_m\big (P^U\big ) - U^{\text {t}}\,\hbox {MMed}_m\big (P^U\big ) |\!|\!|\\&\le ~|\!|\!|U_n^{\text {t}}|\!|\!|\Vert \hbox {MMed}_m\big (P^{U_n}\big ) - \,\hbox {MMed}_m\big (P^U\big )\Vert + |\!|\!|U_n^{\text {t}}- U^{\text {t}}|\!|\!|\Vert \hbox {MMed}_m(P^U)\Vert \end{aligned}$$

and the compactness of \(\mathbb {O}(d)\) we deduce that \( U_n^{\text {t}}\,\hbox {MMed}_m(P^{U_n})\rightarrow U^{\text {t}}\,\hbox {MMed}_m(P^U).\)

4.3 Proof of Theorem 2

We will make use of the fact that if \(m_n,n\in \mathbb {N},\) are medians of univariate distributions that converge weakly as \(n\rightarrow \infty \) to a univariate distribution with unique median m, then \(\lim _{n\rightarrow \infty } m_n=m\). Further, we only prove the second part of the theorem as the arguments for (a) are similar.

Let \((\varOmega ,\mathcal {A},\mathbb {P})\) be a probability space on which the random vectors \(X_1,X_2,\ldots \) are defined, so that \(\mathbb {P}^{X_j}=P\) for all \(j\in \mathbb {N}\). Then there is a \(\mathbb {P}\)-null set \(N\in \mathcal {A}\) such that for each \(\omega \in N^c\)

$$\begin{aligned} P_{n;\omega }:=\frac{1}{n}\sum _{j=1}^n\delta _{X_j(\omega )}\rightarrow P\text {weakly}. \end{aligned}$$

Fix \(\omega \in N^c\); until further notice we omit the argument \(\omega \) below. By definition, the quarter median \(\theta _n\) of \(P_n\) can be written as \(\theta _n=U_n^{\text {t}}\eta _n\) with some \(U_n\in \mathbb {O}(d)\) and some \(\eta _n\in [\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\) (both may depend on \(\omega \in N^c\)). Let \((U_{n'})\) be a subsequence of \((U_n)\). As \(\mathbb {O}(d)\) is compact a subsubsequence \((U_{n''})\) converges to some \(V\in \mathbb {O}(d)\), and we have weak convergence \(P_{n''}^{U_{n''}}\rightarrow P^V\) by the extended continuous mapping theorem; see Billingsley (1968, Theorem 5.5). As the distribution \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), the marginal median \(\eta _n''\) of \(P_{n''}^{U_{n''}}\) converges to the marginal median \(\eta \) of \(P^V\). Now fix ij with \(1\le i < j\le d\). Let \(a_n'',b_n''\), ab be the corresponding rows in \(U_n''\) and V respectively. Using the same arguments as in the proof of Theorem 1 we then obtain for each \(\epsilon >0\), and with m instead of \(n''\),

$$\begin{aligned} \frac{1}{4}\&\le \ \limsup _{m} P_{m}\bigl ( a_{m}^{\text {t}}x \le a_m^{\text {t}}\theta _m, \ b_m^{\text {t}}x \ge b_m^{\text {t}}\theta _m \bigr )\\&\le \ \limsup _{m} P_{m}\bigl ( a_m^{\text {t}}x \le a^{\text {t}}\eta + \epsilon , \ b_m^{\text {t}}x \ge b^{\text {t}}\eta -\epsilon \bigr )\\&\le \ P\bigl (a^{\text {t}}x \le a^{\text {t}}\eta + \epsilon , \ b^{\text {t}}x \ge b^{\text {t}}\eta -\epsilon \bigr ), \end{aligned}$$

where in the first line we have used that \((\theta _m,U_m)\) solves the quarter median problem for \(P_m\). Letting \(\epsilon \downarrow 0\) we get \(P\left( V_{-+}(\eta ; a,b) \right) \, \ge \, 1/4\). More generally, and using the same argument, we obtain \(P\bigl (V_{\pm \pm }(\eta ;a,b)\bigr )\ge 1/4\) as well as \(P\bigl (H_{\pm }(\eta ;a)\bigr )\ge 1/2\) and \(P\big (H_{\pm }(\eta (\omega );b)\bigr ))\ge 1/2\). Recall that some of the quantities depend on the \(\omega \) chosen above so that, for example,

$$\begin{aligned} P\bigl (H_+(\eta ;a)\bigr )\; =\; P\bigl (H_+(\eta (\omega );a(\omega ))\bigr ) \; = \; P\bigl (\{x\in \mathbb {R}^d:\, \langle a(\omega ),x-\eta (\omega )\rangle \ge 0\}\bigr ). \end{aligned}$$

In summary, we have shown that every limit point \((\eta (\omega ),V(\omega ))\) of the sequence \((\theta _n(\omega ),U_n(\omega ))_{n\in \mathbb {N}}\) is a solution of the quarter median problem for P. Hence, if the solution is unique and given by \((\theta ,H)\in \mathbb {R}^d\times \mathbb {H}\), then, on a set of probability 1, \(\eta =\theta \) and \([V]=H\).

4.4 Proof of Theorem 3

Invariance with respect to shifts means that we may assume \(\theta =0\). As already noticed in Sect. 4.3, the quarter median \(\theta _n\) can be written as \(\theta _n=U_n^{\text {t}}\eta _n\) with \(\eta _n\in [\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\). Theorem 2 implies that \(([U_n])_{n\in \mathbb {N}}\) converges almost surely to H with respect to the quotient topology on \(\mathbb {H}(d)\). The mapping \(p:\mathbb {O}(d)\rightarrow \mathbb {H}(d)\), \(U\mapsto [U]\), associates to each \(H\in \mathbb {H}(d)\) a finite fiber \(p^{-1}(\{H\})\), and \(\mathbb {O}(d)\) is a covering space for \(\mathbb {H}(d)\). General results from algebraic topology, see e.g. tom Dieck (1991, Section II.6), imply that p may locally be inverted to obtain homeomorphisms to open subsets of the individual sheets of the covering space. In particular, we may assume that the matrices \(U_n\) converge in \(\mathbb {O}(d)\) to some \(U\in H\) almost surely as \(n\rightarrow \infty \).

For each \(a\in \mathbb {S}_{d-1}\) and \(n\in \mathbb {N}\) let \(M_n^\pm (a):=\hbox {Med}_\pm \bigl (P_n^a\bigr )\), and let M(a) be the uniquely determined median of \(P^a\). We define two stochastic processes \(Y_n^\pm =\bigl (Y_n^\pm (a),a\in \mathbb {S}_{d-1}\bigr )\) with index set \(\mathbb {S}_{d-1}\) by

$$\begin{aligned} Y_n^\pm (a)=\sqrt{n}\big (M_n^\pm (a)-M(a)\big )\quad \text {for all } a\in \mathbb {S}_{d-1}, \end{aligned}$$

and will use results presented by Grübel (1996) and adopt arguments used by Kuelbs and Zinn (2013) to prove that

$$\begin{aligned} Y_n^\pm \rightarrow _{\text {\tiny d}}Y\quad \text {as }n\rightarrow \infty . \end{aligned}$$
(38)

Here \(Y=(Y_a,a\in \mathbb {S}_{d-1})\) is a centered Gaussian process with continuous paths and covariance function

$$\begin{aligned} \mathrm{cov}(Y_a,Y_b) \, = \, \frac{K(a,b)}{f_a(M(a))f_b(M(b))}, \end{aligned}$$
(39)

with

$$\begin{aligned} K(a,b)\, = \, P\bigl (\{x\in \mathbb {R}^d:\, a^{\text {t}}x\le M(a), \, b^{\text {t}}x\le M(b)\}\bigr ) \,-\, \frac{1}{4}\,. \end{aligned}$$
(40)

We regard the \(Y_n^\pm \) as processes with paths in \(\ell _\infty (\mathbb {S}_{d-1})\), the space of real-valued bounded functions on \(\mathbb {S}_{d-1}\) endowed with the supremum norm, and the symbol ’\(\rightarrow _{\text {\tiny d}}\)’ refers to weak or distributional convergence in the sense of Hoffmann–Jørgensen; see e.g. van der Vaart and Wellner (1996) and Dudley (1999) for details.

For the proof of (38) we start with a functional central limit theorem for the empirical processes and then use an almost sure representation in order to be able to work with individual paths in an \(\epsilon \)\(\delta \) style.

Let \(F(a,\cdot ):=F_a(\cdot )\) be the distribution function of \(a^{\text {t}}X\) and let

$$\begin{aligned} F_n(a,\cdot )\,=\,F_{a,n}(\cdot )\, =\, \frac{1}{n}\sum _{j=1}^n 1\big (a^{\text {t}}X_j\le \cdot \,\big ) \end{aligned}$$

be the (random) distribution function associated with \(P_n^a\), \(a\in \mathbb {S}_{d-1}\). Here \(1(\,\cdot \,)\) denotes the indicator function of its (logical) argument. We introduce the empirical process \(Z_n=\left( \sqrt{n}\left( F_n(a,t)-F(a,t)\right) ,(a,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\right) \) as a process with paths in \(\ell _\infty (\mathbb {S}_{d-1}\times \mathbb {R})\). We obtain a semimetric d on \(\mathbb {S}_{d-1}\times \mathbb {R}\) via

$$\begin{aligned} d\big ((a,s),(b,t)\big )^2 = E\Bigl (\bigl (1(a^{\text {t}}X\!\le \! s) - P(a^{\text {t}}X\!\le \! s)\bigr ) - \bigl (1(b^{\text {t}}X\!\le \! t) - P(b^{\text {t}}X\!\le \! t)\bigr )\Bigr )^2, \end{aligned}$$

\((a,s),(b,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\). As noticed in Grübel (1996, Proof of Theorem 1), we then have \(Z_n\rightarrow _{\text {\tiny d}}Z_0\), where \(Z_0=(Z_0(a,t),(a,t)\in \mathbb {S}_{d-1}\times \mathbb {R})\) is a centered Gaussian process with covariance function

$$\begin{aligned} E\bigl (Z_0(a,s)Z_0(b,t)\bigr )\, =\, E\Bigl (\bigl (1(a^{\text {t}}X\!\le \! s) - P(a^{\text {t}}X\!\le \! s)\bigr ) \bigl (1(b^{\text {t}}X\!\le \! t) - P(b^{\text {t}}X\!\le \! t)\bigr )\Bigr ), \end{aligned}$$

\((a,s),(b,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\), and sample paths that are bounded and continuous with respect to d. The assumptions on P ensure that the covariance function is continuous with respect to the usual topology on \(\mathbb {S}_{d-1}\times \mathbb {R}\). This implies that the process \(Z_0\) has continuous sample paths.

For the remainder of the proof we abbreviate ‘almost surely’ to ‘a.s.’, and convergence refers to \(n\rightarrow \infty \) unless specified otherwise.

By the Skorokhod–Dudley–Wichura representation theorem (Dudley 1999, Theorem 3.5.1) there exists a probability space \((\tilde{\varOmega },\tilde{\mathcal {A}},\tilde{\mathbb {P}})\) together with perfect and measurable maps \(g_n:\tilde{\varOmega }\rightarrow \varOmega \), \(n\in \mathbb {N}_0\), such that \(\tilde{\mathbb {P}}^{g_n}=\mathbb {P}\) for all \(n\in \mathbb {N}_0\), and such that, with \(\tilde{Z}_n:= Z_n\circ g_n\),

$$\begin{aligned} \sup _{(a,t)\in \mathbb {S}_{d-1}\times \mathbb {R}}\bigl |\tilde{Z}_n(a,t)-\tilde{Z}_0(a,t)\bigr | \, \rightarrow \, 0\quad \tilde{\mathbb {P}}\text {-{a.s.}}. \end{aligned}$$
(41)

As \(F_n\) is a deterministic function of \(Z_n\) it follows that \(\tilde{F}_n:= F_n\circ g_n\) has the same distribution under \(\tilde{\mathbb {P}}\) as \(F_n\) under \(\mathbb {P}\), \(n\in \mathbb {N}\). This extends to the associated one-dimensional projections and their medians, and to the \(Y_n\)-processes, where \(\tilde{M}_n^\pm (a):=\hbox {Med}_\pm \bigl (\tilde{F}_n(a,\cdot )\bigr )\) and \(\tilde{Y}_n^\pm :=Y_n^\pm \circ g_n\), \(n\in \mathbb {N}\).

Arguing as in Grübel (1996, Proof of Theorem 1) we obtain that

$$\begin{aligned} \tilde{Y}_n^\pm (a)\; \rightarrow \; \tilde{Y}(a) := -f_a\bigl (M(a)\bigr )^{-1}\tilde{Z}_0\bigl (a,M(a)\bigr )\quad \tilde{\mathbb {P}}\text {-a.s.} \end{aligned}$$

for all \(a\in \mathbb {S}_{d-1}\), and that

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\bigl |\tilde{Y}_n^\pm (a)\bigr |\ = \ O(1) \quad \tilde{\mathbb {P}}\text {-{a.s.}}. \end{aligned}$$
(42)

For the integral of the marginal medians needed by Grübel (1996) this was enough, but here we need the stronger stronger statement

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\bigl |\tilde{Y}_n^\pm (a) - \tilde{Y}(a)\bigr |\rightarrow 0\quad \tilde{\mathbb {P}}\text {-{a.s.}}. \end{aligned}$$
(43)

For the proof of (43) we adopt some arguments given in Kuelbs and Zinn (2013, Proof of Proposition 1). First we note that \(F\bigl (a,M(a)\bigr )=1/2\) and then write

$$\begin{aligned} F\bigl (a,M(a)+h\bigr ) - 1/2 = f_a\bigl (M(a)\bigr )h + r(a,h)h, \end{aligned}$$
(44)

where \(r(a,h)=f_a\bigl (M(a)+\zeta _{a,h}h\bigr )-f_a\bigl (M(a)\bigr )\) with some \(\zeta _{a,h}\in [0,1]\). Using compactness of \(\mathbb {S}_{d-1}\) we see that the assumption (14) implies

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}|r(a,h)|\le \sup _{a\in \mathbb {S}_{d-1}}\sup _{|s|\le |h|}\left| f_a\left( M(a)+s\right) -f_a\left( M(a)\right) \right| \rightarrow 0\text {as}~h\rightarrow 0. \end{aligned}$$

Further, (42) yields

$$\begin{aligned} T_n\, := \, \sup _{a\in \mathbb {S}_{d-1}}\bigl |\tilde{M}_n^\pm (a)-M(a)\bigr |\, \rightarrow \, 0\quad \tilde{\mathbb {P}}\text {-{a.s.}} \end{aligned}$$

in view of the definition of the Y-processes. Taken together this gives, \(\tilde{\mathbb {P}}\)-a.s,

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\bigl |r\big (a,\tilde{M}_n^\pm (a)-M(a)\big )\bigr |\, \le \sup _{a\in \mathbb {S}_{d-1}}\;\sup _{|s|\le T_n}\bigl |f_a\bigl (M(a)+s\bigr )-f_a\bigl (M(a)\bigr ) \bigr |\, \rightarrow \, 0, \end{aligned}$$

and using (42) again we get

$$\begin{aligned} \tilde{B}_n^\pm :=\sup _{a\in \mathbb {S}_{d-1}}|\tilde{Y}_n^\pm (a)||r(a,\tilde{M}_n^\pm (a)-M(a))|\rightarrow 0\quad \tilde{\mathbb {P}}\text {-{a.s.}}. \end{aligned}$$

From (41) and the continuity of \(a\rightarrow \tilde{Z}_0\left( a,M(a)\right) \) it follows that, again \(\tilde{\mathbb {P}}\)-a.s.,

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\left| \sqrt{n}\left( \tilde{F}_{n}\big (a,\tilde{M}_n^\pm (a)\big ) - F\big (a,\tilde{M}_n^\pm (a)\big )\right) - \tilde{Z}_0\big (a,M(a)\big )\right| \; \rightarrow \; 0. \end{aligned}$$
(45)

Absolute continuity of P implies that samples from P are in general position with probability 1, meaning that at most d points are on a hyperplane. Hence at most d of the variables \(a^{\text {t}}X_1,\ldots ,a^{\text {t}}X_n\) coincide, and thus, given that the range of \(\tilde{F}_n=F_n\circ g_n\) is contained in the range of \(F_n\),

$$\begin{aligned} \bigl | \tilde{F}_n\bigl (a,\tilde{M}_n^\pm (a)\bigr )\, - \, 1/2\,\bigr | \ \le \ d/n\,. \end{aligned}$$

with probability 1. Using this and (45) we get

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\left| \sqrt{n}\Big (F\big (a,\tilde{M}_n^\pm (a)\big ) - 1/2 \Big ) + \tilde{Z}_0\big (a,M(a)\big )\right| \; \rightarrow \; 0 \quad \tilde{\mathbb {P}}\text {-a.s.}. \end{aligned}$$

From

$$\begin{aligned} \sqrt{n}\Bigl (F\big (a,\tilde{M}_n^\pm (a)\big ) - 1/2 \Bigr ) \,=\, \tilde{Y}_n^\pm (a)\left( f_a\big (M(a)\big )+r\big (a,\tilde{M}_n^\pm (a)-M(a)\big )\right) , \end{aligned}$$

which holds by (44), we obtain that, \(\tilde{\mathbb {P}}\)-a.s.,

$$\begin{aligned} \tilde{A}_n^\pm :=\sup _{a\in \mathbb {S}_{d-1}} \left| \tilde{Y}_n^\pm (a)\left( f_a\big (M(a)\big ) + r\big (a,\tilde{M}_n^\pm (a)-M(a)\big )\right) + \tilde{Z}_0\big (a,M(a)\big )\right| \rightarrow 0. \end{aligned}$$

Thus,

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\left| \tilde{Y}_n^\pm (a)f_a\big (M(a)\big )+\tilde{Z}_0\big (a,M(a)\big )\right| \le \tilde{A}_n^\pm +\tilde{B}_n^\pm \rightarrow 0\quad \tilde{\mathbb {P}}\text {-{a.s.},} \end{aligned}$$

which finishes the proof of (43) as \(a\mapsto f_a(M(a))\) is bounded away from 0 on \(\mathbb {S}_{d-1}\).

Returning to the un-tilded variables we see that this establishes the functional limit theorems in (38).

We note in passing that \(\sup _{a\in \mathbb {S}_{d-1}}\bigl |\tilde{Y}_n^+(a)-\tilde{Y}_n^-(a)\bigr |\,\rightarrow \, 0\) \(\tilde{\mathbb {P}}\)-a.s., which implies

$$\begin{aligned} \sup _{a\in \mathbb {S}_{d-1}}\left| \sqrt{n}\left( M_n^+(a) - M_n^-(a)\right) \right| \rightarrow 0\quad \text {in } \mathbb {P}\text {-probability}. \end{aligned}$$
(46)

This shows that the volume of the interval \([\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\) shrinks fast enough so that the choice of a vector from this interval is irrelevant for the distributional asymptotics.

Now let \(b_n(i)^{\text {t}}\) and \(b(i)^{\text {t}}\) be the ith row of \(U_n\), \(n\in \mathbb {N}\), and U respectively, for \(i=1,\dots ,d\). In view of \(U_n\rightarrow U\) in \(\mathbb {O}(d)\) we have \(b_n(i)\rightarrow b(i)\) \(\mathbb {P}\)-a.s. for \(i=1,\ldots ,d\). Let \(W_n:=\sqrt{n}\eta _n = U_n\sqrt{n}(\theta _n-\theta )\) and

$$\begin{aligned} W_n^\pm := \sqrt{n} \begin{pmatrix} M_n^\pm (b_n(1))\\ \vdots \\ M_n^\pm (b_n(d))\end{pmatrix}, \quad W:=\begin{pmatrix} Y(b(1))\\ \vdots \\ Y(b(d)) \end{pmatrix}. \end{aligned}$$

Note that \(W_n\) is an element of the d-dimensional interval \([W_n^-,W_n^+]\) with probability 1. As the paths of the limit process are continuous the functional limit theorems (38) now imply \(W_n^\pm \rightarrow _{\text {\tiny d}}W\), hence

$$\begin{aligned} W_n \; \rightarrow _{\text {\tiny d}}\; W, \end{aligned}$$
(47)

with \(W\sim N_d(0,\varXi )\), where the entries of \(\varXi \) can be calculated from (39) and (40). Indeed, as \(M\equiv 0\) and as (0, U) solves the quarter median problem for P, we obtain \(K(b_i,b_i)=1/4\) and \(K(b_i,b_j)=0\) if \(i\not = j\), so that \(\varXi =\varDelta (P,U)\) with \(\varDelta (P,U)\) the diagonal matrix given in (17).

We now apply \(U_n^{\text {t}}\) on the left hand side, \(U^{\text {t}}\) on the right hand side of (47) and use (Billingsley 1968, Theorem 5.5) again to obtain the claimed asymptotic normality of \(\sqrt{n}(\theta _n-\theta )\), together with the mean and covariance matrix of the limit distribution.

It remains to show that \(\varSigma (P,U)\) does not depend on the choice of U from the set H. This follows easily from the fact that, for diagonal matrices \(\varDelta \), we always have \(P^{\text {t}}\varDelta P=\varDelta \) if P corresponds to the permutation of two rows or the multiplication by \(-1\) of some row.

For the proof the assumption (16) is important. Without such an assumption we would need the decomposition of \(\sqrt{n}\bigl (M_n^\pm (b_n(i))) - M(b(i))\bigr )\) into two terms \(\sqrt{n}\bigl (M_n^\pm (b_n(i))) - M(b_n(i))\bigr )\) and \(\sqrt{n}\bigl (M(b_n(i))) - M(b(i))\bigr )\). The first could still be analyzed with the above arguments, but for the second term the local behaviour of the function \(a\mapsto M(a)\) and distributional asymptotics of the estimates \(U_n\) would become important; see also Theorem 6.

Further, (46) and the ensuing comment show that the original estimate \(\theta _n\) need not be permutation invariant. This may seem surprising as the proof relies on empirical process theory, which deals with the data through their empirical distribution. Here it turned out to be enough that the estimate can be squeezed between two other estimates that are such functions of the empirical distribution.

4.5 Proof of Theorem 4

The transformation law for d-dimensional elliptical distributions, together with the equivariance from Proposition 1 (a), means that we may assume that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Recall that \(e_1,\ldots ,e_d\) is the canonical basis of \(\mathbb {R}^d\). For a subset I of \(\{1,\ldots ,d\}\) let

$$\begin{aligned} A_I := \bigl \{x\in \mathbb {R}^d:\, x^{\text {t}}e_i\ge 0\text { for }i\in I, x^{\text {t}}e_i\le 0\text { for }i\notin I\bigr \}. \end{aligned}$$

The assumptions imply that \(P^T=P\) for all reflections \(T:\mathbb {R}^d\rightarrow \mathbb {R}^d\) about hyperplanes \(H(e_i)\), hence \(P(A_I)\) does not depend on I; also, \(\sum _{I\subset \{1,\ldots ,d\}}P(A_I)=1\). A quarter space such as \(\{x\in \mathbb {R}^d:\, x^{\text {t}}e_i\ge 0, \, x^{\text {t}}e_j\ge 0\}\) with \(i\not =j\) can be written as the union of \(2^{d-2}\) such sets \(A_I\), where the intersections are all P-null sets, hence all of these have probability \(2^{d-2}2^{-d}=1/4\). This shows that \((0,\varSigma )\) solves the quarter median problem for P.

In order to see that the quarter median is unique it is enough to show that \({\hbox {MMed}}(P^U)=0\) for all \(U\in \mathbb {O}(d)\) under the above assumptions on P. This follows from the fact that \(\mathrm{Ell}_d(0,\varSigma ;h)\) is invariant under the reflection \(x\mapsto -x\), \(x\in \mathbb {R}^d\).

It remains to show that the solution of the quarter median is unique in the strictly elliptical case. To prove this, we recall that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Assume that (0, [U]) with \(U\in \mathbb {O}(d)\) is a solution of the quarter median problem. Let \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) be the row vectors of U. Then \(P^U=\mathrm{Ell}_d(0,\tilde{\varSigma };h)\), where \(\tilde{\varSigma }=\left( \tilde{\sigma }_{ij}\right) _{1\le i,j\le d}=U\varSigma U^t\). With \(U_{ij}\) as the \(2\times d\) matrix with the rows \(b_i^{\text {t}},b_j^{\text {t}},\,1\le i<j\le d,\) we see that \(P_{ij}:=P^{U_{ij}}\) is the elliptical distribution \(\mathrm{Ell}_2(0,\tilde{\varSigma }_{ij};h_2)\) with dispersion matrix

$$\begin{aligned} \tilde{\varSigma }_{ij}=U_{ij}\,\varSigma \,U_{ij}^{\text {t}}= \begin{pmatrix}\tilde{\sigma }_{ii} &{}\quad \tilde{\sigma }_{ij}\\ \tilde{\sigma }_{ji} &{}\quad \tilde{\sigma }_{jj}\end{pmatrix} \end{aligned}$$

and density generator \(h_2\) which is equal to h if \(d=2\) and given by

$$\begin{aligned} h_2(u)=\frac{\pi ^{d/2-1}}{\varGamma (d/2-1)}\int _u^\infty (t-u)^{d/2-2}\,h(t)\,dt\text {for}~u>0, \end{aligned}$$

if \(d\ge 3\); see, e.g. Fang et al. (1990, Section 2.2.3). According to Grübel (1996, p. 1466) the probability assigned by \(P_{ij}\) to the left lower quadrant is given by

$$\begin{aligned} \frac{1}{4}+\frac{1}{2\pi }\arcsin \left( \tilde{\sigma }_{ij}/(\tilde{\sigma }_{ii}\tilde{\sigma }_{jj})^{1/2}\right) , \end{aligned}$$

which is equal to \(\frac{1}{4}\) if and only if \(\tilde{\sigma }_{ij}=0\), i.e. if and only if \(\tilde{\varSigma }_{ij}\) is a diagonal matrix with positive diagonal elements. Therefore, \(\tilde{\varSigma }=U\varSigma U^t\) is a diagonal matrix, \(\tilde{\varSigma }=\mathrm{diag}(\tilde{\lambda }_1,\dots ,\tilde{\lambda }_d)\) say, or equivalently, \(\varSigma =U^{\text {t}}\tilde{\varSigma }U=U^{\text {t}}\mathrm{diag}(\tilde{\lambda }_1,\dots ,\tilde{\lambda }_d)U\). It follows from this that the set of eigenvalues \(\bigl \{\tilde{\lambda }_1,\ldots ,\tilde{\lambda }_d\bigr \}\) of \(\tilde{\varSigma }\) and the set of eigenvalues \(\bigl \{\lambda _1,\ldots ,\lambda _d \bigr \}\) of \(\varSigma \) coincide and, as a consequence, that there is a permutation matrix \(\Pi \) such that \(\Pi ^{\text {t}}\varSigma \Pi =\tilde{\varSigma }\), which in turn implies \(U^{\text {t}}\Pi ^{\text {t}}\varSigma U\Pi = \varSigma .\) Thus, putting \(U^{\text {t}}\Pi ^{\text {t}}=:Q=(q_{ij})\) we have \(Q\varSigma =\varSigma Q^{\text {t}},\) i.e. \(q_{ij}\lambda _j=\lambda _iq_{ij}.\) As the \(\lambda _j\) are pairwise distinct this gives \(q_{ij}=0\) for \(i\not =j\). Thus Q is a diagonal matrix with the entries \(\pm 1\) in the diagonal so that \(U\sim I_d.\) Taken together this shows that \((0,[I_d])\) is the unique solution of the quarter median problem.

4.6 Proof of Theorem 5

(a) We first assume that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Using Theorem 3 we only need to evaluate the diagonal elements of \(\varDelta (P,U)\), and because of Theorem 4 we may take U to be the identity matrix. In particular, \(b_i=e_i\) for \(i=1,\ldots ,d\). Then \(f_{e_i}\) is the ith marginal density of P, and the calculations preceding Theorem 4 lead to

$$\begin{aligned} f_{e_i}\bigl (M(e_i)\bigr )\, =\, \frac{c_{d-1}(h)}{\sqrt{\lambda _i}c_d(h)}, \quad i=1,\ldots ,d. \end{aligned}$$

Putting this together we see that, for such P, the asymptotic covariance matrix is indeed the specified multiple of \(\varSigma \).

We now use the equivariance properties of the quarter median to ‘bootstrap’ this to general strictly elliptical distributions. Hence suppose that \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) and that \(\lambda _1> \lambda _2> \cdots > \lambda _d\) are the eigenvalues of \(\varSigma \). Then, for some \(U\in \mathbb {O}(d)\), \(\varSigma =U^{\text {t}}\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)U\). If \(X_i\), \(i\in \mathbb {N}\), are independent with distribution P then the random variables \(Y_i:= U(X_i-\mu )\), \(i\in \mathbb {N}\), are independent with distribution \(\mathrm{Ell}_d(0,\mathrm{diag}(\lambda _1,\ldots ,\lambda _d);h)\). As \(U(\theta _n - \mu \bigr )\) is a quarter median of \(Y_1,\dots ,Y_n\) the first part of the proof leads to

$$\begin{aligned} \sqrt{n}\,U\bigl (\theta _n - \mu \bigr ) \rightarrow _{\text {\tiny d}}N_d(0,\varXi ), \end{aligned}$$

with \(\varXi =\sigma _{{{\textsc {QMed}}}}^2\,\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Using the well-known properties of weak convergence and the behavior of multivariate normal distributions under affine transformations we finally obtain (24).

(b) Suppose that the eigenvalues of \(\varSigma \) coincide and let \((U_n)_{n\in \mathbb {N}}\) be as in the statement of the second part of the theorem. Then, for any subsequence \((U_{n'})\) there exists a subsubsequence \((U_{n''})\) such that \(U_{n''}\rightarrow U_0\) for some \(U_0\in \mathbb {O}(d)\) as \(n''\rightarrow \infty \). With m for \(n''\) and \(b_m(i)^{\text {t}}\), \(b(i)^{\text {t}}\) the ith row of \(U_m\) respectively \(U_0\), and using the same arguments as for (47), we obtain

$$\begin{aligned} \sqrt{m} \begin{pmatrix} M_{m}(b_m(1))\\ \vdots \\ M_{m}(b_m(d))\end{pmatrix} \; \rightarrow _{\text {\tiny d}}\; \begin{pmatrix} Y(b(1))\\ \vdots \\ Y(b(d)) \end{pmatrix} \quad \text {as }m\rightarrow \infty . \end{aligned}$$

The symmetry of P implies that the limit distribution does not depend on \(U_0\).

4.7 Proof of Theorem 6

As the result is somewhat tangential to the topic of multivariate location estimation we only provide the main ideas together with some arguments specific to the quarter median application.

The general approach consists in writing the parameter \(\theta =\theta (P)\) as the solution of an equation \(\varPsi _P(\theta )=z\) with a suitably chosen function \(\varPsi _P\) and known z. Let \(\varPsi _n\) be the empirical version of \(\varPsi _P\) and suppose that \(\varPsi _n(\theta _n)\) sufficiently close to z for large \(n\in \mathbb {N}\). Then, under certain conditions, properties of the estimator sequence \((\theta _n)_{n\in \mathbb {N}}\) can be obtained by localizing \(\varPsi _P\) near \(\theta \) and then using the delta method. An excellent general exposition of this approach can be found in the textbook of van der Vaart (1998), see also van der Vaart (2002). Specifically, van der Vaart(1998, Theorem 5.9) together with van der Vaart (2002, Theorem 6.17) may serve as a guide towards filling in the details. Ultimately, this would also lead to an alternative proof for Theorem 5 in the case \(d=2\) that does not make use of the results presented by Grübel (1996).

For \(a,b>0\) let

$$\begin{aligned} E(a,b)\, :=\, \Bigl \{(x,y)^{\text {t}}\in \mathbb {R}^2:\, \frac{x^2}{a^2} + \frac{y^2}{b^2} = 1\Bigr \} \, =\, \bigl \{(a\cos (t),b\sin (t))^{\text {t}}:\, 0\le t<2\pi \bigr \} \end{aligned}$$

be the boundary of the two-dimensional centered ellipse with main half-axes parallel to the unit vectors \(e_1\) and \(e_2\) and of length a and b respectively. On this set we consider the push-forward \(Q(a,b):=\mathrm{unif}(\mathbb {S}_{1})^{T_{a,b}}\) of the uniform distribution on \(\mathbb {S}_{1}=E(1,1)\) under the mapping \(T_{a,b}:\mathbb {R}^2\rightarrow \mathbb {R}^2\), \((x,y)^{\text {t}}\mapsto (a x,b y)^{\text {t}}\). If \(a\not = b\) then the solution of the quarter median problem for Q(ab) is unique and given by \(\mu =(0,0)^{\text {t}}\) and \(U=\mathrm{diag}(1,1)=U_\alpha \) with \(\alpha =0\).

Let P be a probability measure on the Borel subsets of \(\mathbb {R}^2\). We introduce the function \(\varPsi _P:\mathbb {R}\times \mathbb {R}\times [-\pi /4,\pi /4)\rightarrow \mathbb {R}^3\), where the components \(\varPsi _{P,i}\) are the respective probabilities of the upper right, upper left and lower left quarter spaces associated with the argument. Formally, and with \(b_\alpha ,b_\alpha '\) the columns of \(U_\alpha ^{\text {t}}\),

$$\begin{aligned} \varPsi _{P,1}(x,y,\alpha ) = P\bigl (V_{++}((x,y)^{\text {t}};b_\alpha ,b_\alpha ')\bigr ), \end{aligned}$$

and \(\,\varPsi _{P,2}(x,y,\alpha ) = P\bigl (V_{-+}(\cdots )\bigr )\), \(\,\varPsi _{P,3}(x,y,\alpha ) = P\bigl (V_{--}(\cdots )\bigr )\).

Suppose that \(P=Q(a,b)\) with \(a\not =b\). Then elementary geometric considerations lead to the following matrix of partial derivatives of \(\,\varPsi _{P}\) at \((x,y,z)=(0,0,0)\),

$$\begin{aligned} M(a,b)\, :=\, \frac{1}{2\pi ab} \begin{pmatrix} -b &{}\quad b &{}\quad b \\ -a &{}\quad -a &{}\quad a \\ b-a &{}\quad a-b &{}\quad b-a \end{pmatrix}. \end{aligned}$$
(48)

For later use we note that

$$\begin{aligned} M(a,b)^{-1} := \pi \begin{pmatrix} 0 &{}\quad -b &{}\quad ab/(b-a) \\ a &{}\quad -b &{}\quad 0 \\ a &{}\quad 0 &{}\quad ab/(b-a)\end{pmatrix}. \end{aligned}$$

In order to obtain the derivative for more general P we use some structural properties of symmetric and elliptical distributions. Note that Q(rr) is the uniform distribution on the boundary of the sphere with radius r. If \(X\sim \mathrm{Sym}_2(h)\) then \(\Vert X\Vert \) has density

$$\begin{aligned} f_{\Vert X\Vert }(r)\, =\, \frac{2\pi }{c_2(h)}\, r\,h(r^2),\quad r>0, \end{aligned}$$

and given \(\Vert X\Vert =r\), X is uniformly distributed on \(\{x\in \mathbb {R}^2:\, \Vert x\Vert =r\}\). Taken together this implies the mixture representation

$$\begin{aligned} \mathrm{Sym}_2(h)\, =\, \frac{2\pi }{c_2(h)}\, \int _0^\infty Q(r,r) \, r \, h(r^2)\, dr. \end{aligned}$$

Elliptical distributions are affine transformations of symmetric distributions. In particular,

$$\begin{aligned} P\, :=\, \mathrm{Ell}_2\bigl ((0,0)^{\text {t}}, \mathrm{diag}(a^2,b^2);h\bigr ) \; =\; \mathrm{Sym}_2(h)^{T_{a,b}}, \end{aligned}$$

so that

$$\begin{aligned} P \; =\; \frac{2\pi }{c_2(h)}\, \int _0^\infty Q(ar,br) \, r \, h(r^2)\, dr. \end{aligned}$$

Inserting quarter spaces we obtain a relation between \(\varPsi _P\) and the \(\varPsi \)-functions corresponding to the measures Q(ab), \(a,b>0\). This in turn can be used to relate the derivative D(P) of \(\varPsi _P\) at (0, 0, 0) to the derivatives obtained earlier for Q(ab), and we arrive at

$$\begin{aligned} D(P) \, =\, \frac{2\pi }{c_2(h)}\, \int _0^\infty M(ar,br)\, r \, h(r^2)\, dr \, = \, \frac{\pi c_1(h)}{c_2(h)}\, M(a,b), \end{aligned}$$

and thus

$$\begin{aligned} D(P)^{-1} \, =\, \frac{c_2(h)}{c_1(h)}\, \begin{pmatrix} 0 &{}\quad -b &{}\quad ab/(b-a) \\ a &{}\quad -b &{}\quad 0 \\ a &{}\quad 0 &{}\quad ab/(b-a)\end{pmatrix}. \end{aligned}$$

If \((x,y,\alpha )\) solves the quarter median problem for an absolutely continuous P, then \(\varPsi _P(x,y,\alpha )=(1/4,1/4,1/4)\). For a sample of size n from P the random vector of observation counts in the quarter spaces associated with the solution of the quarter median problem for P has a multinomial distribution with parameters n and (1/4, 1/4, 1/4, 1/4). The multivariate central limit theorem shows that the standardized counts in the three quarter spaces corresponding to the components of \(\varPsi \) are asymptotically normal with mean vector 0 and covariance matrix

$$\begin{aligned} \varSigma _0 := \begin{pmatrix} 3/16 &{}\quad -1/16 &{}\quad -1/16 \\ -1/16 &{}\quad 3/16 &{}\quad -1/16 \\ -1/16 &{}\quad -1/16 &{}\quad 3/16 \end{pmatrix}. \end{aligned}$$

We may now apply the delta method to obtain asymptotic normality for the triplet consisting of the quarter median coordinates and the rotation angle for a sequence of independent random variables with distribution \(P\,=\,\mathrm{Ell}_2\bigl ((0,0)^{\text {t}}, \mathrm{diag}(a^2,b^2);h\bigr )\). The limiting normal distribution is centered and has covariance matrix

$$\begin{aligned} \varSigma _1\, =\, (D(P)^{-1})^{\text {t}}\, \varSigma _0 \, D(P)^{-1} \; =\; \frac{c_2(h)^2}{4c_1(h)^2}\, \begin{pmatrix} a^2 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad b^2 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad a^2b^2/(a-b)^2 \end{pmatrix}. \end{aligned}$$

It remains to extend this to \(P=\mathrm{Ell}_2(\mu ,\varSigma ;h)\). We use essentially the same argument as at the end of Sect. 4.6 (a). By assumption and Theorem 4, we have that \(\varSigma =U_\alpha ^{\text {t}}\mathrm{diag}(\lambda _1,\lambda _2)U_\alpha \), where \(U_\alpha \) is the matrix representing the rotation by the angle \(\alpha \) in the interior of the half-open interval I, see (8), and that \((\mu ,U_\alpha )\) is the unique solution of the quarter median problem for P. Theorem 2 implies that \(\alpha _n-\alpha \in (-\pi /4,+\pi /4)\) with probability 1 for n sufficiently large. Let \(Y_i:= U_\alpha ^{\text {t}}(X_i-\mu )\), \(i\in \mathbb {N}\). Then, with probability 1 for n large enough, \(\left( U_\alpha ^{\text {t}}({{\textsc {QMed}}}(X_1,\ldots ,X_n)-\mu ),U_{\alpha _n(X_1,\dots ,X_n)-\alpha }\right) \) is a solution of the quarter median problem for \(Y_1,\dots ,Y_n\). Noting that \(\lambda _1=a^2\) and \(\lambda _2=b^2\) we may now apply the above asymptotic normality result to the Y-sequence to obtain

$$\begin{aligned} \begin{pmatrix}U_\alpha ^{\text {t}}&{}\quad 0 \\ 0 &{}\quad 1\end{pmatrix} \begin{pmatrix}\sqrt{n}\big ({{\textsc {QMed}}}(X_1,\ldots ,X_n)-\mu \big )\\ \sqrt{n}\big (\alpha _n(X_1,\ldots ,X_n)-\alpha \big )\end{pmatrix} \ \rightarrow _{\text {\tiny d}}\ N_3(0,\varSigma _1). \end{aligned}$$

Consequently,

$$\begin{aligned} \begin{pmatrix}\sqrt{n}\big ({{\textsc {QMed}}}(X_1,\ldots ,X_n)-\mu \big )\\ \sqrt{n}\big (\alpha _n(X_1,\ldots ,X_n)-\alpha \big )\end{pmatrix} \ \rightarrow _{\text {\tiny d}}\ N_3\left( 0, \quad \sigma _{{{\textsc {QMed}}}}^2 \begin{pmatrix}\varSigma &{}\quad 0 \\ 0 &{}\quad \frac{\lambda _1\lambda _2}{(\sqrt{\lambda _1} -\sqrt{\lambda _2})^2}\end{pmatrix}\right) , \end{aligned}$$
(49)

which is the statement of the theorem.

4.8 Proof of Proposition 2

We denote by \(\mathcal C\) the set of non-empty closed subsets of \(\mathbb {R}^d\times \mathbb {SO}(d).\) For \(x_1,\ldots ,x_n\in \mathbb {R}^d\) let \(\varGamma (x_1,\dots ,x_n)\) be the (by Theorem 1 non-empty) set of solutions \((\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) of the quarter median problem for \(P_{n;x_1,\dots ,x_n}\). As permutations of \(x_1,\ldots ,x_n\) do not change \(P_{n;x_1,\dots ,x_n}\) we may regard \(\varGamma \) as a function on \(E:=(\mathbb {R}^d)^n/\sim \), where \((x_1,\ldots ,x_n),(y_1,\ldots ,y_n)\in (\mathbb {R}^d)^n\) are equivalent if \(y_i=x_{\pi (i)}\) for \(i=1,\ldots ,n\) for some permutation \(\pi \) of the set \(\{1,\ldots ,n\}\). We aim to prove that the function \(\varGamma \) has values in \(\mathcal C\) and that it is measurable in the sense that the set

$$\begin{aligned} A(C)\, := \, \bigl \{ z\in E:\, \varGamma (z)\cap C\ne \emptyset \bigr \} \end{aligned}$$
(50)

is a Borel set for each \(C\in \mathcal C\). Then, by the measurable selection theorem of Kuratowski and Ryll–Nardzewski (see, e.g. Rockafellar 1976, 1.C Corollary) there exists a measurable function \(\tau :E\rightarrow \mathbb {R}^d\times \mathbb {SO}(d)\) such that \(\tau (z)\in \varGamma (z)\) for all \(z\in E\). In fact, we will prove that the set A(C) is closed whenever \(C\in \mathcal C\).

Arguing as in the first part of the proof of Theorem 1 in Sect. 4.1 we see that if \((\theta _\ell ,U_\ell )_{\ell \in \mathbb {N}}\) is a sequence of elements in \(\varGamma (z)\) converging to \((\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) as \(\ell \rightarrow \infty \), then \((\theta ,U)\in \varGamma (z)\). This shows that the function \(\varGamma \) has values in \(\mathcal C\).

Now, more generally, fix \(C\in \mathcal C\), let A(C) be as in (50), and let \((z_\ell )_{\ell \in \mathbb {N}}\) be a sequence of elements of A(C) that converges to some element \(z\in E\) as \(\ell \rightarrow \infty \). As pointed out in connection with the transition from \(\mathbb {O}(d)\) to \(\mathbb {H}(d)\) in Sect. 4.4 we can find \((x_1^{(\ell )},\dots ,x_n^{(\ell )})\in z_\ell \), \(l\in \mathbb {N}\), and \((x_1,\dots ,x_n)\in z\) such that \((x_1^{(\ell )},\dots ,x_n^{(\ell )})\) converges to \((x_1,\dots ,x_n)\in z\) in the space \((\mathbb {R}^d)^n\). Then for each \(\ell \in \mathbb {N}\) there exists an associated element \((\theta _\ell ,U_\ell )\) of \(\varGamma (x_1^{(\ell )},\dots ,x_n^{(\ell )})\). Arguing as in Sect. 4.1 again, we see that along some subsequence \((\ell ')\) it holds that \((\theta _{\ell '},U_{\ell '})\rightarrow (\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) and

$$\begin{aligned} P_{n;x_1^{(\ell ')},\dots ,x_n^{(\ell ')}}^{U_{\ell '}}\ \rightarrow \ P_{n;x_1,\dots ,x_n}^U \quad \text {weakly.} \end{aligned}$$

With \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) as the row vectors of U this gives \(P_{n;x_1,\dots ,x_n}\big (V_{\pm \pm }(\theta ;b_i,b_j)\big )\ge \frac{1}{4}\). Consequently, \((\theta ,U)\in \varGamma (z)\); also, \(z\in A(C)\) as C is closed.

4.9 Proof of Theorem 7

Let \(\theta \) be a quarter median of \(P_n\), with associated orthogonal directions \(b,b'\in S_2.\) Let \(L_{\theta ,b}=\{\theta + tb:t\in \mathbb {R}\},\,L_{\theta ,b'}=\{\theta + tb':t\in \mathbb {R}\}\) be the axes of the rectangular coordinate system with origin \(\theta \) and directions \(b,b'\). We consider two cases.

(i) \(\theta \in \{x_1,\dots ,x_n\}\). We may assume without loss of generality that \(\theta =x_1,\) and define

$$\begin{aligned} \psi =\min \bigl (\min \bigl \{\arccos (|b_{1j}^{\text {t}}b|):2\le j\le n\bigr \}, \,\min \bigl \{\arccos (|b_{1j}^{\text {t}}b'|):2\le j\le n\bigr \}\bigr ). \end{aligned}$$

There are orthogonal directions \(b_*,b_*'\in \mathbb {S}_{1}\) obtained from \(b,b'\) by the same clockwise or counterclockwise rotation with the angle \(\psi \), where \(b_*\) or \(b_*'\) is an element of \(\{-b_{12},b_{12},\ldots ,-b_{1n},b_{1n}\}\). By construction, the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b_{*},b_{*}')\) is greater than or equal to the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b,b')\); also the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b_*)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b_*')\) is greater than or equal to the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b')\), respectively. It follows that \((\theta ,U_*)\) with \(U_*\) as the matrix with the row vectors \(b_*,b_*'\) solves the quarter median problem for \(P_n\). Thus by Proposition 3 (c) there is an \(\eta \in \mathcal {M}\bigl (P_n^{U_*}\bigr )\) such that \((\theta _*,U_*)\) with \(\theta _*=U_*^{\text {t}}\eta \) also solves the quarter median problem for \(P_n\). Note that \(\eta = \alpha b_* +\beta b_*'\), where \(\alpha \) is median of \(P_n^{b_*}\) and \(\beta \) is a is median of \(P_n^{b_*}\). Because \(b_*\) or \(b_*'\) is an element of \(\{-b_{12},b_{12},\ldots ,-b_{1n},b_{1n}\}\), and

$$\begin{aligned} \hbox {Med}_\pm \left( P_n^{-b_*}\right) =-\hbox {Med}_\mp \left( P_n^{b_*}\right) , \hbox {Med}_\pm \bigl (P_n^{-b_*'}\bigr )=-\hbox {Med}_\mp \bigl (P_n^{b_*'}\bigr ), \end{aligned}$$
(51)

it follows that \((\theta _*,U_*)\in \mathscr {L}(P_n).\)

(ii) \(\theta \notin \{x_1,\dots ,x_n\}\). With \(\tilde{\theta }\) as the intersection point of the line parallel to \(L_{\theta ,b}\) through the data point \(\tilde{x}\in \{x_1,\dots ,x_n\}\) with the shortest distance to \(L_{\theta ,b}\), i.e. \(\inf \{\Vert \tilde{x}-y\Vert :y\in L_{\theta ,b}\}=\min \left\{ \inf \{\Vert x_j-y\Vert :y\in L_{\theta ,b}\}:1\le j\le n\right\} \), and the line parallel to \(L_{\theta ,b'}\) through the data point \(\tilde{\tilde{x}}\in \{x_1,\dots ,x_n\}\) with the shortest distance to \(L_{\theta ,b'}\), we have that \((\tilde{\theta },U)\) with U as matrix with the row vectors \(b,b'\) is a solution of the quarter median problem for \(P_n.\) If \(\tilde{x}=\tilde{\tilde{x}}\) then \(\tilde{\theta }=\tilde{x}=\tilde{\tilde{x}}\) and we arrive at case (i). So, we can (and do) assume that there are two points, \(x_1,x_2\) say, such that \(x_1\in L_{\theta ,b},\,x_2\in L_{\theta ,b'}.\) Now we put \(B:=\{b_{1j}:2\le j\le n\}\cup \{b_{2j}:3\le j\le n\}\) and define

$$\begin{aligned} \psi =\min \bigl (\min \bigl \{\arccos (|d^{\text {t}}b|):d\in B\bigr \}, \,\min \bigl \{\arccos (|d^{\text {t}}b'|):d\in B\bigr \}\bigr ). \end{aligned}$$

There are orthogonal directions \(b_*,b_*'\in \mathbb {S}_{1}\) obtained from \(b,b'\) by the same clockwise or counterclockwise rotation with the angle \(\psi \), where \(b_*\) or \(b_*'\) is an element of \(B\cup (-B)\). With \(L_1=\{x_1 + tb_*:t\in \mathbb {R}\}\) and \(L_2=\{x_2 + tb_*':t\in \mathbb {R}\}\) we obtain the axes of a new rectangular coordinate system. The origin \(\theta \) of the coordinate system with the axes \(L_{\theta ,b}\) and \(L_{\theta ,b'}\) and the origin \(\theta _*\) of the new system are both located on the Thales circle about the line segment connecting \(x_1\) and \(x_2\). By construction, the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta _*;b_*,b_*')\) is greater than or equal to the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b,b')\). Further, the number of data points \(x_j\) lying in \(H_{\pm }(\theta _*;b_*)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta _*;b_*')\) are greater than or equal to the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b')\), respectively. Consequently, \((\theta _*,U_*)\) with \(U_*\) as matrix with the row vectors \(b_*,b_*'\) solves the quarter median for \(P_n\). Again, by Proposition 3 (c), there is an \(\eta \in \mathcal {M}\bigl (P_n^{U_*}\bigr )\) such that \((\theta _{**},U_*)\), with \(\theta _{**}=U_*^{\text {t}}\eta \) also solves the quarter median problem for \(P_n\). Here, \(b_*\) or \(b_*'\) is an element of \(B\cup (-B)\), so that, again by (51), we have \(\bigl (\theta _{**},U_*\bigr ) \in \mathscr {L}(P_n)\).