3. L’estimation de la fonction de densité des revenus

Eric Graf et Yves Tillé

Précédent | Suivant

Dans une approche basée sur le plan (design based) en population finie, l’inférence se fait par rapport au plan de sondage  P ( S ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaGqaaiaa=b facaWFOaGaam4uaiaacMcaaaa@3C05@  utilisé pour sélectionner l’échantillon S MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadofaaa a@39D5@  dans la population U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadwfaaa a@39D7@  de taille finie N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaad6eaaa a@39D0@ . Dans cette approche, seules les indicatrices d’inclusion dans l’échantillon sont aléatoires, toutes les autres grandeurs sont fixes. La fonction de répartition des revenus au niveau de la population est alors une fonction en escaliers :  F y ( x )= kU 1 y k x /N MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAeada WgaaWcbaGaamyEaaqabaGcdaqadeqaaiaadIhaaiaawIcacaGLPaaa cqGH9aqpdaWcgaqaamaaqababaGaaCymamaaBaaaleaacaWG5bWaaS baaWqaaiaadUgaaeqaaSGaeyizImQaamiEaaqabaaabaWaaSbaaWqa aiaadUgacqGHiiIZcaWGvbaabeaaaSqab0GaeyyeIuoaaOqaaiaad6 eaaaaaaa@4A97@  et sa dérivée, la fonction de densité, n’existe pas à cause des discontinuités. Si l’on ne veut pas se placer dans une approche basée sur le modèle (model based) avec un modèle de super population pour justifier le terme de fonction de densité des revenus, il faut artificiellement lisser la fonction de répartition pour qu’elle devienne dérivable. C’est donc par abus de langage que nous nous autorisons ici à parler de fonction de densité. Avec cette volonté de lissage, Deville (2000) et Osier (2009) proposent d’estimer la fonction de densité des revenus par noyau gaussien :

K ( u ) = 1 h 2 π e u 2 / 2 , u = x y k h f ^ 1 ( x ) = 1 N ^ k S w k K ( x y k h )                                ( 3.1 ) = 1 h 2 π 1 N ^ k S w k exp [ ( x y k ) 2 2 h 2 ] MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaqbaeaabmWaaa qaaiaadUeadaqadeqaaiaadwhaaiaawIcacaGLPaaaaeaacqGH9aqp aeaadaWcaaqaaiaaigdaaeaacaWGObWaaOaaaeaacaaIYaGaeqiWda haleqaaaaakiaadwgadaahaaWcbeqaaiabgkHiTmaalyaabaGaamyD amaaCaaabeqaaiaaikdaaaaabaGaaGOmaaaaaaGccaaISaGaaGzbVl aaywW7caWG1bGaeyypa0ZaaSaaaeaacaWG4bGaeyOeI0IaamyEamaa BaaaleaacaWGRbaabeaaaOqaaiaadIgaaaaabaGabmOzayaajaWaaS baaSqaaiaaigdaaeqaaOWaaeWabeaacaWG4baacaGLOaGaayzkaaaa baGaeyypa0dabaWaaSaaaeaacaaIXaaabaGabmOtayaajaaaamaaqa fabeWcbaGaam4AaiabgIGiolaadofaaeqaniabggHiLdGccaaMc8Ua am4DamaaBaaaleaacaWGRbaabeaakiaadUeadaqadaqaamaalaaaba GaamiEaiabgkHiTiaadMhadaWgaaWcbaGaam4AaaqabaaakeaacaWG ObaaaaGaayjkaiaawMcaaiaabccacaqGGaGaaeiiaiaabccacaqGGa GaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabcca caqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiai aabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGa aeiiamaabmaabaGaaG4maiaac6cacaaIXaaacaGLOaGaayzkaaaaba aabaGaeyypa0dabaWaaSaaaeaacaaIXaaabaGaamiAamaakaaabaGa aGOmaiabec8aWbWcbeaaaaGcdaWcaaqaaiaaigdaaeaaceWGobGbaK aaaaWaaabuaeqaleaacaWGRbGaeyicI4Saam4uaaqab0GaeyyeIuoa kiaaykW7caWG3bWaaSbaaSqaaiaadUgaaeqaaOGaciyzaiaacIhaca GGWbWaamWaaeaacqGHsisldaWcaaqaamaabmqabaGaamiEaiabgkHi TiaadMhadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaadaahaa WcbeqaaiaaikdaaaaakeaacaaIYaGaamiAamaaCaaaleqabaGaaGOm aaaaaaaakiaawUfacaGLDbaaaaaaaa@9BF0@

h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgaaa a@39EA@  est la largeur de la bande qu’Osier estime par h ^ = σ ^ N ^ 0,2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadIgaga qcaiabg2da9iqbeo8aZzaajaGabmOtayaajaWaaWbaaSqabeaacqGH sislcaqGWaGaaeilaiaabkdaaaaaaa@40E7@  et σ ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqbeo8aZz aajaaaaa@3AD0@  est l’écart-type estimé de la distribution empirique des revenus :

σ ^ = k S w k y k 2 N ^ ( k S w k y k N ^ ) 2 = k S w k y k 2 N ^ y ¯ w 2 . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqbeo8aZz aajaGaeyypa0ZaaOaaaeaadaWcaaqaamaaqababaGaam4DamaaBaaa leaacaWGRbaabeaakiaadMhadaqhaaWcbaGaam4Aaaqaaiaaikdaaa aabaGaam4AaiabgIGiolaadofaaeqaniabggHiLdaakeaaceWGobGb aKaaaaGaeyOeI0YaaeWaaeaadaWcaaqaamaaqababaGaam4DamaaBa aaleaacaWGRbaabeaakiaadMhadaWgaaWcbaGaam4AaaqabaaabaGa am4AaiabgIGiolaadofaaeqaniabggHiLdaakeaaceWGobGbaKaaaa aacaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqabaGccqGH9aqp daGcaaqaamaalaaabaWaaabeaeaacaWG3bWaaSbaaSqaaiaadUgaae qaaOGaamyEamaaDaaaleaacaWGRbaabaGaaGOmaaaaaeaacaWGRbGa eyicI4Saam4uaaqab0GaeyyeIuoaaOqaaiqad6eagaqcaaaacqGHsi slceWG5bGbaebadaqhaaWcbaGaam4Daaqaaiaaikdaaaaabeaakiaa i6caaaa@659F@

Notons que cette estimation de σ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiabeo8aZb aa@3AC0@  n’est pas robuste étant très sensible aux valeurs extrêmes de y . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhaca GGUaaaaa@3AAD@  Des données de revenus ont très souvent une queue de distribution étendue vers la droite avec des valeurs très élevées possibles, on parle de valeurs aberrantes représentatives (representative outliers) au sens de Chambers (1986) et Hulliger (1999). Comme le montrent nos simulations dans la section 4, cela peut fortement biaiser nos estimations de variance. Verma et Betti (2011) procèdent également par noyau rappelant que, selon Silverman (1986), le choix du noyau n’est pas crucial pour assurer la convergence de f ^ ( y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaabmqabaGaamyEaaGaayjkaiaawMcaaaaa@3C80@  vers f ( y ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAgada qadeqaaiaadMhaaiaawIcacaGLPaaacaGGSaaaaa@3D20@  alors que celui de la largeur de bande l’est. Ils utilisent une valeur conseillée par Silverman dans le cas de distributions avec un coefficient d’asymétrie positif, h = 0,79( Q ^ 75 Q ^ 25 ) N ^ 0,2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgacq GH9aqpcaqGWaGaaeilaiaabEdacaqG5aGaaeikaiqadgfagaqcamaa BaaaleaacaaI3aGaaGynaaqabaGccqGHsislceWGrbGbaKaadaWgaa WcbaGaaGOmaiaaiwdaaeqaaOGaaiykaiqad6eagaqcamaaCaaaleqa baGaeyOeI0IaaeimaiaabYcacaqGYaaaaOGaaiOlaaaa@4A10@  Dans leurs conclusions, ils relèvent que la méthode par linéarisation peut être problématique en raison d’irrégularités de la fonction de densité empirique. On ajoutera que ces problèmes sont d’autant plus préoccupants qu’il est fréquent, dans les données issues d’enquêtes, d’avoir des agglomérats d’observations à certaines valeurs (dues à des arrondis ou des questions-fourchettes), ce qui peut compliquer l’estimation de la densité. La suite de l’article décrit des solutions que nous avançons pour réduire le biais de la variance estimée.

3.1 Passer par le logarithme

Une solution qui, comme on le verra plus loin, donne de très bons résultats est de simplement passer par le logarithme pour estimer la densité en x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaca GGUaaaaa@3AAC@  Si l’on pose v = log ( x + a ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAhacq GH9aqpciGGSbGaai4BaiaacEgadaqadeqaaiaadIhacqGHRaWkcaWG HbaacaGLOaGaayzkaaGaaiilaaaa@42CD@  où x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  est le revenu et a MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadggaaa a@39E3@  un nombre réel positif par exemple égal à ( | min k ( y k ) | + 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamaabmqaba GaaGPaVpaaemaabaWaaubeaeqaleaacaWGRbaabeGcbaGaciyBaiaa cMgacaGGUbaaaiaacIcacaWG5bWaaSbaaSqaaiaadUgaaeqaaOGaai ykaiaaykW7aiaawEa7caGLiWoacqGHRaWkcaaIXaaacaGLOaGaayzk aaaaaa@49DE@  dans le cas où l’on aurait des revenus négatifs ou nuls (en négligeant le fait que a MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadggaaa a@39E3@  serait estimé), on a que

F v ( v ) = P ( V v ) = P ( log ( Y + a ) v ) = P ( Y e v a ) = F y ( e v a ) , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAeada WgaaWcbaGaamODaaqabaGcdaqadeqaaiaadAhaaiaawIcacaGLPaaa cqGH9aqpieaacaWFqbWaaeWabeaatuuDJXwAK1uy0HwmaeHbfv3ySL gzG0uy0Hgip5wzaGqbaiab+vr8wjabgsMiJkaadAhaaiaawIcacaGL PaaacqGH9aqpcaWFqbWaaeWabeaaciGGSbGaai4BaiaacEgadaqade qaaiab+Hr8zjabgUcaRiaadggaaiaawIcacaGLPaaacqGHKjYOcaWG 2baacaGLOaGaayzkaaGaeyypa0Jaa8huamaabmqabaGae4hgXNLaey izImQaamyzamaaCaaaleqabaGaamODaaaakiabgkHiTiaadggaaiaa wIcacaGLPaaacqGH9aqpcaWGgbWaaSbaaSqaaiaadMhaaeqaaOWaae WabeaacaWGLbWaaWbaaSqabeaacaWG2baaaOGaeyOeI0IaamyyaaGa ayjkaiaawMcaaiaaiYcaaaa@7139@

V MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamrr1ngBPr wtHrhAXaqeguuDJXwAKbstHrhAG8KBLbacfaGae8xfXBfaaa@4468@  et Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamrr1ngBPr wtHrhAXaqeguuDJXwAKbstHrhAG8KBLbacfaGae8hgXNfaaa@446E@  seraient de variables aléatoires. Donc,

f v ( v ) = d F v ( v ) d v = d F y ( e v a ) d v = f y ( e v a ) e v . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAgada WgaaWcbaGaamODaaqabaGcdaqadeqaaiaadAhaaiaawIcacaGLPaaa cqGH9aqpdaWcaaqaaiaadsgacaWGgbWaaSbaaSqaaiaadAhaaeqaaO WaaeWabeaacaWG2baacaGLOaGaayzkaaaabaGaamizaiaadAhaaaGa eyypa0ZaaSaaaeaacaWGKbGaamOramaaBaaaleaacaWG5baabeaakm aabmqabaGaamyzamaaCaaaleqabaGaamODaaaakiabgkHiTiaadgga aiaawIcacaGLPaaaaeaacaWGKbGaamODaaaacqGH9aqpcaWGMbWaaS baaSqaaiaadMhaaeqaaOWaaeWabeaacaWGLbWaaWbaaSqabeaacaWG 2baaaOGaeyOeI0IaamyyaaGaayjkaiaawMcaaiaadwgadaahaaWcbe qaaiaadAhaaaGccaaIUaaaaa@5CCE@

Autrement dit f v ( v ) = f y ( x ) ( x + a ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadAgada WgaaWcbaGaamODaaqabaGcdaqadeqaaiaadAhaaiaawIcacaGLPaaa cqGH9aqpcaWGMbWaaSbaaSqaaiaadMhaaeqaaOWaaeWabeaacaWG4b aacaGLOaGaayzkaaWaaeWabeaacaWG4bGaey4kaSIaamyyaaGaayjk aiaawMcaaiaacYcaaaa@4849@  ce qui nous donne l’estimateur suivant de la densité en x : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaca GG6aaaaa@3AB8@

f ^ 2 ( x ) = f ^ v ( v ) x + a = f ^ y ( log ( x + a ) ) x + a .               ( 3.2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaaBaaaleaacaaIYaaabeaakmaabmqabaGaamiEaaGaayjkaiaa wMcaaiabg2da9maalaaabaGabmOzayaajaWaaSbaaSqaaiaadAhaae qaaOWaaeWabeaacaWG2baacaGLOaGaayzkaaaabaGaamiEaiabgUca RiaadggaaaGaeyypa0ZaaSaaaeaaceWGMbGbaKaadaWgaaWcbaGaam yEaaqabaGcdaqadeqaaiGacYgacaGGVbGaai4zamaabmqabaGaamiE aiabgUcaRiaadggaaiaawIcacaGLPaaaaiaawIcacaGLPaaaaeaaca WG4bGaey4kaSIaamyyaaaacaaIUaGaaeiiaiaabccacaqGGaGaaeii aiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGa GaaeiiamaabmaabaGaaG4maiaac6cacaaIYaaacaGLOaGaayzkaaaa aa@6162@

L’estimation de la densité en x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  de Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamrr1ngBPr wtHrhAXaqeguuDJXwAKbstHrhAG8KBLbacfaGae8hgXNfaaa@446E@  peut donc s’évaluer en estimant celle du logarithme de la variable, divisée par la valeur de la variable au point qui nous intéresse. La propriété reste valable en population finie. Le fait de passer par le logarithme a l’avantage de diminuer l’effet levier exercé par les grandes valeurs des revenus dans le calcul de l’approximation de la densité par noyau. Les simulations montrent que cette méthode très simple réduit fortement le biais.

3.2 Plus proches voisins avec largeur de bande minimale

Deville (2000) esquisse une autre manière du type « plus proches voisins » (voir Silverman 1986) d’estimer la densité en utilisant le noyau

K D ( u ) = { 1 b a si a u < b 0 sinon , , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadUeada WgaaWcbaGaamiraaqabaGcdaqadeqaaiaadwhaaiaawIcacaGLPaaa cqGH9aqpdaGabaqaauaabaqGciaaaeaadaWcaaqaaiaaigdaaeaaca WGIbGaeyOeI0IaamyyaaaaaeaacaqGZbGaaeyAaiaaysW7caaMi8Ua amyyaiabgsMiJkaadwhacaaMe8UaaeipaiaaysW7caWGIbaabaGaaG imaaqaaiaabohacaqGPbGaaeOBaiaab+gacaqGUbGaaGilaaaaaiaa wUhaaiaaiYcaaaa@57B4@

avec u = y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadwhacq GH9aqpcaWG5bWaaSbaaSqaaiaadUgaaeqaaaaa@3D17@  et où le choix de a MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadggaaa a@39E3@  et b , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadkgaca GGSaaaaa@3A94@  vérifiant x [ a , b ] , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhacq GHiiIZdaWadaqaaiaadggacaGGSaGaamOyaaGaay5waiaaw2faaiaa cYcaaaa@409D@  reste à déterminer et pourrait dépendre de x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaca GGUaaaaa@3AAC@  La distance ( b a ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamaabmqaba GaamOyaiabgkHiTiaadggaaiaawIcacaGLPaaaaaa@3D41@  représente la largeur de bande h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgaca GGUaaaaa@3A9C@  L’estimation de la densité vaudrait alors

f ^ D ( x , a , b ) = 1 N ^ k S K D ( y k ) = 1 N ^ k S w k 1 b a 1 y k [ a , b [                  ( 3.3 ) = F ^ y ( b ) F ^ y ( a ) b a , x [ a , b [ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaqbaeaabmWaaa qaaiqadAgagaqcamaaBaaaleaacaWGebaabeaakmaabmqabaGaamiE aiaaiYcacaWGHbGaaGilaiaadkgaaiaawIcacaGLPaaaaeaacqGH9a qpaeaadaWcaaqaaiaaigdaaeaaceWGobGbaKaaaaWaaabuaeqaleaa caWGRbGaeyicI4Saam4uaaqab0GaeyyeIuoakiaaykW7caWGlbWaaS baaSqaaiaadseaaeqaaOWaaeWabeaacaWG5bWaaSbaaSqaaiaadUga aeqaaaGccaGLOaGaayzkaaaabaaabaGaeyypa0dabaWaaSaaaeaaca aIXaaabaGabmOtayaajaaaamaaqafabeWcbaGaam4AaiabgIGiolaa dofaaeqaniabggHiLdGccaaMc8Uaam4DamaaBaaaleaacaWGRbaabe aakmaalaaabaGaaGymaaqaaiaadkgacqGHsislcaWGHbaaaiaahgda daWgaaWcbaGaamyEamaaBaaabaGaam4AaaqabaGaeyicI48aaKGiae aacaWGHbGaaGilaiaadkgaaiaawUfacaGLBbaaaeqaaOGaaeiiaiaa bccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaae iiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiamaabmaabaGa aG4maiaac6cacaaIZaaacaGLOaGaayzkaaaabaaabaGaeyypa0daba WaaSaaaeaaceWGgbGbaKaadaWgaaWcbaGaamyEaaqabaGcdaqadeqa aiaadkgaaiaawIcacaGLPaaacqGHsislceWGgbGbaKaadaWgaaWcba GaamyEaaqabaGcdaqadeqaaiaadggaaiaawIcacaGLPaaaaeaacaWG IbGaeyOeI0IaamyyaaaacaaISaGaamiEaiabgIGiopaajicabaGaam yyaiaaiYcacaWGIbaacaGLBbGaay5waaaaaaaa@88E5@

avec F ^ y ( x ) = k S w k 1 y k x / N ^ . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAeaga qcamaaBaaaleaacaWG5baabeaakmaabmqabaGaamiEaaGaayjkaiaa wMcaaiabg2da9maaqababaGaam4DamaaBaaaleaacaWGRbaabeaakm aalyaabaGaaCymamaaBaaaleaacaWG5bWaaSbaaeaacaWGRbaabeaa cqGHKjYOcaWG4baabeaaaOqaaiqad6eagaqcaaaaaSqaaiaadUgacq GHiiIZcaWGtbaabeqdcqGHris5aOGaaiOlaaaa@4D50@

Notons que l’estimation de la densité (3.3) n’est pas une fonction continue et qu’elle ne serait pas très adaptée pour estimer des valeurs de densité à l’extrémité des queues de la distribution. Puisque nos travaux ne reposent pas trop sur les queues de la distribution, nous considérons cette approche comme une option.

Notre deuxième proposition d’estimation de la densité en x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  s’inspire de l’idée ci-dessus. Elle est du type « plus proches voisins », mais impose aussi une largeur de bande minimale : notre méthode impose d’utiliser au minimum les p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchaaa a@39F2@  plus proches observations du point x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  tout en imposant une largeur de bande minimale h ( p ) h opt MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAaiaacI cacaWGWbGaaiykaiabgwMiZkaadIgadaWgaaWcbaGaae4Baiaabcha caqG0baabeaaaaa@3EEA@  où

h opt = 0,9 min ( σ ^ , Q ^ 75 Q ^ 25 ) 1,34 N ^ 5 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgada WgaaWcbaGaae4BaiaabchacaqG0baabeaakiabg2da9maalaaabaGa aeimaiaabYcacaqG5aGaciyBaiaacMgacaGGUbGaaiikaiqbeo8aZz aajaGaaGilaiqadgfagaqcamaaBaaaleaacaaI3aGaaGynaaqabaGc cqGHsislceWGrbGbaKaadaWgaaWcbaGaaGOmaiaaiwdaaeqaaOGaai ykaaqaaiaabgdacaqGSaGaae4maiaabsdadaGcbaqaaiqad6eagaqc aaWcbaGaaGynaaaaaaaaaa@5190@

est la règle empirique (rule of thumb) de Silverman (1986) pour déterminer la largeur de la bande. Cette valeur est aussi utilisée par défaut par la fonction R density pour la largeur de la bande si rien n’est spécifié. Cette solution est plus robuste que (3.1) et évite les problèmes que l’on rencontre lorsque plusieurs valeurs y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaam4Aaaqabaaaaa@3B17@  sont très proches les unes des autres, ce qui arrive fréquemment parce que les personnes interrogées ont tendance à arrondir leur revenu.

Les valeurs y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@3BD1@   k = 1 , ... , n , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadUgacq GH9aqpcaaIXaGaaiilaiaac6cacaGGUaGaaiOlaiaacYcacaWGUbGa aiilaaaa@40C7@  étant supposées ordonnées par leur rang, la largeur h ( p ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgada qadeqaaiaadchaaiaawIcacaGLPaaaaaa@3C69@  de la fenêtre autour de x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  est initialement déterminée par les p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchaaa a@39F2@  plus proches observations, avec p n . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiCaiablQ Mi9iaad6gacaGGUaaaaa@39E8@  Pour les simulations exposées dans la section suivante, après différents essais, le p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchaaa a@39F2@  initial a été fixé à 30. On impute comme densité en x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  la densité estimée au point observé y j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaamOAaaqabaaaaa@3B16@  le plus proche inférieur ou égal à x , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaca GGSaaaaa@3AAA@  c’est-à-dire j = max ( k | y k x ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadQgacq GH9aqpciGGTbGaaiyyaiaacIhadaqadeqaamaaeiaabaGaam4Aaiaa ykW7aiaawIa7aiaadMhadaWgaaWcbaGaam4AaaqabaGccqGHKjYOca WG4baacaGLOaGaayzkaaGaaiilaaaa@48E7@   k = 1 , ... , n . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4Aaiabg2 da9iaaigdacaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaad6gacaGG Uaaaaa@3DC0@  La largeur de la bande en x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  dépendra en fait des p j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchada WgaaWcbaGaamOAaaqabaaaaa@3B0D@  plus proches observations autour de y j , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaamOAaaqabaGccaGGSaaaaa@3BD0@  avec p j p . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa aaleaacaWGQbaabeaakiabgwMiZkaadchacaGGUaaaaa@3B7B@  On la désignera par h ( p j ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgada qadeqaaiaadchadaWgaaWcbaGaamOAaaqabaaakiaawIcacaGLPaaa aaa@3D8E@  dans la suite pour le rappeler. La densité n’est donc estimée qu’en des points observés sans qu’un lissage ou une interpolation soient menés entre les f ^ ( y j ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaabmqabaGaamyEamaaBaaaleaacaWGQbaabeaaaOGaayjkaiaa wMcaaiaac6caaaa@3E57@  L’algorithme pour estimer f ^ ( y j ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaabmqabaGaamyEamaaBaaaleaacaWGQbaabeaaaOGaayjkaiaa wMcaaaaa@3DA5@  est le suivant (voir aussi Figure 3.1) :

Figure 3.1

1. La largeur initiale de la fenêtre autour du point y j , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaamOAaaqabaGccaGGSaaaaa@3BD0@  avec p j = p , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchada WgaaWcbaGaamOAaaqabaGccqGH9aqpcaWGWbGaaiilaaaa@3DC2@  est définie par :

h( p j )= y u + y u+1 2 y + y 1 2 ; u = { j+ p j / 21 si p j estpair j+ p j /2 si p j estimpair = j p j /2 . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGacaGaaiaabeqaamaabaabaaGcbaqbaeaabiabaq aabaGaamiAamaabmqabaGaamiCamaaBaaaleaacaWGQbaabeaaaOGa ayjkaiaawMcaaiabg2da9maalaaabaGaamyEamaaBaaaleaacaWG1b aabeaakiabgUcaRiaadMhadaWgaaWcbaGaamyDaiabgUcaRiaaigda aeqaaaGcbaGaaGOmaaaacqGHsisldaWcaaqaaiaadMhadaWgaaWcba GaeS4eHWgabeaakiabgUcaRiaadMhadaWgaaWcbaGaeS4eHWMaeyOe I0IaaGymaaqabaaakeaacaaIYaaaaiaacUdaaeaacaWG1baabaGaey ypa0dabaWaaiqaaeaafaqaaeOacaaabaGaamOAaiabgUcaRmaalyaa baGaamiCamaaBaaaleaacaWGQbaabeaaaOqaaiaaikdacqGHsislca aIXaaaaaqaaiaabohacaqGPbGaaGPaVlaaykW7caWGWbWaaSbaaSqa aiaadQgaaeqaaOGaaGPaVlaaykW7caqGLbGaae4CaiaabshacaaMc8 UaaGPaVlaabchacaqGHbGaaeyAaiaabkhaaeaacaWGQbGaey4kaSYa ayWaaeaadaWcgaqaaiaadchadaWgaaWcbaGaamOAaaqabaaakeaaca aIYaaaaaGaayj84laawUp+aaqaaiaabohacaqGPbGaaGPaVlaaykW7 caWGWbWaaSbaaSqaaiaadQgaaeqaaOGaaGPaVlaaykW7caqGLbGaae 4CaiaabshacaaMc8UaaGPaVlaabMgacaqGTbGaaeiCaiaabggacaqG PbGaaeOCaaaaaiaawUhaaaqaaaqaaiabloriSbqaaiabg2da9aqaai aadQgacqGHsisldaGbdaqaamaalyaabaGaamiCamaaBaaaleaacaWG QbaabeaaaOqaaiaaikdaaaaacaGLWJVaay5+4dGaaGOlaaaaaaa@973B@

2. Si la largeur de fenêtre h ( p j ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgaca GGOaGaamiCamaaBaaaleaacaWGQbaabeaakiaacMcaaaa@3D5D@  ainsi obtenue est inférieure à h o p t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgada WgaaWcbaGaam4BaiaadchacaWG0baabeaaaaa@3CF8@ , on incrémente les deux bornes :

borne supérieure : u u + 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadwhacq GHsgIRcaWG1bGaey4kaSIaaGymaiaacYcaaaa@3F2B@  tant que u < n , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadwhaca aMe8UaaeipaiaaysW7caWGUbGaaiilaaaa@3F73@

borne inférieure : l l 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadYgacq GHsgIRcaWGSbGaeyOeI0IaaGymaiaacYcaaaa@3F24@  tant que l > 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadYgaca aMe8UaaeOpaiaaysW7caaIXaGaaiilaaaa@3F34@

ce qui implique p j p j + 2 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchada WgaaWcbaGaamOAaaqabaGccqGHsgIRcaWGWbWaaSbaaSqaaiaadQga aeqaaOGaey4kaSIaaGOmaiaacYcaaaa@416C@ nbsp;sauf si u = n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadwhacq GH9aqpcaWGUbaaaa@3BF0@  ou l = 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadYgacq GH9aqpcaaIXaGaaiilaaaa@3C5F@  on n’a alors plus le même nombre de points à gauche et à droite de y j . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadMhada WgaaWcbaGaamOAaaqabaGccaGGUaaaaa@3BD2@

3. Répéter 2 jusqu’à ce que h ( p j ) h opt . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgada qadeqaaiaadchadaWgaaWcbaGaamOAaaqabaaakiaawIcacaGLPaaa cqGHLjYScaWGObWaaSbaaSqaaiaab+gacaqGWbGaaeiDaaqabaGcca GGUaaaaa@4405@

4.   La densité estimée en x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIhaaa a@39FA@  est alors donnée par

f ^ ( x ) = f ^ ( y j ) = { p j n h ( p j ) sans pondération , p j plus proche s de y j w j std n h ( p j ) avec pondération , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaabmqabaGaamiEaaGaayjkaiaawMcaaiabg2da9iqadAgagaqc amaabmqabaGaamyEamaaBaaaleaacaWGQbaabeaaaOGaayjkaiaawM caaiabg2da9maaceaabaqbaeaabkGaaaqaamaalaaabaGaamiCamaa BaaaleaacaWGQbaabeaaaOqaaiaad6gacaWGObWaaeWabeaacaWGWb WaaSbaaSqaaiaadQgaaeqaaaGccaGLOaGaayzkaaaaaaqaaiaaboha caqGHbGaaeOBaiaabohacaaMe8UaaeiCaiaab+gacaqGUbGaaeizai aabMoacaqGYbGaaeyyaiaabshacaqGPbGaae4Baiaab6gacaaISaaa baWaaSaaaeaadaaeqbqabSqaaiaadchadaWgaaqaaiaadQgaaeqaai aaykW7caaMc8UaaeiCaiaabYgacaqG1bGaae4CaiaaykW7caaMc8Ua aeiCaiaabkhacaqGVbGaae4yaiaabIgacaqGLbGaaGPaVlaaykW7ca WGZbGaaGPaVlaaykW7caqGKbGaaeyzaiaaykW7caaMc8UaamyEamaa BaaabaGaamOAaaqabaaabeqdcqGHris5aOGaam4DamaaDaaaleaaca WGQbaabaGaae4CaiaabshacaqGKbaaaaGcbaGaamOBaiaadIgadaqa deqaaiaadchadaWgaaWcbaGaamOAaaqabaaakiaawIcacaGLPaaaaa aabaGaaeyyaiaabAhacaqGLbGaae4yaiaaysW7caqGWbGaae4Baiaa b6gacaqGKbGaaey6aiaabkhacaqGHbGaaeiDaiaabMgacaqGVbGaae OBaiaaiYcaaaaacaGL7baaaaa@9A6E@

avec les poids standardisés w k std = w k / w ¯ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadEhada qhaaWcbaGaam4AaaqaaiaabohacaqG0bGaaeizaaaakiabg2da9maa lyaabaGaam4DamaaBaaaleaacaWGRbaabeaaaOqaaiqadEhagaqeaa aacaGGSaaaaa@42F6@   k = 1 , ... , n . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadUgacq GH9aqpcaaIXaGaaiilaiaac6cacaGGUaGaaiOlaiaacYcacaWGUbGa aiOlaaaa@40C9@

Le nombre d’observations p j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadchada WgaaWcbaGaamOAaaqabaaaaa@3B0D@  prises en compte pour le calcul peut varier et dépend de la courbure locale de la fonction de répartition empirique. La condition h ( p j ) h o p t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiaadIgaca GGOaGaamiCamaaBaaaleaacaWGQbaabeaakiaacMcacqGHLjYScaWG ObWaaSbaaSqaaiaad+gacaWGWbGaamiDaaqabaaaaa@431E@  garantit une largeur de fenêtre minimale aux endroits où beaucoup d’observations seraient concentrées sur un petit intervalle. On rend la procédure encore plus solide en combinant cette approche avec la précédente, c’est-à-dire en estimant la densité du logarithme de la variable divisée par sa valeur (non logarithmisée) :

f ^ 3 ( x ) = f ^ ( log ( x + a ) ) x + a .               ( 3.4 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaaiqadAgaga qcamaaBaaaleaacaaIZaaabeaakmaabmqabaGaamiEaaGaayjkaiaa wMcaaiabg2da9maalaaabaGabmOzayaajaWaaeWabeaacaqGSbGaae 4BaiaabEgadaqadaqaaiaadIhacqGHRaWkcaWGHbaacaGLOaGaayzk aaaacaGLOaGaayzkaaaabaGaamiEaiabgUcaRiaadggaaaGaaGOlai aabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGa aeiiaiaabccacaqGGaGaaeiiaiaabccadaqadaqaaiaaiodacaGGUa GaaGinaaGaayjkaiaawMcaaaaa@579F@

3.3 Robustesse de la linéarisée

Comme mentionné plus haut, dans le cas de la médiane ou pour les autres quantiles, Croux (1998) relève que la fonction d’influence empirique ou linéarisée estimée à partir de l’échantillon n’est pas aussi robuste qu’il n’y paraît, même si l’on connaît la fonction de densité. Nous avons vérifié cela pour les données SILC utilisées dans les simulations modélisées avec une loi Bêta Généralisée de seconde espèce (GB2) grâce à la fonction profml.gb2 de R (Graf et Nedyalkova 2011). Sur de petits échantillons ( n 100 ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamaabmaaba GaamOBaiabgsMiJkaaigdacaaIWaGaaGimaaGaayjkaiaawMcaaiaa cYcaaaa@400D@  le biais potentiel de la linéarisée engendré par un trop grand nombre de valeurs extrêmes peut aussi biaiser l’estimation de la variance calculée à partir de cette dernière. Pour de plus grands échantillons ( n 1 000 ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9peuD0xXdbvk9qq=xd9qqaq=Jf9sr 0=vr0=vrWZqaaeaabiGaaiaacaqabeaadaqaaqaaaOqaamaabmaaba GaamOBaiabgwMiZkaaigdacaaMe8UaaGimaiaaicdacaaIWaaacaGL OaGaayzkaaGaaiilaaaa@4265@  un biais relatif maximal dans la variance estimée à l’aide de la linéarisée empirique vs. théorique peut atteindre jusqu’à 5 %. Il est cependant en-dessous du pourcent en valeur absolue dans les trois quarts des cas.

Précédent | Suivant

Date de modification :