Les bookmarks de Clem

terminology - Gamma distribution what is scale and rate - Cross Validated

Explication intéressante du rate dans la loi gamma. J'avais jamais vraiment saisi pourquoi tout le monde bossait avec le rate alors que le scale était vachement plus intuitif. En fait c'est que je ne l'utilise pas dans le contexte où le rate a du sens.

En gros, une somme de n lois exponentielles de paramètre d'intensité beta, c'est une loi gamma de forme n et de rate beta.

stats

September 22, 2025 at 14:24:53 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/489998/gamma-distribution-what-is-scale-and-rate

[R] lmer, p-values and all that

Sur les degrés de liberté dans un lmer

stats

March 3, 2025 at 11:08:35 GMT+1 · permalink

·

https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html

Marginal effect zoo

Un livre intéressant

stats

December 2, 2024 at 12:22:25 GMT+1 · permalink

·

https://marginaleffects.com/

xkcd: Y-Axis

Toujours utile à garder sous le coude pour le ressortir au bon moment.

stats

November 21, 2024 at 17:20:01 GMT+1 · permalink

·

https://xkcd.com/2023/

StatsRef.com | The sourcebook for statistics

Ressource intéressante.

stats

November 8, 2024 at 16:26:00 GMT+1 · permalink

·

https://www.statsref.com/

generalized linear model - difference between GLM covariance matrix from MLE vs. IRLS for non-canonical link - Cross Validated

Sur le calcul des matrices de covariances avec le GLM sous R.

stats

July 31, 2024 at 16:41:20 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/635130/difference-between-glm-covariance-matrix-from-mle-vs-irls-for-non-canonical-lin

markov chain montecarlo - Fisher information vs Posterior Covariance - Cross Validated

Un post intéressant indiquant que la matrice d'information de Fisher est identique à la covariance de la posterior. Mais qui ne donne aucune référence sur ce point. Autant je comprends pourquoi l'estimation MAP est identique à la MLE quand les priors sont uniformes impropres, autant je ne comprends pas pourquoi la matrice d'information de Fisher devrait être théoriquement égale à la covariance de la posterior. C'est vraiment un point que j'aimerais résoudre, parce que ça m'arrangerait pas mal pour l'étude que je suis en train de faire.

stats

July 24, 2024 at 21:34:18 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/605657/fisher-information-vs-posterior-covariance

Case Study III: Model Selection Uncertainty

Sur l'usage du bootstrap avec validate et ols pour disposer d'une mesure d'incertitude sur le modèle dans la procédure de model selection (comparaison avec stepAIC).

stats

June 7, 2024 at 10:27:50 GMT+2 · permalink

·

https://conservancy.umn.edu/server/api/core/bitstreams/16468fbc-dbd9-4b8d-a5c7-09d03ff48821/content

Hey—let’s collect all the stupid things that researchers say in order to deflect legitimate criticism | Statistical Modeling, Causal Inference, and Social Science

Intéressant

stats

March 27, 2024 at 13:40:27 GMT+1 · permalink

·

https://statmodeling.stat.columbia.edu/2024/03/23/hey-lets-collect-all-the-stupid-things-that-researchers-say-in-order-to-deflect-legitimate-criticism/

Bayes Rules! An Introduction to Applied Bayesian Modeling

Bouquin de bayésien

stats

March 19, 2024 at 14:41:18 GMT+1 · permalink

·

https://www.bayesrulesbook.com/

lognormal distribution - Variance of $X$ and Variance of $\log(X)$. How to relate them? - Cross Validated

À garder ssous le coude

stats

November 21, 2023 at 22:01:38 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/418313/variance-of-x-and-variance-of-logx-how-to-relate-them

When/is it ever appropriate to use the mean of ratios? : AskStatistics

Autre idée intéressante. Oui au fond, tout dépend de ce pour quoi on estime un rapport, je sais pas pourquoi je me prends la tête comme ça. Du coup, avec le précédent, ça résout mon pb...

stats

July 27, 2023 at 09:05:26 GMT+2 · permalink

·

https://www.reddit.com/r/AskStatistics/comments/m7h8et/whenis_it_ever_appropriate_to_use_the_mean_of/

NLMR is not available on CRAN anymore · Issue #95 · ropensci/NLMR · GitHub

RandomFields(utils) n'est plus maintenu, c'est officiel. Le message de l'auteur/mainteneur:

Dear Users of RandomFields(Utils),

it is a de facto decision of CRAN that CRAN does not support any
further updates of the auxiliary package RandomFieldsUtils since April 2022.
So, I do not have any hope that a new version of RandomFields will be accepted by CRAN, eventually.

The future of my R packages is very unclear. The currently most likely scenario is to put the latest versions on github and to move to Julia for future programming.

Many thanks to you, Kurt and Uwe for the great support the past years.

Best,
Martin

stats

March 28, 2023 at 08:50:45 GMT+2 · permalink

·

https://github.com/ropensci/NLMR/issues/95

Chernoff bound - Wikipedia

Intéressant. Si on a une variable X qui est une somme d'autres variables, on peut s'appuyer là dessus pour en faire un IC assez étroit. Meilleur que Vysochanskij-Petunin, à garder sous le coude.

stats

March 17, 2023 at 12:11:52 GMT+1 · permalink

·

https://en.wikipedia.org/wiki/Chernoff_bound

A paper used capital T’s instead of error bars. But wait, there’s more! – Retraction Watch

C'est génial

marrant · stats

December 14, 2022 at 21:40:09 GMT+1 · permalink

·

https://retractionwatch.com/2022/12/05/a-paper-used-capital-ts-instead-of-error-bars-but-wait-theres-more/

Relationship between poisson and exponential distribution - Cross Validated

Démonstration limpide de la distribution exponentielle pour les waiting times sur un processus de Poisson

stats

September 29, 2022 at 20:46:40 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/2092/relationship-between-poisson-and-exponential-distribution

If Nothing Goes Wrong, Is Everything All Right? Interpreting Zero Numerators | JAMA | JAMA Network

Article intéressant: si j'échantillonne n individus et que je ne trouve aucun positif, quel est le risque maximum d'être positif ? Règle ici: on a 95% de chances que le risque soit inférieur à n/3 -- et en suivant le même raisonnement qu'eux, 86% de chances que le risque soit inférieur à n/2.

Logique : On cherche une confiance à 95% donc un niveau de confiance à 0.05. Du coup, on cherche 0.05^(1/n), ce qui correspond grosso modo à -log(0.05)/n ~= 3/n.

Vérif sous R:

set.seed(777)
n <- 10:100
p <- seq(0,0.5, length=1000)
g <- sapply(n, function(ni) {
m0 <- sapply(p, function(y) {
rb <- rbinom(100, prob=y, size=ni)
})
cs <- colSums(m0==0)
css <- cumsum(cs)/sum(cs)
p[max(c(1:length(css))[css<0.95])]
})
plot(n,g, xlab="Taille d'échantillon",
ylab="Prévalence correspondant à 95% des zéros")
lines(n, 3/n, col="red", lwd=2)

Vérif maths. On considère la série:
$$
\sum_{k=0} (z^k)/(k!) = \exp(z)
$$
On définit $z = \log(0.05)/n$, ce qui nous permet d'étendre $\exp z =
0.05^{1/n}$ de la façon suivante:
$$
0.05^{1/n} = \sum_{k=0} \frac{(log(0.05)^k)}{n^k k!}
$$
Si $n$ suffisamment grand, on arrondit à:
$$
0.05^{1/n} \approx \log(0.05)/n
$$
et $log(0.05) \approx 3$
En suivant le même raisonnement, si l'on fixe un intervalle à 86\%,
alors le seuil est à 2/n.

stats

August 12, 2022 at 14:06:26 GMT+2 · permalink

·

https://jamanetwork.com/journals/jama/article-abstract/385438

At last! Incontrovertible evidence (p=0.0001) that people over 40 are older, on average, than people under 40. | Statistical Modeling, Causal Inference, and Social Science

gold.

marrant · stats

August 8, 2022 at 16:20:08 GMT+2 · permalink

·

https://statmodeling.stat.columbia.edu/2022/08/08/at-last-incontrovertible-evidence-that-people-over-40-are-older-on-average-than-people-under-40/

machine learning - How to understand the drawbacks of K-means - Cross Validated

Intéressant

stats

June 14, 2022 at 21:43:04 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/133656/how-to-understand-the-drawbacks-of-k-means

Christoph Molnar sur Twitter : "Why I hate cluster analysis • What's longer, 10 cm 📏, 200$ 💵or a change from red to blue 🎨??? • What's the difference between banana 🍌 and apple 🍎? What about lemon 🍋? • How many clusters? 2? 10? 100? I can do whatever you want, no wrong answers!" / Twitter

À garder sous le coude, ya des choses intéressantes dans la réponse.

stats

June 9, 2022 at 13:40:25 GMT+2 · permalink

·

https://twitter.com/ChristophMolnar/status/1534095622638272513

R torch for statistics (not just machine learning). | Ryan Giordano, statistician.

dérivée automatic (automatic differenciation) sous R

stats

May 11, 2022 at 11:06:41 GMT+2 · permalink

·

https://rgiordan.github.io/code/2022/04/01/rtorch_example.html

Reaction time distributions: an interactive overview

La shifted lognormal pour la distribution des temps de réaction

stats

January 27, 2022 at 21:10:35 GMT+1 · permalink

·

https://lindeloev.shinyapps.io/shiny-rt/

distributions - The sum of independent lognormal random variables appears lognormal? - Cross Validated

La somme de log-normale s'approche par une distribution de log-normale. Bidouillage, mais intéressant

stats

January 26, 2022 at 12:13:17 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/238529/the-sum-of-independent-lognormal-random-variables-appears-lognormal

CDC Announces Plan To Send Every U.S. Household Pamphlet On Probabilistic Thinking

À garder sous le coude pour les formations

marrant · stats

January 19, 2022 at 10:53:04 GMT+1 · permalink

·

https://www.theonion.com/cdc-announces-plan-to-send-every-u-s-household-pamphle-1848354068

Restarting NIMBLE MCMC

Comment redémarrer un MCMC. Et définir son propre sampler avec Nimble

bayésien · mcmc · stats

November 23, 2021 at 12:09:25 GMT+1 · permalink

·

http://danielturek.github.io/public/restartingMCMC/restartingMCMC.html

Le paradoxe de Simpson illustré par des données de vaccination contre le Covid-19

Jolie explication du paradoxe de Simpson

stats

November 6, 2021 at 13:25:00 GMT+1 · permalink

·

https://theconversation.com/le-paradoxe-de-simpson-illustre-par-des-donnees-de-vaccination-contre-le-covid-19-170159

Poisson Process: The Limiting Case of the Bernoulli Process

Le processus de Poisson est une version continue du processus de Bernoulli.

stats

October 24, 2021 at 21:54:08 GMT+2 · permalink

·

https://stephens999.github.io/fiveMinuteStats/bernoulli_poisson_process.html

ELFI - Engine for Likelihood-Free Inference — ELFI 0.8.0 documentation

M'a l'air intéressant pour l'ABC.

ABC · stats

October 14, 2021 at 13:40:14 GMT+2 · permalink

·

https://elfi.readthedocs.io/en/latest/

Un garçon pas comme les autres (Bayes): $(n-1)$-sane in the membrane

Diviser par (n-1) dans le calcul de la variance permet de corriger un biais dans l'estimation de la variance de la population. Mais un biais tellement faible que c'est peanuts, et qu'il n'y a quasi-aucun cas de figure dans lequel la correction de ce biais pourrait se révéler utile. Simpson résume bien :

The n vs (n−1) denominator for a variance estimator is a curiosity. It is the source of thrilling (Not thrilling) exercises or exam questions. But it is not interesting.

It could maybe set up the idea that MLEs are not unbiased. But even then, the useless correction term is not needed. Just let it be slightly biased and move on with your life.

Because if that is the biggest bias in your analysis, you are truly blessed."

Amen.

stats

October 10, 2021 at 22:16:34 GMT+2 * · permalink

·

https://dansblog.netlify.app/posts/2021-10-11-n-sane-in-the-membrane/

variance - Finding a right way of sampling 1/X knowing X follows the Moschopoulos distribution (sum of Gamma distribution with different (shape/rate parameters) - Cross Validated

La distribution de Moschopoulos est une somme de gamma avec des paramètres différents.

stats

October 8, 2021 at 09:59:53 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/545053/finding-a-right-way-of-sampling-1-x-knowing-x-follows-the-moschopoulos-distribut

Linear model with log-transformed response vs. generalized linear model with log link - Cross Validated

Modéliser le log de la moyenne d'une variable log-normale, n'est pas la même chose que de modéliser la moyenne du log d'une variable log-normale. Une bonne explication.

stats

October 5, 2021 at 13:35:55 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/47840/linear-model-with-log-transformed-response-vs-generalized-linear-model-with-log/48679#48679

Interpreting elpd_diff - loo package - Modeling - The Stan Forums

Interpréter les différences de SE quand on utilise elpd. Super réponse de Vehtari.

elpd · stats

June 11, 2021 at 12:53:20 GMT+2 · permalink

·

https://discourse.mc-stan.org/t/interpreting-elpd-diff-loo-package/1628

The Kalman Filter

Une explication du filtre de Kalman

stats

April 23, 2021 at 18:07:12 GMT+2 · permalink

·

http://www.cs.unc.edu/~welch/kalman/

Parallelized loops with R | Blas M. Benito

À garder sous le coude.

info · R · stats

March 25, 2021 at 10:32:15 GMT+1 · permalink

·

https://www.blasbenito.com/post/02_parallelizing_loops_with_r/

Count transformation models - Siegfried - - Methods in Ecology and Evolution - Wiley Online Library

À lire

alire · stats

April 17, 2020 at 08:26:10 GMT+2 * · permalink

·

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13383?af=R

How post-hoc power calculation is like a shit sandwich « Statistical Modeling, Causal Inference, and Social Science

Super intéressant

gelman · stats

June 20, 2019 at 08:26:10 GMT+2 · permalink

·

https://statmodeling.stat.columbia.edu/2019/01/13/post-hoc-power-calculation-like-shit-sandwich/

[1312.6536] Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm

Modèles de survie spatio-temporels

stats · survie

April 5, 2019 at 14:56:08 GMT+2 · permalink

·

https://arxiv.org/abs/1312.6536

Linking bacterial populations with health | Stanford News

There’s been a movement which has said that most research is wrong. It’s making people feel they’re doing something wrong, but that’s not the problem. The problem is that the publication system pushes you because you can only publish if you get a good, that is, small, p-value [a statistical test that indicates whether results could be due to chance]. Researchers then massage the data until they get the p-value and then it’s not reproducible. But if we were much more transparent and said, “You’re allowed to publish things which are significant or not significant because it’s useful down the road and just publish all your data and the code you used for the analyses” – if you’re transparent about what you’re doing, there’s much less opportunity to shoehorn the data into some wrong conclusion.

I feel that people misuse summaries in statistics. They feel as if statistics is going to summarize everything into one value, as if one p-value is going to summarize five years of work. It’s ridiculous. Everything is multidimensional, it’s complex. But if we could publish more of the negative results and all of the data, we would advance science much faster, because people would get insight from the negative results.

stats

March 6, 2019 at 10:48:02 GMT+1 · permalink

·

https://news.stanford.edu/2019/03/01/linking-bacterial-populations-health/

Rank-normalized split-Rhat and effective sample size

Intéressant.

bayésien · mcmc · stats

March 4, 2019 at 17:21:47 GMT+1 · permalink

·

https://avehtari.github.io/rhat_ess/rhat_ess.html

Marchenko–Pastur distribution - Wikipedia

Intéressant

stats

March 4, 2019 at 09:19:31 GMT+1 · permalink

·

https://en.wikipedia.org/wiki/Marchenko%E2%80%93Pastur_distribution

Calculate the standard error of any function using the delta method |

Très intéressant !

stats

January 8, 2019 at 11:26:46 GMT+1 · permalink

·

https://oliviergimenez.github.io/post/delta-method/

Standard errors for lasso prediction using R - Cross Validated

L'estimation d'erreurs types pour les régressions lasso, c'est visiblement un sacré bordel.

lasso · stats

November 7, 2018 at 09:58:49 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/91462/standard-errors-for-lasso-prediction-using-r

poisson distribution - How does glmnet handle overdispersion? - Cross Validated

Super explication de pourquoi on se fout de la surdispersion en régression lasso.

stats

November 7, 2018 at 09:53:53 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/101031/how-does-glmnet-handle-overdispersion

glmnet: maximum de vraisemblance pénalisée

La vignette est sympa

stats

November 7, 2018 at 09:52:23 GMT+1 · permalink

·

https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html

Overdispersed Poisson et bootstrap | Freakonometrics

Intéressant: il utilise la loi gamma pour simuler une quasi-poisson. Malheureusement, ses liens ne marchent pas. Mais je trouve l'idée intéressante, je me la garde sous le coude.

stats

November 4, 2018 at 11:45:09 GMT+1 · permalink

·

https://freakonometrics.hypotheses.org/5770

Manuel d’analyse spatiale | Insee

Oh mais il y a des choses ici !

geographie · stats

October 31, 2018 at 09:33:59 GMT+1 · permalink

·

https://www.insee.fr/fr/information/3635442

Significance: Vol 15, No 5

Spécial stats au tribunal. M'a l'air intéressant...

stats

October 5, 2018 at 13:55:58 GMT+2 · permalink

·

https://rss.onlinelibrary.wiley.com/toc/17409713/2018/15/5

Demystifying the Integrated Tail Probability Expectation Formula: The American Statistician: Vol 0, No 0

Un regard intéressant sur la notion d'espérance. Très axé math statistique, mais effectivement, ça donne une intuition de l'espérance comme différence de deux surfaces, que l'on peut calculer de deux façons différentes.

maths · stats

September 28, 2018 at 16:54:55 GMT+2 · permalink

·

https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1497541

Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA

À lire absolument. Aussi.
C'est dingue le nombre de bouquins de qualité, gratuits, qui sortent en ce moment.

alire · inla · stats

September 22, 2018 at 18:31:29 GMT+2 · permalink

·

https://becarioprecario.bitbucket.io/spde-gitbook/

Statistical Rethinking Fall 2017 - YouTube

Chaîne intéressante. À suivre.

bayesien · stats

September 7, 2018 at 11:44:45 GMT+2 · permalink

·

https://www.youtube.com/playlist?list=PLDcUM9US4XdM9_N6XUUFrhghGJ4K25bFc