Les bookmarks de Clem

Case Study III: Model Selection Uncertainty

Sur l'usage du bootstrap avec validate et ols pour disposer d'une mesure d'incertitude sur le modèle dans la procédure de model selection (comparaison avec stepAIC).

stats

June 7, 2024 at 10:27:50 GMT+2 · permalink

·

https://conservancy.umn.edu/server/api/core/bitstreams/16468fbc-dbd9-4b8d-a5c7-09d03ff48821/content

Hey—let’s collect all the stupid things that researchers say in order to deflect legitimate criticism | Statistical Modeling, Causal Inference, and Social Science

Intéressant

stats

March 27, 2024 at 13:40:27 GMT+1 · permalink

·

https://statmodeling.stat.columbia.edu/2024/03/23/hey-lets-collect-all-the-stupid-things-that-researchers-say-in-order-to-deflect-legitimate-criticism/

Bayes Rules! An Introduction to Applied Bayesian Modeling

Bouquin de bayésien

stats

March 19, 2024 at 14:41:18 GMT+1 · permalink

·

https://www.bayesrulesbook.com/

lognormal distribution - Variance of $X$ and Variance of $\log(X)$. How to relate them? - Cross Validated

À garder ssous le coude

stats

November 21, 2023 at 22:01:38 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/418313/variance-of-x-and-variance-of-logx-how-to-relate-them

When/is it ever appropriate to use the mean of ratios? : AskStatistics

Autre idée intéressante. Oui au fond, tout dépend de ce pour quoi on estime un rapport, je sais pas pourquoi je me prends la tête comme ça. Du coup, avec le précédent, ça résout mon pb...

stats

July 27, 2023 at 09:05:26 GMT+2 · permalink

·

https://www.reddit.com/r/AskStatistics/comments/m7h8et/whenis_it_ever_appropriate_to_use_the_mean_of/

NLMR is not available on CRAN anymore · Issue #95 · ropensci/NLMR · GitHub

RandomFields(utils) n'est plus maintenu, c'est officiel. Le message de l'auteur/mainteneur:

Dear Users of RandomFields(Utils),

it is a de facto decision of CRAN that CRAN does not support any
further updates of the auxiliary package RandomFieldsUtils since April 2022.
So, I do not have any hope that a new version of RandomFields will be accepted by CRAN, eventually.

The future of my R packages is very unclear. The currently most likely scenario is to put the latest versions on github and to move to Julia for future programming.

Many thanks to you, Kurt and Uwe for the great support the past years.

Best,
Martin

stats

March 28, 2023 at 08:50:45 GMT+2 · permalink

·

https://github.com/ropensci/NLMR/issues/95

Chernoff bound - Wikipedia

Intéressant. Si on a une variable X qui est une somme d'autres variables, on peut s'appuyer là dessus pour en faire un IC assez étroit. Meilleur que Vysochanskij-Petunin, à garder sous le coude.

stats

March 17, 2023 at 12:11:52 GMT+1 · permalink

·

https://en.wikipedia.org/wiki/Chernoff_bound

A paper used capital T’s instead of error bars. But wait, there’s more! – Retraction Watch

C'est génial

marrant · stats

December 14, 2022 at 21:40:09 GMT+1 · permalink

·

https://retractionwatch.com/2022/12/05/a-paper-used-capital-ts-instead-of-error-bars-but-wait-theres-more/

Relationship between poisson and exponential distribution - Cross Validated

Démonstration limpide de la distribution exponentielle pour les waiting times sur un processus de Poisson

stats

September 29, 2022 at 20:46:40 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/2092/relationship-between-poisson-and-exponential-distribution

If Nothing Goes Wrong, Is Everything All Right? Interpreting Zero Numerators | JAMA | JAMA Network

Article intéressant: si j'échantillonne n individus et que je ne trouve aucun positif, quel est le risque maximum d'être positif ? Règle ici: on a 95% de chances que le risque soit inférieur à n/3 -- et en suivant le même raisonnement qu'eux, 86% de chances que le risque soit inférieur à n/2.

Logique : On cherche une confiance à 95% donc un niveau de confiance à 0.05. Du coup, on cherche 0.05^(1/n), ce qui correspond grosso modo à -log(0.05)/n ~= 3/n.

Vérif sous R:

set.seed(777)
n <- 10:100
p <- seq(0,0.5, length=1000)
g <- sapply(n, function(ni) {
m0 <- sapply(p, function(y) {
rb <- rbinom(100, prob=y, size=ni)
})
cs <- colSums(m0==0)
css <- cumsum(cs)/sum(cs)
p[max(c(1:length(css))[css<0.95])]
})
plot(n,g, xlab="Taille d'échantillon",
ylab="Prévalence correspondant à 95% des zéros")
lines(n, 3/n, col="red", lwd=2)

Vérif maths. On considère la série:
$$
\sum_{k=0} (z^k)/(k!) = \exp(z)
$$
On définit $z = \log(0.05)/n$, ce qui nous permet d'étendre $\exp z =
0.05^{1/n}$ de la façon suivante:
$$
0.05^{1/n} = \sum_{k=0} \frac{(log(0.05)^k)}{n^k k!}
$$
Si $n$ suffisamment grand, on arrondit à:
$$
0.05^{1/n} \approx \log(0.05)/n
$$
et $log(0.05) \approx 3$
En suivant le même raisonnement, si l'on fixe un intervalle à 86\%,
alors le seuil est à 2/n.

stats

August 12, 2022 at 14:06:26 GMT+2 · permalink

·

https://jamanetwork.com/journals/jama/article-abstract/385438

At last! Incontrovertible evidence (p=0.0001) that people over 40 are older, on average, than people under 40. | Statistical Modeling, Causal Inference, and Social Science

gold.

marrant · stats

August 8, 2022 at 16:20:08 GMT+2 · permalink

·

https://statmodeling.stat.columbia.edu/2022/08/08/at-last-incontrovertible-evidence-that-people-over-40-are-older-on-average-than-people-under-40/

machine learning - How to understand the drawbacks of K-means - Cross Validated

Intéressant

stats

June 14, 2022 at 21:43:04 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/133656/how-to-understand-the-drawbacks-of-k-means

Christoph Molnar sur Twitter : "Why I hate cluster analysis • What's longer, 10 cm 📏, 200$ 💵or a change from red to blue 🎨??? • What's the difference between banana 🍌 and apple 🍎? What about lemon 🍋? • How many clusters? 2? 10? 100? I can do whatever you want, no wrong answers!" / Twitter

À garder sous le coude, ya des choses intéressantes dans la réponse.

stats

June 9, 2022 at 13:40:25 GMT+2 · permalink

·

https://twitter.com/ChristophMolnar/status/1534095622638272513

R torch for statistics (not just machine learning). | Ryan Giordano, statistician.

dérivée automatic (automatic differenciation) sous R

stats

May 11, 2022 at 11:06:41 GMT+2 · permalink

·

https://rgiordan.github.io/code/2022/04/01/rtorch_example.html

Reaction time distributions: an interactive overview

La shifted lognormal pour la distribution des temps de réaction

stats

January 27, 2022 at 21:10:35 GMT+1 · permalink

·

https://lindeloev.shinyapps.io/shiny-rt/

distributions - The sum of independent lognormal random variables appears lognormal? - Cross Validated

La somme de log-normale s'approche par une distribution de log-normale. Bidouillage, mais intéressant

stats

January 26, 2022 at 12:13:17 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/238529/the-sum-of-independent-lognormal-random-variables-appears-lognormal

CDC Announces Plan To Send Every U.S. Household Pamphlet On Probabilistic Thinking

À garder sous le coude pour les formations

marrant · stats

January 19, 2022 at 10:53:04 GMT+1 · permalink

·

https://www.theonion.com/cdc-announces-plan-to-send-every-u-s-household-pamphle-1848354068

Restarting NIMBLE MCMC

Comment redémarrer un MCMC. Et définir son propre sampler avec Nimble

bayésien · mcmc · stats

November 23, 2021 at 12:09:25 GMT+1 · permalink

·

http://danielturek.github.io/public/restartingMCMC/restartingMCMC.html

Le paradoxe de Simpson illustré par des données de vaccination contre le Covid-19

Jolie explication du paradoxe de Simpson

stats

November 6, 2021 at 13:25:00 GMT+1 · permalink

·

https://theconversation.com/le-paradoxe-de-simpson-illustre-par-des-donnees-de-vaccination-contre-le-covid-19-170159

Poisson Process: The Limiting Case of the Bernoulli Process

Le processus de Poisson est une version continue du processus de Bernoulli.

stats

October 24, 2021 at 21:54:08 GMT+2 · permalink

·

https://stephens999.github.io/fiveMinuteStats/bernoulli_poisson_process.html

ELFI - Engine for Likelihood-Free Inference — ELFI 0.8.0 documentation

M'a l'air intéressant pour l'ABC.

ABC · stats

October 14, 2021 at 13:40:14 GMT+2 · permalink

·

https://elfi.readthedocs.io/en/latest/

Un garçon pas comme les autres (Bayes): $(n-1)$-sane in the membrane

Diviser par (n-1) dans le calcul de la variance permet de corriger un biais dans l'estimation de la variance de la population. Mais un biais tellement faible que c'est peanuts, et qu'il n'y a quasi-aucun cas de figure dans lequel la correction de ce biais pourrait se révéler utile. Simpson résume bien :

The n vs (n−1) denominator for a variance estimator is a curiosity. It is the source of thrilling (Not thrilling) exercises or exam questions. But it is not interesting.

It could maybe set up the idea that MLEs are not unbiased. But even then, the useless correction term is not needed. Just let it be slightly biased and move on with your life.

Because if that is the biggest bias in your analysis, you are truly blessed."

Amen.

stats

October 10, 2021 at 22:16:34 GMT+2 * · permalink

·

https://dansblog.netlify.app/posts/2021-10-11-n-sane-in-the-membrane/

variance - Finding a right way of sampling 1/X knowing X follows the Moschopoulos distribution (sum of Gamma distribution with different (shape/rate parameters) - Cross Validated

La distribution de Moschopoulos est une somme de gamma avec des paramètres différents.

stats

October 8, 2021 at 09:59:53 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/545053/finding-a-right-way-of-sampling-1-x-knowing-x-follows-the-moschopoulos-distribut

Linear model with log-transformed response vs. generalized linear model with log link - Cross Validated

Modéliser le log de la moyenne d'une variable log-normale, n'est pas la même chose que de modéliser la moyenne du log d'une variable log-normale. Une bonne explication.

stats

October 5, 2021 at 13:35:55 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/47840/linear-model-with-log-transformed-response-vs-generalized-linear-model-with-log/48679#48679

Interpreting elpd_diff - loo package - Modeling - The Stan Forums

Interpréter les différences de SE quand on utilise elpd. Super réponse de Vehtari.

elpd · stats

June 11, 2021 at 12:53:20 GMT+2 · permalink

·

https://discourse.mc-stan.org/t/interpreting-elpd-diff-loo-package/1628

The Kalman Filter

Une explication du filtre de Kalman

stats

April 23, 2021 at 18:07:12 GMT+2 · permalink

·

http://www.cs.unc.edu/~welch/kalman/

Parallelized loops with R | Blas M. Benito

À garder sous le coude.

info · R · stats

March 25, 2021 at 10:32:15 GMT+1 · permalink

·

https://www.blasbenito.com/post/02_parallelizing_loops_with_r/

Count transformation models - Siegfried - - Methods in Ecology and Evolution - Wiley Online Library

À lire

alire · stats

April 17, 2020 at 08:26:10 GMT+2 * · permalink

·

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13383?af=R

How post-hoc power calculation is like a shit sandwich « Statistical Modeling, Causal Inference, and Social Science

Super intéressant

gelman · stats

June 20, 2019 at 08:26:10 GMT+2 · permalink

·

https://statmodeling.stat.columbia.edu/2019/01/13/post-hoc-power-calculation-like-shit-sandwich/

[1312.6536] Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm

Modèles de survie spatio-temporels

stats · survie

April 5, 2019 at 14:56:08 GMT+2 · permalink

·

https://arxiv.org/abs/1312.6536

Linking bacterial populations with health | Stanford News

There’s been a movement which has said that most research is wrong. It’s making people feel they’re doing something wrong, but that’s not the problem. The problem is that the publication system pushes you because you can only publish if you get a good, that is, small, p-value [a statistical test that indicates whether results could be due to chance]. Researchers then massage the data until they get the p-value and then it’s not reproducible. But if we were much more transparent and said, “You’re allowed to publish things which are significant or not significant because it’s useful down the road and just publish all your data and the code you used for the analyses” – if you’re transparent about what you’re doing, there’s much less opportunity to shoehorn the data into some wrong conclusion.

I feel that people misuse summaries in statistics. They feel as if statistics is going to summarize everything into one value, as if one p-value is going to summarize five years of work. It’s ridiculous. Everything is multidimensional, it’s complex. But if we could publish more of the negative results and all of the data, we would advance science much faster, because people would get insight from the negative results.

stats

March 6, 2019 at 10:48:02 GMT+1 · permalink

·

https://news.stanford.edu/2019/03/01/linking-bacterial-populations-health/

Rank-normalized split-Rhat and effective sample size

Intéressant.

bayésien · mcmc · stats

March 4, 2019 at 17:21:47 GMT+1 · permalink

·

https://avehtari.github.io/rhat_ess/rhat_ess.html

Marchenko–Pastur distribution - Wikipedia

Intéressant

stats

March 4, 2019 at 09:19:31 GMT+1 · permalink

·

https://en.wikipedia.org/wiki/Marchenko%E2%80%93Pastur_distribution

Calculate the standard error of any function using the delta method |

Très intéressant !

stats

January 8, 2019 at 11:26:46 GMT+1 · permalink

·

https://oliviergimenez.github.io/post/delta-method/

Standard errors for lasso prediction using R - Cross Validated

L'estimation d'erreurs types pour les régressions lasso, c'est visiblement un sacré bordel.

lasso · stats

November 7, 2018 at 09:58:49 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/91462/standard-errors-for-lasso-prediction-using-r

poisson distribution - How does glmnet handle overdispersion? - Cross Validated

Super explication de pourquoi on se fout de la surdispersion en régression lasso.

stats

November 7, 2018 at 09:53:53 GMT+1 · permalink

·

https://stats.stackexchange.com/questions/101031/how-does-glmnet-handle-overdispersion

glmnet: maximum de vraisemblance pénalisée

La vignette est sympa

stats

November 7, 2018 at 09:52:23 GMT+1 · permalink

·

https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html

Overdispersed Poisson et bootstrap | Freakonometrics

Intéressant: il utilise la loi gamma pour simuler une quasi-poisson. Malheureusement, ses liens ne marchent pas. Mais je trouve l'idée intéressante, je me la garde sous le coude.

stats

November 4, 2018 at 11:45:09 GMT+1 · permalink

·

https://freakonometrics.hypotheses.org/5770

Manuel d’analyse spatiale | Insee

Oh mais il y a des choses ici !

geographie · stats

October 31, 2018 at 09:33:59 GMT+1 · permalink

·

https://www.insee.fr/fr/information/3635442

Significance: Vol 15, No 5

Spécial stats au tribunal. M'a l'air intéressant...

stats

October 5, 2018 at 13:55:58 GMT+2 · permalink

·

https://rss.onlinelibrary.wiley.com/toc/17409713/2018/15/5

Demystifying the Integrated Tail Probability Expectation Formula: The American Statistician: Vol 0, No 0

Un regard intéressant sur la notion d'espérance. Très axé math statistique, mais effectivement, ça donne une intuition de l'espérance comme différence de deux surfaces, que l'on peut calculer de deux façons différentes.

maths · stats

September 28, 2018 at 16:54:55 GMT+2 · permalink

·

https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1497541

Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA

À lire absolument. Aussi.
C'est dingue le nombre de bouquins de qualité, gratuits, qui sortent en ce moment.

alire · inla · stats

September 22, 2018 at 18:31:29 GMT+2 · permalink

·

https://becarioprecario.bitbucket.io/spde-gitbook/

Statistical Rethinking Fall 2017 - YouTube

Chaîne intéressante. À suivre.

bayesien · stats

September 7, 2018 at 11:44:45 GMT+2 · permalink

·

https://www.youtube.com/playlist?list=PLDcUM9US4XdM9_N6XUUFrhghGJ4K25bFc

ELFI: Engine for Likelihood-Free Inference

Encore un nouveau logiciel qui a l'air génial. À lire

alire · stats

September 3, 2018 at 10:31:52 GMT+2 · permalink

·

http://jmlr.org/papers/v19/17-374.html

The risks of alcohol (again) – WintonCentre – Medium

J'aime bien Spiegelhalter...

divers · stats

August 24, 2018 at 13:40:47 GMT+2 · permalink

·

https://medium.com/wintoncentre/the-risks-of-alcohol-again-2ae8cb006a4a

Sensitivity of binomial N‐mixture models to overdispersion: The importance of assessing model fit - Knape - - Methods in Ecology and Evolution - Wiley Online Library

À lire

alire · ecology · stats

August 21, 2018 at 09:34:18 GMT+2 · permalink

·

https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.13062?af=R

The Markov-chain Monte Carlo Interactive Gallery

Super site illustrant le principe de différents algos utilisés pour le MCMC.

informatique · mcmc · stats

July 27, 2018 at 08:59:16 GMT+2 · permalink

·

https://chi-feng.github.io/mcmc-demo/

Test autocorrélation spatiale

Code intéressant: inclusion des inverse distances au carré comme pondération dans un test d'autocorrélation.

autocorrélation · R · spdep · stats

July 16, 2018 at 11:18:27 GMT+2 · permalink

·

https://msu.edu/~ashton/classes/866/notes/lect21/neighbor.R

Google Groupes: Tree depth and Leapfrog

C'est mon problème. Moralité : réduire la curvature et standardiser les prédicteurs.

mcmc · stan · stats

July 4, 2018 at 13:29:29 GMT+2 · permalink

·

https://groups.google.com/forum/

Accessing the contents of a stanfit object

Récupérer les élements d'un objet stanfit. Notamment les éléments "sous le capot", genre la profondeur de l'arbre ou la taille des pas. treedepth et stepsize.

mcmc · stan · stats

July 4, 2018 at 13:01:59 GMT+2 · permalink

·

https://cran.r-project.org/web/packages/rstan/vignettes/stanfit-objects.html

Number of iterations - General - The Stan Forums

«All that stuff about running a million iterations and thinning by 10k is irrelevant for Stan/HMC, don’t do that.»

MCMC · stan · stats

July 4, 2018 at 11:40:38 GMT+2 · permalink

·

http://discourse.mc-stan.org/t/number-of-iterations/1674/2

Stan Best Practices · stan-dev/stan Wiki

Sur la convergence, je suis assez surpris : ils notent "In practice we have found that requiring Rhat < 1.1 is a good default requirement for each parameter." Or, les chaînes n'ont pas vraiment une bonne tête avec Rhat à 1.1... En outre, ils notent " A good check for such issues is the number of effective samples per iteration -- if N_eff / N < 0.001 then you should be suspect of the effective sample size calculation." J'ai 1500 itérations et j'ai des N_eff de 135 pour mon paramètre le plus merdique. J'aime pas la tête des traces MCMC, mais cette recommandation tendrait à indiquer que je suis peut-être trop puriste sur ce coup-là. Je vais essayer de creuser cette question...

mcmc · stan · stats

July 4, 2018 at 11:04:25 GMT+2 · permalink

·

https://github.com/stan-dev/stan/wiki/Stan-Best-Practices

Statistical ecology comes of age

À lire

alire · ecology · stats

July 3, 2018 at 17:17:48 GMT+2 · permalink

·

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298184/

Thomas Bayes and the crisis in science – TheTLS

À lire.

alire · stats

June 29, 2018 at 22:00:32 GMT+2 · permalink

·

https://www.the-tls.co.uk/articles/public/thomas-bayes-science-crisis/

Letter to the ISBA Membership Regarding Misconduct Allegations | International Society for Bayesian Analysis

Ils auront réagi vite.

stats

June 29, 2018 at 16:12:51 GMT+2 · permalink

·

https://bayesian.org/misconduct-letter/

[1806.10639] An Introduction to Animal Movement Modeling with Hidden Markov Models using Stan for Bayesian Inference

Super doc !

mouvement · stats

June 29, 2018 at 15:37:20 GMT+2 · permalink

·

https://arxiv.org/abs/1806.10639

Practical distance sampling in R

Ressource intéressante.

ecology · R · stats

June 28, 2018 at 15:57:35 GMT+2 · permalink

·

http://converged.yt/RDistanceBook/index.html

Data Visualization

Super bouquin, à lire absolument.

Mais alors vraiment absolument.

Via Mathieu.

ggplot · stats

June 27, 2018 at 14:01:55 GMT+2 · permalink

·

http://socviz.co/

An Interview with Alan Gelfand | methods.blog

À voir, ça a l'air vachement bien.

mcmc · stats

June 26, 2018 at 12:32:21 GMT+2 · permalink

·

https://methodsblog.wordpress.com/2018/06/26/alan-gefland/

nimble-dev/AHMnimble: Examples from Kéry

Les exemples de Kéry et Royle traduits en Nimble

mcmc · nimble · stats

June 26, 2018 at 11:35:51 GMT+2 · permalink

·

https://github.com/nimble-dev/AHMnimble

Evaluating Wikipedia as a Self-Learning Resource for Statistics: You Know They'll Use It: The American Statistician: Vol 0, No 0

Wikipedia pas recommandé pour l'autoapprentissage en stats.

science · stats

June 21, 2018 at 13:56:41 GMT+2 · permalink

·

https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1392360

Advanced Bayesian Multile... The R Journal

Ça ça m'intéresse... sur le package brms. À lire

alire · stats

June 21, 2018 at 13:47:47 GMT+2 · permalink

·

https://journal.r-project.org/archive/2018/RJ-2018-017/index.html

An asymptotic approximation to the N‐mixture model for the estimation of disease prevalence - Brintz - - Biometrics - Wiley Online Library

Alire

alire · stats

June 7, 2018 at 08:42:12 GMT+2 · permalink

·

https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.12913?af=R

No-U-turn sampling for fast Bayesian inference in ADMB and TMB: Introducing the adnuts and tmbstan R packages

La vache, ça avance à une vitesse ! À lire absolument...

alire · stats

June 4, 2018 at 13:20:46 GMT+2 · permalink

·

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0197954

Applications for deep learning in ecology | bioRxiv

J'arrête pas d'en entendre parler de ça... Faudrait que je me prenne un moment pour creuser...

ecologie · informatique · stats

June 1, 2018 at 09:27:38 GMT+2 · permalink

·

https://www.biorxiv.org/content/early/2018/05/30/334854

Gérer les non-convergences dans glmer

plein de pistes utiles

glmer · stats

May 20, 2018 at 21:43:24 GMT+2 · permalink

·

https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html

An Introduction to Greta · R Views

Greta pour le MCMC. Ça semble génial...

mcmc · stats

May 15, 2018 at 12:33:32 GMT+2 · permalink

·

https://rviews.rstudio.com/2018/04/23/on-first-meeting-greta/

R - Trend estimation for short time series - Stack Overflow

Réponse intéressante de Ben Bolker, reposant sur le package lmperm, permettant l'ajustement de modèles linéaires avec test par permutation. À creuser, un jour.

stats

May 15, 2018 at 10:08:32 GMT+2 · permalink

·

https://stackoverflow.com/questions/24739097/r-trend-estimation-for-short-time-series

How to make a shaded relief in R

Un shaded relief sous R. S'appuie sur le package metR du gars sur github

R · stats

May 14, 2018 at 13:11:22 GMT+2 · permalink

·

https://eliocamp.github.io/codigo-r/2018/02/how-to-make-shaded-relief-in-r/

Aki's favorite scientific books (so far) - Statistical Modeling, Causal Inference, and Social Science

Liste des livres de stats préférés de Aki Vehtari. Ceux que je connais sont effectivement incontournables. Me reste à lire les autres !

stats

May 14, 2018 at 12:57:53 GMT+2 · permalink

·

http://andrewgelman.com/2018/05/14/aki_books/

[1805.01124] A Coefficient of Determination (R2) for Linear Mixed Models

Ah ? Ça m'intéresse

alire · stats

May 5, 2018 at 14:50:58 GMT+2 · permalink

·

https://arxiv.org/abs/1805.01124

[1804.06788] Validating Bayesian Inference Algorithms with Simulation-Based Calibration

Semble intéressant, à lire donc.

bayesienne · stats

April 19, 2018 at 09:21:00 GMT+2 · permalink

·

https://arxiv.org/abs/1804.06788

Is this outbreak over?

Quand un outbreak se termine-t-il ?

ecologie · stats

April 16, 2018 at 20:55:26 GMT+2 · permalink

·

https://reconlearn.netlify.com/post/practical-outbreakend/

optimization - Step-by-step example of reverse-mode automatic differentiation - Cross Validated

Super illustration de l'approche. L'exemple de ffriend est limpide.

math · stats

April 15, 2018 at 18:08:05 GMT+2 · permalink

·

https://stats.stackexchange.com/questions/224140/step-by-step-example-of-reverse-mode-automatic-differentiation

The Multivariable Chain Rule - HMC Calculus Tutorial

La règle de dérivation en chaîne multivariée, utilisée pour la /reverse mode algorithmic differentiation/, à son tour utilisée dans STAN.

Rapidement, si z = h(x,y), et si (i) x = f(t) et (ii) y = g(t), alors (dz/dt) = (dh/dx)*(dx/dt) + (dh/dy)*(dy/dt)

Bonne explication du pourquoi.

math · stats

April 15, 2018 at 18:00:12 GMT+2 · permalink

·

https://www.math.hmc.edu/calculus/tutorials/multichainrule/

Re: R2 measure in mixed models?

Intéressant scepticisme de Douglas Bates concernant la généralisation du R2 au cas des modèles mixtes.

mixte · modèle · stats

April 14, 2018 at 17:38:28 GMT+2 · permalink

·

http://thread.gmane.org/gmane.comp.lang.r.lme4.devel/3281

Getting Genetics Done: Using the "Divide by 4 Rule" to Interpret Logistic Regression Coefficients

Un petit "truc" rigolo tiré de Gelman et Hill : dans une régression logistique, la pente de la courbe est maximisée pour a + bX = 0.

Alors la dérivée de exp(a+bX)/(1+exp(a+bX)) à cet endroit de pente maximale vaut b*exp(a+bX)/((1+exp(a+bX))^2.

Alors, lorsque la pente de cette courbe maximale est b*exp(0)/(1+exp(0))^2 = b/4.

Autrement dit, si on a une régression logistique avec une pente de b, alors on divise b par 4, et on a une approximation de la différence max de la proba que y=1 pour chaque augmentation de une unité de X.
Par exemple, si le coefficient de régression vaut 0.8, alors une augmentation de une unité de x vaut une augmentation de 0.8/4=0.2 de la proba de y=1.

Bien sûr, l'approximation marche mieux quand la proba prédite est proche de 0.5, et soit quand beta est proche de 0, soit quand x varie peu (voir le commentaire de Ben Bolker).

Peut toujours servir.

stats · truc

April 13, 2018 at 21:30:32 GMT+2 · permalink

·

http://www.gettinggeneticsdone.com/2010/12/using-divide-by-4-rule-to-interpret.html

[1804.02921] Distributional Regression Forests for Probabilistic Precipitation Forecasting in Complex Terrain

"In many classical models this only captures the location of the distribution but over the last decade there has been increasing interest in distributional regression approaches modeling all parameters including location, scale, and shape."

Il existe des méthodes de modélisations distributionelles, mais elles supposent que l'on connaît déjà les prédicteurs pertinents. Il y a des méthodes de sélection des prédicteurs, mais qui ne permettent pas la modélisation distributionnelle. D'où des arbres et forêts distributionnels. M'a l'air rigolo cette histoire.

predictive · stats

April 11, 2018 at 11:44:29 GMT+2 · permalink

·

https://arxiv.org/abs/1804.02921

This is what “power = .06” looks like. Get used to it. - Statistical Modeling, Causal Inference, and Social Science

Graphe important. Quand l'effet est faible et que le bruit est important (donc quand la puissance est faible, ici de 0.06), se focaliser sur les effets significatifs conduit à des effets dont la magnitude est 9 fois plus importante que l'effet réel et qui ont une chance sur quatre d'avoir le mauvais signe.
En lien avec l'article précédent dans mon shaarli : plus une étude est caractérisée par du bruit, moins on peut avoir confiance dans les effets significatifs.

gelman · stats

April 10, 2018 at 12:19:00 GMT+2 · permalink

·

http://andrewgelman.com/2014/11/17/power-06-looks-like-get-used/

The "What does not kill my statistical significance makes it stronger" fallacy - Statistical Modeling, Causal Inference, and Social Science

"So, we’ve seen from statistical analysis that the “What does not kill my statistical significance makes it stronger” is a fallacy: Actually, the noisier the study, the less we learn from statistical significance."
Le truc, c'est que quand il y a beaucoup de bruit dans une étude, un résultat significatif tendra à indiquer un effet dont la magnitude tendra à être plus importante que l'effet réel, et dont le signe peut même aller dans le mauvais sens.

gelman · stats

April 10, 2018 at 12:15:24 GMT+2 · permalink

·

http://andrewgelman.com/2017/02/06/not-kill-statistical-significance-makes-stronger-fallacy/

Statistical vignette of the day as a teaching tool | Dynamic Ecology

Des histoires intéressantes à lire...

stats

April 5, 2018 at 15:29:40 GMT+2 · permalink

·

https://dynamicecology.wordpress.com/2018/03/19/statistical-vignette-of-the-day-as-a-teaching-tool/

Mixed impressions in species distribution modeling – Rapid Ecology

A lire: fourcade et al. cité dans le blog. M'a l'air pas mal.

alire · ecologie · stats

March 2, 2018 at 11:23:53 GMT+1 · permalink

·

https://rapidecology.com/2018/02/26/mixed-impressions-in-species-distribution-modeling/

Multiple source spatial cluster detection via multi-criteria analysis | SpringerLink

Semble intéressant... A lire

alire · ecologie · stats

February 25, 2018 at 19:50:17 GMT+1 · permalink

·

https://link.springer.com/article/10.1007%2Fs10651-018-0403-9

Spatial data and the tidyverse

À étudier sérieusement.

alire · informatique · R · stats

February 9, 2018 at 09:48:03 GMT+1 · permalink

·

http://www.robinlovelace.net/presentations/spatial-tidyverse.html

Phase-Amplitude Separation and Modeling of Spherical Trajectories

Apparemment, il existe des méthodes stats permettant de modéliser des trajets sur une sphère (par exemple des migrations d'animaux, des trajets d'ouragans, etc.). Je ne connaissais pas.
Bon, pas besoin pour le moment, mais c'est bon de savoir que ça existe.

alire · stats

January 22, 2018 at 10:27:57 GMT+1 · permalink

·

http://amstat.tandfonline.com/doi/full/10.1080/10618600.2017.1340892

Methods in Ecology and Evolution - Wiley Online Library

Numéro spécial de MEE sur l'élicitation d'avis d'expert. A récupérer et lire.

alire · elicitation · statistique · stats

January 13, 2018 at 08:39:07 GMT+1 · permalink

·

http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/

Does Akaike play dice :: Mike Meredith

Super explication du pourquoi ne pas utiliser l'AIC de façon automatique. Très bel exemple.

statistiques · stats

January 12, 2018 at 16:52:08 GMT+1 · permalink

·

http://www.mikemeredith.net/blog/2017/Akaike_dice.htm

"Participants reported being hungrier when they walked into the café (mean = 7.38, SD = 2.20) than when they walked out [mean = 1.53, SD = 2.70, F(1, 75) = 107.68, P < 0.001]." - Statistical Modeling, Causal Inference, and Social Science

Excellent!

"Participants reported being hungrier when they walked into the café (mean = 7.38, SD = 2.20) than when they walked out [mean = 1.53, SD = 2.70, F(1, 75) = 107.68, P < 0.001]."

stats

December 22, 2017 at 16:22:25 GMT+1 · permalink

·

http://andrewgelman.com/2016/07/08/29495/

Alice's Adventures in Numberland

Apparemment, ce serait une lecture intéressante. Bon, le blog est à l'ancienne, sans flux RSS, mais il semblerait qu'il y ait des infos assez intéressantes...
À lire un jour...

alire · science · stats

December 21, 2017 at 10:41:01 GMT+1 · permalink

·

https://www.math.uci.edu/~asilverb/Adventures.html

Mixed Effects Random Forests in Python – Towards Data Science

Les forêts aléatoires avec random effect, ça existe!
En python, mais ça existe...

maths · stats

December 18, 2017 at 15:33:38 GMT+1 · permalink

·

https://towardsdatascience.com/mixed-effects-random-forests-6ecbb85cb177

Comment communiquer le risque au public

intéressant, basé sur des travaux de psycho.

divers · sciences · stats

December 4, 2017 at 21:56:56 GMT+1 · permalink

·

http://www.decisionsciencenews.com/2010/12/03/some-ideas-on-communicating-risks-to-the-general-public/

New publication on using hypervolumes for niche modelling – Macrosystems ecology lab

À lire. J'avais entendu parler de la méthode, et j'étais pas fan (la méthode du noyau est de moins en moins efficace quand la dimension de l'espace écologique augmente). Visiblement, ya un débat. À lire donc.

ecologie · niche · stats

December 4, 2017 at 10:43:45 GMT+1 · permalink

·

http://benjaminblonder.org/2017/07/06/new-publication-on-using-hypervolumes-for-niche-modelling/

The Substitute for p-Values: Journal of the American Statistical Association: Vol 112, No 519

À lire

alire · stats

November 21, 2017 at 21:28:12 GMT+1 · permalink

·

http://www.tandfonline.com/doi/full/10.1080/01621459.2017.1311264

Distance sampling with camera traps - Howe - 2017 - Methods in Ecology and Evolution - Wiley Online Library

Intéressante application du distance sampling sur données de pièges photos. À lire

alire · ecology · stats · écologie

November 8, 2017 at 15:29:44 GMT+1 · permalink

·

http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12790/abstract;jsessionid=BFF425D59ADDA49F453106D5A0AD803B.f01t01

[1710.08162] bridgesampling: An R Package for Estimating Normalizing Constants

Encore une nouvelle approche pour estimer les constantes de normalisation dans les approches bayesiennes

bayésienne · stats

October 25, 2017 at 11:43:08 GMT+2 · permalink

·

https://arxiv.org/abs/1710.08162

Probability and computer limitations :: Mike Meredith

Article très intéressant, sur le sujet des probas très faibles ou très élevées, qui, pour des raisons de représentations finies, ne peuvent être représentées sur l'ordi.
Quand on veut travailler avec ça, ya plein de petits trucs qui permettent de ne pas avoir de surprise, que l'on souhaite calculer ces probas, les additionner ou calculer 1-p.
Super intéressant

informatique · statistiques · stats

October 23, 2017 at 14:45:29 GMT+2 · permalink

·

http://www.mikemeredith.net/blog/2017/UnderOverflow.htm

Model comparison: Deviance-based approaches

Un cours assez intéressant sur la comparaisons de modèles, avec en particulier un passage très intéressant permettant d'acquérir une connaissance plus intuitive de "l'effective number of parameters"

AIC · DIC · information · model · multimodel · statistics · stats

October 23, 2017 at 12:39:34 GMT+2 · permalink

·

https://web.as.uky.edu/statistics/users/pbreheny/701/S13/notes/2-19.pdf

How Humans See Data - John Rauser - Velocity Amsterdam 2016 - YouTube

J'ai regardé les premières minutes, ça a l'air génial. Une présentation de la construction de graphes en se basant sur le modèle de Cleveland.
À regarder plus en détail.

exploratoire · graphique · stats

October 8, 2017 at 15:31:46 GMT+2 · permalink

·

https://www.youtube.com/watch?v=fSgEeI2Xpdc

Bayesian Inference for Multistate ‘Step and Turn’ Animal Movement in Continuous Time | SpringerLink

Ça a l'air super intéressant. A lire.

alire · stats

October 3, 2017 at 22:14:42 GMT+2 · permalink

·

https://link.springer.com/article/10.1007/s13253-017-0286-5

Beyond subjective and objective in statistics - Gelman - 2017 - Journal of the Royal Statistical Society: Series A (Statistics in Society) - Wiley Online Library

Article qui semble intéressant. À lire.

alire · stats

October 2, 2017 at 15:36:13 GMT+2 · permalink

·

http://onlinelibrary.wiley.com/doi/10.1111/rssa.12276/full