Frank Nielsen(@FrnkNlsn) 's Twitter Profileg
Frank Nielsen

@FrnkNlsn

Machine Learning & AI, Information Sciences & Information Geometry, Distances & Statistical models, HPC.
"Geometry defines the architecture of spaces" @SonyCSL

ID:117258094

linkhttps://franknielsen.github.io/index.html calendar_today25-02-2010 01:38:54

5,2K Tweets

23,4K Followers

1,3K Following

Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Slide deck on 3 concepts for statistical divergences:

- *comparative convexity* and a generalization of Bregman divergences
- computing divergences with *maximal invariant*
- embedding Fisher-Rao metrics onto submanifolds and *projective divergences*
👉franknielsen.github.io/SlidesVideo/

Slide deck on 3 concepts for statistical divergences: - *comparative convexity* and a generalization of Bregman divergences - computing divergences with *maximal invariant* - embedding Fisher-Rao metrics onto submanifolds and *projective divergences* 👉franknielsen.github.io/SlidesVideo/
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Distances omnipresent in information sciences:
- Probability: convergence thms
- Statistics: estimators/scoring rules
- Information theory: mutual information
- Signal proc.: factorization
- ML: loss functions
- Information geometry: canonical structures

franknielsen.github.io/Divergence/

Distances omnipresent in information sciences: - Probability: convergence thms - Statistics: estimators/scoring rules - Information theory: mutual information - Signal proc.: factorization - ML: loss functions - Information geometry: canonical structures franknielsen.github.io/Divergence/
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Precursors of statistical manifolds in information geometry:
- Space of statistical parameters by Hotelling (1930)
- Statistical field by Mahalanobis (1936)
- Population space by Rao (1945)

2 meanings for statistical manifolds:
-mfd of statistical models
-mfd with dual structure

Precursors of statistical manifolds in information geometry: - Space of statistical parameters by Hotelling (1930) - Statistical field by Mahalanobis (1936) - Population space by Rao (1945) 2 meanings for statistical manifolds: -mfd of statistical models -mfd with dual structure
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

I agree with this quote by replacing 'is not about' by 'is not only about' 🙂

'Mathematics is not about numbers, equations, computations, or algorithms: it is about understanding'
-- William Paul Thurston

account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

3 definitions of the Jensen-Shannon divergence yielding 3 different generalizations!

① JSD = Symmetrization of KLD

② JSD = concave gap induced by Shannon entropy

③ JSD = variational KLD divergence

mdpi.com/1099-4300/23/4…

3 definitions of the Jensen-Shannon divergence yielding 3 different generalizations! ① JSD = Symmetrization of KLD ② JSD = concave gap induced by Shannon entropy ③ JSD = variational KLD divergence mdpi.com/1099-4300/23/4…
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Talk next week on recent works on distances with applications in statistics & ML at meeting edmsa.sciencesconf.org

Introduce concepts:
- comparative convexity & Bregman divergences
- maximal invariant & f-divergences
- Fisher Rao & projective distances

x.com/frnknlsn/statu…

Talk next week on recent works on distances with applications in statistics & ML at meeting edmsa.sciencesconf.org Introduce concepts: - comparative convexity & Bregman divergences - maximal invariant & f-divergences - Fisher Rao & projective distances x.com/frnknlsn/statu…
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Fisher-Rao geometry of statistical model of univariate normal distributions amounts to Poincaré hyperbolic upper plane U.

Isometric embedding of a *band* of U onto the pseudosphere of negative curvature.

Hilbert proved that no full isometric embedding of U in R3 is possible.

Fisher-Rao geometry of statistical model of univariate normal distributions amounts to Poincaré hyperbolic upper plane U. Isometric embedding of a *band* of U onto the pseudosphere of negative curvature. Hilbert proved that no full isometric embedding of U in R3 is possible.
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Overview of the key principles of main statistical distances used in machine learning:
- f-divergences
- Bregman divergences
- Optimal transport

👉franknielsen.github.io/Divergence/ind…

(Figure from mdpi.com/1099-4300/22/1… )

Overview of the key principles of main statistical distances used in machine learning: - f-divergences - Bregman divergences - Optimal transport 👉franknielsen.github.io/Divergence/ind… (Figure from mdpi.com/1099-4300/22/1… )
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Prof. Mahalanobis (famous for Mahalanobis Δ^2 distance) mentions analogy with relativity: model space of distributions by a 'statistical field' !

Introduced work of Einstein and Minkowski, english translation published by the university of Calcutta.

👉 franknielsen.github.io/IG/

Prof. Mahalanobis (famous for Mahalanobis Δ^2 distance) mentions analogy with relativity: model space of distributions by a 'statistical field' ! Introduced work of Einstein and Minkowski, english translation published by the university of Calcutta. 👉 franknielsen.github.io/IG/
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Nice to see that at least 6 papers at have coresets in their titles !
Hope for green data ML/AI !

x.com/frnknlsn/statu…

Nice to see that at least 6 papers at #ICML24 have coresets in their titles ! Hope for green data ML/AI ! x.com/frnknlsn/statu…
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Softplus is a strictly convex activation function smoothing ReLU.
It is related to LogSumExp (LSE) by fixing one argument to zero.
LSE is only convex but fixing one argument to zero makes it strictly convex and yields the softplus function.
tinyurl.com/MonteCarloIG

Softplus is a strictly convex activation function smoothing ReLU. It is related to LogSumExp (LSE) by fixing one argument to zero. LSE is only convex but fixing one argument to zero makes it strictly convex and yields the softplus function. tinyurl.com/MonteCarloIG
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

Variational divergences = minimizers of averages of divergences.

Sibson's information radius yields Jensen-Shannon variational divergence with centroid = arithmetic mean.

Symmetrize any divergence as a variational divergence using an abstract mean:

mdpi.com/1099-4300/23/4…

Variational divergences = minimizers of averages of divergences. Sibson's information radius yields Jensen-Shannon variational divergence with centroid = arithmetic mean. Symmetrize any divergence as a variational divergence using an abstract mean: mdpi.com/1099-4300/23/4…
account_circle
Frank Nielsen(@FrnkNlsn) 's Twitter Profile Photo

My short reviews:

① Excellent panorama of information geometry (IG) by its Founder. Most accessible text!
⬇️
② Great description of dual structures of IG with exercises but need diff geo background
⬇️
③ Most rigorous text. Foundations, statistical invariance

My short reviews: ① Excellent panorama of information geometry (IG) by its Founder. Most accessible text! ⬇️ ② Great description of dual structures of IG with exercises but need diff geo background ⬇️ ③ Most rigorous text. Foundations, statistical invariance
account_circle