发博文
个人资料
Johannine
Johannine
  • 博客等级:
  • 博客积分:281
  • 博客访问:13,495
  • 关注人气:3
评论
加载中…
友情链接

矩陣數學論壇

ICM sources

圣哲筆談

always pretends to be reversed

George

Study Space

访客
加载中…
好友
加载中…
留言
加载中…
博文
标签:

杂谈

分类: 数字

I am so much delighted to read the economist.com's recent editorial to unmask the myth of sex disproportion in China: Gendercide, the world wide war on baby girls.

Conventional thinking many blame illiteracy of parents, poverty and the controvertial one-child policy for the sex disproportions in China. Thanks to the economist.com, these opinions have turned out to be distorted, with sound statistical backgrounds.

Based on prudent analysis and cautious data display, the article has successfully demonstrated that conventional thinking may be

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
标签:

杂谈

金融危機迷霧重重,德國與希臘博弈持續進行。一個貫穿金融發展始終的問題現在終於被經濟學家網站重新提出:金融工具到底能否促進經濟發展?

http://www.economist.com/debate/debates/overview/166?source=hptextfeature

在我看來,凡是都有個度。金融最基本的形態就是銀行存貸,目的是增加流動性,有效運用資金。即便到了今天,金融工具紛繁複雜,從股票債券到期權期貨,但始終沒有擺脫這個實質。金融自始至終只是一個增加流動性的工具而已,通過借貸和風險的分割,讓資本運作和實體經濟更加有效地配合起來,本身是不具有生產力的。

現代社會的可悲之處在於,許多人誤認為金融家是社會財富的創造者,過高地給他們的經濟活動進行定價。而金融家通過複雜統計工具進行看似高深的金融創新,實質上不過是用來忽悠投資者來支付更高的傭金而已。越來越多的人執迷不悟,將大量資金投放到金融活動中去,乃至脫離了社會現有經濟狀況,其結果只能是透支未來的財富。在這個充斥泡沫的年代,麥道夫現象根本不是個案,其實我們的券商和上市公司,各方都或多或少做著

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
标签:

杂谈

In the quantile longitudinal model proposed by Koenker, he tried to base the L1 norm on the L2 norm of mean regression. The difficulty is to decide the penalty term reflecting the random effect part: using the L1 norm penalty term, the structure of a pilot covariance matrix can not be used. Then I started to think: why not use the eigenvalue?

 

Eigenvalue and eigenvector decomposition may be the most charming part of matrix algebra, it compresses the information in an N*N-dimension space into N dimensions, such as the spectral decomposition, which was applied in signal transmission adapted to our television before the advance of digital technology. In fact, it drives all the classical multidimensional analysis techniques, the Primary Component Analysis, Partial Least Squares Regression, Factor Analysis, Independent Co

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 

Draw a line on a balloon. Draw a circle on it as well. Stretch and twist the balloon in what ever direction you like, without breaking it. Whatever figure you create, the line remains a line, and the circle remains a closed area. There are no breaks.

 

Such categorizes the idea of continuity. The surface is continuous because:

 

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
(2009-11-13 21:22)
标签:

杂谈

Ratio estimation is an important topic more than in sampling survey literature. In microeconomics, the marginal effect is indeed ratio statistics. In medical industry, the cost-effectiveness ratio also interests researchers, since in practice we not only take into consideration of whether a certain medicine can cure a disease, but also how much the cost entails in the treatment. A treatment with high economical cost may not be applicable.

 

Turn back to the statistical literature behind. For all conditions we consider the estimation of R=X/Y, where X and Y are both random variables. In sampling survey R is only an intermediate variable, the real interest is estimating X, therefore Y is chosen with a relatively small variance. Thus, in estimation we neglect the variance of Y, and the formula is rather simple.

 

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 

The world of mathematics is a marvelous creation by nature, and discovered by human intelligence. Although various elements have identified themselves by various symbols and formulas, their essence is inevitably homogeneous once and again.

 

For example, when we study calculation we learn how to add, minus, multiple and devide numbers. Such operator as '+','*','-'and '/' are first applied in the space of real variables. Then we move on to study matrix algebra, l

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
标签:

杂谈

In sampling theory, the ratio estimator is an important statistics. Since it implements information of the assistant variable, whose population distribution is more easily obtained, the sampling variance is greatly reduced. However, the theoretical variance of a ratio estimator is hard to get, and many statisticians approximate the estimation by neglecting the variance of the denominator part of the estimator.  This method inevitably leads to an underestimation of the variance estimation. It is argued that if the sample size is large enough this approximation is quite feasible. In this article I carry out a strict Monte-Carlo experiment of a sampling instance. Both theoretical variance and estimation using the approximation is calculated, and I find that even with a small sample size the approximation is quite close to the real value. Further discussion is presented, demonst
阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
标签:

数学

The Nine Chapters of Mathematical Art, an ancient Chinese mathematical work, already shed some light on the characteristics on the Hilbert Space. The Gougu Theorem mentioned in this book actually coincides with the Parseval Formula held in the two dimensional Euclidean space.  Time series analysis deals a lot with distances between random variables, and such distances are defined in a special type of Hilbert space: the square integrable functions on a probability space. Although in linear time series regressions the algorithm simulates that of the least square linear regression, in linear regression the variables are defined directly on the Euclidean space, while in time series calculation the application of distances in Euclidean space only approximates distances in the previous square integrable function space. The spectral analysis is based upon an isomorphism between two types of Hilbert spaces, using an Ito integrati
阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
标签:

杂谈

分类: 创作

神經網絡是一個強大的工具,僅僅依靠一個隱藏層,就可以發覺幾乎一切形式的非線性關係。神經網絡唯一的缺點,也就是黑箱型和收斂性了。

 

神經網絡用於非線性降維的設計思路真是十分巧妙,直接將原始變量既作為輸入變量又作為輸出變量,而中間層的神經元上的數值就可以作為降維結果了。

 

例如要把100維變量的元素降為2維,只需要建一個特殊神經網絡,輸入層、輸出層都是100,中間層個數為2。因為輸出層就是原始變量,因此中間層的2維變量可以最大限度地擬合原始變量的信息。

 

 

 

不過這樣的模型看上去不像純粹的非線性模型,因為從輸入層到中間層,

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
(2009-02-27 18:28)
标签:

杂谈

達爾文的進化論用模型敘述就是一個遺傳過程,首先是按照個體素質的優劣進行非均勻抽樣,然後再進行隨機配對交叉,最後再經歷一次遺傳變異。這個數學模型在今天的優化問題中廣為采納,名曰:遺傳算法。

 

某老師似乎格外青睞這個来自生物界的計算方法,不論什麽模拟试验就说,用遗传算法做吧。其实遺傳算法執行起來是很方便的;不過收斂速度實在是很慢。最近做一個830維化學變量的降維,想嘗試一下遺傳算法的威力;結果果不其然,從下午到現在一共遺傳了30000多代,也沒有出一個所以然來。不過也可能是我的計算機太慢了。如果做所有子集回歸的話,一共要算36288748080758658160676個模型,實在是不可想象;即便是遺傳算法加速100倍,也得计算3.6E22个模型,遗传3.6E20代。按照我的CPU,估计明年此时也不能完成了。

 

想想也不奇怪,在纯随机的选择更新之下,自然界中物种进化也不过如此,一下午繁衍的30000代,放在自然界中也有600000年了吧,60万年前的生物和今天又能有多大差别呢。但是现代CPU能把60万年

阅读  ┆ 评论  ┆ 转载 ┆ 收藏 
  

新浪BLOG意见反馈留言板 不良信息反馈 电话:4006900000 提示音后按1键(按当地市话标准计费) 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 会员注册 | 产品答疑

新浪公司 版权所有