I am so much
delighted to read the economist.com's recent editorial to unmask
the myth of sex disproportion in China: Gendercide, the world wide
war on baby girls.
Conventional
thinking many blame illiteracy of parents, poverty and
the controvertial one-child policy for the sex disproportions
in China. Thanks to the economist.com, these opinions have turned
out to be distorted, with sound statistical backgrounds.
Based on
prudent analysis and cautious data display, the article has
successfully demonstrated that conventional thinking may
be
金融危機迷霧重重,德國與希臘博弈持續進行。一個貫穿金融發展始終的問題現在終於被經濟學家網站重新提出:金融工具到底能否促進經濟發展?
http://www.economist.com/debate/debates/overview/166?source=hptextfeature
在我看來,凡是都有個度。金融最基本的形態就是銀行存貸,目的是增加流動性,有效運用資金。即便到了今天,金融工具紛繁複雜,從股票債券到期權期貨,但始終沒有擺脫這個實質。金融自始至終只是一個增加流動性的工具而已,通過借貸和風險的分割,讓資本運作和實體經濟更加有效地配合起來,本身是不具有生產力的。
現代社會的可悲之處在於,許多人誤認為金融家是社會財富的創造者,過高地給他們的經濟活動進行定價。而金融家通過複雜統計工具進行看似高深的金融創新,實質上不過是用來忽悠投資者來支付更高的傭金而已。越來越多的人執迷不悟,將大量資金投放到金融活動中去,乃至脫離了社會現有經濟狀況,其結果只能是透支未來的財富。在這個充斥泡沫的年代,麥道夫現象根本不是個案,其實我們的券商和上市公司,各方都或多或少做著
In
the quantile longitudinal model proposed by Koenker, he tried
to base the L1 norm on the L2 norm of mean regression. The
difficulty is to decide the penalty term reflecting the random
effect part: using the L1 norm penalty term, the structure of a
pilot covariance matrix can not be used. Then I started to think:
why not use the eigenvalue?
Eigenvalue and
eigenvector decomposition may be the most charming part of
matrix algebra, it compresses the information in an N*N-dimension
space into N dimensions, such as the spectral decomposition, which
was applied in signal transmission adapted to our television before
the advance of digital technology. In fact, it drives all the
classical multidimensional analysis techniques, the Primary
Component Analysis, Partial Least Squares Regression, Factor
Analysis, Independent Co
Draw a line on a balloon. Draw a circle on
it as well. Stretch and twist the balloon in what ever
direction you like, without breaking it. Whatever figure you
create, the line remains a line, and the circle remains a closed
area. There are no
breaks.
Such categorizes the idea of continuity.
The surface is continuous
because:
Ratio estimation is
an important topic more than in sampling survey literature. In
microeconomics, the marginal effect is indeed ratio statistics. In
medical industry, the cost-effectiveness ratio also interests
researchers, since in practice we not only take into consideration
of whether a certain medicine can cure a disease, but also how much
the cost entails in the treatment. A treatment with high economical
cost may not be applicable.
Turn back to
the statistical literature behind. For all conditions we consider
the estimation of R=X/Y, where X and Y are both random variables.
In sampling survey R is only an intermediate variable, the
real interest is estimating X, therefore Y is chosen with a
relatively small variance. Thus, in estimation we neglect the
variance of Y, and the formula is rather simple.
The
world of mathematics is a marvelous creation by nature, and
discovered by human intelligence. Although various elements have
identified themselves by various symbols and formulas, their
essence is inevitably homogeneous once and again.
For
example, when we study calculation we learn how to add, minus,
multiple and devide numbers. Such operator as '+','*','-'and '/'
are first applied in the space of real variables. Then we move on
to study matrix algebra,
l
In sampling theory,
the ratio estimator is an important statistics. Since it implements
information of the assistant variable, whose population
distribution is more easily obtained, the sampling variance is
greatly reduced. However, the theoretical variance of a ratio
estimator is hard to get, and many statisticians approximate the
estimation by neglecting the variance of the denominator part of
the estimator. This method inevitably leads to an
underestimation of the variance estimation. It is argued that if
the sample size is large enough this approximation is quite
feasible. In this article I carry out a strict Monte-Carlo
experiment of a sampling instance. Both theoretical variance and
estimation using the approximation is calculated, and I find that
even with a small sample size the approximation is quite close to
the real value. Further discussion is presented,
demonst
The
Nine Chapters of Mathematical Art, an ancient Chinese mathematical
work, already shed some light on the characteristics on the Hilbert
Space. The Gougu Theorem mentioned in this book actually coincides
with the Parseval Formula held in the two dimensional Euclidean
space. Time series analysis deals a lot with distances
between random variables, and such distances are defined in a
special type of Hilbert space: the square integrable functions on a
probability space. Although in linear time series regressions the
algorithm simulates that of the least square linear regression, in
linear regression the variables are defined directly on the
Euclidean space, while in time series calculation the application
of distances in Euclidean space only approximates distances in the
previous square integrable function space. The spectral analysis is
based upon an isomorphism between two types of Hilbert spaces,
using an Ito integrati
(2009-03-12 15:31)
神經網絡是一個強大的工具,僅僅依靠一個隱藏層,就可以發覺幾乎一切形式的非線性關係。神經網絡唯一的缺點,也就是黑箱型和收斂性了。
神經網絡用於非線性降維的設計思路真是十分巧妙,直接將原始變量既作為輸入變量又作為輸出變量,而中間層的神經元上的數值就可以作為降維結果了。
例如要把100維變量的元素降為2維,只需要建一個特殊神經網絡,輸入層、輸出層都是100,中間層個數為2。因為輸出層就是原始變量,因此中間層的2維變量可以最大限度地擬合原始變量的信息。

不過這樣的模型看上去不像純粹的非線性模型,因為從輸入層到中間層,
達爾文的進化論用模型敘述就是一個遺傳過程,首先是按照個體素質的優劣進行非均勻抽樣,然後再進行隨機配對交叉,最後再經歷一次遺傳變異。這個數學模型在今天的優化問題中廣為采納,名曰:遺傳算法。
某老師似乎格外青睞這個来自生物界的計算方法,不論什麽模拟试验就说,用遗传算法做吧。其实遺傳算法執行起來是很方便的;不過收斂速度實在是很慢。最近做一個830維化學變量的降維,想嘗試一下遺傳算法的威力;結果果不其然,從下午到現在一共遺傳了30000多代,也沒有出一個所以然來。不過也可能是我的計算機太慢了。如果做所有子集回歸的話,一共要算36288748080758658160676個模型,實在是不可想象;即便是遺傳算法加速100倍,也得计算3.6E22个模型,遗传3.6E20代。按照我的CPU,估计明年此时也不能完成了。
想想也不奇怪,在纯随机的选择更新之下,自然界中物种进化也不过如此,一下午繁衍的30000代,放在自然界中也有600000年了吧,60万年前的生物和今天又能有多大差别呢。但是现代CPU能把60万年