Learn about CUDA_zerup

http://blog.sina.com.cn/u/1343680057

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

Learn about CUDA

(2010-11-28 10:58:28)

标签：

cuda

gpu

it

分类：学术科研--为伊消得憔悴

Define: an acronym for Compute Unified Device Architecture

Advantages over GPGPU:

Scattered reads – code can read from arbitrary addresses in memory.

Shared memory – CUDA exposes a fast shared memory region (up to 48KB in size) that can be shared amongst threads.

Faster downloads and readbacks to and from the GPU

Full support for integer and bitwise operations, including integer texture lookups.

Limitations:

Fermi GPUs(compute capability 2.0) have(nearly) full support of C++, but the member functions can't be virtual.

Texture rendering is not supported.(we don't concern!)

Double precision only supports round-to-nearest-even and chop operations.

The bus bandwidth and latency between the CPU and the GPU may be a bottleneck.

Threads should be running in groups of at least 32 for best performance, with total number of threads numbering in the thousands. Branches in the program code do not impact performance significantly, provided that each of 32 threads takes the same execution path.

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：各种命大哥排名

后一篇：做事，要上双保险

新浪BLOG意见反馈留言板　欢迎批评指正