Java8 parallelStream API 线程池耗光
(2015-12-22 22:55:50)
标签:
java8foreachparallestream线程池卡主 |
分类: study |
So what about parallel processing?
One most advertised functionality of streams is that they allow
automatic parallelization of processing. And one can find the
amazing demonstrations on the web, mainly based of the same example
of a program contacting a server to get the values corresponding to
a list of stocks and finding the highest one not exceeding a given
limit value. Such an example may show an increase of speed of
400
But this example as little to do with parallel processing. It is an example of concurrent processing, which means that the increase of speed will be observed also on a single processor computer. This is because the main part of each “parallel” task is waiting. Parallel processing is about running at the same time tasks that do no wait, such as intensive calculations.
Automatic parallelization will generally not give the expected result for at least two reasons:
- The increase of speed is highly dependent upon the kind of task and the parallelization strategy. And over all things, the best strategy is dependent upon the type of task.
- The increase of speed in highly dependent upon the environment. In some environments, it is easy to obtain a decrease of speed by parallelizing.
Whatever the kind of tasks to parallelize, the strategy applied by parallel streams will be the same, unless you devise this strategy yourself, which will remove much of the interest of parallel streams. Parallelization requires:
- A pool of threads to execute the subtasks,
- Dividing the initial task into subtasks,
- Distributing subtasks to threads,
- Collating the results.
Without entering the details, all this implies some overhead. It will show amazing results when:
- Some tasks imply blocking for a long time, such as accessing a remote service, or
- There are not many threads running at the same time, and in particular no other parallel stream.
If all subtasks imply intense calculation, the potential gain is
limited by the number of available processors.
Java
The worst case is if the application runs in a server or a container alongside other applications, and subtasks do not imply waiting. In such a case, (for example running in a J2EE server), parallel streams will often be slower that serial ones. Imagine a server serving hundreds of requests each second. There are great chances that several streams might be evaluated at the same time, so the work is already parallelized. A new layer of parallelization at the business level will most probably make things slower.
Worst: there are great chances that the business applications will see a speed increase in the development environment and a decrease in production. And that is the worst possible situation.
Edit: for a better understanding of why parallel streams in Java 8 (and the Fork/Join pool in Java 7) are broken, refer to these excellent articles by Edward Harned:
What streams are good for
Stream are a useful tool because they allow lazy evaluation. This is very important in several aspect:
- They allow functional programming style using bindings.
- They
allow for better performance by removing iteration. Iteration
occurs with evaluation. With streams, we can bind dozens of
functions
without iterating. - They allow easy parallelization for task including long waits.
- Streams may be infinite (since they are lazy). Functions may be bound to infinite streams without problem. Upon evaluation, there must be some way to make them finite. This is often done through a short circuiting operation.
What streams are not good for
Streams should be used with high caution when processing intensive
computation tasks. In particular, by default, all streams will use
the same ForkJoinPool,
configured to use as many threads as there are cores in the
computer on which the program is running.
If evaluation of one parallel stream results in a very long running
task, this may be split into as many long running sub-tasks that
will be distributed to each thread in the pool. From there, no
other parallel stream can be processed because all threads will be
occupied. So, for computation intensive stream evaluation, one
should always use a specific ForkJoinPool
To do this, one may create a Callable
This way, other parallel streams (using their
own ForkJoinPool)
will not be blocked by this one. In other words, we would need a
pool of ForkJoinPool
If a program is to be run inside a container, one must be very
careful when using parallel streams. Never use the default pool in
such a situation unless you know for sure that the container can
handle it. In a Java
Previous articles
推广:51说图

加载中…