出于测试目的,我编写CPU压力程序:它只在M个线程中执行N个for循环.
我运行这个程序有大量的线程,比如说200.
但是在任务管理器中,我看到线程计数器不会超过一些小值,例如9,并且Thread.Start方法等待完成先前运行的线程.
这种行为看起来像一个ThreadPool行为,但我希望无需等待某些原因,无论如何都必须启动常规的System.Threading.Thread.
下面的代码将重现此问题并提供解决方法选项:
using System;
using System.Diagnostics;
using System.Threading;
namespace HeavyLoad
{
class Program
{
static long s_loopsPerThread;
static ManualResetEvent s_startFlag;
static void Main(string[] args)
{
long totalLoops = (long)5e10;
int threadsCount = 200;
s_loopsPerThread = totalLoops / threadsCount;
Thread[] threads = new Thread[threadsCount];
var watch = Stopwatch.StartNew();
for (int i = 0; i < threadsCount; i++)
{
Thread t = new Thread(IntensiveWork);
t.IsBackground = true;
threads[i] = t;
}
watch.Stop();
Console.WriteLine("Creating took {0} ms", watch.ElapsedMilliseconds);
// *** Comment out s_startFlag creation to change the behavior ***
// s_startFlag = new ManualResetEvent(false);
watch = Stopwatch.StartNew();
foreach (var thread in threads)
{
thread.Start();
}
watch.Stop();
Console.WriteLine("Starting took {0} ms", watch.ElapsedMilliseconds);
if (s_startFlag != null)
s_startFlag.Set();
watch = Stopwatch.StartNew();
foreach (var thread in threads)
{
thread.Join();
}
watch.Stop();
Console.WriteLine("Waiting took {0} ms", watch.ElapsedMilliseconds);
Console.ReadLine();
}
private static void IntensiveWork()
{
if (s_startFlag != null)
s_startFlag.WaitOne();
for (long i = 0; i < s_loopsPerThread; i++)
{
// hot point
}
}
}
}
情况1:如果注释了s_startFlag创建,则启动线程会立即开始高密集的CPU工作.在这种情况下,我有一个小的并发(大约9个线程)和我持有线程启动代码的所有时间:
Creating took 0 ms
Starting took 4891 ms
Waiting took 63 ms
情况2:但是如果我创建了s_startFlag,那么所有新线程都会等到它被设置.在这种情况下,我成功地同时启动所有200个线程并获得预期值:启动时间很短,工作时间很长,任务管理器中的线程数为200:
Creating took 0 ms
Starting took 27 ms
Waiting took 4733 ms
为什么线程拒绝在第一种情况下开始?我超过了什么样的限制?
系统:
>操作系统:Windows 7专业版
>框架:NET 4.6
> CPU:Intel Core2 Quad Q9550 @ 2.83GHz
> RAM:8 Gb
最佳答案 我做了一些研究,现在我发现高CPU负载确实对线程启动时间有很大的影响.
第一:我将totalLoops设置为100倍的值,以便有更多的观察时间.我看到线程不受限制,但创建速度非常慢. 1个线程在1-2秒内启动!
第二:我使用SetThreadAffinityMask函数(https://sites.google.com/site/dotburger/threading/setthreadaffinitymask-1)将主线程显式绑定到CPU核心#0和工作线程到核心#1,#2,#3.
Stopwatch watch;
using (ProcessorAffinity.BeginAffinity(0))
{
watch = Stopwatch.StartNew();
for (int i = 0; i < threadsCount; i++)
{
Thread t = new Thread(IntensiveWork);
t.IsBackground = true;
threads[i] = t;
}
watch.Stop();
Console.WriteLine("Creating took {0} ms", watch.ElapsedMilliseconds);
}
和
using (ProcessorAffinity.BeginAffinity(1, 2, 3))
{
for (long i = 0; i < s_loopsPerThread; i++)
{
}
}
现在主线程有自己的专用CPU内核(在进程边界中)和工作线程在~10毫秒后开始(totalLoops = 5e10).
Creating took 0 ms
Starting took 2282 ms
Waiting took 3681 ms
另外,我在MSDN中找到了这句话:
When you call the Thread.Start method on a thread, that thread might
or might not start executing immediately, depending on the number of
processors and the number of threads currently waiting to execute.
https://msdn.microsoft.com/en-us/library/1c9txz50(v=vs.110).aspx
结论:Thread.Start方法对活跃工作线程的数量非常敏感.这可能是一个非常强大的性能影响 – 数百次减速.