> Eric Dumazet a écrit :
>> Bill Fink a écrit :
>>> On Wed, 1 Oct 2008, Neil Horman wrote:
>>>
>>>> Hey all-
>>>> Since Eric mentioned the use of statistical analysis to determine if
>>>> hash chains were growing improperly, I thought I would take a stab
>>>> at such an
>>>> approach. I'm no statistics expert, but it would seem to me that simply
>>>> computing the standard deviation of all the non-zero chain lengths
>>>> would give a
>>>> good guide point to determine if we needed to invalidate our hash
>>>> table. I've
>>>> written the below implementation. I've not tested it (I'll be
>>>> doing that here
>>>> shortly for the next few days), but I wanted to post it to get
>>>> feedback from you
>>>> all on the subject. The highlights are:
>>>>
>>>> 1) We add a counter to rt_hash_bucket structure to track the length
>>>> of each
>>>> individual chain. I realize this is sub-optimal, as it adds
>>>> potentially lots of
>>>> memory to hash table as a whole (4 bytes * number of hash buckets).
>>>> I'm afraid
>>>> I've not come up with a better way to track that yet. We also
>>>> track the total
>>>> number of route entries and the number of non-zero length chains.
>>>> Lastly we
>>>> also maintain a global maximum chain length which defines the
>>>> longest chain we
>>>> will tolerate in the route table. This patch defines it as the
>>>> mean chain
>>>> length plus one standard deviation.
>>>
>>> I believe the general rule of thumb for something like this is at
>>> least two standard deviations. For a normal distribution, one standard
>>> deviation covers about 68 % of the sample universe, while two standard
>>> deviations covers about 95 % (three standard deviations covers 99.73 %).
>>> See the Wikipedia entry:
>>>
>>>
http://en.wikipedia.org/wiki/Standard_deviation
>>>
>>
>> Thanks Bill for the pointer, this is the trick.
>>
>> I believe we should target "4σ 99.993666% " case.
>>
>> But we dont need to really compute Standard deviation at runtime, only
>> find an (upper) approximation of it.
>>
>> For elasticity=4 and 512*1024 samples (mean < 4), I guess 4σ can be
>> approximated by 20 or something.
>>
>
> Good estimation of Standard Deviation could be computed for free
> in rt_check_expire(). (each runs scans 20% of hash table with default
> tunables timeout & ip_rt_gc_interval)
>
> We could update 4σ estimation in this function, every minute (ip_rt_gc_interval)
>
> At softirq time we then can detect a particular hash chain is
> longer than 4σ estimation and trigger an appropriate action.
>
> This action is to : flush table, and while we do that, expand hash table
> if its current size is under ip_rt_max_size/elasticity...
>