池州市网站建设_网站建设公司_前端工程师_seo优化
2025/12/25 21:11:27 网站建设 项目流程

问题描述:

dpdk-testpmd在超过128核双numa场景中,启动失败问题,问题日志如下,扫描内存的时候,无法使用numa1的内存。

...EAL:Detected lcore0as core0on socket0EAL:Detected lcore127as core215on socket0EAL:Skipped lcore128as core0on socket1EAL:Maximum logical cores by configuration:128EAL:Detected CPU lcores:128EAL:Detected NUMA nodes:1EAL:huge_pre_init setted2......EAL:Hugepage/dev/hugepages//rtemap_1 is on socket 1EAL:Hugepage/dev/hugepages//rtemap_0 is on socket 0EAL:num of reserve hugepage on socket0:1EAL:num of reserve hugepage on socket1:1EAL:num of reserve hugepage on socket2:0......EAL:Allocating1pages of size1024M on socket0EAL:Trying to obtain current memory policy.EAL:Setting policy MPOL_PREFERREDforsocket0EAL:Restoring previous memory policy:0EAL:Allocating1pages of size1024M on socket1EAL:Trying to obtain current memory policy.EAL:Setting policy MPOL_PREFERREDforsocket1EAL:eal_memalloc_alloc_seg_bulk():couldn't find suitable memseg_list EAL:Restoring previous memory policy:0EAL:FATAL:Cannot init memory EAL:Cannot init memory EAL:Error-exiting with code:1Cause:::invalid EAL arguments

原因分析:

一开始只是关注了error的log,以为是大页分配失败了,没有关注到前面的日志信息。

EAL: Setting policy MPOL_PREFERREDforsocket1EAL: eal_memalloc_alloc_seg_bulk(): couldn'tfindsuitable memseg_list

其实刚开始的日志也非常的重要,这里其实早早的就暴露了问题,dpdk-testpmd启动的时候,只探测了0-127核,跳过了128以及之后的核,最终只扫了numa0,故只能用numa0的内存。

EAL: Detected lcore0as core0on socket0EAL: Detected lcore127as core215on socket0EAL: Skipped lcore128as core0on socket1EAL: Maximum logical cores by configuration:128EAL: Detected CPU lcores:128EAL: Detected NUMA nodes:1EAL: huge_pre_init setted2

分析代码,可以看到cpu count的上限是由 RTE_MAX_LCORE决定的。而它是由config/x86/meson.build 中的 dpdk_conf.set('RTE_MAX_LCORE', 128)这个决定。

./lib/eal/common/eal_common_lcore.c

继续分析config/x86/meson.build代码,如下,可以看到通过修改编译,增加-Dmax_lcores来修改cpu数量来解决。

config/x86/meson.build

commit8ef09fdc506b76d505d90e064d1f73533388b640 Author:Juraj Linkeš<juraj.linkes@pantheon.tech>Date:Tue Aug1712:45:562021+0200build:add optional NUMA and CPU counts detection Add an option to automatically discover the host's NUMA and CPU counts and use those valuesfora non cross-build.Give users the option to override the per-archdefaultvalues or values from cross files by specifying them on the command line with-Dmax_lcores and-Dmax_numa_nodes.Signed-off-by:Juraj Linkeš<juraj.linkes@pantheon.tech>Reviewed-by:Honnappa Nagarahalli<honnappa.nagarahalli@arm.com>Reviewed-by:David Christensen<drc@linux.vnet.ibm.com>Acked-by:Bruce Richardson<bruce.richardson@intel.com>diff--git a/config/x86/meson.build b/config/x86/meson.build index704ba3db56..29f3dea181100644---a/config/x86/meson.build+++b/config/x86/meson.build @@-70,3+70,5@@elseendif dpdk_conf.set('RTE_CACHE_LINE_SIZE',64)+dpdk_conf.set('RTE_MAX_LCORE',128)+dpdk_conf.set('RTE_MAX_NUMA_NODES',32)

解决方案:

编译dpdk-testpmd的时候,增加参数-Dmax_lcores=256,如下解决。
meson build -Dmax_lcores=256

小结:

dpdk在使用的时候,还是和机型绑定在一起的,对于核数和numa数量的增加,都会影响dpdk的使用,需要进行适配。比较好的是,dpdk在开发之初,考虑到了这个问题,通过调整编译参数就可以解决。

需要专业的网站建设服务?

联系我们获取免费的网站建设咨询和方案报价,让我们帮助您实现业务目标

立即咨询