我正在更新一个旧的 Linux驱动程序,它通过DMA将数据传输到用户空间页面,这些页面通过get_user_pages()从应用程序传递下来.
我的硬件是一个新的基于x86 Xeon的板,具有12GB的RAM.
驱动程序从VME获取数据到PCIe FPGA,并且应该将其写入主存储器.我为每个页面做了一个dma_map_pages(),我用dma_mapping_error()检查它,并将返回的物理DMA地址写入DMA控制器的缓冲区描述符.然后我开始DMA. (我们也可以看到从FPGA跟踪器开始的传输).
但是,当我获得DMA完成IRQ时,我看不到任何数据.为了控制,我可以通过PIO模式访问相同的VME地址空间并且可以工作.我还尝试将值写入用户页面的page_address(页面),应用程序可以看到这些.一切都好.
深入研究这个问题我检查了像DMA-API.txt这样的常用文档,但我找不到任何其他方法,也没有找到其他驱动程序.
我的内核是自编译的4.4.59 64位,各种调试(调试DMA-API等)设置为yes.
我还试图通过驱动程序/ iommu /来查看这里的调试可能性,但只有几个pr_debugs.
有趣的是:我有另一个驱动程序,一个以太网驱动程序,它支持连接到PCI的NIC.这个没有问题!
转储并比较检索到的DMA dma_addr_t时,我看到:
NIC驱动程序通过dma_alloc_coherent()为缓冲区描述符等分配内存,它的地址是“低4 GB”:
[ 3127.800567] dma_alloc_coherent: memVirtDma = ffff88006eeab000, memPhysDma = 000000006eeab000
[ 3127.801041] dma_alloc_coherent: memVirtDma = ffff880035d9b000, memPhysDma = 0000000035d9b000
[ 3127.801373] dma_alloc_coherent: memVirtDma = ffff88006ecd4000, memPhysDma = 000000006ecd4000
用户空间页面的VME驱动程序dma_map_page是> 4GB,DMA地址看起来不同:0xffffe010(与应用程序的偏移量).
pageAddr=ffff88026b4b1000 off=10 dmaAddr=00000000ffffe010 length=100
DMA_BIT_MASK(32)设置在两个驱动器中,我们的FPGA内核为32位宽.
问题:为了使这个DMA工作,我必须有特殊的先决条件吗?我读到highmem内存不能用于DMA,这仍然是这样吗?
dmesg的一部分:
[ 0.539839] debug: unmapping init [mem 0xffff880037576000-0xffff880037ab2fff]
[ 0.549502] DMA-API: preallocated 65536 debug entries
[ 0.549509] DMA-API: debugging enabled by kernel config
[ 0.549545] DMAR: Host address width 46
[ 0.549550] DMAR: DRHD base: 0x000000fbffc000 flags: 0x1
[ 0.549573] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 0.549580] DMAR: RMRR base: 0x0000007bc14000 end: 0x0000007bc23fff
[ 0.549585] DMAR: ATSR flags: 0x0
[ 0.549590] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x0
[ 0.549779] DMAR: dmar0: Using Queued invalidation
[ 0.549784] DMAR: dmar0: Number of Domains supported <65536>
[ 0.549796] DMAR: Setting RMRR:
[ 0.549809] DMAR: Set context mapping for 00:14.0
[ 0.549812] DMAR: Setting identity map for device 0000:00:14.0 [0x7bc14000 - 0x7bc23fff]
[ 0.549820] DMAR: Mapping reserved region 7bc14000-7bc23fff
[ 0.549829] DMAR: Set context mapping for 00:1d.0
[ 0.549831] DMAR: Setting identity map for device 0000:00:1d.0 [0x7bc14000 - 0x7bc23fff]
[ 0.549838] DMAR: Mapping reserved region 7bc14000-7bc23fff
[ 0.549845] DMAR: Prepare 0-16MiB unity mapping for LPC
[ 0.549853] DMAR: Set context mapping for 00:1f.0
[ 0.549855] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 0.549861] DMAR: Mapping reserved region 0-ffffff
[ 0.549892] DMAR: Intel(R) Virtualization Technology for Directed I/O
...
[ 0.551725] iommu: Adding device 0000:00:00.0 to group 10
[ 0.551753] iommu: Adding device 0000:00:01.0 to group 11
[ 0.551780] iommu: Adding device 0000:00:01.1 to group 12
[ 0.551806] iommu: Adding device 0000:00:02.0 to group 13
[ 0.551833] iommu: Adding device 0000:00:02.2 to group 14
[ 0.551860] iommu: Adding device 0000:00:03.0 to group 15
[ 0.551886] iommu: Adding device 0000:00:03.2 to group 16
[ 0.551962] iommu: Adding device 0000:00:05.0 to group 17
[ 0.551995] iommu: Adding device 0000:00:05.1 to group 17
[ 0.552027] iommu: Adding device 0000:00:05.2 to group 17
[ 0.552059] iommu: Adding device 0000:00:05.4 to group 17
[ 0.552083] iommu: Adding device 0000:00:14.0 to group 18
[ 0.552134] iommu: Adding device 0000:00:16.0 to group 19
[ 0.552166] iommu: Adding device 0000:00:16.1 to group 19
[ 0.552191] iommu: Adding device 0000:00:19.0 to group 20
[ 0.552216] iommu: Adding device 0000:00:1d.0 to group 21
[ 0.552272] iommu: Adding device 0000:00:1f.0 to group 22
[ 0.552305] iommu: Adding device 0000:00:1f.3 to group 22
[ 0.552332] iommu: Adding device 0000:01:00.0 to group 23
[ 0.552360] iommu: Adding device 0000:03:00.0 to group 24
[ 0.552437] iommu: Adding device 0000:04:00.0 to group 25
[ 0.552473] iommu: Adding device 0000:04:00.1 to group 25
[ 0.552510] iommu: Adding device 0000:04:00.2 to group 25
[ 0.552546] iommu: Adding device 0000:04:00.3 to group 25
[ 0.552575] iommu: Adding device 0000:05:00.0 to group 26
[ 0.552605] iommu: Adding device 0000:05:00.1 to group 27
最佳答案 为了完整性,我们找到了答案.完全不同的原因:FPGA PCIe核心中的PCIe协议错误……