将文件列表作为Java 8 Stream读取

我有一个(可能很长)二进制文件列表,我想懒惰地阅读.将有太多文件加载到内存中.我目前正在使用FileChannel.map()将它们作为MappedByteBuffer读取,但这可能不是必需的.我希望方法readBinaryFiles(…)返回
Java 8 Stream,这样我就可以在访问文件时懒惰加载文件列表.

    public List<FileDataMetaData> readBinaryFiles(
    List<File> files, 
    int numDataPoints, 
    int dataPacketSize )
    throws
    IOException {

    List<FileDataMetaData> fmdList = new ArrayList<FileDataMetaData>();

    IOException lastException = null;
    for (File f: files) {

        try {
            FileDataMetaData fmd = readRawFile(f, numDataPoints, dataPacketSize);
            fmdList.add(fmd);
        } catch (IOException e) {
            logger.error("", e);
            lastException = e;
        }
    }

    if (null != lastException)
        throw lastException;

    return fmdList;
}


//  The List<DataPacket> returned will be in the same order as in the file.
public FileDataMetaData readRawFile(File file, int numDataPoints, int dataPacketSize) throws IOException {

    FileDataMetaData fmd;
    FileChannel fileChannel = null;
    try {
        fileChannel = new RandomAccessFile(file, "r").getChannel();
        long fileSz = fileChannel.size();
        ByteBuffer bbRead = ByteBuffer.allocate((int) fileSz);
        MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSz);

        buffer.get(bbRead.array());
        List<DataPacket> dataPacketList = new ArrayList<DataPacket>();

        while (bbRead.hasRemaining()) {

            int channelId = bbRead.getInt();
            long timestamp = bbRead.getLong();
            int[] data = new int[numDataPoints];
            for (int i=0; i<numDataPoints; i++) 
                data[i] = bbRead.getInt();

            DataPacket dp = new DataPacket(channelId, timestamp, data);
            dataPacketList.add(dp);
        }

        fmd = new FileDataMetaData(file.getCanonicalPath(), fileSz, dataPacketList);

    } catch (IOException e) {
        logger.error("", e);
        throw e;
    } finally {
        if (null != fileChannel) {
            try {
                fileChannel.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    return fmd;
}

从readBinaryFiles(…)返回fmdList.Stream()将无法完成此操作,因为文件内容已经被读入内存,我将无法做到.

将多个文件的内容作为Stream读取的其他方法依赖于使用Files.lines(),但我需要读取二进制文件.

我愿意在Scala或Golang中这样做,如果这些语言比Java更能支持这个用例.

我很感激任何关于如何懒惰地阅读多个二进制文件内容的指针.

最佳答案 这应该足够了:

return files.stream().map(f -> readRawFile(f, numDataPoints, dataPacketSize));

…如果,也就是说,您愿意从readRawFile方法的签名中删除抛出IOException.您可以让该方法在内部捕获IOException并将其包装在UncheckedIOException中.(延迟执行的问题是异常也需要延迟.)

点赞