Skip to content

Support stream Chunk reading in Python API #974

@watwea-heaviside

Description

@watwea-heaviside

In the Python mcap reader API, chunks are read into memory as full blocks. This often causes OOM errors on smaller systems.

Main reader loop:

if isinstance(next_item, ChunkIndex):
self._stream.seek(next_item.chunk_start_offset + 1 + 8, io.SEEK_SET)
chunk = Chunk.read(ReadDataStream(self._stream))
for index, record in enumerate(
breakup_chunk(chunk, validate_crc=self._validate_crcs)
):
if isinstance(record, Message):
channel = summary.channels[record.channel_id]
if topics is not None and channel.topic not in topics:
continue
if start_time is not None and record.log_time < start_time:
continue
if end_time is not None and record.log_time >= end_time:
continue
if channel.schema_id == 0:
schema = None
else:
schema = summary.schemas[channel.schema_id]
message_queue.push(
(
(schema, channel, record),
next_item.chunk_start_offset,
index,
)
)

Chunk data read line:
data = stream.read(data_length)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions