Share via

Azure Event Hubs Kafka endpoint corrupts message size, resulting in CorruptRecordError (Expected 1190 bytes, Read 6)

LathaSree 20 Reputation points
2025-12-31T03:04:52.3733333+00:00

Hello Azure Support Team and Community,

We have isolated a critical issue within the Azure Event Hubs Kafka compatibility layer that is preventing us from consuming messages. The issue is severe and highly specific:

Problem: Messages produced to Event Hubs via the Kafka protocol are confirmed intact (readable by the native AMQP SDK), but when accessed via the Kafka endpoint, the payload is corrupted, leading to a CorruptRecordError in the consumer.

Key Diagnostic Findings (Reproducible)

  1. Data Validity Confirmed: The native AMQP Consumer reads all messages successfully (old and new).

Producer Library: The producer uses kafka-python.

Consumer Attempts: We have tested two different consumer libraries, with the same failure result:

Attempt A: confluent-kafka consumer failed to retrieve any message value (null/empty).

  __Attempt B:__ __`kafka-python` consumer__ failed with the specific size mismatch error.
```### The Specific Error

When using the `kafka-python` consumer, we receive the following, definitive error message for both new and existing messages:

> __`An error occurred: [Error 2] CorruptRecordError: Invalid record size: expected to read 1190 bytes in record payload, but instead read 6`__

This confirms that the Azure Kafka endpoint correctly stores the original message size (1190 bytes) but is failing to retrieve and serve the correct payload content, sending only 6 bytes of corrupt data instead.

### Producer Configuration Details

__Producer Library:__ `kafka-python`

__API Version (Suspected Culprit):__ `configs["api_version"] = (0, 10)`

__Payload:__ UTF-8 JSON bytes (Confirmed non-empty)

### Request

We require immediate guidance on this specific `CorruptRecordError`. Since the issue is proven to be outside our client code (AMQP works, and the error reports an internal record size failure), this points to a fundamental bug in the Event Hubs Kafka compatibility service.

Is there a known configuration or broker version setting (beyond `broker.version.fallback: 1.0.0`) required to prevent this message corruption when reading data produced by `kafka-python`?

Thank you for your urgent assistance.

Azure Event Hubs
0 comments No comments

Answer accepted by question author
  1. Anonymous
    2025-12-31T05:00:38.9933333+00:00

    Hi @LathaSree

    Thank you for contacting Microsoft Q&A. Please find below the detailed steps to address the reported issue.

    This failure is not caused by your payload or AMQP pipeline—your own diagnostic confirms AMQP consumers read the full 1190‑byte JSON correctly. The corruption occurs only on the Kafka endpoint, and the error:

    CorruptRecordError: Invalid record size: expected 1190 bytes … read 6

    is a classic symptom of a Kafka client using an older message format version than the broker expects.

    Azure Event Hubs' Kafka endpoint supports Kafka API version 1.0+ only (and internally aligns with the Kafka 1.0+ message format). When the kafka-python producer is explicitly forced to use:

    api_version = (0, 10)

    It switches to the Kafka 0.10 message format, which Event Hubs does NOT support. Event Hubs then correctly stores the message length metadata—but the body retrieval fails because the consumer expects a Kafka‑0.10 format envelope that Event Hubs never produced.

    This mismatch leads directly to the “expected X bytes, read Y bytes” corruption behavior.

    This is why AMQP works (AMQP path is unaffected) and both Kafka consumers fail (both read from the Kafka endpoint using a mismatched wire format).

    __
    Solution -__

    1. Remove or update the forced API version

    Do NOT force api_version = (0, 10) in kafka-python. Instead, use API auto‑detection:

    Python

    KafkaProducer(

    bootstrap_servers=[your_eventhub_fqdn:9093],

    security_protocol="SASL_SSL",

    sasl_mechanism="PLAIN",

    sasl_plain_username="$ConnectionString",

    sasl_plain_password=event_hubs_connection_string,

    api_version_auto_timeout_ms=60000 # Allow auto-negotiation

    )

    Or explicitly set the correct modern API version:

    Python

    api_version = (1, 0)

    or higher, depending on your kafka-python version

    Event Hubs requires Kafka 1.0+ (per Microsoft Learn) Troubleshoot connectivity issues - Azure Event Hubs - Azure Event Hubs | Microsoft Learn

    This single change resolves the corrupted‑record behavior.

    2. Validate your compression configuration

    If you're using Snappy (which is common in Kafka), be aware Event Hubs only supports GZIP for Kafka messages. Snappy or LZ4 causes partial‑reads or decode failures.

    From official docs (compression limits): Configure Azure Event Hubs and Kafka data flow endpoints in Azure IoT Operations - Azure IoT Operations | Microsoft Learn

    Ensure your producer includes:

    Python

    compression_type="gzip"

    3. Ensure your consumer also uses Kafka 1.0+

    For both confluent-kafka and kafka-python, remove forced versions or set:

    Python

    api_version = (1, 0)

    This matches Event Hubs' Kafka wire format and stops corruption.

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. LathaSree 20 Reputation points
    2026-01-09T03:10:18.2733333+00:00

    Client has corrected api version . Now We are able to read messages through Kafka

    Thank you team for your help,.

    0 comments No comments

  2. LathaSree 20 Reputation points
    2026-01-05T22:44:11.3133333+00:00

    Hi VRISHABHANATH PATIL &Manoj

    Thank you for the detailed explanation.

    We are currently checking this with the client and have asked them to update the Kafka API version to 1.0+ (or remove the forced 0.10 setting) as suggested. We are also reviewing the compression configuration on their side.

    I’ll share an update as soon as I hear back from them or once the change is applied and validated.

    Thanks again for the support — this has been very helpful.

    Best regards, Latha


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.