Sample Avro File Format

Spread the love

Apache Sample Avro file is a data serialization system which offers rich data structure. It can be also called a container file which stores the persistent data. With the dynamic language, it is having simple integration. There is no requirement of code generation to read or write data files. 

{
  "type""record",
  "name""thecodebuzz_schema",
  "namespace""thecodebuzz.avro",
  "fields": [
    {
      "name""username",
      "type""string",
      "doc""Name of the user account on Thecodebuzz.com"
    },
    {
      "name""email",
      "type""string",
      "doc""The email of the user logging message on the blog"
    },
    {
      "name""timestamp",
      "type""long",
      "doc""time in seconds"
    }
  ],
  "doc:""A basic schema for storing thecodebuzz blogs messages"
}

AVRO Schema Example

{
  "type""record",
  "name""AvrosampleNetCore.AccountDetails",
  "fields": [
    {
      "name""AccountId",
      "type""int"
    },
    {
      "name""AccountName",
      "type""string"
    },
    {
      "name""Accounts",
      "type": [
        "null",
        {
          "type""array",
          "items": {
            "type""record",
            "name""AvrosampleNetCore.SubAccounts",
            "fields": [
              {
                "name""AccountId",
                "type""int"
              },
              {
                "name""AccountType",
                "type": [ "null""string" ]
              }
            ]
          }
        }
      ]
    }
  ]
}

The Avro data source support for the following:

For Hadoop Sample Avro File format is a row-based storage format which is a widely used serialization platform. The data definition avron schema is stored in JSON format by Avro. So it is easy to read and interpret by any program. The data is stored into the binary format and this makes it efficient and compact. 

It also handles avro schema changes like an added field, missing field and changed field. This results in a reading of data by programs. The old programs can read new data and new programs can read old data. For storing the data in a data lake landing zone, this format is ideal. The Avro format has a flexible data structure that allows us to create records with an array, an enumerated type, and a sub-record.

The Avro file format will have a large number of applicants for reserving data in data lake landing blocks, where each block can object, size, and compress data. There is no need to reserve schema differently because the data has been read generally from the leading block.

So above we uploaded some avro format examples just go through them and use them according to your needs.