Apache Sample Avro file is a data serialization system which offers rich data structure. It can be also called a container file which stores the persistent data. With the dynamic language, it is having simple integration. There is no requirement of code generation to read or write data files.
AVRO FILE EXAMPLE 1
{
"type"
:
"record"
,
"name"
:
"thecodebuzz_schema"
,
"namespace"
:
"thecodebuzz.avro"
,
"fields"
: [
{
"name"
:
"username"
,
"type"
:
"string"
,
"doc"
:
"Name of the user account on Thecodebuzz.com"
},
{
"name"
:
"email"
,
"type"
:
"string"
,
"doc"
:
"The email of the user logging message on the blog"
},
{
"name"
:
"timestamp"
,
"type"
:
"long"
,
"doc"
:
"time in seconds"
}
],
"doc:"
:
"A basic schema for storing thecodebuzz blogs messages"
}
Sample Avro File Format
AVRO Schema Example
{
"type"
:
"record"
,
"name"
:
"AvrosampleNetCore.AccountDetails"
,
"fields"
: [
{
"name"
:
"AccountId"
,
"type"
:
"int"
},
{
"name"
:
"AccountName"
,
"type"
:
"string"
},
{
"name"
:
"Accounts"
,
"type"
: [
"null"
,
{
"type"
:
"array"
,
"items"
: {
"type"
:
"record"
,
"name"
:
"AvrosampleNetCore.SubAccounts"
,
"fields"
: [
{
"name"
:
"AccountId"
,
"type"
:
"int"
},
{
"name"
:
"AccountType"
,
"type"
: [
"null"
,
"string"
]
}
]
}
}
]
}
]
}
Sample Avro Schema
The Avro data source support for the following:
- Partitioning: Without any extra configuration one can read or write the partitioned data
- Record names: Bypassing a map of parameters with recordName and recordNamespace, Record name is used.
- Schema conversion: There is an automatic conversion between Avro record and Apache Spark SQL.
- • Compression: When Avro out to disk there is the use of compression.
For Hadoop Sample Avro File format is a row-based storage format which is a widely used serialization platform. The data definition avron schema is stored in JSON format by Avro. So it is easy to read and interpret by any program. The data is stored into the binary format and this makes it efficient and compact.
It also handles avro schema changes like an added field, missing field and changed field. This results in a reading of data by programs. The old programs can read new data and new programs can read old data. For storing the data in a data lake landing zone, this format is ideal. The Avro format has a flexible data structure that allows us to create records with an array, an enumerated type, and a sub-record.
The Avro file format will have a large number of applicants for reserving data in data lake landing blocks, where each block can object, size, and compress data. There is no need to reserve schema differently because the data has been read generally from the leading block.
So above we uploaded some avro format examples just go through them and use them according to your needs.
Hello, I am Denail Soovy. I am a developer of different technology. I am passionate about teaching and Daily teaching many students. I want to share knowledge with all of the developers or other people who need it.
I will try to teach every student with my easy and updated blogs.