Ingestion
You can send Log events to Parseable via HTTP POST requests with data as JSON payload. You can use the HTTP output plugins of all the common logging agents like FluentBit, Vector, syslog-ng, LogStash, among others to send log events to Parseable.
You can also directly integrate Parseable with your application via REST API calls.
Log Streams
Log streams are logical (and physical) collections of related log events. For example, in a Kubernetes cluster, you can have a log stream for each application or a log stream for each namespace - depending on how you want to query the data. A log stream is identified by a unique name, and role based access control, alerts, and notifications are supported at the log stream level.
To start sending logs, you'll need to create a log stream first, via the Console Create Log Stream
button.
Schema
Schema is the structure of the log event. It defines the fields and their types. Parseable supports two types of schema - dynamic and static. You can choose the schema type while creating the log stream. Additionally, if you want to enforce a specific schema, you'll need to send that schema at the time of creating the log stream.
At any point in time, you can fetch the schema of a log stream on the Console or the Get Schema API.
Dynamic
Log streams by default have dynamic schema. This means you don't need to define a schema for a log stream. The Parseable server detects the schema from first event. If there are subsequent events (with new schema), it updates internal schema accordingly.
Log data formats evolve over time, and users prefer a dynamic schema approach, where they don't have to worry about schema changes, and they are still able to ingest events to a given stream.
For dynamic schema, Parseable doesn't allow changing the type of an existing column whose type is already set. For example, if a column is detected as string
in the first event, it can't be changed to int
or timestamp
in a later event. If you'd like to force a specific schema, you can set the schema while creating the stream.
Static
In some cases, you may want to enforce a specific schema for a log stream. You can do this by setting the static schema flag while creating the log stream. This schema will be enforced for all the events ingested to the stream. You'll need to provide the schema in the form of a JSON object with field names and their types, with the create stream API call. The following types are supported in the schema: string
, int
, float
, timestamp
, boolean
.
Partitioning
By default, the log events are partitioned based on the p_timestamp
field. p_timestamp
is an internal field added by Parseable to each log event. This field specifies the time when the Parseable server received this event. Parseable adds this field to ensure there is always a time axis to the log events, so it becomes easier to query the events based on time. Refer to the historical data ingestion section for more details.
You can also partition the log events based on a custom time field. For example, if you're sending events that contain a field called datetime
(a column that has a timestamp in a valid format), you can specify this field as the partition field. This helps speed up the query performance when you're querying based on the partition field. Refer to the custom partitioning section for more details.
Flattening
Nested JSON objects are automatically flattened. For example, the following JSON object
{
"foo": {
"bar": "baz"
}
}
will be flattened to
{
"foo.bar": "baz"
}
before it gets stored. While querying, this field should be referred to as foo.bar
. For example, select foo.bar from <stream-name>
. The flattened field will be available in the schema as well.
Batching and Compression
Wherever applicable, we recommend enabling the log agent's compression and batching features to reduce network traffic and improve ingestion performance. The maximum payload size in Parseable is 10 MiB (10485760 Bytes). The payload can contain single log event as a JSON object or multiple log events in a JSON array. There is no limit to the number of batched events in a single call.
Timestamp
Correct time is critical to understand the proper sequence of events. Timestamps are important for debugging, analytics, and deriving transactions. We recommend that you include a timestamp in your log events formatted in RFC3339 format.
Parseable uses the event-received timestamp and adds it to the log event in the field p_timestamp
. This ensures there is a time reference in the log event, even if the original event doesn't have a timestamp. If you'd like to use your own timestamp instead for partitioning of data, please refer the documentation here.