Parseable is a free and open source, purpose built log observability system. We use advancements in big data and analytics to bring best of efficiency, simplicity, and performance to log data. Learn why and how we're building Parseable.
Motivation
Traditionally, logging has been seen as only a text search problem. This was because log data is mostly textual and volumes were not high, so data ingestion or storage were not really issues. This led us to today, where all the logging platforms are primarily indexing based text search engines.
Log volumes are exploding now. With new emphasis on observability, log data now plays an even bigger role in overall business reliability and security. So, today's logging challenges are different – ingestion, storage, correlation, and observability, all at scale.
With increased demand, software industry responded with fully managed service offerings (SaaS). Such offerings abstract all the complexities of running a large scale logging unification platform, and are great at time to value. As soon as you have some data ingested, you can see dashboards, get alerts and go about your business.
But, pretty soon, users realize they are in a walled garden with limited flexibility and high costs. Moving data out of the system is difficult, and integration with other systems is not possible. Additionally, these systems are black boxes with no visibility of how your data is stored or processed.
If you're looking for freedom, interoperability, complete data ownership and avoiding vendor lock in – there are just no great open source alternatives.
What is Parseable?
Parseable is a free and open source, purpose built log observability system. It is a lightweight, high throughput system designed to ingest, store and query log data at scale.
Written in Rust to guarantee memory safety, performance, and stability. In our tests, Parseable server takes 50% less CPU and 80% less memory than Elasticsearch for a similar ingestion workload.
Apache Arrow and Parquet for best of interoperability and performance. Open data formats like Apache Arrow & Parquet, as the backing formats enable full physical access to your data. More importantly, you have the freedom to use tools from huge Parquet ecosystem to deeply analyze and visualize your data.
Cloud native, stateless design. Traditional log stores manage several tiers on disk and snapshots on object stores. Local disk based state management is orthogonal to the cloud native paradigm. Some of the most performant, petascale analytical systems already use object stores as primary store, with best results. We're bringing this approach to log data.
Wide availability. Due to absence of a simple logging engine, developers are muscle memory trained to look at logging and reliability at later stages. With Parseable, we want to change this. A system like Parseable will enable logging and observability to be a first class citizen in the software development lifecycle.
Design Goals
Best time to value. Make it easy for our users to set up and deploy Parseable. Ingest log data seamlessly from a wide variety of sources and generate valuable insights within minutes.
Performance. With Rust and columnar data formats as the base, Parseable is geared for high throughput and low latency. We'll keep adding performance improvements as we move forward.
Leverage cloud era infrastructure. Elasticity and commodity storage are key benefits of the cloud era. It is possible to scale Parseable up or down based on load. You can even turn off the container or VM it is running on, if there is no traffic, and spin it back up. In future, as we build distributed Parseable, we'll stay true to this design and allow scaling elastically, on the fly.
Extensibility. Today, it is difficult to use a dashboard or query engine of your choice with any logging platform. We want to change this with an extensible API based design, open storage data format and a free and open source platform.
Current State
As of writing this post, Parseable is available as binary and container image for Linux and macOS (x86). If you're interested in Windows or ARM support for the Parseable release, please let us know.
Quick start and installation flows are well documented. API is well documented, with samples available on our Postman workspace.
Log ingestion is documented for popular logging agents like FluentBit, Vector, syslog-ng. In many cases users prefer Kafka or compatible systems as log data pipeline and use a sink connector to ingest data into Parseable. Refer Kafka and Redpanda docs for this approach.
Parseable ships with a builtin UI for log search. It lets you filter logs based on tags and metadata fields and perform text search. We've huge plans for the UI, but if you're looking for a more familiar interface to start with, try out the Parseable Grafana data source plugin – it lets you plug Parseable data to your Grafana dashboard for deeper analysis.
The server is thoroughly tested. We have been able to push a single node deployment to 100K+ events / second, with json log events of size 1 KB each. Standard search queries (without caching, when served from the object store) take few seconds, with the majority of time spent in network calls.
Parseable is already being used in several organizations in production use cases.
License
Parseable is licensed under AGPLv3. We chose AGPLv3 (against any other source available license) because it is an OSI-approved license. Being open source will always be at the core of who we are, and we believe that AGPLv3 allows us stay true to free and open source software ideology.
Additionally, AGPLv3 provides a framework for us to build a sustainable business around the open source project. It allows us to capture some of the value that project creates commercially, so we can continue to invest in the project and the community. We believe this is the best way to ensure the long-term success of the project.
Future
As the community and adoption grows, we want to deliver a more refined experience for the log unification use case and make it even easier to ingest data from a wide variety of ecosystems. Upcoming key integrations
- AWS Lambda extension.
- AWS ELB & Kinesis log ingestion.
- OpenTelemetry collector.
On the backend, we're working on making Parseable schema less, which allows even better user experience for microservice, container type architectures. Additionally, as we see more production use cases and higher volumes, we'll work towards evolving Parseable into a distributed platform with high availability. We'll also work on optimizing the query engine for better performance, and possibly look at caching.
On the data visualization and analysis side, we will improve the Parseable UI extensively to allow better data drill down, corelation and context information per log event.
Follow along on our roadmap for more details.
Get Involved
If you just found out about Parseable, and found it interesting, visit us on our website, or catch up on Twitter @parseableio. Parseable Slack community is available to help you anytime, whether you have an issue in setup or usage, or just want to chat.
Attribution
Parseable uses several open source components under the hood. We'd like to thank the community for their work and contributions. Some of the key components are