a feature extraction and storage technology that enables real-time analytics and AI initiatives, making model-ready data accessible, usable, and re-usable across organizationsWatch Video
Purpose-built feature storage that automatically converts all of your data into a model-ready format.
Our taps extract features from raw data at the source (wherever it is – even if your data is in a multitude of data centers) and pull those features into FeatureBase. Feature extraction can be executed server-side or client-side.
With client side extraction, the amount of data transferred is reduced, resulting in cost savings. There are multiple methods we can use to ingest features into FeatureBase: continuous updates (streaming) and batch. FeatureBase pulls from sources such as Kafka and ingests data formats such as CSV and JSON, Avro, and Parquet.
Store + Manage Features
Features are stored in-memory in a high performance model-ready format. Users experience low-latency exploration of features and are able to manage users and infrastructure via the control plane. FeatureBase employs a purpose-built feature storage solution rather than cobbling together existing data-centric technologies.
Leverage our APIs—like HTTP, gRPC (Python), and Postgres Wire Protocol (SQL)— for end-user applications including real-time decisioning, analytical applications, and model training. Queries and transforms happen within this step in your model or code.
FeatureBase simplifies, accelerates, and improves control over data to power real-time analytics and AI. FeatureBase is an overlay to conventional big data systems that automatically extracts features, not data, from each of the underlying data sources or data lakes and stores them in one centralized feature storage platform. FeatureBase maintains up-to-the-millisecond data updates with little to no upfront data preparation. This is achieved by reducing the dimensionality of the original data, effectively collapsing conventional data models (such as relational or star-schemas) into a highly optimized format.
‘AI Ready’ feature storage that continuously extracts and updates features in real-time
Track and filter time at a feature level
Supports ML Workloads
High concurrency queries for machine-scale analytics and ML
Single Point of Access
Centralized, ultra low latency access to all of your data
Lossless reduction in data footprint, up to 85%, without copying or moving data
Performant Joins at query time, with no pre-aggregation or pre-processing
Extension framework enabling seamless integration into existing environments
FeatureBase is beneficial for organizations that need to access large quantities of real-time data transactions each day joined with many fragmented, terabyte-scale data sources. The workloads where we add the greatest value are the complex analytical ones where source data is fragmented across silos, and where a user or machine wants to apply a number of filters or criteria to a query that will return a subset of that data upon which to take business actions.
We have a variety of ingest plugins including bulk SQL loaders, Kafka connecters (supporting Avro and the confluent schema registry), and change data capture (CDC) plugins. We are constantly adding new ingest plugins and some that will be in production soon include Spark and Parquet. Our experienced customer engineering managers are trained to bring these integrations, customized for your complex data environment, to production during the implementation period.
FeatureBase stores data in a format that extracts features at the original data source and then homomorphically compresses them for transmission and storage. The core format allows for granular scans at a feature-by-feature level rather than a columnar or tabular data format. This enables breakthrough analytical performance, allowing for unprecedented iteration speed in feature engineering on the totality of large data sets.
FeatureBase can be deployed in hybrid environments, and is priced based on your organization’s unique consumption demands.
Molecula also offers a Managed Service option for FeatureBase. This offering is designed for companies that need fast, scalable analytics and don’t have the bandwidth to learn and implement a new solution.
Ingest data from a Kafka topic into FeatureBase
Ingest data via Kafka Connect into FeatureBase
Ingest data from MySQL database into FeatureBase
Ingest data from SQL Server database into FeatureBase
Ingest data from Snowflake data warehouse into FeatureBase
Ingest data from Cassandra database into FeatureBase
Ingest data from Teradata data warehouse into FeatureBase
Ingest Spark data streams into FeatureBase
Ingest Parquet files into FeatureBase
Ingest files from your S3 instances into FeatureBase
Ingest data from your Big Query data warehouse into FeatureBase
Ingest data from any database into FeatureBase via ODBC
Monitor your instance with Prometheus
Monitor your instance with Splunk
Monitor your instance with Jaeger
Collect metrics about your instance with StatsD
Monitor your instance with OpenTracing
Monitor your instance with Datadog
Query your FeatureBase data from your Jupyter Notebook
Create Pandas data frames from FeatureBase
Query your FeatureBase data using Snowflake
Query your FeatureBase data using RStudio
Query your FeatureBase data using RAPIDS
Query your FeatureBase data with JDBC Driver
Leverage our GRPC API to connect custom applications
Visualize and query your real-time, mission-critical data
Read more about our product and related resources—
Unlock Human Potential with the Power of Real-Time Data Molecula is an enterprise feature store […]View Resource
Calculate Your TCO with Molecula—
Molecula's novel approach to data access is game changing for your machine-scale analytics and AI. By simplifying real-time analytics and AI infrastructure Molecula can reduce footprint by 60-90% and save you time, resources, and headaches. Come see for yourself.