How is FeatureBase used by ML engineers/data scientists?
There are four stages in the machine learning life cycle where data scientists are using FeatureBase today.
- Most critically, assuming they have the proper permissions, data scientists can use FeatureBase to immediately and centrally access continuously updated records about the most important data in an organization. This data might include customers, patients, merchants and devices and originate from dozens or even hundreds of systems. They can now do this without having to have IT architect, deploy or manage infrastructure for each and every project.
- Real-time, iterative data exploration that reduces or, often, completely eliminates the long information request cycles between the data scientist and data engineer or IT.
- FeatureBase eliminates the category to integer phase of data preparation because the core data format does this natively.
- While data scientists can work on FeatureBase directly with Jupyter notebooks using our Python Client Library, they also still export from FeatureBase into Pandas dataframes to leverage libraries like scikit-learn and imblearn. Using FeatureBase to create Pandas dataframes allows data scientists to use a much larger sample size.