Mat Keep, Senior Director of Product Marketing for MongoDB, is a high-performing product marketing and management leader with a track record of developing and delivering high-growth product adoption and go-to-market strategies. He is focused on building businesses around open…
5 key questions for app-driven analytics
Data that drives applications and data that drives analytics typically live in separate domains in the data space. This separation is mainly due to the fact that they serve different strategic purposes for an organization.
Applications are used to engage customers, while analytics are for insights. The two classes of workloads have different requirements—such as read and write access patterns, concurrency, and latency—so organizations typically deploy purpose-built databases and duplicate data between them to satisfy the unique requirements of each use case.
Although these systems are different, they are also highly interdependent in today’s digital economy. Application data is fed into analytics platforms, where it is combined and enriched with other operational and historical data, supplemented with business intelligence (BI), machine learning (ML) and predictive analytics, and sometimes fed back into applications to deliver richer experiences.
Consider, for example, an e-commerce system that segments users by demographic data and previous purchases and then makes relevant recommendations the next time they visit the site.
The process of moving data between the two types of systems is here to stay. But today it is not enough. The current digital economy, with its seamless user experiences that customers have come to expect, requires applications to also become smarter, autonomously performing intelligent actions in real time on our behalf.
Together with smarter apps, companies will have insights faster so they know what is happening “in the moment”.
To meet these requirements, we can no longer rely solely on copying data out of our operational systems to centralized analysis repositories. Moving data takes time and creates too much separation between application events and analytical actions.
Instead, analytics processing must “shift left” to the data source – to the applications themselves. We call this shift application-driven analytics. And it’s a shift that both developers and analytics teams must be ready to embrace.
Define required capabilities
Embracing the shift is one thing; having the capabilities to implement it is another. In this article, we’ll break down the capabilities required to implement application-driven analytics into the following five critical questions for developers:
- How do developers access the tools they need to build sophisticated analytics queries directly into application code?
- How do developers make sense of massive streams of time series data?
- How do developers create intelligent applications that automatically respond to events in real time?
- How do developers combine live application data in hot database storage with legacy data in cooler cloud storage to make predictions?
- How can developers bring analytics into applications without compromising performance?
1. How do developers access the tools they need to build sophisticated analytics queries directly into application code?
To unlock the latent power of application data that exists across the data landscape, developers rely on the ability to perform CRUD (create, read, update, and delete) operations, sophisticated aggregations, and data transformations.
The primary tool for delivering these capabilities is an API that allows them to query data in any way they need, from simple lookups to building more sophisticated data processing pipelines. Developers need the API implemented as an extension of their preferred programming language to stay “in the zone” while working through problems in a state of flow.
Alongside a powerful API, developers need a versatile search engine and indexer that delivers results in the most efficient way possible. Without indexing, the database engine must go through each record to find a match. With indexing, the database can find relevant results faster and with less overhead.
As developers begin to interact with the database systematically, they will need tools that can give them visibility into search performance so they can adjust and optimize. Examples include monitoring tools that provide real-time server and database metrics, identify performance issues, and provide recommendations such as index and schema suggestions to further streamline database queries.
2. How do developers make sense of voluminous streams of time series data?
Time series data is typical in many modern applications. Internet of Things (IoT) sensor data, financial transactions, clickstreams and logs enable companies to arrive at valuable insights. Developers need the ability to query and analyze this data across rolling time windows while filling in any gaps in incoming data. They also need a way to visualize this data in real time to understand complex trends.
Another key requirement is a mechanism that automates the management of the life cycle of time series data. As data ages, it should be moved out of warm storage to avoid overloading active systems; However, there is still value in this data, especially in aggregated form, to provide historical analysis.
So organizations need a systematic way to partition this data into affordable object storage to maintain their ability to access and query this data for the insights it can emerge.
3. How do developers create intelligent applications that automatically respond to events in real time?
Modern applications must be able to continuously analyze data in real time as they react to live events. Applications must be able to access real-time data changes and then automatically execute application code in response to the event, allowing developers to build reactive, real-time analytics into the app.
Dynamic pricing in a ride notification service, recalculating delivery times in a logistics app due to changing traffic conditions, triggering a service call when a factory machine component starts to fail or initiating a trade when stock markets move – these are just a few examples of in-app analytics that requires continuous real-time data analysis.
4. How do developers combine live application data in hot database storage with legacy data in cooler cloud storage to make predictions?
Data is increasingly distributed across different applications, microservices and even cloud providers. Some of this data consists of recently ingested time series measurements or orders made in your e-commerce store and resides in hot database storage. Other data sets consist of older data that can be archived in cloud storage with cheaper objects.
Organizations must be able to query, mix and analyze fresh data coming in from microservices and IoT devices alongside cooler data, APIs and third-party data sources residing in object stores in ways not possible with traditional databases.
The ability to bring all key data elements together is critical to understanding trends and making predictions, whether handled by a human or as part of a machine learning process.
5. How can developers bring analytics into applications without compromising performance?
Live, customer-facing applications must serve many concurrent users while ensuring low, predictable latency, and they must do so consistently at scale. Any slowdown degrades the customer experience and drives customers towards competitors. In an oft-cited study, Amazon found that just 100 milliseconds of extra load time cost them 1% in sales. So it’s critical that analytics requests on live data don’t impact app performance.
A distributed architecture can help enforce isolation between the transactional and analytical sides of an application within a single database cluster. You can also use sophisticated replication techniques to move data to systems that are completely isolated but look like a single system to the app.
The bridge to app-driven analytics
As application-driven analytics becomes pervasive, a developer data platform is needed to unify the core data services needed to create smarter apps and improve business visibility.
A developer data platform bridges the traditional divide between transactional and analytical workloads in an elegant and integrated data architecture, serving as a single platform that manages a common data set for both developers and analysts. It minimizes data movement and duplication, eliminates data silos, reduces architectural complexity and unlocks analytics faster on live operational data. The final, critical requirement is that it does all of this while meeting the most demanding needs for robustness, scale and privacy.
To learn more, read Application-Driven Analytics: Defining the Next Wave of Modern Apps.