Photobank kiev - Fotolia

Datos IO wants to provide DR for distributed databases

Startup Datos IO eyes 15-minute RPO, with point-in-time recovery of distributed databases used in cloud applications.

Startup Datos IO has come out of stealth, with an eye on the data protection market for a new generation of distributed databases built for cloud and big data applications.

The data protection technology targets scale-out databases, such as MongoDB, Cassandra, SQLite, Redis, Google BigTable and Amazon Dynamo. Datos IO has started an early access program for its product platform, and the new funds will be used to build out its sales and marketing, as well as ongoing product development.

The startup, based in San Jose, Calif., claims seven customer deployments and 16 product betas in the pipeline. Datos IO expects to have a distributed databases product for general availability in 2016.

"We believe we have a first-mover advantage," said Puneet Agarwal, a partner at San Francisco-based venture capitalist firm True Ventures and a Datos IO director. "Five of the top databases are all distributed, so there is a clear movement in this direction. Recovery management is a natural component that needs to exist. It's basic recovery management in scale-out databases. This is a very hard problem to solve."

Datos IO CEO Tarun Thakur -- formerly of Data Domain, Veritas and IBM Research -- and Prasenjit Sarkar, previously a master inventor in IBM Research, founded the company in June 2014. Five of the startup's 23 employees are from IBM Research labs. Datos IO is operating on the premise that business requirements are changing, and applications and database architects have new recovery requirements for data stores in new scale-out designs.

Its distributed databases software is designed to provide point-in-time version and recovery for scale-out databases, allowing enterprise users to fix operational errors before they get replicated and corrupt other nodes.

They provide recovery software that can roll back an instance of a database before a failure occurs.
Deni Connorfounding analyst of SSG-NOW

Datos IO is using a distributed versioning platform designed for recovery point objectives (RPOs) as low as 15 minutes. The versions are fault-resilient, follow application-defined quorum and are stored in native format. The platform will use semantic deduplication to reduce capacity usage by removing duplicate records from databases.

Users will be able to restore an entire keyspace or a single column family, without manual steps in the process. Copying and scripts are not needed. The versions are in native format and are database-consistent, so no repair is required when restoring.

"They provide recovery software that can roll back an instance of a database before a failure occurs," said Deni Connor, founding analyst of SSG-NOW. "The new databases are used for building cloud applications, and big data distributed across a cluster of nodes and wide-geographical areas. I haven't seen any product like this. I do expect to see more."

Datos IO has raised $12.5 million in Series A funding from Lightspeed Venture Partners and True Ventures, and another $2.75 million in seed funding.

Next Steps

Backup evolves to allow faster RPO

Data backup strategy for DR

Data backups and DR are converging

Dig Deeper on Disaster recovery storage