Google Cloud Storage ha tenido desde hace tiempo la habilidad de operar Hadoop de manera que los desarrolladores pueden realizar analítica avanzada en sus plataformas de computing distribuido. Ahora, Google está tratando de simplificar este roceso con un nuevo conector que, según la compañía, facilita el correr Hadoop en la plataforma Google Cloud. Más abajo el resto del artículo escrito por Alex Williams en el portal techcrun.com
The Google Cloud Storage connector for Hadoop manages the cluster and file system for developers so they can focus on the logic of their data processing without the complexity of setting it up and managing themselves.
Google originally developed the Google File System in 2003. It is now the basis for Hadoop, an open-source distributed computing environment managed by the Apache Software Foundation that allows for data to be kept in small pieces on server clusters and then processed to do data analytics. Out of Hadoop has emerged a diverse ecosystem anchored by companies such as Cloudera and Hortonworks.
The new Google Connector for Hadoop is based on Colossus, the latest iteration of the company’s cloud storage system. The new service uses a simple connector library, allowing Hadoop to run directly against Google Cloud Storage, giving the customer the ability to leverage Google’s expertise in large data processing.
Google lists a number of benefits that come with the new connector. Developers can start up a Hadoop cluster pretty quickly as it is all managed in one place on Google Cloud Storage. It leverages Google’s scalability, giving the user higher availability. It costs less as there is no need to keep two copies of the data. With most traditional Hadoop systems, the user has to keep a copy of the data for running Hadoop and another for backup. Instead, Google handles it all through its storage system.
Hadoop has hit the mainstream of the data-analytics world. As I noted in a post last month, Hadoop is an important technology for Internet companies like Twitter that process data by the petabyte. It is also of increasing importance for more traditional organizations that also now must process unprecedented amounts of information.
But Hadoop has traditionally been a complex undertaking, requiring people with multiple talents to unleash its potential. Google Cloud Storage Connector for Hadoop shows how Hadoop is becoming more accessible and readily available as a service more so than a complex undertaking that requires any number of skills to leverage.