Business people, Big Data is coming for you.
Software that captures lots of data and uses it to make predictions has mostly been the province of engineers skilled in arcane databases and statisticians capable of developing complex algorithms. As the business gets bigger, however, software makers are domesticating their products in the hope they will prove attractive to a broader population.
Cloudera, which offers a popular version of the open source database called Hadoop, released software on Wednesday that makes it possible to run queries from a more mainstream SQL programming language interface. SQL, thanks to its adoption by Oracle, Microsoft and others, is known to millions of business analysts.
“This enables us to talk to a whole other class of customer,” said Mike Olson, the chief executive of Cloudera. “The knock against Hadoop was that it is too complex.”
There is a reason for that. Hadoop is one of several so-called unstructured databases that were created at Yahoo and Google, after those two companies found they had previously unimaginable amounts of data about activities like people’s Web-surfing habits. Put into databases designed to handle this unstructured behavior, then analyzed, this information was valuable for figuring out things like what advertisement to put in front of each individual Web surfer.
Now, with more commerce, content and social behavior online, Hadoop-like systems are valuable to mainstream corporations. Cloudera, which was formed by veterans of Google, Yahoo and Oracle, was among the first to make a commercial management product to go with Hadoop, which is an open source product.
Cloudera’s new SQL offering, named Impala, is based on an open source project called Dremel that began inside Google. Mr. Olson said Google had released papers on Dremel, but Cloudera was the first to make a public version.
Like Hadoop itself, Impala will be open source, and Cloudera will make money from subscriptions to its management software. The Hadoop product was also improved, Mr. Olson said, so complex queries could now be performed up to 30 times faster.
This is not the only way companies are trying to reach more Big Data customers. Last week Teradata released a no-cost trial version of a combination database-analysis program that is capable of handling traditional SQL queries as well as larger data analysis work.
The product, which comes from Teradata’s acquisition of Aster Data, has more than 50 analytical functions, including social network analysis and fraud detection. The target audience includes business analysts as much as highly trained data scientists. It comes with tutorials, presumably in the hope that prospective customers will love the test product enough to buy a full-featured production version.