Mature, robust, Open Source relational Big Data warehousing solution provides advanced "SQL-on-Hadoop®" functionality and support.
Forest Hill, MD —9 March 2015— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache™ Tajo™ v0.10.0, the latest version of the advanced Open Source data warehousing system in Apache Hadoop®.
Apache Tajo is used for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities.
"Tajo has evolved over the last couple of years into a mature 'SQL-on-Hadoop' engine," said Hyunsik Choi, Vice President of Apache Tajo. "The improved JDBC driver in this release allows users to easily access Tajo as if users use traditional RDBMSs. We have verified new JDBC driver on many commercial BI solutions and various SQL tools. It was easy and works successfully."
Tajo v0.10.0 reflects dozens of new features and improvements, including:
- Oracle and PostgreSQL catalog store support
- Direct JSON file support
- HBase storage integration (allowing users to directly access HBase tables through Tajo)
- Improved JDBC driver for easier use of JDBC application
- Improved Amazon S3 support
A complete overview of all new enhancements can be found in the project release notes athttps://dist.apache.org/repos/dist/dev/tajo/tajo-0.10.0-rc1/relnotes.html
Described as "a dark horse in the race for mass adoption" by GigaOM, Tajo is in use at numerous organizations worldwide, including Gruter, Korea University, Melon, NASA JPL Radio Astronomy and Airborne Snow Observatory projects, and SK Telecom for processing Web-scale data sets in real time.
Byeong Hwa Yun, Project Leader at Melon, said "Congratulations on 0.10.0 release! Melon is the biggest music streaming service company in S. Korea. We use Tajo as an ETL tool as well as an analytical processing system. We have experienced that Tajo makes our ETL jobs faster 1.5x-10x than Hive does. Besides, HBase storage integration in this release enables our analytic pipeline simpler. We hope that Tajo has a large role to play in the Apache Hadoop ecosystem."
"I'm very happy with that Tajo has rapidly developed in recent years," said Jihoon Son, member of the Apache Tajo Project Management Committee. "One of the most impressive parts is the improved support on Amazon S3. Thanks to the EMR bootstrap, users can exploit Tajo's advanced SQL functionalities on AWS with just a few clicks."
Availability and Oversight
Apache Tajo software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Tajo, visit http://tajo.apache.org/ and https://twitter.com/ApacheTajo
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and follow https://twitter.com/TheASF
# # #