Up Your Game: How to Rock Data Quality Checks in the Data Lake

Posted by Adam Diaz on Feb 7, 2017 2:52:06 PM

Common sense tells us one can’t use data unless its quality is understood. Data quality checks are critical for the data lake – but it’s not unusual for companies to initially gloss over this process in the rush to move data into less-costly and scalable Hadoop storage especially during initial adoption. After all isn't landing data into Hadoop with little definition of schema and data quality what Hadoop is all about? After landing data in a raw zone in Hadoop the reality quickly sets in that in order for data to useful both structure and data quality must be applied. Defining data quality rules becomes particularly important depending on what sort of data you’re bringing into the data lake; for example, large volumes of data from machines and sensors.  Data validation is essential because it is coming from an external environment and it probably hasn’t gone through any quality checks.

Read More

Topics: Hadoop, Big Data Ecosystem, Bedrock, Data Lake Solutions, Data Warehouse, Data Lake, Metadata Management

How Data Lakes Work

Posted by Ben Sharma on Jan 26, 2017 8:06:19 AM

Excerpt from ebook, Architecting Data Lakes: Data Management Architectures for Advanced Business Use Cases, by Ben Sharma and Alice LaPlante.

Read More

Topics: Hadoop, Ben Sharma, Big Data Ecosystem, Data Warehouse, Data Lake, Data Management

The Executive Guide to Data Warehouse Augmentation

Posted by Rajesh Nadipalli on Jan 19, 2017 3:12:13 PM

The traditional data warehouse (DW) is constrained in terms of storage capacity and processing power. That’s why the overall footprint of the data warehouse is shrinking as companies look for more efficient ways to store and process big data. Although data warehouses are still used effectively by many companies for complex data analytics, creating a hybrid architecture by migrating storage and large-scale or batch processing to a data lake enables companies to save on storage and processing costs and get more value from their data warehouse for business intelligence activities.

Read More

Topics: Hadoop, Data Warehouse, Data Lake, Data Management

The Business Case for Data Lakes

Posted by Ben Sharma on Jan 3, 2017 12:58:46 PM

Excerpt from ebook, Architecting Data Lakes: Data Management Architectures for Advanced Business Use Cases, by Ben Sharma and Alice LaPlante.

Read More

Topics: Big Data Ecosystem, Data Lake Solutions, Data Warehouse, Data Lake

Zaloni Zip: Building a Modern Data Lake Architecture Pt. 2

Posted by Rajesh Nadipalli on Dec 13, 2016 11:27:48 AM

In the last video, we looked at the pain points of traditional data warehouse architecture and the high level architecture of the next generation Data Lake based on Hadoop.

In this video, I will discuss the key components you need to build a new architecture.

Read More

Topics: Hadoop, Big Data Ecosystem, Data Warehouse, Zaloni Zip, Data Lake, Data Management

Hadoop and Transactions are Not What You Think

Posted by Adam Diaz on Nov 29, 2016 12:53:19 PM

Does Hadoop have the ability to support transactions? This is one of the most common questions I hear from folks new to Hadoop and searching for the best technology for their specific use case. Folks from the RDBMS world tend to initially look to map what is known from years of experience into similar functionality in Hadoop. Obviously, data being provided via transactions is a common use case. Unfortunately, many still have not heard of the improvements made in Hive since the early days of the Hadoop wars or more specifically the Hive/Impala wars. Of course, Spark has also burst onto the scene but to a large degree many folks still use Hive as the go-to technology, especially where integration with their favorite BI tool is involved.

Read More

Topics: Hadoop, Big Data Ecosystem, Data Warehouse, Spark

Zaloni Zip: Data Warehouse Architecture

Posted by Rajesh Nadipalli on Nov 2, 2016 11:53:26 AM

In the latest Zaloni Zip, Raj Nadipalli discusses how to modernize your data warehouse (DW) architecture. He specifically addresses the traditional DW architecture, pain points of DW architecture and the modern data lake architecture.

Read More

Topics: Hadoop, Big Data Ecosystem, Data Warehouse, Zaloni Zip

So, you want to be a tech visionary? An executive guide to data lakes

Posted by Greg Wood on Aug 19, 2016 11:47:54 AM

You’ve heard it time and time again: cloud is the future; those who don’t adopt modern big data practices will fall behind the pack; the next wave of IT disruption is right around the corner. And yet, at the same time, budgets are shrinking, demand is growing and pressure on the IT organization to show value is at an all-time high. As an executive, you have the full force of your business behind you and more options than ever to achieve both short- and long-term goals with business data. So many options, in fact, that the landscape has become a confusing, often contradictory mess of competing solutions.

Read More

Topics: Hadoop, Big Data Ecosystem, Data Lake Solutions, Data Warehouse