How to Get Your Ingested Data Production-Ready in Just 8 Weeks

Posted by Kelly Schupp, VP, Data-Driven Marketing on Feb 23, 2017 3:21:01 PM

Whether you’re planning for a data lake implementation or have a proof-of-concept (POC) in place, one thing is clear: You must hydrate the data lake before it can be ready for production. Depending on the complexity and volume, data ingestion and data preparation can take at least six months.

Read More

Topics: Big Data Ecosystem, Bedrock, Data Lake, Data Management, Data Governance, Metadata Management

What Is Metadata and Why Is It Critical in Today’s Data Environment?

Posted by Scott Gidley on Feb 15, 2017 3:43:06 PM

Excerpt from ebook, Understanding Metadata: Create the Foundation for a Scalable Data Architecture, by Federico Castanedo and Scott Gidley.

Read More

Topics: Big Data Ecosystem, Data Lake, Data Management, Metadata Management

Zaloni Zip: Solving the Challenges of Hybrid Data Lake Architecture

Posted by Parth Patel on Feb 14, 2017 3:47:07 PM

In this Zaloni Zip, we will discuss the challenges of a Hybrid Data-Lake architecture and how Zaloni’s centralized data-lake management platform tackles those challenges head-on. 

Read More

Topics: Big Data Ecosystem, Bedrock, Zaloni Zip, Data Lake, Data Management, Data Governance, Metadata Management

New Releases of Bedrock and Mica Expand Data Lake Beyond Hadoop

Posted by Kelly Schupp, VP, Data-Driven Marketing on Feb 9, 2017 9:33:09 AM

With our latest Bedrock and Mica updates, we’re pushing the boundaries of what has up until now typically defined a data lake: Hadoop. Why are we moving in this direction? Because it makes sense for our clients, who need a solution to centralize management of data from siloed data systems, legacy databases and hybrid architectures. Our solutions support the concept of a data lake beyond Hadoop to encompass a more holistic, enterprise-wide approach. By constructing a “logical” data lake architecture versus a physical one, we can give companies transparency into all of their data regardless of its location, enable application of enterprise-wide governance capabilities, and allow for expanded, controlled access for self-serve business users across the organization.

Read More

Topics: Hadoop, Big Data Ecosystem, Bedrock, Zaloni News, Data Lake, Data Management, Mica, Data Governance, Metadata Management

Up Your Game: How to Rock Data Quality Checks in the Data Lake

Posted by Adam Diaz on Feb 7, 2017 2:52:06 PM

Common sense tells us one can’t use data unless its quality is understood. Data quality checks are critical for the data lake – but it’s not unusual for companies to initially gloss over this process in the rush to move data into less-costly and scalable Hadoop storage especially during initial adoption. After all isn't landing data into Hadoop with little definition of schema and data quality what Hadoop is all about? After landing data in a raw zone in Hadoop the reality quickly sets in that in order for data to useful both structure and data quality must be applied. Defining data quality rules becomes particularly important depending on what sort of data you’re bringing into the data lake; for example, large volumes of data from machines and sensors.  Data validation is essential because it is coming from an external environment and it probably hasn’t gone through any quality checks.

Read More

Topics: Hadoop, Big Data Ecosystem, Bedrock, Data Lake Solutions, Data Warehouse, Data Lake, Metadata Management

Deriving Value from the Data Lake

Posted by Ben Sharma on Jan 18, 2017 2:48:23 PM

Excerpt from ebook, Architecting Data Lakes: Data Management Architectures for Advanced Business Use Cases, by Ben Sharma and Alice LaPlante.

Read More

Topics: Ben Sharma, Big Data Ecosystem, Data Lake, Data Management, Data Governance, Metadata Management

3 Keys to Creating an Enterprise-scale Security Model for the Data Lake

Posted by Parth Patel on Dec 21, 2016 10:30:28 AM

We’re seeing data lake environments grow from the size of tens of terabytes to the colossal scale of petabytes. As a result, more enterprises are questioning how on earth they should go about governing something so huge and complex. From our perspective, a policy-based or attribute-based security model is paramount in terms of creating an enterprise-scale security model for the data lake. Leveraging metadata, a policy-based security model automates permissions and access – and is really the only way to confidently secure big data while still allowing the access necessary to democratize use and derive value from the data lake.

Read More

Topics: Big Data Ecosystem, Data Lake, Data Management, Data Governance, Metadata Management

Zaloni Zip: Using Transient Clusters and Keeping Your Metadata

Posted by Parth Patel on Dec 6, 2016 9:26:47 AM

As the name suggests, transient clusters are compute clusters that automatically shut down and stop billing when processing is finished. However, using this cost-effective approach has been an issue because metadata is automatically deleted by the cloud provider when a transient cluster is shut down.

This is noteworthy because metadata is the key to getting value from big data. Therefore, most enterprises have opted to pay for persistent compute across the board in order to maintain the metadata. How can enterprises leverage transient clusters for cost-savings and maintain their metadata? Read More

Topics: Hadoop, Big Data Ecosystem, Bedrock, Zaloni Zip, Data Management, Metadata Management

Metadata is Critical for Fishing in the Big Data Lake

Posted by Andy Oram on Nov 21, 2016 11:17:05 AM

Excerpt from report, Managing the Data Lake: Moving to Big Data Analysis, by Andy Oram, editor at O’Reilly Media

Read More

Topics: Hadoop, Big Data Ecosystem, Bedrock, Data Lake, Data Management, Metadata Management

Webinar: Understanding Metadata – Why It's Essential to Your Big Data Solution and How to Manage It Well

Posted by Kelly Schupp, VP, Data-Driven Marketing on May 24, 2016 2:49:49 PM

How much does metadata impact your big data solution and your business?

The answer might surprise you. Metadata is essential for managing, migrating, accessing, and deploying a big data solution. Without it, enterprises have limited visibility into the data itself and can’t trust its quality—negating the value of data in the first place. Creating end-to-end data visibility allows you to keep track of data, enable search and query across big data systems, safeguards your data, and reduces risk.

Read More

Topics: Big Data Ecosystem, Metadata Management